Harvard study: AI beats doctors in diagnosis and care management

By Akash Pandey

May 01, 2026

02:50 pm

What's the story

A recent study published in the journal Science has revealed that an artificial intelligence (AI) model can outperform experienced doctors in diagnosing patients and managing their care. The research, conducted by a team from Harvard Medical School and Beth Israel Deaconess Medical Center, tested an AI reasoning model developed by OpenAI against human physicians and an earlier version of the AI, GPT-4.

Testing methods

AI model's clinical acumen tested through real-life cases

The researchers put the AI model through a series of experiments to evaluate its clinical acumen. These tests included real-life cases such as a patient with lupus who had previously visited Beth Israel's emergency department in Boston. The team assessed how accurately the AI model could diagnose at three stages: during triage in the ER, and after being admitted into hospital care.

Performance comparison

AI model excelled in providing accurate diagnoses

The study found that the AI model outperformed two experienced physicians in providing accurate diagnoses. This was done solely based on electronic health records and the limited information available to the doctors at that time. Dr. Adam Rodman, a clinical researcher at Beth Israel and one of the study authors, emphasized that "This is the big conclusion for me - it works with the messy real-world data of the emergency department."

Benchmark achievement

Model also bested case reports and clinical vignettes

The study also tested the AI model against case reports published in the New England Journal of Medicine and clinical vignettes. This was done to see if it could meet established "benchmarks" and tackle complex diagnostic questions. Raj Manrai, an assistant professor of Biomedical Informatics at Harvard Medical School, was also part of the study, said: "The model outperformed our very large physician baseline."

Limitations acknowledged

Study highlights limitations of AI model compared to physicians

The authors of the study noted that the AI model relied solely on text for diagnosis, while real-life clinicians have to consider other inputs like images, sounds, and nonverbal cues. Dr. David Reich, chief clinical officer for Mount Sinai Health System in New York who wasn't involved in this research, said "This paper is a beautiful summary of just how much things have improved."

Future implications

Need for rigorous testing of AI models emphasized

Despite its impressive performance, the study authors don't believe these results support replacing doctors with AI. Manrai said, "I think it does mean that we're witnessing a really profound change in technology that will reshape medicine." The findings highlight the need for rigorous testing of AI models, ideally through forward-looking trials to better understand their impact on clinical practice.