By using this site, you agree to the Privacy Policy and Terms of Use.
Accept
News Dailys Lifestyle
  • Home
  • Curious Tech
  • History & Untold Stories
  • Science & Space
  • Surprising Facts & Lists
Reading: An AI out diagnosed ER doctors on real patient records
Font ResizerAa
News Dailys LifestyleNews Dailys Lifestyle
  • Home
  • Curious Tech
  • History & Untold Stories
  • Science & Space
  • Surprising Facts & Lists

Search

  • Home
  • Curious Tech
  • History & Untold Stories
  • Science & Space
  • Surprising Facts & Lists

Follow us

Home » An AI out diagnosed ER doctors on real patient records

Curious Tech

An AI out diagnosed ER doctors on real patient records

Nikola Gjakovski
By
Nikola Gjakovski
Nikola Gjakovski
ByNikola Gjakovski
Author | Life Coach | Hard Work Advocate | Social Media Expert — Inspiring people to build the lives they actually want.
Follow:
Last updated: June 3, 2026
Share
7 Min Read
SHARE

Contents
What the Test Was Actually MeasuringWhat This Doesn’t Mean

In a study published in 2026, a study involving researchers affiliated with Harvard Medical School and a major Boston academic medical center quietly did something the medical establishment had been dreading. Researchers handed an AI the same messy, incomplete electronic health records that ER doctors work from every day, no curated test questions, no clean data set,s and let it try to diagnose real patients. The AI won

an OpenAI reasoning model correctly identified diagnoses in a majority of real emergency-room triage cases, outperforming the physician comparators. Two experienced attending physicians, working from the identical records, scored notably lower than the AI model. The study tested the model against a cohort of real ER patients from the study site at three stages of care: initial triage, first doctor interaction, and hospital admission.

That last detail is the one worth sitting with. Every previous AI-versus-doctor study used cleaned-up, structured cases, the kind of tidy problem sets that appear on board exams. This one used the real thing: incomplete notes, ambiguous symptoms, and records typed fast by nurses during shift change. The researchers called this the study’s most important distinction.

And here’s where it gets strange. The AI wasn’t just matching doctors on routine cases. On structured treatment-planning problems, O1 scored substantially higher than physicians on structured treatment-planning problems. Physicians scored less than half that, even when they had access to external reference resources.

One illustrative case from the study: According to the study, one illustrative case involved a patient with a complex presentation that the AI resolved by surfacing a buried detail in the chart that the clinical team had not connected. The human team hadn’t connected the dots. The AI flagged a possible lupus history in the patient’s records, a connection that could explain why the clot wasn’t responding. Whether that flag led to a changed outcome isn’t stated in the published findings, but the point stands. The model found a thread the doctors missed, buried in the same chart they were all reading.

What the Test Was Actually Measuring

Source: Pexels

The natural question is: how? A language model isn’t trained to diagnose anything. It’s trained to predict text. So how does predicting the next word in a sentence become something that looks like clinical reasoning?

The short answer is that medical knowledge is encoded in language. Everything a physician learns, from textbooks, case studies, treatment protocols, and published research, exists as text. A model trained on enough of that material develops something that functions like pattern recognition. It reads a chart the way a senior clinician reads a chart: looking for what doesn’t fit, what’s missing, what one symptom combination tends to mean.

One of the study’s co-authors put it plainly. AI models now score close to 100% on multiple-choice medical licensing exams. “We can’t track progress anymore,” he said. The study’s authors noted that AI performance on standardized medical exams has become so high that such benchmarks can no longer meaningfully track progress.

The ceiling comment is the part that should get attention. Medical AI benchmarks were designed to measure progress. They no longer do, because the thing being measured has outgrown them. The field is now running a different kind of race, against real-world conditions, real patient records, real clinical noise, and this study is the first serious lap time.

What This Doesn’t Mean

Source: Pexels

None of this is an argument for replacing physicians. A diagnosis is a decision, and decisions require accountability, judgment about what a patient can tolerate, and a conversation about risk. An AI doesn’t sit across from anyone. It doesn’t notice that the patient seems more confused than the chart suggests, or that the family in the hallway looks terrified and needs five minutes.

What the study actually shows is something more specific and more useful: that the information-processing part of diagnosis, the part that requires holding dozens of variables in mind simultaneously and checking them against a vast store of prior cases, is something AI does better than humans. Not sometimes. Reliably.

The question that follows isn’t whether to hand medicine over to a language model. It’s whether a doctor reading an ER chart in 2027 who isn’t using one of these tools is making a decision with less information than they could have. That’s the gap this study opened. It won’t close on its own.

This article was researched, written, and edited by our human editorial team. AI tools were used in a limited research-assistant capacity. All claims were independently verified.

Newsletter

TAGGED:AI diagnosis doctorsartificial intelligencemedical technologyscience research
Share This Article
Facebook Pinterest Copy Link Print
Nikola Gjakovski
ByNikola Gjakovski
Follow:
Author | Life Coach | Hard Work Advocate | Social Media Expert — Inspiring people to build the lives they actually want.
Previous Article How 2026 became a replay of the gold tax that failed India the first time
Leave a Comment

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You Might Also Like

Curious Tech

Google bet $100 million that fashion was the missing piece in wearable tech

Curious Tech
June 3, 2026
Curious Tech

How 2026 changed what we thought we knew about where clean energy comes from

Curious Tech
June 3, 2026
Curious Tech

How 1% of the power beat the most advanced robot AI on the market

Curious Tech
June 1, 2026
Curious Tech

A San Francisco startup’s new robot AI taught itself to use appliances it was never trained on

Curious Tech
June 1, 2026
News Dailys Lifestyle

News Daily

Categories

  • Curious Tech
  • Money & Economic History
  • Science & Space
  • Surprising Facts & Lists
  • History & Untold Stories

Get in Touch

  • About us
  • Editorial Team
  • Corrections Policy
  • Editorial Standards & Ethics Policy
  • Privacy Policy
  • Terms and Conditions
  • Contact us
© 2026 News Daily. All Rights Reserved.