THither has been a john roy major temblor in scientific discipline and medical specialty, and the world will feel its aftershocks for years to come.The effects of artificial intelligence (AI) are being discussed in every corner of modern society, yet health care has so far remained somewhat sheltered from its broader impacts. Interest is widespread, but adoption has been uneven globally, slowed by regulation, the high stakes of getting things wrong, and the deeply human nature of clinical work.Dr. Robert Wachter, the chair of medicine at the University of California, San Francisco, and one of the most thoughtful observers of clinical AI, argues in his brilliant new book A Giant Leap that the question of where AI fits in medicine is best understood along two axes: feasibility and risk. Diagnosis, he writes, is high risk and low feasibility, and thus difficult to execute.There is little people agree on these days, but I would wager that most of us hold doctors who make quick and accurate diagnoses in high regard. “Our reverence is partly because effective medical care begins with the correct diagnosis, and partly because it is the most interesting thing we do,” Dr. Wachter writes, adding that it is also vitally important to the patient. Wrong diagnoses cost time, money, the chance to deliver the right treatment, and too often, life itself.Anyone who has watched an adept diagnostician at work knows there is something almost magical about how they tease out a disease through a Sherlockian process of elimination, Bayesian theory, and a slate of differential diagnoses. This is why we venerate fictional detectives like House, and why real diagnostic brilliance still feels like one of medicine’s highest arts.Diagnosis has remained one of medicine’s hardest frontiers for AI — until now.In a study just published in Science, Dr. Peter G. Brodeur and colleagues at Harvard Medical School and Beth Israel Deaconess Medical Center fed OpenAI’s o1-preview reasoning model the same triage notes a nurse might scribble at the front door of an emergency room (vital signs, a basic history, the first impression of a sick patient).The results were astonishing.On 76 real cases drawn from a major Boston emergency department, the model arrived at the exact or a very close diagnosis 67 percent of the time. The two attending doctors tested against it managed 55 percent and 50 percent. When the same doctors were later handed a stack of differential diagnoses and asked to guess which had been written by a colleague and which by AI, they could not tell the difference. One got 15 percent of his guesses right; the other got 3 percent.A brief word on what these systems are. A large language model is trained on vast quantities of text drawn from the internet, books, and other sources. By processing trillions of words, it learns the statistical patterns of language. ChatGPT, Gemini, and Claude are large language models. Given a question, they generate an answer one word at a time by predicting what should come next.Until recently, these models produced answers in one breath. A reasoning model, such as OpenAI’s o1 series, works differently. It is trained to slow down, to work through a problem step by step, and to check its own work along the way. The shift from one-breath answers to deliberate reasoning has produced one of the largest jumps in AI performance yet.The Boston emergency department was not the only test. The same model was put through five other experiments, including the New England Journal of Medicine’s published case puzzles, long used as demanding tests of diagnostic reasoning, and a separate set of management cases drawn from real patients in which the question was not what the diagnosis is but what to do next. Across the study, the model was compared with hundreds of doctors, earlier AI systems, and historical human baselines. In nearly every experiment, the model performed at or above the level of the doctors.The authors also checked whether the model was simply remembering cases it had been trained on. It was not. The model was not merely retrieving information. It appeared to be reasoning.So, are doctors finished? Of course not.In an accompanying Perspective in Science, Drs. Ashley M. Hopkins and Erik Cornelisse of Flinders University in Australia put it plainly: “Passing examinations is not the same as being a doctor.”Reasoning over text is not the same as being a doctor either. The model worked from words alone. It did not see the patient. It could not order a test and revise its thinking when the result came back. It did not have to deliver bad news to a family.Medicine is not just a set of inferences. It is a relationship and a calling.There are also reasons to worry about how AI changes the doctor who uses it. Aviation has lived with a version of this problem for decades. When machines take over, human skills atrophy, which is why pilots are still trained for the moments when automation fails. There is no good reason to think clinical reasoning is exempt.A radiologist who reads images alongside AI may grow dull on exactly the cases the AI gets wrong. A junior doctor trained in this era may never develop the pattern recognition that comes from years of unaided practice. A clinician will often order the test the AI flagged even when their own judgment says it is unnecessary, because the cost of ignoring an algorithmic flag — in litigation and in conscience — is heavier than the cost of acting on it.Sometimes, paradoxically, a clinician with AI performs worse than a clinician without it. One study in JAMA found that systematically biased AI predictions reduced clinicians’ diagnostic accuracy. The presence of an apparently authoritative second opinion changes how the first one is offered.This study is also already old. The model tested was released in September 2024. In artificial intelligence, that is a generation. Current AI reasoning systems are multimodal, processing not only text but also images, audio, and video. The next generation will be more capable still.But there are important questions the medical community and society at large have not yet answered — questions that cannot be resolved by building smarter models alone. We need evaluation frameworks that measure performance in the noisy ambiguity of real care rather than the clean prose of a case puzzle, transparency that lets a patient know an algorithm has shaped their diagnosis, and clear lines of accountability when something goes wrong.(Anirban Mahapatra is a scientist and author. His most recent book is When the Drugs Don’t Work. The views expressed are personal)
Global News Perspectives
In today's interconnected world, staying informed about global events is more important than ever. ZisNews provides news coverage from multiple countries, allowing you to compare how different regions report on the same stories. This unique approach helps you gain a broader and more balanced understanding of international affairs. Whether it's politics, business, technology, or cultural trends, ZisNews ensures that you get a well-rounded perspective rather than a one-sided view. Expand your knowledge and see how global narratives unfold from different angles.
Customizable News Feed
At ZisNews, we understand that not every news story interests everyone. That's why we offer a customizable news feed, allowing you to control what you see. By adding keywords, you can filter out unwanted news, blocking articles that contain specific words in their titles or descriptions. This feature enables you to create a personalized experience where you only receive content that aligns with your interests. Register today to take full advantage of this functionality and enjoy a distraction-free news feed.
Like or Comment on News
Stay engaged with the news by interacting with stories that matter to you. Like or dislike articles based on your opinion, and share your thoughts in the comments section. Join discussions, see what others are saying, and be a part of an informed community that values meaningful conversations.
Download the Android App
For a seamless news experience, download the ZisNews Android app. Get instant notifications based on your selected categories and stay updated on breaking news. The app also allows you to block unwanted news, ensuring that you only receive content that aligns with your preferences. Stay connected anytime, anywhere.
Diverse News Categories
With ZisNews, you can explore a wide range of topics, ensuring that you never miss important developments. From Technology and Science to Sports, Politics, and Entertainment, we bring you the latest updates from the world's most trusted sources. Whether you are interested in groundbreaking scientific discoveries, tech innovations, or major sports events, our platform keeps you updated in real-time. Our carefully curated news selection helps you stay ahead, providing accurate and relevant stories tailored to diverse interests.
No comments yet.