ChatGPT outperforms physicians in providing high-quality, empathetic advice to patient questions
Listen to this article
There has been widespread speculation about how advances in artificial intelligence (AI) assistants like ChatGPT could be used in medicine. A new study published in JAMA Internal Medicine provides an early glimpse into the role that AI assistants could play.
The study compared written responses from physicians and those from ChatGPT to real-world health questions. A panel of licensed healthcare professionals preferred ChatGPT’s responses 79% of the time and rated ChatGPT’s responses as of higher quality and more empathetic.
What the researchers say: “The opportunities for improving healthcare with AI are massive,” said the lead author. “AI-augmented care is the future of medicine.”
In the new study, the research team set out to answer the question: Can ChatGPT respond accurately to questions patients send to their doctors? If yes, AI models could be integrated into health systems to improve physician responses to questions sent by patients and ease the ever-increasing burden on physicians.
“ChatGPT might be able to pass a medical licensing exam," said the study’s co-author, “but directly answering patient questions accurately and empathetically is a different ballgame.”
“The COVID-19 pandemic accelerated virtual healthcare adoption,” explained the researchers. “While this made accessing care easier for patients, physicians are burdened by a barrage of electronic patient messages seeking medical advice that have contributed to record-breaking levels of physician burnout.”
To obtain a large and diverse sample of healthcare questions and physician answers that did not contain identifiable personal information, the team turned to social media where millions of patients publicly post medical questions to which doctors respond: Reddit’s AskDocs.
r/AskDocs is a subreddit with approximately 452,000 members who post medical questions and verified healthcare professionals submit answers. While anyone can respond to a question, moderators verify healthcare professionals’ credentials and responses display the respondent’s level of credentials. The result is a large and diverse set of patient medical questions and accompanying answers from licensed medical professionals.
While some may wonder if question-answer exchanges on social media are a fair test, team members noted that the exchanges were reflective of their clinical experience.
The team randomly sampled 195 exchanges from AskDocs where a verified physician responded to a public question. The team provided the original question to ChatGPT and asked it to author a response. A panel of three licensed healthcare professionals assessed each question and the corresponding responses and were blinded to whether the response originated from a physician or ChatGPT. They compared responses based on information quality and empathy, noting which one they preferred.
The panel of healthcare professional evaluators preferred ChatGPT responses to physician responses 79% of the time.
“ChatGPT messages responded with nuanced and accurate information that often addressed more aspects of the patient’s questions than physician responses,” the authors said.
Additionally, ChatGPT responses were rated significantly higher in quality than physician responses: good or very good quality responses were 3.6 times higher for ChatGPT than physicians (physicians 22.1% versus ChatGPT 78.5%). The responses were also more empathic: empathetic or very empathetic responses were 9.8 times higher for ChatGPT than for physicians (physicians 4.6% versus ChatGPT 45.1%).
“I never imagined saying this,” added the study coauthor, “but ChatGPT is a prescription I’d like to give to my inbox. The tool will transform the way I support my patients.”
“While our study pitted ChatGPT against physicians, the ultimate solution isn’t throwing your doctor out altogether,” cautioned the researchers. “Instead, a physician harnessing ChatGPT is the answer for better and empathetic care.”
“Our study is among the first to show how AI assistants can potentially solve real world healthcare delivery problems,” they added. “These results suggest that tools like ChatGPT can efficiently draft high quality, personalized medical advice for review by clinicians.”
So, what? If all the physician does is review AI answers, why not have the review done by other AI? This is a conundrum faced in so many professions—lawyers whose knowledge of the law will never match AI, accountants who will fail to match AI’s grasp of tax or accounting, screen and television writers and novelists who are now being replaced by AI, board members who first use and then are exchanged for AI (this is already happening).
In the end there may have to be jobs reserved for humans regardless of the skills of ChatGPT and its GAI successors. The problem will then be: who makes the decision and how will it be enforced? By GAI perhaps?
Join the discussion
More from this issue of TR
When employees leave their jobs, coworkers call it quits
People leave jobs all the time, whether they’re laid off, fired, or just quit. But how do their departures affect coworkers left behind?
"Blinding" is not a silver bullet to deal with gender bias
The gender of the idea proposer had no impact on whether innovation managers thought it was a good idea or not.
You might be interested inBack to Today's Research
People taking adult education classes run lower risk of dementia
How can we best keep our brain fit as we grow older?
How to predict how much you‘ll earn
You can go to a palmist, a Tarot reader, or a career counselor to get an idea of your likely future earnings. They will probably be about as accurate as each other. Now, for the first time, an interesting piece of research enables researchers to rank the most important factors that predict future affluence –and the findings might surprise you.
Join our tribe
Subscribe to Dr. Bob Murray’s Today’s Research, a free weekly roundup of the latest research in a wide range of scientific disciplines. Explore leadership, strategy, culture, business and social trends, and executive health.