Health chatbots can leave patients with murkier advice

A recent Oxford-led study found that people using health chatbots did not make better decisions than those relying on online searches or their own judgment. Participants often left out key details, struggled to interpret responses, and were more likely to underestimate the severity of conditions they identified.

WTF Index IDIOCRACY
◄ Terminator 2 Idiocracy 3 ►

The story mainly shows people becoming dependent on chatbots for medical judgment while receiving murkier, potentially unsafe guidance.

Health chatbots can leave patients with murkier advice

AI-powered health chatbots are becoming a common place to start when people feel unwell. In overburdened healthcare systems, long waiting lists and rising costs are pushing more people toward tools like ChatGPT for medical self-diagnosis.

A recent Oxford-led study suggests that this shift comes with a serious limitation: the chatbot is only one side of the problem. People also have to know what to ask, what details to include, and how to judge the answer they receive.

Why people are turning to health chatbots

The appeal is easy to understand. Chatbots are available quickly, they can respond in plain language, and they can feel more direct than searching through pages of online medical information.

According to one recent survey, about one in six American adults already use chatbots for health advice at least monthly. That makes chatbot health advice more than a niche behavior. It is already part of how many people try to understand symptoms and possible next steps.

But the Oxford-led study described a gap between access to AI and useful medical guidance. A chatbot can produce a confident answer, but that does not mean the exchange has captured the right facts or led the user toward the safest interpretation.

The study found a communication problem

For the study, the authors recruited around 1,300 people in the U.K. The participants were given medical scenarios written by a group of doctors and asked to identify possible health conditions and courses of action, such as seeing a doctor or going to the hospital.

Participants used several chatbots: the default AI model powering ChatGPT, GPT-4o, Cohere's Command R+, and Meta's Llama 3, which once underpinned the company's Meta AI assistant. They also used their own methods to decide what the scenarios might mean.

Adam Mahdi, director of graduate studies at the Oxford Internet Institute and a co-author of the study, told TechCrunch, "The study revealed a two-way communication breakdown." He added, "Those using [chatbots] didn't make better decisions than participants who relied on traditional methods like online searches or their own judgment."

That finding matters because the promise of health chatbots is not just speed. The stronger claim is that they could help people make better choices. In this study, that improvement did not appear.

Where chatbot health advice breaks down

The study points to several weak spots in the interaction between people and AI systems. The problem was not described as a single bad answer or one model failing in isolation. It was a broader issue with how humans and chatbots exchange medical information.

  • Participants often omitted key details when asking chatbots for help.
  • Some chatbot answers were difficult for participants to interpret.
  • Responses frequently mixed useful guidance with weaker recommendations.
  • Using chatbots made participants less likely to identify a relevant health condition.
  • Participants using chatbots were more likely to underestimate the severity of conditions they did identify.

Mahdi said, "[T]he responses they received [from the chatbots] frequently combined good and poor recommendations." That is a difficult format for a person making a health decision, especially when they are already unsure what symptoms matter.

In medical self-diagnosis, missing context can change the entire direction of the advice. A chatbot may not know which detail is absent. A user may not know that the missing detail is important. That is the two-way communication problem at the center of the study.

Tech companies are still pushing deeper into health

The findings arrive as major technology companies continue to explore AI in healthcare and wellness. Apple is reportedly developing an AI tool that can dispense advice related to exercise, diet, and sleep. Amazon is exploring an AI-based way to analyze medical databases for "social determinants of health." Microsoft is helping build AI to triage messages to care providers sent from patients.

These projects are not identical to chatbot self-diagnosis, but they sit in the same wider trend: companies are looking for ways to use AI to improve health outcomes. The Oxford-led study is a reminder that real-world use can be more complicated than model performance alone.

TechCrunch has previously reported that professionals and patients are mixed on whether AI is ready for higher-risk health applications. The American Medical Association recommends against physician use of chatbots like ChatGPT for assistance with clinical decisions. Major AI companies, including OpenAI, warn against making diagnoses based on their chatbots' outputs.

What the study implies for patients

The clearest takeaway is not that people will stop asking chatbots health questions. The survey data suggests many already do. The more practical issue is whether people understand the limits of the answers they receive.

Mahdi's recommendation was direct: "We would recommend relying on trusted sources of information for healthcare decisions." He also said, "Current evaluation methods for [chatbots] do not reflect the complexity of interacting with human users. Like clinical trials for new medications, [chatbot] systems should be tested in the real world before being deployed."

That real-world testing point is central. A chatbot may perform differently when judged on a controlled benchmark than when used by a person who is worried, missing details, or trying to decide whether a symptom is serious.

For health chatbots to become safer and more useful, the interaction itself has to be evaluated, not only the model's answer in isolation. The study suggests that the future of AI in healthcare depends as much on communication design and user behavior as on the underlying model.