• Profile
Close

AI chatbots miss key signs of psychiatric drug reactions, lag behind expert advice

MedicalXpress Breaking News-and-Events May 20, 2025

Asking artificial intelligence for advice can be tempting. Powered by large language models (LLMs), AI chatbots are available 24/7, are often free to use, and draw on troves of data to answer questions. Now, people with mental health conditions are asking AI for advice when experiencing potential side effects of psychiatric medicines—a decidedly higher-risk situation than asking it to summarize a report.

One question puzzling the AI research community is how AI performs when asked about mental health emergencies. Globally, including in the US, there is a significant gap in mental health treatment, with many individuals having limited to no access to mental health care. It's no surprise that people have started turning to AI chatbots with urgent health-related questions.

Now, researchers at the Georgia Institute of Technology have developed a new framework to evaluate how well AI chatbots can detect potential adverse drug reactions in chat conversations and how closely their advice aligns with human experts. The study was led by Munmun De Choudhury, J.Z. Liang, associate professor in the school of interactive computing, and Mohit Chandra, a third-year computer science PhD student, and is available on the arXiv preprint server.

Putting AI to the test

Going into their research, De Choudhury and Chandra wanted to answer two main questions: first, can AI chatbots accurately detect whether someone is having side effects or adverse reactions to medication? Second, if they can accurately detect these scenarios, can AI agents then recommend good strategies or action plans to mitigate or reduce harm?

The researchers collaborated with a team of psychiatrists and psychiatry students to establish clinically accurate answers from a human perspective and used those to analyse AI responses.

To build their dataset, they went to the internet's public square, Reddit, where many have gone for years to ask questions about medication and side effects.

They evaluated nine LLMs, including general-purpose models (such as GPT-4o and LLama-3.1), and specialised medical models trained on medical data. Using the evaluation criteria provided by the psychiatrists, they computed how precisely the LLMs were in detecting adverse reactions and correctly categorising the types of adverse reactions caused by psychiatric medications.

Additionally, they prompted LLMs to generate answers to queries posted on Reddit and compared the alignment of LLM answers with those provided by the clinicians over four criteria: (1) emotion and tone expressed, (2) answer readability, (3) proposed harm-reduction strategies, and (4) actionability of the proposed strategies.

The research team found that LLMs stumble when comprehending the nuances of an adverse drug reaction and distinguishing different types of side effects. They also discovered that while LLMs sounded like human psychiatrists in their tones and emotions, such as being helpful and polite, they had difficulty providing true, actionable advice aligned with the experts.

Better bots, better outcomes

The team's findings could help AI developers build safer, more effective chatbots. Chandra's ultimate goals are to inform policymakers of the importance of accurate chatbots and help researchers and developers improve LLMs by making their advice more actionable and personalised.

Chandra notes that improving AI for psychiatric and mental health concerns would be particularly life-changing for communities that lack access to mental health care.

"When you look at populations with little or no access to mental health care, these models are incredible tools for people to use in their daily lives," Chandra said. "They are always available, they can explain complex things in your native language, and they become a great option to go to for your queries.

"When the AI gives you incorrect information by mistake, it could have serious implications on real life," Chandra added. "Studies like this are important because they help reveal the shortcomings of LLMs and identify where we can improve."

Go to Original
Only Doctors with an M3 India account can read this article. Sign up for free or login with your existing account.
4 reasons why Doctors love M3 India
  • Exclusive Write-ups & Webinars by KOLs

  • Nonloggedininfinity icon
    Daily Quiz by specialty
  • Nonloggedinlock icon
    Paid Market Research Surveys
  • Case discussions, News & Journals' summaries
Sign-up / Log In
x
M3 app logo
Choose easy access to M3 India from your mobile!


M3 instruc arrow
Add M3 India to your Home screen
Tap  Chrome menu  and select "Add to Home screen" to pin the M3 India App to your Home screen
Okay