AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

Sahwa@reddthat.com · 13 days ago

AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

pooterbroo@programming.dev · 12 days ago

They presented five generative AI chatbots—Gemini (2.0, Google; version available December 2024), DeepSeek (V3, High-Flyer; version available December 2024), Meta AI (Llama 3.3, Meta; version available December 2024), ChatGPT (3.5, OpenAI; version available November 2022) and Grok (2, xAI; version available August 2024)—with a series of closed- and open-ended prompts across five misinformation-prone categories.

Couldn’t the researchers at least bother to use the latest models?

Rimu@piefed.social · 12 days ago

The study was done in Feb 2025 and they probably wrote the research proposal months before then, waited for approval / funding, etc. I don’t know the process of how academia works but I imagine it to be very slow and bureaucratic.

https://bmjopen.bmj.com/content/16/4/e112695

pooterbroo@programming.dev · 12 days ago

Well they didn’t even use the latest models in Feb 2025. They should’ve used DeepSeek R1 and OpenAI o3-mini which use additional test time compute to arrive at better answers. They used GPT 3.5 which was about 2½ years old at the time.

AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

AI chatbots fail medical misinformation test, returning inaccurate and fabricated advice

Just a moment...