October 25, 2023

ChatGPT “Pretty Good” at Basic IBS Info, but Misses Details

Important to recognize limitations when it comes to healthcare and research

chatgpt

The artificial intelligence (AI) language model, ChatGPT, works reasonably well for answering basic queries from the public about irritable bowel syndrome (IBS), but less so for medical professionals seeking detailed and referenced information.

Advertisement

Cleveland Clinic is a non-profit academic medical center. Advertising on our site helps support our mission. We do not endorse non-Cleveland Clinic products or services. Policy

That’s what researchers from Cleveland Clinic’s Digestive Disease Institute concluded when they tested ChatGPT’s version 4.0 with 15 commonly asked questions about IBS. Its overall accuracy was 80%, and there were no fully inaccurate answers so it’s fine for basic inquiries.

However, the model still missed details and gave some outdated answers. Moreover, physicians seeking medical literature references will be disappointed, explain study co-author Anthony Lembo, MD, the Institute’s Director of Research and presenting co-author, Brian Baggott, MD, a gastroenterologist in Cleveland Clinic’s Department of Gastroenterology, Hepatology & Nutrition.

“For the most part, the model was reasonable. Where it gets lost – and I think this is important to emphasize about ChatGPT – it only has what’s available in the public domain. It can’t find papers that aren’t publicly available,” notes Dr. Baggott.

Regardless of what clinicians might think of ChatGPT, Dr. Lembo says it’s important they understand its advantages and limitations as patients use it to seek information. “ChatGPT is becoming more and more popular among laypeople. It used to be Google, but now they’re turning to ChatGPT because it’s more sophisticated.”

Study methods

For the study, the investigators tested the most current version, ChatGPT4.0, with 15 common questions about IBS that they derived from both ChatGPT itself and from Google Trends. For each question, they asked ChatGPT to provide references from the medical literature along with the answers. Three independent gastroenterologists then assessed ChatGPT’s answers in three ways:

1) An overall assessment as either “accurate” or “inaccurate.”

2) Granular assessments as either “100% accurate,” “accurate with missing information,” “partly inaccurate” or “100% inaccurate.”

Advertisement

3) The references judged as “suitable,” “unsuitable” or “nonexistent.”

Overall, the researchers deemed 80% of the answers “accurate” and 20% “inaccurate.” For the granular assessments, they considered just under two-thirds of ChatGPT’s answers to be 100% accurate, about one-third to be “partly inaccurate,” and a small number “accurate with missing information.” None were considered 100% inaccurate.

Strengths and weaknesses of ChatGPT

The two questions that ChatGPT4.0 answered best were “What causes IBS?” and “What foods should I avoid if I have IBS?” For both, the answers earned both overall and granular grades of “accurate” and the references provided as “suitable.”

Just two questions were answered inaccurately overall. These were “How is IBS diagnosed?” and “Can [cannabidiol] improve IBS symptoms?”

However, several more answers were granularly considered “partly inaccurate,” including “Is there a test for IBS?” Two more were deemed “accurate with missing information,” including “What support resources are available for people with IBS?”

Dr. Baggott points out that even though ChatGPT can’t access subscriber-restricted journal text, it should be able to find other sources referencing that content. Yet, that can take time. For example, it didn’t know that recent guidelines advise against using probiotics for IBS.

“It’s not always up to date. Plus, the results in that example may be confounded by the fact that a lot of people write materials saying that probiotics work for them,” Dr. Baggott notes.

Advertisement

He also points out that ChatGPT may miss some subtle points that may or may not make a difference clinically. “So it’s not that it’s completely wrong, but just not the way I would explain to a patient.”

As for the references, they were “nonexistent” for the questions “What are the treatment options for IBS?” and “How to manage IBS during pregnancy?” Eight more were deemed “unsuitable” while just five were “suitable.”

“ChatGPT is pretty good, but it doesn’t get the references. It will eventually, though,” Dr. Lembo predicts.

The takeaway from this study, he says, is to “caution patients that ChatGPT can give you good information about general topics, but for specific answers you should always consult your doctor.”

The study’s first author, post-doctoral research fellow Joseph El Dahdah, MD, is scheduled to present the new findings for ChatGPT-4.0 Answers Common Irritable Bowel Syndrome Patient Queries: Accuracy and References Validity at the American College of Gastroenterology meeting in Vancouver in late October.

Related Articles

Impostor phenomenon
February 6, 2024
Recognizing the Impact of Impostor Phenomenon and Microaggressions in Gastroenterology

The importance of raising awareness and taking steps to mitigate these occurrences

Koji Hashimoto, MD, and team
February 2, 2024
Combined Cardiac Surgery and Liver Transplant Is a New Option for Highly Selected Patients

New research indicates feasibility and helps identify which patients could benefit

liver
December 8, 2023
MILU Improves Outcomes Among Critically Ill Patients with Advanced Liver Disease

Standardized and collaborative care improves liver transplantations

alcohol
November 17, 2023
Younger Patients with Alcohol-Associated Hepatitis Present to the ED More Often, Research Shows

Caregiver collaboration and patient education remain critical

food allergies
October 26, 2023
What Gastroenterologists Need to Know About Managing Food Allergies in Clinical Practice

Beyond recognizing and treating food allergies, GIs also have a responsibility to address common food allergy misconceptions

celiac disease
October 24, 2023
Women’s Health Disorders Found More Frequently Among Individuals With Celiac Disease

Findings reinforce the importance of multidisciplinary care

Miguel Reguiero, MD
October 23, 2023
No Link Between Small Molecule Therapies and Cardiovascular Events in New IBD Study

Underlying inflammation may cause higher rates of CV events

Loop ileostomy
July 14, 2023
Comparing Treatment Strategies for High-Risk Ileocolic Crohn’s Disease

Findings from a 2023 ASCRS presentation indicate that both 2-stage and modified 2-stage approaches are viable management strategies for Crohn’s patients who require a temporary ileostomy

Ad