Locations:
Search IconSearch

ChatGPT Performs ‘Better Than Expected’ in Responding to Basic Cardiology Queries

Cleveland Clinic leads pioneering assessment of the AI model for health information purposes

Digital heart symbol

A popular online artificial intelligence (AI) model was able to answer simple questions about cardiovascular disease prevention, possibly pointing the way to future clinical use of the technology, Cleveland Clinic researchers say.

Advertisement

Cleveland Clinic is a non-profit academic medical center. Advertising on our site helps support our mission. We do not endorse non-Cleveland Clinic products or services. Policy

In a study led by Ashish Sarraju, MD, of Cleveland Clinic’s Section of Preventive Cardiology and Rehabilitation, the dialogue-based AI language model Chat Generative Pre-trained Transformer (ChatGPT) gave appropriate responses to 84% of basic questions that patients might search online or ask their clinician in a patient portal. The work was published as a research letter in JAMA (Epub 3 Feb 2023).

ChatGPT essentials

Since its November 2022 launch, ChatGPT has been widely discussed, particularly with regard to its implications for academia and cybersecurity. It works by integrating information from a variety of sources across the internet and presenting the results in an understandable way. It was not developed for medical use and such investigations thus far have been few, but that may well change, Dr. Sarraju says.

“ChatGPT just burst on the scene with such media attention that everybody began to use it to query things,” he remarks. “We know that our preventive cardiology patients tend to look up much of the critical information we discuss with them at visits. So we figured that as ChatGPT becomes more popular, our patients might start using it to ask questions. Before that begins to happen, we wanted to see how it performed.”

He and his colleagues also wanted to explore potential uses of ChatGPT in medical practice. “We were interested in whether it might have a place somewhere in the medical workflow where there’s a bottleneck,” Dr. Sarraju says.

The study in brief

For the study, the researchers developed a list of 25 questions related to preventive cardiology that patients often ask, such as “What’s the best diet for the heart?” and “How can I lose weight?” They posed each question to the ChatGPT interface three times. The responses were graded by an experienced preventive cardiology clinician as “appropriate,” “inappropriate” or, if the three responses differed, “unreliable.”

Advertisement

The grading was done separately for two hypothetical situations: as a response on a patient-facing platform like a hospital informational website, and as an AI-generated response to an electronic message question a patient would send to their clinician.

Of the ChatGPT responses to the 25 questions, 21 were deemed appropriate in both hypothetical contexts and four were deemed inappropriate in both contexts; none of the responses were judged unreliable.

Two of the four inappropriate responses pertained to exercise — one to amount and the other to type. The AI firmly recommended both cardiovascular activity and weightlifting for all, rather than reflecting the fact that those activities may be harmful for some people.

“Exercise counseling is very individualized,” Dr. Sarraju explains. “An AI model that is trained on publicly available general information won’t be able to provide that level of personalized information.”

The other two inappropriate responses may be easier to correct. In one case, ChatGPT failed to mention familial hypercholesterolemia in response to a question about interpreting an LDL cholesterol level above 200 mg/dL. In the other, its response to a question about the cholesterol-lowering agent inclisiran (Leqvio®) indicated that the drug wasn’t commercially available when in fact it was licensed in the U.S. in December 2021.

“That speaks to training bias,” Dr. Sarraju notes. “Any AI model is only as good as the data it’s trained on.”

Exceeding expectations — but not ready for clinical use

Nonetheless, the investigators were impressed at how well ChatGPT performed overall. “We expected it to do well with basic questions that are more factual in nature, that it presumably would have been trained on during its training timeline,” Dr. Sarraju says. “We found that even with more nuanced questions — like what someone should do if their cholesterol isn’t controlled on a statin — its responses were quite reasonable and nuanced in return. It was surprising. It did better than we expected.”

Advertisement

Of course, much more work is needed before a model like ChatGPT can be considered for use in clinical practice, notes study co-author Leslie Cho, MD, Co-Section Head of Preventive Cardiology and Rehabilitation at Cleveland Clinic. “Where can it enter the workflow, and what level of information can be safely delegated to the AI model before a human needs to step in? Do we need continued quality control? Do we need somebody fact-checking it regularly? These are all things we don’t yet know,” Dr. Cho points out.

There are also regulatory questions to be resolved. “If an AI model is developed for direct patient use, it needs to be regulated,” Dr. Sarraju observes. “But who’s going to do that regulation? How do we assess the quality? I assume this would go through the FDA’s device process, but that needs to be determined and spelled out in a robust manner.”

Advertisement

Related Articles

cardiac amyloidosis as seen on cardiac MRI
AI Tool Improves Accuracy of Diagnosing Cardiac Amyloidosis on MRI

Model shows promise in differentiating from hypertrophic cardiomyopathy and other conditions

scalp EEG electrodes on a woman's head
October 17, 2024/Neurosciences/Epilepsy
Machine Learning Algorithm May Enhance Accuracy of Predicting Seizure Control After Epilepsy Surgery

Model relies on analysis of peri-ictal scalp EEG data, promising wide applicability

Genetic sequencing
September 26, 2024/Cancer
AI Can Help Find Trials for Patients with Rare Cancers

Study demonstrates potential for improving access

AI in a lightbulb
Consider This Comprehensive Approach for Evaluating AI

Cleveland Clinic uses data to drive its AI implementation strategy

gut-brain axis
Assessing Alzheimer’s Disease Drug Targets Along the Gut-Brain Axis

Pairing machine learning with multi-omics revealed potential therapeutic targets

Photo of Tom Mihaljevic, MD, and Gary Cohn
A Fireside Chat about Digital Technology in Healthcare

Cleveland Clinic and IBM leaders share insights, concerns, optimism about impacts

neurosurgeon performing minimally invasive spine operation
Top Takeaways From the 2024 AANS Annual Meeting

Scientific program chair reflects on what may resonate longest from this year’s neurosurgery conference

UTI bacteria and artificial intelligence
AI Algorithms Accurately Predict Antibiotic Resistance in UTI

Up to 3 days faster than waiting for urine culture results

Ad