Advertisement
Cleveland Clinic leads pioneering assessment of the AI model for health information purposes
A popular online artificial intelligence (AI) model was able to answer simple questions about cardiovascular disease prevention, possibly pointing the way to future clinical use of the technology, Cleveland Clinic researchers say.
Advertisement
Cleveland Clinic is a non-profit academic medical center. Advertising on our site helps support our mission. We do not endorse non-Cleveland Clinic products or services. Policy
In a study led by Ashish Sarraju, MD, of Cleveland Clinic’s Section of Preventive Cardiology and Rehabilitation, the dialogue-based AI language model Chat Generative Pre-trained Transformer (ChatGPT) gave appropriate responses to 84% of basic questions that patients might search online or ask their clinician in a patient portal. The work was published as a research letter in JAMA (Epub 3 Feb 2023).
Since its November 2022 launch, ChatGPT has been widely discussed, particularly with regard to its implications for academia and cybersecurity. It works by integrating information from a variety of sources across the internet and presenting the results in an understandable way. It was not developed for medical use and such investigations thus far have been few, but that may well change, Dr. Sarraju says.
“ChatGPT just burst on the scene with such media attention that everybody began to use it to query things,” he remarks. “We know that our preventive cardiology patients tend to look up much of the critical information we discuss with them at visits. So we figured that as ChatGPT becomes more popular, our patients might start using it to ask questions. Before that begins to happen, we wanted to see how it performed.”
He and his colleagues also wanted to explore potential uses of ChatGPT in medical practice. “We were interested in whether it might have a place somewhere in the medical workflow where there’s a bottleneck,” Dr. Sarraju says.
For the study, the researchers developed a list of 25 questions related to preventive cardiology that patients often ask, such as “What’s the best diet for the heart?” and “How can I lose weight?” They posed each question to the ChatGPT interface three times. The responses were graded by an experienced preventive cardiology clinician as “appropriate,” “inappropriate” or, if the three responses differed, “unreliable.”
Advertisement
The grading was done separately for two hypothetical situations: as a response on a patient-facing platform like a hospital informational website, and as an AI-generated response to an electronic message question a patient would send to their clinician.
Of the ChatGPT responses to the 25 questions, 21 were deemed appropriate in both hypothetical contexts and four were deemed inappropriate in both contexts; none of the responses were judged unreliable.
Two of the four inappropriate responses pertained to exercise — one to amount and the other to type. The AI firmly recommended both cardiovascular activity and weightlifting for all, rather than reflecting the fact that those activities may be harmful for some people.
“Exercise counseling is very individualized,” Dr. Sarraju explains. “An AI model that is trained on publicly available general information won’t be able to provide that level of personalized information.”
The other two inappropriate responses may be easier to correct. In one case, ChatGPT failed to mention familial hypercholesterolemia in response to a question about interpreting an LDL cholesterol level above 200 mg/dL. In the other, its response to a question about the cholesterol-lowering agent inclisiran (Leqvio®) indicated that the drug wasn’t commercially available when in fact it was licensed in the U.S. in December 2021.
“That speaks to training bias,” Dr. Sarraju notes. “Any AI model is only as good as the data it’s trained on.”
Nonetheless, the investigators were impressed at how well ChatGPT performed overall. “We expected it to do well with basic questions that are more factual in nature, that it presumably would have been trained on during its training timeline,” Dr. Sarraju says. “We found that even with more nuanced questions — like what someone should do if their cholesterol isn’t controlled on a statin — its responses were quite reasonable and nuanced in return. It was surprising. It did better than we expected.”
Advertisement
Of course, much more work is needed before a model like ChatGPT can be considered for use in clinical practice, notes study co-author Leslie Cho, MD, Co-Section Head of Preventive Cardiology and Rehabilitation at Cleveland Clinic. “Where can it enter the workflow, and what level of information can be safely delegated to the AI model before a human needs to step in? Do we need continued quality control? Do we need somebody fact-checking it regularly? These are all things we don’t yet know,” Dr. Cho points out.
There are also regulatory questions to be resolved. “If an AI model is developed for direct patient use, it needs to be regulated,” Dr. Sarraju observes. “But who’s going to do that regulation? How do we assess the quality? I assume this would go through the FDA’s device process, but that needs to be determined and spelled out in a robust manner.”
Advertisement
Advertisement
Model relies on analysis of peri-ictal scalp EEG data, promising wide applicability
Study demonstrates potential for improving access
Cleveland Clinic uses data to drive its AI implementation strategy
Pairing machine learning with multi-omics revealed potential therapeutic targets
Cleveland Clinic and IBM leaders share insights, concerns, optimism about impacts
Scientific program chair reflects on what may resonate longest from this year’s neurosurgery conference
Up to 3 days faster than waiting for urine culture results
Investigators are developing a deep learning model to predict health outcomes in ICUs.