AI Can Unlock EHR Data to Determine Trial Eligibility

A medically trained artificial intelligence (AI) system deployed within a health system firewall accurately identified eligible and ineligible patients for a rare disease clinical trial, providing auditable and valid justifications. The findings, from a new study published by Cleveland Clinic researchers, suggest that this approach may enhance the efficiency of chart review and recruitment processes for clinical research.

Cleveland Clinic is a non-profit academic medical center. Advertising on our site helps support our mission. We do not endorse non-Cleveland Clinic products or services. Policy

“Our study showed that an AI system that uses both structured and unstructured data from complex real-world electronic health records [EHRs] was 96% accurate in assessing patients’ trial eligibility across nine prespecified domains of trial criteria,” says M. Trejeeve Martyn, MD, MSc, first author of the investigation, which was presented at the 2026 Technology and Heart Failure Therapeutics meeting and simultaneously published in the Journal of Cardiac Failure, official journal of the Heart Failure Society of America.

“These findings have implications for how AI can help us use EHR data more broadly across research beyond clinical trials that we are actively evaluating, with potential applications to retrospective studies, implementation studies and quality reporting for registries,” Dr. Martyn continues. “The number of potential use cases is large.”

A better way to leverage EHR data?

Contemporary EHRs are complex, making manual chart review for clinical research time-consuming and costly. Previous efforts to automate chart review using large language models (LLMs) have shown promise but faced limitations such as reliance on simulated data, use of exclusively structured or unstructured data without synthesis, and lack of auditable justifications for decisions.

“Even though there are tremendous amounts of data that live in the EHR, the ability to use these data at scale for purposes like determining clinical trial eligibility has historically been limited,” Dr. Martyn notes.

The new study evaluated an AI system (Synapsis AI, Dyania Health) that has an LLM component and aims to overcome earlier limitations by synthesizing both structured and unstructured data from real-world EHRs and by providing interpretable justifications for its conclusions. The researchers studied the system’s ability to assess eligibility for a transthyretin amyloidosis therapeutics trial.

Essentials of the study design

The AI system was deployed in August 2024 within the firewall of a unified EHR system covering multiple Cleveland Clinic hospitals and clinics in Ohio and Florida. Patients with cardiac amyloidosis-related diagnosis codes were prefiltered before AI system processing.

Cleveland Clinic investigators and the vendor created a “scoping document” based on the DepleTTR-CM phase 3 trial protocol, which served as the rubric for evaluating the AI system’s performance.

The system assessed 32 inclusion/exclusion criteria for the trial: 10 based on structured data alone, 18 combining structured data and LLM outputs (unstructured data) and four relying solely on LLM outputs.

“Structured data are from discrete fields that can readily be pulled, such as ICD-10 codes or lab values,” Dr. Martyn explains. “Unstructured data are things like imaging reports, pathology reports and clinical notes. Unstructured data need to be accurately abstracted and organized, and that’s where the LLM comes in and where around 80% of EHR data is housed.”

The system could assign one of four labels for each criterion: accept, reject, borderline or missing information. Based on the collective criteria assessments, patients were then categorized as follows:

Complete match, indicating that all trial criteria were met with no data exclusions
Partial match, indicating that all criteria were met but one or two criteria were missing
Borderline match, indicating that all criteria were met except for one or two that were borderline based on the scoping document

The process ended with a “human in the loop” review in which a study investigator reviewed all partial, complete and borderline matches. Final calls on eligibility were made by the study investigators.

The primary outcome was the AI system’s accuracy in evaluating 77 trial criteria-related questions across nine broad criteria categories in a random sample of 100 patients.

Results

Prefiltering yielded 1,476 patient EMRs with amyloid-related diagnosis codes. The AI system processed these records over six days using two graphics processing units.

Primary outcome

In a random sample of 100 patients, the LLM answered 7,409 out of 7,700 questions correctly, achieving an accuracy of 96.2% against physician review. Patient-level accuracy averaged 96%, with a minimum of 86% (in 1 patient) and a maximum of 100% (in 12 patients).

Secondary outcomes

The AI system identified 46 matches, of which 43 were deemed appropriate after human review (93.4% accuracy) — 4/4 as complete matches and 39/42 as partial/borderline. Three rejections were attributable to LLM errors identified during human review. Some patients were excluded for non-protocol reasons or deferred due to borderline criteria. After excluding borderline deferrals, 30 patients were deemed immediately recruitable. Among AI system-identified matches, 100% of complete matches and 76.5% of partial matches were considered eligible for recruitment.

Justifications provided by the system were judged 100% interpretable without further chart review.

Of 1,446 patients rejected by the AIS, a random sample of 200 were physician-reviewed, with 198 rejections deemed appropriate, yielding a negative predictive value of 99%.

Exploratory analysis: Comparison with routine screening

Notably, 29 of the 30 patients identified as readily recruitable had not been identified through routine screening processes in the prior 90 days. Exploratory analysis showed that AI-assisted screening identified more patients (30 over 7 days) compared with routine care (14 patients), with a higher proportion of Black patients and less connection to heart failure specialists among AI-identified patients.

Takeaways and next steps

“This AI system demonstrated rapid processing of structured and unstructured data to provide accurate eligibility assessments with interpretable justifications,” Dr. Martyn observes. “The LLM showed consistent high performance — above 96%— across multiple trial criteria domains.”

He identified several key takeaways from the study:

An AI system can be successfully deployed within a health system firewall for accurate determination of patients’ trial eligibility. “No data ever exited the Cleveland Clinic firewall when we were using this technology,” he says.
An AI system with an LLM can use both structured and unstructured data to make conclusions around trial eligibility. “We don’t believe this has been demonstrated before in a robust way using EHR data from real patients and without preselected notes being sent to the cloud,” he notes.
An AI system used for these purposes can provide valid, interpretable justifications for its conclusions for human verification. “This ability to provide the source of truth is the difference between a usable tool and a nonusable tool for our busy research coordinators,” he comments.
AI-assisted screening identified many patients previously missed by routine methods, potentially broadening recruitment reach. “It raises the prospect for trial recruitment that is broader geographically and demographically and less dependent on subspecialist practices,” he says.
Human-in-the-loop review is still essential to address ambiguous or missing EHR data and to identify non-protocol-related reasons for exclusion (e.g., advanced dementia).

“Clinical trials are the backbone of evidence generation in cardiology, but we know that they are expensive, time-consuming and often have trouble reaching enrollment goals,” notes study co-author Ashish Sarraju, MD, a Cleveland Clinic preventive cardiologist. “Efforts like this to incorporate auditable AI into clinical trial workflows are crucial opportunities to see if clinical trial conduct can be improved meaningfully with new technologies.”

“Our next steps are to deploy this technology for use in recruitment for more trials, both in rare conditions and more common diseases,” Dr. Martyn says. “Even common-disease trials have extensive inclusion and exclusion criteria, so this could save a lot of time spent screening for ultimately ineligible patients in that setting as well.”