Turning real world data into insights
A lot has been said and written about the potential of artificial intelligence (AI) to advance and improve health care. And while there are still many challenges to overcome and questions to answer, one area in particular shows dramatic positive results.
The use of AI to unscramble the unstructured data from electronic health care records (EHR) is uncovering previously hidden insights and bringing significant advances to health care and clinical research.
EHR data, which captures patients’ medical histories, has long been recognized as a remarkable wealth of information. De-identified to protect patient privacy, the data can be mined to gain insights into disease trends, treatment effectiveness, and utilization patterns.
The challenge, however, is that most information in EHRs isn’t in formats that can be easily queried, compared, or analyzed. Structured data—things like patient vitals, medication records, diagnostic codes, procedure histories, and lab results—makes up only a portion of the EHR.
The remainder is unstructured, such as clinician notes, radiology reports, and images. This unstructured data holds great value, but making sense of it en masse has been resource intensive, if not impossible, until the advent of AI.
Turning real world data into insights
Sophisticated machine learning (ML) and natural language processing (NLP) tools can rapidly extract key information from free-text clinical notes and images. With careful oversight and validation, this real-world data (RWD) becomes an invaluable asset to the medical community.
Momentum is growing for the use of real-world data in patient care and drug development. Recent FDA guidance on using RWD from EHRs and medical claims data in regulatory decision-making marks a milestone in regulatory acceptance of AI-validated, real-world datasets.
Already, clinical researchers are leveraging AI tools to find important insights into disease progression, treatment decisions, and long-term patient outcomes.
For example, an AI-powered analysis of patterns of language in EHR unstructured clinical notes identified metastatic progression not coded in the patient record. The effort identified a five-fold increase in metastatic patients for a prostate cancer study.
In another case, researchers studying geographic atrophy (GA) used AI to extract disease signals from unstructured clinical notes and images of the eye. The approach expanded a GA cohort by nearly a half-million patients and improved insight into the disease progression and treatment outcomes.
Growing potential in research and health care
With AI-powered tools, life sciences companies are unlocking critical insights trapped in unstructured, real-world data, improving trial efficiency, and accelerating drug development.
- Researchers can optimize site selection by analyzing historical recruitment, patient demographics, and disease burden.
- They can drive patient recruitment by matching eligible patients to trials using predictive analytics.
- They can improve study design and reduce complexity.
- They can enhance patient safety by identifying safety signals and adverse event patterns across diverse populations.
At the same time, AI’s ability to process vast amounts of data has suggested a new era of personalized health care, with improved diagnostics, algorithmic predictions of patient responses, and tailored treatment plans.
With so much potential, it’s important to remember that ML and NLP models are only as good as the data that powers them. The realization of AI-driven health care promise is predicated on high-quality, properly validated data-curation processes.
Key considerations for success
Ensuring the accuracy of AI outputs starts with a comprehensive and rigorous approach to the underlying data.
This requires an agnostic approach to vendors and health care systems, with a focus on patients and the clinicians responsible for their care, not the EHR technology used to store the medical records or the facilities providing treatment.
De-identified RWD, including active longitudinal records (information collected repeatedly over time from the same patients), sourced from an array of health care settings and technologies is essential. The process of creating AI-curated data sets that can illuminate the transformative change to health care and research outcomes requires a deliberate approach and a clinically informed collaboration at each step—from ingestion to curation to application.
While the “machines” get top billing in AI-powered health care advances, it takes a team of clinicians, nurses, informaticians, data scientists, epidemiologists, biostatisticians, and engineers working together to make effective decisions on how to curate and standardize data while retaining its original clinical context. And this data must be harmonized (integrating structured and unstructured data), and models frequently refined to prevent bias and maintain accuracy.
The effort and organizational experience required to execute and deliver at this high level is significant. But the rewards make it worthwhile.
With the right measures and oversight, AI is helping to turn jumbled RWD into remarkable insights and improvements that can benefit us all.
Sujay Jadhav is a health care executive.