Using real-world data to understand a patient’s journey through the healthcare ecosystem can be challenging, particularly for rare or less common diseases. The relevant data are often fragmented, incomplete, and inaccessible—further limiting the data available to study inherently small cohorts of patients.
To showcase how expert-led AI is helping overcome these challenges, leaders from Truveta and UCB recently co-hosted a webinar where they presented a compelling use case. Ryan Ahern, Chief Medical Officer at Truveta, explained how the company is making massive daily streams of healthcare data available for clinical research using the Truveta Language Model (TLM), a large language multi-modal AI model trained on the largest collection of complete medical records. Eric McCulley, Head of Portfolio Innovation, US Immunology at UCB, showcased how that is driving research his team recently conducted on the patient journey of individuals with Hidradenitis Suppurativa (HS), a chronic, inflammatory skin disease affecting an estimated 0.1% of the US population. Here are some key takeaways from the webinar:
Stitching together a comprehensive patient journey—especially for smaller populations—requires a tremendous amount of well-linked data
Nearly a third of webinar attendees indicated that lack of data availability for patients of interest is the biggest challenge they face in using real-world data. Eric agreed that this has historically been a major research challenge, especially for less common diseases like HS, and even more so for rare diseases (those affecting fewer than 200,000 people). The rarity of these conditions means that the affected population is inherently small, and smaller still once additional inclusion/exclusion criteria are applied for research. To address this problem, Truveta provides the most complete real-world dataset, with full patient medical records from nearly 100M patients across all 50 states. These data are also linked to other high-value data types, such as social drivers of health data, mortality data, and pharmacy data to enrich and further bolster the completeness of the dataset. Having access to such large, representative populations enables researchers to study rare or “rarer” diseases of interest with sufficient sample sizes, even once additional requirements are layered on.
Expert-led AI is enabling us to extract previously inaccessible insights at scale
EHR data from Truveta’s member health systems are updated daily. These frequent updates, combined with the sheer volume of healthcare data to be processed, require the support of AI to make it usable and research ready. Ryan explained how AI—backed by world-class clinical informaticists, epidemiologists, and clinical annotators—is enabling Truveta to extract key clinical concepts at scale from semi-structured and unstructured data to learn faster through insights that were inaccessible, until now.
The presenters zeroed in on concept extraction from free-text clinician notes as an especially powerful opportunity to apply expert-led AI. An estimated 80% of clinical data elements necessary for research and quality improvement purposes are captured in unstructured notes. When partners approach Truveta with a research question, a foundational step is exploring what is documented in clinician notes on that topic in the first place. From there, the team decides on the specific endpoints, outcomes, or concepts of interest to translate from free text into structured data fields. A team of clinical informaticists create standard operating procedures for a team of annotators to manually create a high-fidelity, human-curated extraction set. Truveta engineers then use TLM to train on that curated dataset until the model surpasses the accuracy of human experts, and then deploy at scale. Depending on the complexity of that concept, the process can take 3-6 months.
These insights can facilitate tremendous impacts on patient quality of life
Eric highlighted some of the research UCB has been conducting with Truveta on the HS patient journey. He noted that HS patients are diagnosed an average of 7 years after the onset of symptoms and may not receive effective treatment for years even after an appropriate diagnosis. In the meantime, those patients may experience debilitating pain, irreparable scarring, social stigma, and psychosocial impacts. Using Truveta Data to understand the HS patient journey—including sites of care and providers accessed across multiple specialties, intervention history, time to diagnosis, and more—is enabling UCB to better understand opportunities to improve provider education and facilitate earlier diagnosis and appropriate intervention. This, in turn, has the potential to limit disease progression, reduce unnecessary healthcare utilization, and improve HS patient quality of life. Moving forward, UCB plans to explore the impact of social drivers of health on access to treatment and adherence for HS patients.
View the full webinar recording or contact us to discuss how Truveta Data can support your research needs.