Truveta Data
Clinical notes
Largest collection of clinical notes integrated with EHR data
Nearly 80% of data relevant to research is hidden in unstructured notes
The Truveta Language Model extracts data from notes at scale, empowering researchers with data for more than 5 billion free-text notes.
Understand clinical context for patients
With access to complete EHR data — including notes — linked with social drivers of health, mortality, and claims data, researchers can understand the complete patient journey and address previously unanswerable questions.
Identifying key moments in the patient journey
Repeated outpatient visit in January
First dermatology visit in February
Hidradenitis suppurativa diagnosis in November​
Primary care visit
Specialist visit
Diagnosis
Repeated outpatient visit in January
Redacted patient note showing delays in diagnosis and treatment for a less common condition
First dermatology visit in February
Redacted patient note showing delays in diagnosis and treatment for a less common condition
Hidradenitis suppurativa diagnosis in November​
Redacted patient note showing delays in diagnosis and treatment for a less common condition
Unlock access to any clinical concept of interest
Truveta receives all clinical notes generated during a patient’s care. This includes progress notes, nursing evaluations, procedure/operative reports, referral notes, discharge summaries, imaging reports, and more.
Notes are available across disease areas, including heart failure, vessel disease, migraines, seizures, NASH, hypercholesterolemia, colon cancer, and rare diseases.
Example cardiovascular concepts extracted from notes
Sampling of normalized echocardiogram data in Truveta Data
Sampling of normalized cardiac catheterization data in Truveta Data
Answer novel research questions
Example applications of notes data
Classify disease severity and monitor disease progression to inform R&D
Using echocardiogram data to classify aortic stenosis severity
Assess lifestyle behaviors and symptom prevalence to optimize clinical trial design
Analyzing diet data for a rare disease requiring dietary modifications
Identify potential confounders relevant to comparative effectiveness research
Identifying confounders before head-to-head SGLT2i study
AI enables accuracy at scale
The Truveta Language Model, a large language model trained on medical records data, is designed to identify and structure clinical data from notes and account for nuances such as negation, hypotheticals/conditionals, and family history. The model is continuously evaluated and fine-tuned to ensure clinical accuracy.
Learn more about the depth of Truveta Data
Complete and clean EHR data
Truveta offers complete, timely, and clean EHR data linked with SDOH, mortality, and claims data for more than 120M patients representing the full diversity of the US.
Medical images and metadata
Truveta provides access to millions of medical images across all modalities, including MRI, CT, X-ray, ultrasound, mammogram, PET, and nuclear medicine, searchable by modality and protocol.