Truveta Data

Clinical notes

Largest collection of clinical notes integrated with EHR data

Nearly 80% of data relevant to research is hidden in unstructured notes

The Truveta Language Model extracts data from notes at scale, empowering researchers with data for more than 5 billion free-text notes.

Understand clinical context for patients

With access to complete EHR data — including notes — linked with social drivers of health, mortality, and claims data, researchers can understand the complete patient journey and address previously unanswerable questions.

Identifying key moments in the patient journey

Repeated outpatient visit in January

L

Recurrent issue

L

Prior treatment

L

Suspected diagnosis

L

Referral

First dermatology visit in February

L

Suspected diagnosis

L

Recent surgery

L

Not starting a biologic​

L

Next steps​

Hidradenitis suppurativa diagnosis in November​

L

Disease flare-up

L

Formal diagnosis

L

Antibiotic treatment

Primary care visit

Specialist visit

Diagnosis

Repeated outpatient visit in January

L

Recurrent issue

L

Prior treatment

L

Suspected diagnosis

L

Referral

Redacted patient note showing delays in diagnosis and treatment for a less common condition

First dermatology visit in February

L

Suspected diagnosis

L

Recent surgery

L

Not starting a biologic​

L

Next steps​

Redacted patient note showing delays in diagnosis and treatment for a less common condition

Hidradenitis suppurativa diagnosis in November​

L

Disease flare-up

L

Formal diagnosis

L

Antibiotic treatment

Redacted patient note showing delays in diagnosis and treatment for a less common condition

Unlock access to any clinical concept of interest

Researchers can access a continually expanding library of clinical concepts spanning diverse therapeutic areas and clinical scenarios.

Cardiovascular
  • Tricuspid valve regurgitation
  • Echocardiograms: Quantitative results​
  • Cardiac catheterization reports: Hemodynamic measurements, CAD concepts​​
  • NYHA and KCCQ scores​​
  • QRS duration​
  • Vessel disease
Neurology
  • Seizure frequency
  • Migraine frequency & severity​
  • Migraine treatment response
  • Migraine symptoms & triggers
  • Migraine treatment (triptans) status and discontinuation reason​​
Oncology
  • Hepatocellular carcinoma: ECOG Performance status scores, Child-Pugh scores, and Barcelona Clinic Liver Cancer stage​
  • Colon cancer: Staging, family history, pathology report findings​
Rare disease
  • HoFH: Confirmation of diagnosis
  • OTC Deficiency: symptoms, disease progression, & dietary intake​
Metabolics
  • GLP-1 status and discontinuation reason
Hepatology
  • Fibrosis stage, steatosis, and hepatocyte ballooning​
Pulmonology
  • Pulmonary function test results​​

Truveta receives all clinical notes generated during a patient’s care. This includes progress notes, nursing evaluations, procedure/operative reports, referral notes, discharge summaries, imaging reports, and more.

TLM extracts clinical concepts from notes at scale, linking them to structured data to enable robust research across therapeutic areas, including cardiovascular, metabolic diseases, oncology, neurology, hepatology, pulmonology, and rare diseases. Broader clinical insights, such as nutrition details, apply across disease states.

Detailed example of concepts extracted from echocardiograms

Sampling of normalized echocardiogram data in Truveta Data

Detailed example of concepts extracted from echocardiograms

Sampling of normalized echocardiogram data in Truveta Data

Truveta receives all clinical notes generated during a patient’s care. This includes progress notes, nursing evaluations, procedure/operative reports, referral notes, discharge summaries, imaging reports, and more.

TLM extracts clinical concepts from notes at scale, linking them to structured data to enable robust research across therapeutic areas, including cardiovascular, metabolic diseases, oncology, neurology, hepatology, pulmonology, and rare diseases. Broader clinical insights, such as nutrition details, apply across disease states.

Answer novel research questions

Accelerating therapy adoption, improving clinical trials, and enhancing patient care.

Example applications of notes data

Classify disease severity and monitor disease progression to inform R&D

Using echocardiogram data to classify aortic stenosis severity

Assess lifestyle behaviors and symptom prevalence to optimize clinical trial design

Analyzing diet data for a rare disease requiring dietary modifications

Identify potential confounders relevant to comparative effectiveness research

Identifying confounders before head-to-head SGLT2i study

AI enables accuracy at scale

The Truveta Language Model, a large language model trained on medical records data, is designed to identify and structure clinical data from notes and account for nuances such as negation, hypotheticals/conditionals, and family history. The model is continuously evaluated and fine-tuned to ensure clinical accuracy.

Learn more about the depth of Truveta Data

Complete and clean EHR data

Truveta offers complete, timely, and clean EHR data linked with SDOH, mortality, and claims data for more than 120M patients representing the full diversity of the US.

Medical images and metadata

Truveta provides access to millions of medical images across all modalities, including MRI, CT, X-ray, ultrasound, mammogram, PET, and nuclear medicine, searchable by modality and protocol.