arXiv:2407.00242 Abstract | arXiv Analytics

arXiv:2407.00242 [cs.CL]Abstract References Reviews Resources

EHRmonize: A Framework for Medical Concept Abstraction from Electronic Health Records using Large Language Models

João Matos, Jack Gallifant, Jian Pei, A. Ian Wong

Published 2024-06-28Version 1

Electronic health records (EHRs) contain vast amounts of complex data, but harmonizing and processing this information remains a challenging and costly task requiring significant clinical expertise. While large language models (LLMs) have shown promise in various healthcare applications, their potential for abstracting medical concepts from EHRs remains largely unexplored. We introduce EHRmonize, a framework leveraging LLMs to abstract medical concepts from EHR data. Our study uses medication data from two real-world EHR databases to evaluate five LLMs on two free-text extraction and six binary classification tasks across various prompting strategies. GPT-4o's with 10-shot prompting achieved the highest performance in all tasks, accompanied by Claude-3.5-Sonnet in a subset of tasks. GPT-4o achieved an accuracy of 97% in identifying generic route names, 82% for generic drug names, and 100% in performing binary classification of antibiotics. While EHRmonize significantly enhances efficiency, reducing annotation time by an estimated 60%, we emphasize that clinician oversight remains essential. Our framework, available as a Python package, offers a promising tool to assist clinicians in EHR data abstraction, potentially accelerating healthcare research and improving data harmonization processes.

Comments: submitted for review, total of 10 pages

Categories: cs.CL

Keywords: large language models, electronic health records, medical concept abstraction, requiring significant clinical expertise, task requiring significant

Related articles: Most relevant | Search more

arXiv:2308.06354 [cs.CL] (Published 2023-08-11)

Large Language Models to Identify Social Determinants of Health in Electronic Health Records

Marco Guevara et al.

arXiv:2401.06088 [cs.CL] (Published 2024-01-11)

Autocompletion of Chief Complaints in the Electronic Health Records using Large Language Models

K M Sajjadul Islam, Ayesha Siddika Nipu, Praveen Madiraju, Priya Deshpande

arXiv:2212.06040 [cs.CL] (Published 2022-11-14)

Semantic Decomposition Improves Learning of Large Language Models on EHR Data

David A. Bloore, Romane Gauriau, Anna L. Decker, Jacob Oppenheim

arXiv Analytics

arXiv:2407.00242 [cs.CL]Abstract References Reviews Resources

EHRmonize: A Framework for Medical Concept Abstraction from Electronic Health Records using Large Language Models

Links

Toolbox

arXiv:2407.00242 [cs.CL]AbstractReferencesReviewsResources

EHRmonize: A Framework for Medical Concept Abstraction from Electronic Health Records using Large Language Models

Links

Toolbox

arXiv:2407.00242 [cs.CL]Abstract References Reviews Resources