Resources


Database Credentialed Access

DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries

Jayetri Bardhan, Anthony Colas, Kirk Roberts, Daisy Zhe Wang

DrugEHRQA is a QA dataset containing question-answers from MIMIC-III tables and discharge summaries.

question-answer qa

Published: April 12, 2022. Version: 1.0.0


Database Contributor Review

CARMEN-I: A resource of anonymized electronic health records in Spanish and Catalan for training and testing NLP tools

Eulalia Farre Maduell, Salvador Lima-Lopez, Santiago Andres Frid, Artur Conesa, Elisa Asensio, Antonio Lopez-Rueda, Helena Arino, Elena Calvo, Maria Jesús Bertran, Maria Angeles Marcos, Montserrat Nofre Maiz, Laura Tañá Velasco, Antonia Marti, Ricardo Farreres, Xavier Pastor, Xavier Borrat Frigola, Martin Krallinger

CARMEN-I is a Spanish corpus of 2,000 clinical records from Hospital Clínic, Barcelona. It covers COVID-19 patients and comorbidities, serving as a resource for training clinical NLP models and researchers in NLP applied to clinical documents.

de-identification anonymization clinical ner

Published: Nov. 2, 2023. Version: 1.0


Database Restricted Access

Hospitalized patients with heart failure: integrating electronic healthcare records and external outcome data

Zhongheng Zhang, Linghong Cao, Yan Zhao, Ziyin Xu, Rangui Chen, Lukai Lv, Ping Xu

The new version added beta blockers in the dat_md.csv file. Dataset comprising hospital-level data on patients who were admitted with heart failure to Zigong Fourth People’s Hospital, Sichuan, China between 2016 and 2019.

heart failure china electronic health record

Published: May 22, 2022. Version: 1.3


Database Restricted Access

Hospitalized patients with heart failure: integrating electronic healthcare records and external outcome data

Zhongheng Zhang, Linghong Cao, Yan Zhao, Ziyin Xu, Rangui Chen, Lukai Lv, Ping Xu

The new version added beta blockers in the dat_md.csv file. Dataset comprising hospital-level data on patients who were admitted with heart failure to Zigong Fourth People’s Hospital, Sichuan, China between 2016 and 2019.

heart failure china electronic health record

Published: May 22, 2022. Version: 1.3


Database Credentialed Access

NCH Sleep DataBank: A Large Collection of Real-world Pediatric Sleep Studies with Longitudinal Clinical Data

Harlin Lee, Boyue Li, Yungui Huang, Yuejie Chi, Simon Lin

The NCH Sleep DataBank includes 3,984 pediatric sleep studies on 3,673 unique patients conducted at Nationwide Children's Hospital between 2017 and 2019. It contains polysomnography (PSG), clinical annotations, and longitudinal clinical data.

eeg ehr polysomnography pediatrics clinical decision support sleep disorders sleep study electronic health records ecg

Published: Oct. 27, 2021. Version: 3.1.0


Database Credentialed Access

Curated Data for Describing Blood Glucose Management in the Intensive Care Unit

Aldo Robles Arévalo, Roselyn Mateo-Collado, Leo Anthony Celi

The data subsets consist of time series files that includes all the curated entries of glucose readings and insulin inputs from MIMIC-III database.

insulin replacement therapy glycemic control critical care

Published: April 19, 2021. Version: 1.0.1


Database Credentialed Access

MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing

Shengpu Tang, Parmida Davarmanesh, Yanmeng Song, Danai Koutra, Michael Sjoding, Jenna Wiens

Features and labels from MIMIC-III and eICU-CRD produced by FIDDLE, an EHR preprocessing pipeline.

preprocessing machine learning electronic health record

Published: April 28, 2021. Version: 1.0.0


Database Credentialed Access

BOLD, a blood-gas and oximetry linked dataset

João Matos, Tristan Struja, Jack Gallifant, Luis Filipe Nakayama, Marie Charpignon, Xiaoli Liu, Jaime dos Santos Cardoso, Leo Anthony Celi, An Kwok Wong

An open-source pulse oximetry and arterial blood gas dataset, derived from MIMIC-III, MIMIC-IV, and eICU-CRD

electronic health records health equity pulse oximetry intensive care unit

Published: Nov. 8, 2023. Version: 1.0


Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, Majid Afshar

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

bionlp clinical natural language processing electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0


Database Open Access

MIMIC-IV Clinical Database Demo

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Steven Horng, Leo Anthony Celi, Roger Mark

An openly available subset of patients in the MIMIC-IV database.

mimic critical care electronic health record

Published: Jan. 31, 2023. Version: 2.2