Resources


Database Credentialed Access

DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries

Jayetri Bardhan, Anthony Colas, Kirk Roberts, Daisy Zhe Wang

DrugEHRQA is a QA dataset containing question-answers from MIMIC-III tables and discharge summaries.

question-answer qa

Published: April 12, 2022. Version: 1.0.0


Database Contributor Review

CARMEN-I: A resource of anonymized electronic health records in Spanish and Catalan for training and testing NLP tools

Eulalia Farre Maduell, Salvador Lima-Lopez, Santiago Andres Frid, Artur Conesa, Elisa Asensio, Antonio Lopez-Rueda, Helena Arino, Elena Calvo, Maria Jesús Bertran, Maria Angeles Marcos, Montserrat Nofre Maiz, Laura Tañá Velasco, Antonia Marti, Ricardo Farreres, Xavier Pastor, Xavier Borrat Frigola, Martin Krallinger

CARMEN-I is a Spanish corpus of 2,000 clinical records from Hospital Clínic, Barcelona. It covers COVID-19 patients and comorbidities, serving as a resource for training clinical NLP models and researchers in NLP applied to clinical documents.

de-identification anonymization clinical ner

Published: Nov. 2, 2023. Version: 1.0


Database Credentialed Access

Curated Data for Describing Blood Glucose Management in the Intensive Care Unit

Aldo Robles Arévalo, Roselyn Mateo-Collado, Leo Anthony Celi

The data subsets consist of time series files that includes all the curated entries of glucose readings and insulin inputs from MIMIC-III database.

insulin replacement therapy glycemic control critical care

Published: April 19, 2021. Version: 1.0.1


Database Credentialed Access

NCH Sleep DataBank: A Large Collection of Real-world Pediatric Sleep Studies with Longitudinal Clinical Data

Harlin Lee, Boyue Li, Yungui Huang, Yuejie Chi, Simon Lin

The NCH Sleep DataBank includes 3,984 pediatric sleep studies on 3,673 unique patients conducted at Nationwide Children's Hospital between 2017 and 2019. It contains polysomnography (PSG), clinical annotations, and longitudinal clinical data.

eeg ehr polysomnography pediatrics clinical decision support sleep disorders sleep study electronic health records ecg

Published: Oct. 27, 2021. Version: 3.1.0


Database Credentialed Access

MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing

Shengpu Tang, Parmida Davarmanesh, Yanmeng Song, Danai Koutra, Michael Sjoding, Jenna Wiens

Features and labels from MIMIC-III and eICU-CRD produced by FIDDLE, an EHR preprocessing pipeline.

preprocessing machine learning electronic health record

Published: April 28, 2021. Version: 1.0.0


Database Credentialed Access

BOLD, a blood-gas and oximetry linked dataset

João Matos, Tristan Struja, Jack Gallifant, Luis Filipe Nakayama, Marie Charpignon, Xiaoli Liu, Jaime dos Santos Cardoso, Leo Anthony Celi, An Kwok Wong

An open-source pulse oximetry and arterial blood gas dataset, derived from MIMIC-III, MIMIC-IV, and eICU-CRD

electronic health records health equity pulse oximetry intensive care unit

Published: Nov. 8, 2023. Version: 1.0


Database Credentialed Access

NCH Sleep DataBank: A Large Collection of Real-world Pediatric Sleep Studies with Longitudinal Clinical Data

Harlin Lee, Boyue Li, Yungui Huang, Yuejie Chi, Simon Lin

The NCH Sleep DataBank includes 3,984 pediatric sleep studies on 3,673 unique patients conducted at Nationwide Children's Hospital between 2017 and 2019. It contains polysomnography (PSG), clinical annotations, and longitudinal clinical data.

eeg ehr polysomnography pediatrics clinical decision support sleep disorders sleep study electronic health records ecg

Published: Oct. 27, 2021. Version: 3.1.0


Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, Majid Afshar

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

clinical natural language processing bionlp electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0


Database Open Access

MIMIC-IV Clinical Database Demo

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Steven Horng, Leo Anthony Celi, Roger Mark

An openly available subset of patients in the MIMIC-IV database.

mimic critical care electronic health record

Published: Jan. 31, 2023. Version: 2.2


Database Credentialed Access

Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries

Melissa Poulsen, Vanessa Troiani, Philip Freda, Danielle Mowery, Anahita Davoudi

The database contains a corpus of annotated data from the MIMIC-III Critical Care Database from a study that aimed to develop and apply an annotation schema to characterize opioid use disorder and related contextual factors.

natural language processing clinical notes opioid use disorder substance use

Published: Feb. 8, 2023. Version: 1.0.0