Database Open Access

BIDMC PPG and Respiration Dataset

Published: June 20, 2018. Version: 1.0.0

When using this resource, please cite the original publication:

Pimentel, M.A.F. et al. Towards a Robust Estimation of Respiratory Rate from Pulse Oximeters. IEEE Transactions on Biomedical Engineering, 64(8), pp.1914-1923, 2016. [DOI: 10.1109/TBME.2016.2613124](

Please include the standard citation for PhysioNet:

Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals (2003). Circulation. 101(23):e215-e220.


This dataset contains signals and numerics extracted from the much larger MIMIC II matched waveform Database, along with manual breath annotations made from two annotators, using the impedance respiratory signal.

Data Collection

The original data was acquired from critically-ill patients during hospital care at the Beth Israel Deaconess Medical Centre (Boston, MA, USA). Two annotators manually annotated individual breaths in each recording using the impedance respiratory signal. The 53 recordings within the dataset, each of 8-minute duration, each contain:

  • Physiological signals, such as the PPG, impedance respiratory signal, and electrocardiogram (ECG). These are sampled at 125 Hz.
  • Physiological parameters, such as the heart rate (HR), respiratory rate (RR), and blood oxygen saturation level (SpO2). These are sampled at 1 Hz.
  • Fixed parameters, such as age and gender
  • Manual annotations of breaths.

This dataset was first reported in the referenced publication, in which the data was used to evaluate the performance of different algorithms for estimating respiratory rate from the pulse oximetry, or photoplethysmogram (PPG) signal.

Data Files

The dataset is distributed in three formats:

  1. WFDB (WaveForm DataBase) format, which is the standard format used by PhysioNet.
  2. CSV (comma-separated-value) format
  3. Matlab (r) format, in a manner which is a compatible with the RRest Toolbox of respiratory rate algorithms

WFDB Format

Five files are provided for each recording (where ## is the subject number):

  • bidmc##.breath: Manual breath annotations
  • bidmc##.dat: Waveform data file
  • bidmc##.hea: Waveform header file
  • bidmc##n.dat: Numerics data file
  • bidmc##n.hea: Numerics header file

Further details on the contents of each file are provided here

CSV Format

Separate CSV files are provided for each recording (where ## is the subject number), containing:

  • bidmc_##_Breaths.csv: Manual breath annotations
  • bidmc_##_Signals.csv: Physiological signals
  • bidmc_##_Numerics.csv: Physiological parameters
  • bidmc_##_Fix.txt: Fixed variables

Matlab (r) Format

The *bidmc_data.mat* file contains the following subset of the dataset in a single Matlab (r) variable named *data*. The following are provided for each of the 53 recordings:

  • ekg: Lead II ECG signal. Each signal is provided in a structure, where the *v* field denotes the signal values, and *fs* is the sampling frequency.
  • ppg: Photoplethysmogram signal
  • ref.resp_sig.imp: Impedance respiratory signal
  • ref.breaths: Manual annotations of breaths provided by two independent annotators. A vector of sample numbers is provided, which correspond to the signal sample numbers.
  • ref.params: Physiological parameters: rr (respiratory rate, derived by the monitor from the impedance signal, breaths per minute), hr (heart rate, derived from the ECG, beats per minute), pr (pulse rate, derived from the PPG, beats per minute), spo2 (blood oxygen saturation level, %).
  • fix: A structure of fixed variables, including: id (the MIMIC II matched waveform database subject ID and recording identifier), loc (the ward location), and source (the URLs from which the original data were downloaded).


For more information about the dataset, please contact the authors at:,


Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.

License (for files):
Open Data Commons Attribution License v1.0

Corresponding Author
You must be logged in to view the contact information.

Files on Google Cloud

Click here to view the files in the Google Cloud Console. Login with a Google account is required.

Download Zip from Google


Total uncompressed size: 207.7 MB.Download Zip (207.8 MB)

Visualize waveforms

     Folder Navigation: <base>