Next: PLOTxD(1) Up: WFDB Applications Guide Previous: NST(1)On This Page

Name

parsescp - parse SCP-ECG, optionally save in PhysioBank-compatible format

Synopsis

parsescp [ options ... ]

Description

parsescp converts SCP-ECG output produced by SpaceLabs/Burdick ECG carts into more easily usable formats. It was written in 2000 and has been used to convert about a million ECGs collected at Boston’s Beth Israel Deaconess Medical Center since then. parsescp can also be used to create deidentified SCP-ECG files, although it does not perform this function by default.

Options include:

-a
Anonymize: copy standard input to standard output, removing protected health information (PHI). This option suppresses all other output.
-b
Show baselines (residuals after template subtraction).
-f
Force the input to be parsed even if it contains CRC errors.
-h
Print a usage summary.
-l
Low-pass filter (smooth) the output waveforms.
-o record
Set the record name (default: ecg) for output files.
-s N
Shift templates by N samples before adding them to the baselines.
-S N
Skip parsing of SCP-ECG section N.
-t
Show templates; suppress baselines (complement of -b option).
-v
Verbose mode: print a very detailed analysis of the SCP-ECG input, and write record.txt (specify record using -o).
-w
Create a PhysioBank-compatible record (specify the record name using -o).
-x
Show a hexadecimal data dump (implies -v).
-z
Suppress final transients and zero-mean the ECGs.

Unless the -a option is used, this program produces at least these three files:

record.des
(text) description (age, sex, recording bandwidth, measurements, diagnoses)
record.ecg (binary) reconstructed ECGs (see comments in
parsescp.c, the parsescp source file, for format)
record.key
(text) patient’s name and ID (medical record number)

If invoked with the -v option, parsescp produces:

record.txt
(text) reconstructed ECGs

With -v, parsescp also writes a (very) detailed analysis of the contents of the SCP-ECG input on the standard output.

If parsescp was compiled with the WFDB library, and if it is invoked with the -w option, it also produces a pair of PhysioBank-compatible output files:

record.dat
(binary) signal file containing 12 continuous leads
record.hea
(text) header file describing record.dat

Supported SCP versions

This program was written using AAMI SCP-1999 (Standard communications protocol for computer-assisted electrocardiography, 25 October 1999 draft) as a reference for SCP format. It has been tested only with SCP records produced by SpaceLabs/Burdick ECG carts (these produce second-difference encoded data with reference beat subtraction using a single reference beat, Huffman encoded using the SCP standard Huffman table). Amplitude (unencoded) data and first-difference encoded data should be readable using this program, but these formats have not been tested. Use of custom Huffman tables is recognized but not otherwise supported. Use of multiple reference beats is recognized but not otherwise supported.

ECG signals in Spacelabs/Burdick SCP-ECG files

Spacelabs/Burdick ECG carts of the type for which this program was designed record 2 of the 3 Einthoven leads and all 6 precordial leads simultaneously for 10 seconds, at 500 samples per second per lead, with 16-bit precision over a range of +/-32.767 mV. Thus the sampling interval is 2 ms, and the amplitude resolution is 5 microvolts (5000 nanovolts) per ADC unit.

Note that although the SCP standard specifies how to record the sampling frequency and amplitude resolution in SCP-ECG files, the Spacelabs/Burdick carts don’t do this, so parsescp assumes the sampling frequency and resolution above. parsescp will need modification in order to convert ECGs with other sampling frequencies or resolutions correctly.

ECG signals in parsescp’s output files

This program derives the third Einthoven lead and the three augmented leads using the standard relationships among the leads:
III = II - I

aVR = -(I + II)/2

aVL = II/2 - III

aVF = I/2 + III

In all of its output formats, parsescp represents the samples of each signal as a sequence of unscaled integers, exactly as they appear in the original SCP-ECG input file. Thus, in the .ecg, .txt, and .dat output files, the unit of amplitude is equivalent to 5 microvolts (5000 nanovolts), as in the SCP input. If the recording is shorter than 10 seconds, or if a signal is missing and cannot be reconstructed from the relationships above, each missing sample is assigned a special value (WFDB_INVALID_SAMPLE, or -32768).

The -l and -z options modify the input values as noted above; if neither option is used, the output sample values are numerically identical to the input sample values.

The .ecg file contains selected and rearranged segments of the signals in the commonly-used layout of twelve 2.5 second segments arranged in groups of 4 above a continuous 10-second lead II. Each sample is represented as a big-endian 16-bit two’s complement signed integer. The file begins with a 512-byte prolog containing the record name and recording date and time, which are HIPAA-defined protected health information (PHI) unless the input SCP-ECG has been deidentified. The prolog is followed by four "traces", each representing the same 10-second interval. The first three of these traces are made by concatenating 2.5 second segments (1250 samples) of each of the 12 leads, in this order:
   ( I aVR V1 V4)

   ( II aVL V2 V5)

   (III aVF V3 V6)

The fourth trace is a continuous 10-second segment (5000 samples) of lead II.

The optional .txt and .dat files contain the ECG signals only (no metadata, and no PHI). The signals appear in the standard order:
I, II, III, aVR, aVF, aVL, V1, V2, V3, V4, V5, V6

In the .txt file, each line begins with a sample number (0 to 4999) and is followed by a sample from each of the 12 leads, in order. Each sample is represented as a base 10 numeral, with spaces inserted between samples so that the columns line up. Thus the sample numbers are in column 0, samples of lead I are in column 1, those of lead II are in column 2, etc.

In the .dat file, the first 24 bytes contain the first sample of each signal, in the standard order as for the .txt file. As in the .ecg file, each sample is represented as a big-endian 16-bit two’s complement signed integer. The next 24 bytes contain the second sample of each signal, etc.

Other output files

The .des file contains a variety of information extracted from the SCP-ECG input file, in human-readable form. It does not contain the ECG signals themselves, or the patient’s name or medical record number. Note that .des files made from SCP-ECG files that have not been anonymized will generally contain HIPAA-defined PHI (protected health information) such as the recording date and the patient’s age (even if over 90).

The .key file contains the recording date and time, the patient’s name, and the medical record number, if recorded in the input file.

The .hea file, if generated, contains metadata (information about the corresponding .dat file) only; it does not contain any PHI, even if the input was not anonymized. Age and sex are recorded in the .hea file if present in the input file, except that ages of 90 and more are recorded as 90. The recording date and time are not recorded in the .hea file.

Using parsescp to create deidentified SCP-ECG files

The SCP-ECG standard defines how to record a variety of information that includes elements defined by HIPAA as PHI (protected health information). These include the patient’s name, medical record number, birth day and month, recording day and month, and (if the age is over 90) birth year and age.

If invoked with the -a option, parsescp reads the input SCP-ECG file and writes an anonymized (deidentified) version of it to the standard output. For example:
parsescp -a <12345678.scp >anonymous.scp

In this case, none of the other output files are produced.

parsescp removes all of the PHI as well as names of physicians and technicians, names of hospitals or clinics, and room numbers, replacing them with ’xxx’. It changes all dates to January 1, and if the age is over 90, it resets the age to 90 and the birth year to 90 years before the recording year. Finally, it recalculates the SCP-ECG CRCs so that the output is still a valid SCP-ECG file. Note that the original input file is not modified.

Note that parsescp does not deidentify other types of data (including its own .des and .key files); it can only deidentify SCP-ECG files.

Environment

It may be necessary to set and export the shell variable WFDB (see setwfdb(1) ).

Examples


   parsescp -o 12345 <12345.scp

The command above converts an SCP-ECG file named 12345.scp into a set of three files (12345.des, 12345.ecg, and 12345.key), as described above. The argument following -o need not match the name of the input file as in this example, but such a choice may reduce opportunities for confusion.
   parsescp -o 12345 -w <12345.scp

Same as the first example, but this command also creates a PhysioBank-compatible record named 12345 (consisting of two files named 12345.dat and 12345.hea).
   parsescp -a <12345.scp >a001.scp

The final example reads its input (12345.scp), removes all PHI, and writes the deidentified data to a new SCP-ECG file (a001.scp).

Note that none of these commands modify the original input file (12345.scp).

See Also

rdsamp(1) , setwfdb(1) , xform(1) , signal(5)

Author

George B. Moody (george@mit.edu) and Edna S. Moody

Source

http://www.physionet.org/physiotools/wfdb/convert/parsescp.c


Table of Contents

Up: WFDB Applications Guide


Please e-mail your comments and suggestions to webmaster@physionet.org, or post them to:

PhysioNet
MIT Room E25-505A
77 Massachusetts Avenue
Cambridge, MA 02139 USA

Updated 10 June 2022