Database Open Access
Tappy Keystroke Data
Published: Oct. 20, 2017. Version: 1.0.0
Goldberger AL, Amaral LAN, Glass L, Hausdorff JM, Ivanov PCh, Mark RG, Mietus JE, Moody GB, Peng C-K, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a New Research Resource for Complex Physiologic Signals (2003). Circulation. 101(23):e215-e220.
This is the keystroke dataset for the study titled 'High-accuracy detection of early Parkinson's Disease using multiple characteristics of finger movement while typing'. This research report is currently under review for publication by PLOS ONE.
The dataset contains keystroke logs collected from over 200 subjects, with and without Parkinson's Disease (PD), as they typed normally on their own computer (without any supervision) over a period of weeks or months (having initially installed a custom keystroke recording app, Tappy). This dataset has been collected and analyzed in order to indicate that the routine interaction with computer keyboards can be used to detect changes in the characteristics of finger movement in the early stages of PD.
The participants, from the U.S., Canada, UK and Australia, had visited the project website and agreed to participate in the study. The research was approved by the Human Research Ethics Committee at Charles Sturt University, Australia, protocol number H17013.
Each data file collected includes the timing information from typing activity as the participants used their various Windows applications (such as email, word processing, web searches and the like). The keystroke acquisition software ('Tappy') provided timing accuracy of key press and release timestamps to within several milliseconds.
The data files comprise two Zip archives, one with the participant detail files and the other with the keystroke data files for each user.
The filename of each user file contains a 10 character code, used to cross reference to the keystroke data files for that user. The fields are:
- Birth Year: Year of birth
- Gender: Male/Female
- Parkinsons: Whether they have Parkinson's Disease [True/False]
- Tremors: Whether they have tremors [True/False]
- Diagnosis Year: If they have Parkinson's, when was it first diagnosed
- Whether there is sidedness of movement [Left/Right/None] (self reported)
- UPDRS: The UPDRS score (if known) [1 to 5]
- Impact: The Parkinsons disease severity or impact on their daily life [Mild/Medium/Severe] (self reported)
- Levadopa: Whether they are using Sinemet and the like [Yes/No]
- DA: Whether they are using a dopamine agonist [Yes/No]
- MAOB: Whether they are using an MAO-B inhibitor [Yes/No]
- Other: Whether they are taking another Parkinson's medication [Yes/No]
Each file contains comma separated keystroke data for one month for a particular user. The filename comprises the 10 character code (matching the user details file) and the YYMM of the data. The fields are:
- UserKey: 10 character code for that user
- Date: YYMMDD
- Timestamp: HH:MM:SS.SSS
- Hand: L or R key pressed
- Hold time: Time between press and release for current key mmmm.m milliseconds
- Direction: Previous to current LL, LR, RL, RR (and S for a space key)
- Latency time: Time between pressing the previous key and pressing current key. Milliseconds
- Flight time: Time between release of previous key and press of current key. Milliseconds
Anyone can access the files, as long as they conform to the terms of the specified license.
License (for files):
Open Data Commons Attribution License v1.0
Files on Google Cloud
Click here to view the files in the Google Cloud Console. Login with a Google account is required.
Total uncompressed size: 85.1 MB.Download Zip (85.1 MB)