Home     Books & Chapters     Publications     Video Lectures     PyHubs     Typing Dynamics    

Task 2 - Person Identification

For the second task of the person identification challenge, we recoded the dynamics of typing with a JavaScript application. In each typing session, the users were asked to type some sentences and the keyboard events "keyup", "keydown" and "keypress" were captured by the JavaScript application and stored on our web server. In the file task2-keystrokes-12users-raw-data.txt, each record starts with the keyword TYPING PATTERN which is followed by the identifier of the typing session. Each subsequent line corresponds to a keyboard event. These event are: keypress, keyup and keydown. Each line, contains the following pieces of information:

  • type of the keyboard event (keydown/keyup/keypress)

  • event.keyCode (the keyCode field of the corresponding JavaScript event)

  • event.which (the which field of the corresponding JavaScript event)

  • event.charCode (the charCode field of the corresponding JavaScript event)

  • event.shiftKey (the shiftKey field of the corresponding JavaScript event)

  • the number of milliseconds since the first keystroke event (base on the difference of the values returned by JavaScript's Date.getTime() function).

Additionally, you are given the true identity of the users (coded by integer numbers from 1 to 12) for five typing pattern per users in the file task2-keystrokes-12users-train-labels.txt. Each line of this file contains two numbers separated by a comma:

  • the identifier of a typing pattern (pattern id for short), and
  • the identifier of the user who typed that pattern (user id for short).

Your task is to recognize the user identities for the rest of the pattern, i.e. for the typing patterns that are not included in task2-keystrokes-12users-train-labels.txt.


Your solution for the above task should be a list of pairs. The first number of the pair should be the pattern id. The second number of the pair depends on your prediction, i.e. an integer between 1 and 12 denoting the predicted identity of the user. An example for a file that can be uploaded as submission is task2-random-submission.txt. This file contains all the pattern ids for which predictions are expected, but the predictions (integers between 1 and 12) are filled by a random number generator.

When submitting your solutions, you will need to provide your "login". Please use your nickname as "login". Your solutions will only be displayed at the leaderboard once the administrator has approved your nickname. You may also provide a short description of your solution. Your nickname and the accuracy of your predictions will appear on the leaderboard.


Evaluation of the submissions is based on accuracy. The results reported on the leaderboard are calculated based on a subset of the predictions. In order to ensure the validity of the results, the organizers of the challenge calculate the accuracy on the rest of the predictions as well. Submissions are not displayed on the leaderboard if the accuracy on the both subsets are substantially different.

What to win?

If you have a model that performs well, and you want to write a joint research paper, please feel free to contact us in e-mail: buza at biointelligence dot hu .

"Honest Usage" Policy

Finally, we mention that there are many ways of cheating. As this is a machine learning challenge, the intended behaviour is to train a classifier and use it to make predictions. While you are training a classifier, you are expected to use the labelled data as training data in order to fit your model. Any form of cheating or dishonest behaviour is discouraged. If dishonest behaviour is detected, the affected results will be removed from the leaderboard.

If you have any questions, please feel free to contact us in e-mail:
buza at biointelligence dot hu

Good luck with the challenge!