Idea
Tool
Experiment
Data
Algorithms
Results
Future
Analysis
The raw data collected about each stroke is then analyzed and a set of statistics and other characteristics are generated. These characteristics and statistics are then combined those those strokes of the same letter and an overall set of statistics and characteristics calculated and stored in a .csv file. These statistics and features include the following:
Handwriting Pressure
  One of the best identifiers of handwriting is the handwriting pressure as each person writes with a different amount of force. For each stroke and each character a set of 5 handwriting pressure statistics are calculated including: minimum pressure, maximum pressure, total pressure, average pressure, and the standard deviation of the pressures.
Handwriting Speed
  The duration of each stroke is recorded and summed to calculate how long it took to write each character as each user writes each character at a different speed (i.e. some write slower or faster than others).
Handwriting Size
  Since each person's writing is a different size we included statistics about the size of each letter such as width, height, and the boundary coordinates (top left corner and bottom right corner)
Strokes
  The number of strokes that comprise each letter is also stored.
Context
  The context of each character instance is also stored, such as what character the set of strokes actually represent and the previous and prior characters.
Brezier Curves
  The points of each stroke is extrapolated and a brezier curve is formed to it and the cusps of the brezier curves are stored. This gives information on the start and end of each stroke and the points at which the stroke changes curvature.
Grid
  Each cell in the collector is split into 200 additional cells, 10 across and 20 high. Each stroke is overlayed this 200 cell grid and the cells that are intersected by the stroke is marked. This grid data is then combined with the data from every other stroke that comprise the letter by addition. This takes into account cells that are visited more frequently than others, most importantly the intersections between strokes. A grid was used instead of actual (x,y) data points since a person's handwriting varies from one instance to another so this provides for some flexibility and leeway in that regard. Below is a graphical representation of grid and brezier analysis.



Zuye Zheng | Ananda Gunawardena