GDT Tutorial

Purpose

The purpose of the Gesture Design Tool (gdt) is to aid designers in creating gesture sets for pen-based user interfaces (PUIs). A gesture is a mark made with a pen to cause a command to be executed, such as the copy-editing pigtail mark for delete. (Some people use "gesture" to mean motions made in three dimension with the hands, such as pointing with the index finger. This is not what is meant by a gesture here.)

gdt helps designers of gestures as follows. It allows a designer to explore attributes of their gestures that influence recognition so the designer can determine if their gesures will be difficult for the computer to disambiguate.

Terminology and Concepts

This section introduces terms that are useful for talking about gestures and concepts that are needed to understand gdt.

Terminology

A pen-based application that uses gestures allows the user to draw a number of different kinds of gestures to perform different tasks. For example, a pigtail may indicate delete, a caret insert, and a circle select. We call each of these different kinds of gestures a gesture class. A collection of gesture classes we call a gesture set.

How feature-based recognition works

Although we hope that in the future designers can use gdt effectively without knowing anything about recognition, that is presently not the case. Therefore this section will introduce some basic concepts from feature-based recognition that gdt users will need to understand. The description is intentionally not mathematically rigorous in the hope that it will therefore be more accessible.

Feature-based recognizers categorize gestures using certain attributes of the gestures. These attributes are called features and may include such properties as total length of the gesture, total angle of the gesture, and size of bounding box of the gesture.

gdt is built with thirteen features built-in, although the two related to time are not generally used. gdt uses an implementation of Rubine's recognizer as described in his SIGGRAPH '91 paper, "Specifying Gestures by Example" and uses the features described in that paper:

  1. Cosine of the initial angle
  2. Sine of the initial angle
  3. Length of the bounding box diagonal
  4. Angle of the bounding box diagonal
  5. Distance between first and last point
  6. Cosine of the angle between the first and last points
  7. Sine of the angle between the first and last points
  8. Total length
  9. Total angle traversed
  10. Sum of the absolute value of the angle at each point
  11. Sum of the squared value of those angles
  12. Maximum speed (squared) [normally not used]
  13. Duration [normally not used]
For every gesture, the recognizer computes a vector of these features called the feature vector. The feature vector is used in training and recognition as follows.

Training

The recognizer works by first being trained on a gesture set. Then it is able to compare new gestures with the training set to determine to which gesture class the new gesture belongs.

During training, for each class the recognizer uses the feature vectors of the examples and computes a mean feature vector and covariance matrix (i.e., a table indicating how the features vary together and what their standard deviations are) for the class.

Recognition

When a gesture to be recognized is entered, its feature vector is computed and it is compared to the mean feature vector of all gesture classes in the gesture set. The candidate gesture is recognized as being part of the gesture class whose mean feature vector is closest to the feature vector of the candidate gesture.

For a feature-based recognizer to work perfectly, the values of each feature will be normally distributed within a class and between classes the values of each feature will vary greatly. The next section describes how to use gdt to determine if this is the case.

Usage

First, a general overview of using gdt will be given. Then the method of entering gestures will be explained, along with other common operations. The last section will discuss how to use gdt to find potential recognition problems.

Overview

gdt is used to evaluate a proposed gesture set. The designer enters examples of each type of gesture in the gesture set. (The designer may also or instead have others enter examples to get more variation.) Then the designer uses the visualizations provided by gdt to determine what possible problems may exist in the gesture set. The designer then modifies an example or class and reexamines the set.

Entering the gestures

gdt starts up with an empty gesture set. To make a new class, use the "Class/New" menu item (i.e., select the "New" command from the "Class" menu). First, enter the class name in the Name box. To add example gestures to the class, simply draw them in the white area at the bottom of the class window.

Individual gestures or classes maybe selected by clicking on their icons. The Edit menu contains the common cut, copy, paste, and delete operations, which can be used on gestures or classes. You could use these operations for tasks such as removing mistakenly entered examples.

The File menu allows the user to save and load gesture sets and gesture classes. The File menu in the gesture set window saves and loads gesture sets whereas the File menu in the gesture class window saves and loads individual classes. gdt is indifferent to file extensions, but for the sake of other users you may wish to use ".gs" and ".gc" extensions to denote gesture set and gesture class files, respectively.

Finding potential recognition problems

This section discusses how gdt can be used to find different aspects of a gesture set that may cause problems for the recognizer.

A potential problem that may arise in training feature-based recognizers is bad example gestures. You can use gdt to detect this by using "Set/Classification matrix" to classify the training examples. It will create a table that shows what percentage of examples from a class are classified into each category. The vertical axis is what class it really belongs to and the horizontal axis is how it is classified by the recognizer. If there are no extreme outlying examples, this table will simply have 100s along the diagonal. If there are outlying examples shown in the table, clicking on the table cell will bring up the appropriate category, with the offending example(s) highlighted. You can then delete (with "Edit/Delete") and enter a new example.

Another problem may be that two gesture classes are simply too similar to one another for the recognizer to disambiguate. gdt can compute the distances between classes, which is proportional to how different they are for recognition purposes. Use the "Set/Distance matrix" to see this table. It is read like distance tables on maps: cross index class A with class B to see the distance between classes. (The table is symmetric, but the entire table is shown for convenience.) There is a threshold slider on the right to gray out uninteresting inter-class distances (i.e., large distances).

If it happens that two classes are too close together, you can look at the individual features for the two classes by clicking on their cell in the table. The feature graph shows the values of each feature for each class. Using this graph, you can see which features are too similar and change one of the gestures to differentiate them.

The way the weights are assigned to features may also cause the recognizer to have difficulty classifying some gestures. This may happen if, for example, two classes (let's call them A and B) are distinguished from each other by their size while another class (C) in the same set has examples at two or more substantially different sizes. In order to classify C correctly, the features related to size will probably be given a low weight, which will make disambiguating A and B difficult. This can be discovered using the feature graph ("Set/Feature graph") and corrected by breaking C up into two or more separate classes (e.g., "C-big" and "C-little").

Important: For performance reasons, the classifier does not automatically retrain when the gesture set is changed. You must manually retrain by selecting "Set/Train" in the gesture set window.


allanl@cs.berkeley.edu
Last modified: Wed Mar 25 16:31:57 1998