Reading #8: A Lightweight Multistroke Recognizer for User Interface Prototypes (2010)

by Lisa Anthony and Jacob Wobbrock (paper)

Comments: Danielle

This paper presents a multi-stroke extension to Wobbrock's $1 single-stroke recognizer, called $N. This paper presents the same approach to incorporating recognition into any program, just as was done with $1. The paper contains the pseudo-code for the algorithm that fits on less than one page.

$N allows multiple strokes by connecting the strokes into one long stroke. The many possible permutations that could occur are generated when new templates are created. Resampling, scaling, and rotation all occur as in $1, with some adjustments to enhance the capabilities of $1.

In addition to providing support for multiple strokes, $N addresses some problems with the $1 algorithm. $N allows 1-dimensional gestures (such as lines) by calculating the ratio of the sides of the bounding box and using a threshold to determine if the gesture is 1D or not. $N allows rotation in the gestures as well. If rotation is desired, the gestures are rotated to the template angles instead of 0. Finally, $N provides optimizations to increase recognition speed. Templates are only compared if the starting angle of the stroke is "about the same." Also, the developer can choose whether to limit the number of strokes in gestures that will probably always have a set number of strokes (for example, + and = gestures).

The drawbacks to $N are scale invariance, using more strokes than in the template, collision of gestures, and large numbers of templates.

To test the algorithm, 40 middle and high school students used a sketch input program to input simple algebraic equations.

__________

This paper is nice in the same ways $1 was nice. It can allow any level of programmer to implement pen gesture interaction in any program. Because $N is overall better than $1 (despite having slightly less accuracy for 1-stroke gestures) it is good.

There are more possibilities with $N than with $1. There are more ways it can be used, and more ways it can be enhanced as well. It also seems that $N could easily be extended to 3D (of course with more computation required) to use for hand gestures or something similar.

Reading #7: Sketch Based Interfaces: Early Processing for Sketch Understanding (2001)

by Tevfik Metin Sezgin, Thomas Stahovich, and Randall Davis (paper)

Comments: Marty

This paper describes a system that analyzes a sketch after it is drawn and analyzes what was drawn instead of how the sketch was drawn. It also allows multiple strokes in a sketch, something we have yet to discuss in this course.

One of the main features of Sezgin's system is its vertex detection, or corner finding, implementation. He uses a combination of speed and curvature to detect line corners. After segmentation, the straight edges of a sketch are stored as a polyline.

The second feature is curve handling. The system is able to model curves aszier curves by approximating the control points using a least squares method.

The system beautifies the drawn strokes "primarily to make it look as intended." Lines meant to be parallel are made parallel (also similarly with perpendicular lines), straight lines are made straight, and curves are rendered properly.

Finally, the system performs primitive object recognition. It uses simple geometric constraints to recognize ovals, circles, rectangles, and squares.

A user study was done to test the usability of the program and compare it to a tool-based drawing program. The participants found the author's system easier to use since any shape can be instantly drawn without having to select the corresponding tool. The author's report an accuracy of 96% when approximating drawn shapes from a set of 10 figures.

__________

This paper is an early beautification paper that turns sketched drawings into actual technical drawings such as schematics and diagrams. It does this by applying corner and curve finding to determine the user's intended sketch. I think this paper helps show that sketching can be a superior method of input than traditional menu and tool bar based drawing programs. Such interfaces were rare then, and still are now, and hopefully we can build upon this to help popularize sketch-based interfaces.

Reading #6: Protractor: A Fast and Accurate Gesture Recognizer (2010)

by Yang Li (paper)

Comments: Wenzhe

Protractor is a modified $1 algorithm. The enhancements include support for up to 8 directions of rotation, scale invariance, and speed.

Protractor does the resampling as $1 does, and it uses N=16 for the number of points ($1 used 64 in its testing). Rotation invariance can be toggled on or off. If the gesture is to be rotation-independent, Protractor will rotate around the centroid until the indicative angle is 0, just as $1 does. If rotation is enabled, it rotates the indicative angle to one of 8 equidistant angles. Protractor does not scale the strokes as $1 does, so it is scale-invariant. The rotation adjustment step is also modified. Instead of taking an iterative approach to finding the optimal orientation, an angle is calculated that is close to the optimal angle.

Because of these modifications, Protractor performs significantly faster than $1 as the number of training examples increases. The recognition rates are not significantly different from $1. Because of the speed enhancements, Protractor is ideally suited for mobile device applications.

__________

I like this extension of the $1 algorithm. It sounds like it isn't much more difficult to implement that $1, and the speed enhancements without sacrificing accuracy are nice. It is nice to be able to specify orientation-dependent gestures. This, along with the scale-invariance, can help expand the limited 16-gesture set used by the $1 paper. The paper did show us a 26 gesture class, and Protractor did perform significantly better on that than $1 did in that case.

Reading #5: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (2007)


by Jacob Wobbrock, Andrew Wilson, and Yang Li (paper)

Comments: George

This paper describes the $1 gesture recognizer. This sketch/gesture recognition algorithm is intended to be a simple, easy to program algorithm that can be implemented anywhere. This hopefully would allow gestures to be incorporated into rapid prototyped interfaces that otherwise might not have been able to use gesture input. This is because most user interface designers and programmers don't have the necessary knowledge or skills to be able to implement complex recognition algorithms, and current recognition toolkits are not everywhere in any language, especially in many environments human-computer interaction experts might use.

The authors describe the algorithm in 4 parts: point resampling, indicative angle rotation, scaling and translation, and finding the optimal angle for best score. These transformations applied to each input stroke allow them to easily match up to a few template strokes for each gesture. The recognition result is the template gesture with the smallest Euclidean distance to the input stroke.

The $1 algorithm is compared to the DTW and Rubine algorithms, and it is found to compete well against them, achieving high recognition rates and recognition speed. The $1 algorithm pseudo code is given as well to aid programmers.

__________

This paper is very clearly written and the $1 algorithm is indeed very simple. I find it interesting that such a simple, almost naive, approach can perform very well if executed intelligently. It is easy to imagine improvements and how to add more recognition capabilities to this algorithm, such as rotation-dependent or time-dependent gestures.

Reading #4: Sketchpad: A Man-Machine Graphical Communication System (1963)

by Ivan E. Sutherland (paper)

Comments: Jonathan

This paper presents the initial sketch-based interaction work of Ivan Sutherland. This was one of the first systems to use a pen to draw on a screen, ushering in a new form of human-computer interaction.

To use the system, the user has a set of buttons and switches to activate certain modes and tools, such as a line tool or a delete mode. When the desired settings are set, interaction using the pen accomplishes the desired task. It is important to note that the pen does not perform any free-form drawings, but rather creates geometry using only pre-defined tools or performs commands using pointing or dragging. This makes it more like a CAD system that uses a pen for input (note that the mouse did not exist at the time of this work).

The paper shows its age by emphasizing things like the data structures and memory usage as well as generic representations of sketch elements. A "light pen" is used as the input device.

Most of the paper details the various constraints and tools and how they were implemented using non-procedural object oriented methods, all of which were new ideas (as discussed in this video).

__________

This paper introduced many new ideas about human-computer interaction, graphical displays, and programming. It was the first of its kind in almost every aspect. It is hard to appreciate it now without reading comments from many years ago. Much of the paper seems trivial to implement using our current software development languages and tools. I found it interesting that many ideas were introduced back in 1962 that are still active, and hard, research problems today (such as recognizing artistic drawings and electrical schematics).

Reading #3: "Those Look Similar!" Issues in Automating Gesture Design Advice (2001)

by Long, Landay, and Rowe

Comments: Sam

This paper presents the quill gesture design tool that is aimed at helping developers create pen gesture-based interfaces. The quill software gives advice to the developers if multiple gestures might be ambiguous to the computer or visually similar to people.

The authors conducted some experiments to determine what kinds of gestures are perceived as similar by people by having a few hundred participants judge a large number of gestures and pick the most complex ones. They then developed an algorithm for predicting gesture similarity.

Interface designers use quill to input gestures for their interfaces. quill uses the similarity algorithm and Rubine's methods to give feedback to the users and train and recognize the gestures. The paper talks in detail about challenges related to giving advice such as how, when, and where advice is displayed in addition to what advice is displayed.

The authors conclude that the quill system, while it could use many refinements and improvements, is a good start and can possible inspire other advice-giving systems for gesture-based interfaces.

__________

I can appreciate the assistance given to developers relating to gesture definitions. There still are not many systems that can do this, especially with 3D hand gestures. I have run into issues in my own research where two gestures I didn't think were similar actually were, and it can be a pain to re-define gestures, especially if you discover the similarity after a large gesture set has been defined, which can make it difficult to think of a new unique gesture. I would really appreciate more development of these tools for 2D and 3D gestures.

Reading #2: Specifying Gestures by Example (1991)

by Dean Rubine (paper)

Comments: Danielle

This paper presents Rubine's gesture-recognition algorithm and his implementation of a program that doesn't require a hand-coded recognizer. His goal is to increase the adoption of sketch-based gesture recognition in user interfaces by making it easier to integrate recognition by providing example gestures fed into a learning algorithm rather than hand-coding the recognizer.

Rubine has implemented a gestural drawing program in which simple single-stroke gestures are used to create and manipulate a drawing. Example gestures include rectangle creation, ellipse creation, copy, rotate-scale, and delete. The user of the program is able to add new gesture examples to aid recognition as well as modify the structure of each gesture.

He presents his simple gesture recognition algorithm, which assumes stroke segmentation is already taken care of. For the stroke drawn as the gesture, 13 features are computed. Rubine states that these 13 features are capable of recognizing many gestures, but fail in some cases. Once the features are calculated, they are input to a linear classifier that gives the class name of the stroke. He discusses how the classifier is trained, which is basically the standard method of training a linear classifier.

The classifier always gives one of the gestures from all gesture classes. A probability function is used to determine the probability that the gesture was classified correctly, and if that value falls below a threshold, the classification is rejected, as the gesture is ambiguous. He also rejects gestures based on the number of standard deviations of the gesture from the mean of the classification gesture class.

Rubine says his methods perform well in practice using 10 different gesture sets. He reports recognition rates in the mid to high 90s for varying numbers of examples per class, gesture classes per set, and test gestures per class.

__________

This paper seems to be on the cutting edge of sketch recognition technology for its time. Indeed, the concepts presented in this field are still widely used and studied today. Very little work and very few non-hand-coded recognition applications existed in 1991. I was impressed by the high accuracy achieved on the gesture sets using the linear classifier, though the accuracy reporting didn't seem complete. I have seen other systems, such as in our lab, that can recognize much larger classes of data, and I am particularly interested in 3D extensions of this method as well as other classification algorithms I have been brainstorming, which I look forward to implementing.

Reading #1: Gesture Recognition (2010)

comments: Chris

Tracy Hammond (paper)

This paper is a summary of some well-known gesture recognition techniques in sketch recognition. It begins with a presentation of Rubine's feature-based algorithm. It describes the 13 features and gives some examples of how they are used as well as some illustrations to help understand the features. It briefly touches on the training and recognition system Rubine used, but doesn't go into much detail. Long's 22-feature extension of Rubine's algorithm is then presented in the same way. It presents Long's extra 11 features. Finally, Wobbrock's $1 algorithm is described.
__________

This is a pretty good summary of these well-known sketch recognition methods. I think it summarizes the key points of each approach and would allow the implementation of each method without much trouble. It could probably make up part of a handbook/reference guide. There were many grammatical errors/typos and some missing figures and details, however.

Homework #1: CSCE 624 Introduction

picture of myself

dalogsdon gmail
2nd year Master of Computer Science

I am taking sketch recognition to gain a deeper knowledge of current work in the field. I have a creative and artistic background in addition to computer science, so I feel I might have a unique viewpoint to the issues we will discuss in the class.

Ten years from now, I expect to know what the next big technological advancement in computer science was. As an undergraduate, I rather enjoyed my computer-human interaction, software engineering, figure drawing, painting, and photography courses. My favorite movies are action/adventures and comedies. If I could travel back in time, I would not meet anyone for fear of destroying my existence, but I might go to my parents house and peek in the windows to see myself as a baby. It is perhaps interesting to note that 30 of my spinal vertebrae are fused.