Stuff.

Reading #26: Picturephone: A Game for Sketch Data Capture (2009)

2010-12-15T19:26:00.006-06:00

by Gabe Johnson

Comments: Chris

This paper talks about Picturephone, a sketching game that was mentioned in a previous paper. Picturephone is a game inspired by the telephone children's game in which a message is passed down a line of people by whispering one person at a time. The message usually changes drastically and in some cases can wind up being totally different from the original message.

For example, the original message might be, "Marty took a drink of water," and after one pass might change to "Marty drank some water." Eventually it might become "Marty took a drink of soda" and inevitably will become "Marty was arrested for arson while not wearing pants" after one or two more passes.

Picturephone has the first user sketch the story. The second person then creates a new story based on the sketch. A third person then sketches the new story, and a third player sketches this story. The sketches are then compared and graded somehow. The sketches are also labeled in the process.

_______

This looks fun. I would like to play it. Who wants to play it with me? I would like to see what would happen if the sketch/story was repeated 20 times.

Reading #25: A descriptor for large scale image retrieval based on sketched feature lines (2009)

2010-12-14T13:45:00.004-06:00

by mathias Eitz, Kristian Hildebrand, Tamy Boubekeur, Marc Alexa

Comments: Francisco

This paper deals with sketched based image search, in which the images to be searched for are sketched by the user. The authors use a few asymmetric descriptors that match the main features of a stroke with objects in the images. They tested with a set of 1.5 million pictures of outdoor sceneries. They tested 27 sketches, and the results look similar to the queries. They illustrate several example sketches and top results.

_______

I have thought about searching using sketch queries. We don't even have image search (where we input a normal image, not even a sketch) widely available. Hopefully that area and sketch searching will become widespread soon, as it is very useful.

Reading #24: Games For Sketch Data Collection (2009)

2010-12-14T13:31:00.006-06:00

by Gabe Johnson and Ellen Yi-Luen Do

Comments: Chris

This paper discusses the use of games to collect data, specifically for sketch. The authors wish to understand "how people make and describe hand-made drawings." The paper describes two games: Picturephone (like the telephone game) and Stellasketch. Picturephone gives a description of a sketch for player 1 to draw, and player 2 must then describe the sketch that player 1 drew. More players can then draw the sketch based on player 2's text instead of the original text. This is fun. Stellasketch is like Pictionary. One player draws something based on a clue, and other players privately label the sketch. The point of using the games is to hopefully collect much more data for sketch research than the typical handful of users.

_______

This is a cool idea. I actually want to play these games right now (I want to be in a user study). This is a very cool, free way to reward users for taking the study. Work is nice if it doesn't feel like work.

Reading #23: InkSeine: In Situ Search for Active Note Taking (2007)

2010-12-14T13:19:00.006-06:00

by Ken Hinckley, Shengdong Zhao, Raman Sarin, Patrick Baudisch, Ed Cutrell, Michael Shilman, and Desney Tan

Comments: George

This paper presents a note-taking application that helps the user create references by incorporating searching and gathering content. While taking notes, the user can perform searches by circling some previously written text. The actions are performed by pen gestures. They can add reference icons to a sketch, which appear as normal desktop icons and can link to files or URLs. 5 users tested the system.

_______

For sketching to replace the mouse and keyboard, many unique applications such as this need to be invented and developed. Sketching introduces some interface navigation problems, which can be frustrating to the user, especially during sensitive applications like note taking. We need many novel solutions such as this.

Reading #22: Plushie: An Interactive Design System for Plush Toys

2010-12-14T13:04:00.003-06:00

by Yuki Mori and Takeo Igarashi

Comments: Chris

This is a follow up of Teddy. This paper presents a system that can generate patterns that can be used to create plush toys. The program creates 3d models from 2d sketch inputs and finds a good pattern that can be printed and applied to fabric in such a way that the resulting plush looks like the 3d model. The program incorporates similar 3d conversion as Teddy, and it also includes some editing tools, such as cut, part creation, and seam insertion and deletion.

_______

This is definitely unique. Once again, I like the conversion of the 2d stroke to a 3d shape. I am wondering how complex you can make the shape, since it seems like you have to make a big blob and carve away parts of it maybe. Also, I took a computer-aided sculpting course this semester, and I could have used this to make one of my sculptures (too bad I didn't read this paper when we were doing that project).

Reading #21: Teddy: A Sketching Interface for 3D Freeform Design (1999)

2010-12-14T12:54:00.006-06:00

by Takeo Igarashi, Satoshi Matsuoka, Hidehiko Tanaka

Comments: Chris

This paper showcases a program that takes sketched objects and constructs 3d models from 2d sketches. Basically, it makes wider areas thicker. Once a sufficient stroke is drawn, a 3d model is generated and can be rotated in 3d space and drawn on in different orientations to create different 3d features. The interface supports cutting and erasing geometry..

This program is intended to open up new areas of 3d design and to contribute to the rapid prototyping stage of design. Some of the tools include create, bend, paint, extrude, and smooth. General positive feedback was recorded from some people.

_______

I am always intrigued by the conversion of 2d drawings to 3d. I hadn't seen this paper before, and it is pretty interesting. This paper doesn't really do much recognition, and there are many possibilities for expansion by including sketch recognition techniques.

Reading #20: MathPad 2 : A System for the Creation and Exploration of Mathematical Sketches

2010-12-14T06:14:00.004-06:00

by Joseph J LaViola Jr and Robert C Zeleznik

Comments: Marty

This paper presents a cool math sketch program that can do many things and simulate some math stuff... The cool thing about this is the interface. It is a big graph paper you can write lots of different equations, systems of equations, and diagrams. Gestures are used to help perform segmentation and identification. It can also generate graphs and plots. 12 people or so tested the system and gave some positive feedback. The interface is easy to use and the authors want to be able to include even more stuff and things... and bits.

_______

This seems like a sketch interface for matlab or something. It can simulate many mathy things and is a general purpose math tool. I don't know what its current state is, since this is a fairly old paper.

Reading #19: Diagram Structure Recognition by Bayesian Conditional Random Fields

2010-12-14T06:06:00.002-06:00

by Yuan Qi, Martin Szummer, Thomas P Minka

Comments: Sam

The authors use Bayesian Conditional random Fields (BCRFs) to analyze sketched diagrams to gather contextual information to better recognize complex diagrams.. which are complex. There are many equations which are boggling my mind at this time. 17 users drawer some diagrams, and the algorithms achieved high recognition rates in the low to high 90s.

_______

This is a cool approach for recognizing large, context sensitive drawings, of which diagrams are excellent examples. The mathy approach works pretty well.

Reading #18: Spatial Recognition and Grouping of Text and Graphics (2004)

2010-12-14T05:56:00.002-06:00

by Michael Shilman and Paul Viola

Comments: Marty

This paper discusses grouping and recognition of sketch diagrams. They take a big canvas and identify many different symbols in it. This is cool. You can draw the stuff in any order and it will segment out each symbol. This is hard and probably the main contribution of the paper, but I am sleepy. The grouping had 99% accuracy.. sweet man. Also, the recognition and the grouping were 97% accuracy.

_______

I am dealing with the segmentation problem in hand gestures, and I can relate to this problem. It is nice to have this problem solved with a high accuracy. This makes more complicated sketch interaction possible.

Reading #17: Distinguishing Text from Graphics in On-line Handwritten Ink (2004)

2010-12-14T05:48:00.003-06:00

by Christopher M Bishop, Groffrey E Hinton

Comments: Marty

This is an earlier text vs shape paper. It uses both stroke features, gaps, and time data to help separate text from other strokes. They used HMMs for recognition. They collected data from some dudes. The dudes drew some stuff, whatever they wanted, as long as the sketches contained some text elements and some non-text strokes. Recognition results were mixed, with some groups getting in the mid 90s and some in the mid 70s.

_______

Shape v Text is a hard problem, and there are many solutions to solve it. I don't like this gaps and time solution, however. It just doesn't make sense to me... I think I would like entropy better or simply visual approaches. Also, I think we can combine gestures into the mix to denote text. blah blah blah

Reading #16: An Efﬁcient Graph-Based Symbol Recognizer (2006)

2010-12-14T05:41:00.005-06:00

by WeeSan Lee, Levent Burak Kara, Thomas F Stahovich

Comments: Ozgur

This paper takes a graph-based approach to sketch (symbol) recognition and explored several graph matching techniques. They compute many error metrics for matching graphs and represent symbols using graphs. They collected several types of symbols from some users and ran their 4 matching algorithms on the data, getting results in the mid- to high-90s for most algorithms.

_______

Some symbols can naturally be represented as graphs. We have seen that graph matching can yield high accuracy for appropriate shapes. I think that this could be one component of a good general purpose recognizer.

Reading #15: An Image-Based, Trainable Symbol Recognizer for Hand-drawn Sketches (2005)

2010-12-14T05:28:00.008-06:00

by Levent Burak Kara, Thomas F Stahovich

Comments: Jonathan

This paper takes an image-based approach to sketch recognition using an ensemble classifier consisting of four different classifiers. They want a system that can recognize sketches very fast (real time for interaction) and that is also rotation invariant (using a fast polar coordinate technique).

This paper really focuses on the sketch interface and making it an attractive alternative to paper. To be a viable alternative, interaction (and therefore recognition) must be able to occur in real-time with no interruptions to the user. They also want to be able to recognize many shapes as well as "sketchy" shapes.

They used 20 shapes collected from some users. They achieved recognition rates in the mid to high 90s.

_______

This is a good paper for an introduction to image-based approaches. It is also useful for understanding sketch interfaces. Considering the year (2005), the sketches were recognized very quickly and would be recognized even faster on today's machines.

Reading #14: Using Entropy to Distinguish Shape Versus Text in Hand-Drawn Diagrams (2009)

2010-12-14T05:22:00.003-06:00

by Akshay Bhat and Tracy Hammond

Comments: Ayden

The authors propose that entropy rates are higher for text strokes than for non-text strokes and attempt to separate shapes from text using this idea. They achieved a 92% recognition rate. They define entropy, calculate entropy for all letters of the alphabet, and perform classification on collected sketches.

_______

I agree that text shapes have high entropy, and it is interesting to note that this approach has not been taken earlier in the history of sketch recognition. Obviously some primitive shapes, such as circle and rectangle, will have lower entropy than text, but what about helixes or more complex shapes? This might be good in some diagramming domains.

Reading #13: Ink Features for Diagram Recognition

2010-12-14T05:04:00.008-06:00

by Rachel Patel, Beryl Plimmer, John Grundy, and Ross Ihaka

Comments: Jianjie

This paper aims to perform more accurate diagram recognition by performing a statistical analysis of features used for recognizing various diagram components from sketched samples. This is pretty much an introduction to some of the important concepts in sketch recognition and illustrates some general approaches to sketch recognition. The paper particularly focuses on shape vs. text.

The authors took 46 features grouped into 7 categories. They collected some sketches from 26 participants which contained some diagram elements and text. They used a statistical partitioning technique to find which features can best split the strokes into shape or text strokes and then constructed decision trees with significant features toward the start of the tree.

They tested their methods with some existing shape v text systems and found some interesting results...

_______

Sketch recognition still remains in its infancy despite its age, and formal analyses like this are important to help us understand the processes and achieve greater recognition performance. This work seems kind of inconclusive, however, and I didn't understand the results very well.

Reading #12: Constellation Models for Sketch Recognition (2006)

2010-10-11T17:28:00.007-05:00

by D. Sharon and M. van de Panne (paper)

Comments: Francisco

A Constellation is a collection of objects arranged in a certain pattern, forming an image. Drawings can be imagined as constellations, especially with regards to common objects, such as the human face. Each face has the same features arranged in the same manner, with slight variances in size, shape, and location. However, several overall main relationships are always present: two eyes are on a horizontal line and are above the nose, for example.

The authors have used this concept for sketch recognition. Required drawing elements and their relative positions are set as features and classified using a maximum likelihood search. Many sketches were collected for training data, and common features were identified and labeled. 5 class values were defined, including faces, flowers, sailboats, airplanes, and characters.

_______

This is an interesting way to identify common objects within sketches. Since the authors use only 5 classes, I wonder if this technique will support many classes. Also, it would be interesting to see how detailed this can get to perform more intricate recognition, such as multiple types of faces, or even individuals.

Reading #11: LADDER, a sketching language for user interface developers (2007)

2010-10-11T16:53:00.015-05:00

by Tracy Hammond and Randall Davis (paper)

Comments: Jonathan

LADDER is a language to "describe how sketched diagrams in a domain are drawn, displayed, and edited." It is intended to help interface developers create sketch-based interfaces. LADDER is used to create shape descriptions. Shapes consist of components (such as lines), constraints (such as intersections), aliases, editing, and display properties. LADDER descriptions allow domain-specific definitions that can be used with domain-independent recognizers.

The paper gives many examples of shapes that can be modeled using LADDER. Such shapes include arrows and UML diagrams.

There are many predefined shapes (such as point, line, curve, ellipse), constraints (such as perpendicular, collinear, tangent, larger, acute), orientation-dependent constraints (such as horizontal, negative slope, above, centered below), editing methods (such as click, draw, encircle), and display methods (such as original strokes, ideal strokes, circle, rectangle, text, image).

Some shapes can be made up of certain numbers of segments, while some shapes can be made up of infinite segments.

When recognition occurs, primitive shapes are recognized first. Shapes are generated that contain the original strokes and their interpretations. Some shapes contain sub-shapes. Once primitive shapes are recognized, domain-specific shapes, using the domain descriptions, are recognized.

__________

LADDER is a nice tool for interface developers. I like that it allows complex shapes to be completely described using a programming language. However, I do see certain drawbacks, namely that each shape must be explicitly defined in LADDER. The paper mentions the capability to generate descriptions based on drawn examples, and I think this would be a great idea. It would be very tedious to define explicit shapes for large domains (COA...). I don't know if it has been implemented yet.

Reading #10: Graphical Input Through Machine Recognition of Sketches (1976)

2010-10-11T16:09:00.018-05:00

by Chrisopher F. Herot (paper)

Comments: Jonathan

This is an early sketch recognition system aiming to allow sketch input to computer programs. It contains a system called HUNCH that is used to recognize primitive sketches. It uses speed only to detect corners in a sketch. Curves were viewed "to be a special case of corners," and were modeled using b-splines. Speed was also used to determine how "careful" the user was, for instance faster strokes were less careful. This was used to help identify curves and decide whether to draw them as b-splines or just corners. The programs used to convert sketches to straight segments and curves were called STRAIT and CURVIT, respectively.

STRAIT and CURVIT did not always create the same interpretation as humans. They seemed to be user-dependent, as sketches were interpreted better for some users than for others. An improved method of straight segmenting was implemented, called STRAIN. It used a function of speed to determine which line endpoints to join (instead of a fixed distance, which is what STRAIT did).

Programs were also developed to detect overtraced lines and turn 2D sketches into 3D.

The paper discusses the importance of context in sketch recognition systems. The HUNCH system does not use context in its recognition schemes. For example, all its subroutines are always called in a fixed sequence and always perform recognition in the same way. However, similar strokes will probably be interpreted in different ways given their context.

The paper discusses an interactive system and examines the hierarchical structures of recognized sketches. It discusses various ways to tune the algorithms to work for a "truly interactive system."

__________

This is an early approach to sketch recognition, and it asks many questions as well as answers some. It can be rather dry and boring, but it does bring up some questions that are still relevant today and whose solutions can still be improved on.

For example, the latching problem: when should close endpoints be merged? It depends on the context, which is another contemporary issue. While I was working on my truss recognizer, I dealt with the latching problem when merging close line endpoints to form truss nodes. I used a distance computed based on stroke lengths, but that was obviously a poor choice.

This paper begins to explore machine learning techniques for interpreting sketches. It introduces many questions and proposes possible solutions to those questions by hypothesizing extensions to an existing primitive corner finding and beautification system. I believe the authors asked good and relevant questions, as we are still asking ourselves how to solve many of the same problems in better ways.

Reading #9: PaleoSketch: Accurate Primitive Sketch Recognition and Beautification (2008)

2010-10-11T15:39:00.013-05:00

by Brandon Paulson and Tracy Hammond (paper)

Comments: Jianjie

Paleosketch is a low-level sketch recognition system that can identify primitive shapes from single strokes. It is capable of recognizing Lines, Polylines, Circles, Ellipses, Arcs, Curves, Spirals, and Helixes.

Recognition occurs in three steps: pre-recognition, individual shape tests, and result ranking. Pre-recognition performs is essentially some pre-processing to make recognition easier. This includes removing duplicate points, feature graph generation (speed, curvature, etc) that can be used during recognition, tail removal, and two new features DCR and NDDE. NDDE is the normalized distance between direction extremes, and it is calculated by calculating the stroke length between the point with the highest direction value and the lowest direction value (direction value = change in y over change in x). DCR is the direction change ratio, and it is calculated by taking the "maximum change in direction divided by the average change in direction."

Individual tests are done for lines, polylines, circles, ellipses, arcs, curves, spirals, and helixes. Each test compares some stroke features and calculates the confidence that a stroke is that shape. When all tests are complete, the results are ordered using properties of the corner finding algorithm.

To test this system, data was collected from 10 users. This data was run on Paleosketch and compared with some features disabled as well as with Sezgin's algorithm. Paleosketch achieved a recognition rate of 98.56% across all shapes.

__________

Paleosketch is a good primitive recognition algorithm, essentially combining ideas from previous work and adding in some new stroke features and result ranking. It performs much better than other algorithms we have covered so far, including Rubine's and Sezgin's algorithms. I have used Paleosketch some, and I have found it to be fairly accurate in practice. Its biggest drawback in my opinion is its speed. It runs fairly quick for smaller strokes and collections of a few strokes, but when strokes become long or there are many strokes being analyzed Paleosketch slows down and can take up to a second or more to execute. This makes it a poor choice for online recognition systems. However, it is a great improvement over previous systems.

Reading #8: A Lightweight Multistroke Recognizer for User Interface Prototypes (2010)

2010-09-27T21:12:00.021-05:00

by Lisa Anthony and Jacob Wobbrock (paper)

Comments: Danielle

This paper presents a multi-stroke extension to Wobbrock's $1 single-stroke recognizer, called $N. This paper presents the same approach to incorporating recognition into any program, just as was done with $1. The paper contains the pseudo-code for the algorithm that fits on less than one page.

$N allows multiple strokes by connecting the strokes into one long stroke. The many possible permutations that could occur are generated when new templates are created. Resampling, scaling, and rotation all occur as in $1, with some adjustments to enhance the capabilities of $1.

In addition to providing support for multiple strokes, $N addresses some problems with the $1 algorithm. $N allows 1-dimensional gestures (such as lines) by calculating the ratio of the sides of the bounding box and using a threshold to determine if the gesture is 1D or not. $N allows rotation in the gestures as well. If rotation is desired, the gestures are rotated to the template angles instead of 0. Finally, $N provides optimizations to increase recognition speed. Templates are only compared if the starting angle of the stroke is "about the same." Also, the developer can choose whether to limit the number of strokes in gestures that will probably always have a set number of strokes (for example, + and = gestures).

The drawbacks to $N are scale invariance, using more strokes than in the template, collision of gestures, and large numbers of templates.

To test the algorithm, 40 middle and high school students used a sketch input program to input simple algebraic equations.

__________

This paper is nice in the same ways $1 was nice. It can allow any level of programmer to implement pen gesture interaction in any program. Because $N is overall better than $1 (despite having slightly less accuracy for 1-stroke gestures) it is good.

There are more possibilities with $N than with $1. There are more ways it can be used, and more ways it can be enhanced as well. It also seems that $N could easily be extended to 3D (of course with more computation required) to use for hand gestures or something similar.

Reading #7: Sketch Based Interfaces: Early Processing for Sketch Understanding (2001)

2010-09-15T20:04:00.023-05:00

by Tevfik Metin Sezgin, Thomas Stahovich, and Randall Davis (paper)

Comments: Marty

This paper describes a system that analyzes a sketch after it is drawn and analyzes what was drawn instead of how the sketch was drawn. It also allows multiple strokes in a sketch, something we have yet to discuss in this course.

One of the main features of Sezgin's system is its vertex detection, or corner finding, implementation. He uses a combination of speed and curvature to detect line corners. After segmentation, the straight edges of a sketch are stored as a polyline.

The second feature is curve handling. The system is able to model curves as Bézier curves by approximating the control points using a least squares method.

The system beautifies the drawn strokes "primarily to make it look as intended." Lines meant to be parallel are made parallel (also similarly with perpendicular lines), straight lines are made straight, and curves are rendered properly.

Finally, the system performs primitive object recognition. It uses simple geometric constraints to recognize ovals, circles, rectangles, and squares.

A user study was done to test the usability of the program and compare it to a tool-based drawing program. The participants found the author's system easier to use since any shape can be instantly drawn without having to select the corresponding tool. The author's report an accuracy of 96% when approximating drawn shapes from a set of 10 figures.

__________

This paper is an early beautification paper that turns sketched drawings into actual technical drawings such as schematics and diagrams. It does this by applying corner and curve finding to determine the user's intended sketch. I think this paper helps show that sketching can be a superior method of input than traditional menu and tool bar based drawing programs. Such interfaces were rare then, and still are now, and hopefully we can build upon this to help popularize sketch-based interfaces.

Reading #6: Protractor: A Fast and Accurate Gesture Recognizer (2010)

2010-09-14T10:27:00.007-05:00

by Yang Li (paper)

Comments: Wenzhe

Protractor is a modified $1 algorithm. The enhancements include support for up to 8 directions of rotation, scale invariance, and speed.

Protractor does the resampling as $1 does, and it uses N=16 for the number of points ($1 used 64 in its testing). Rotation invariance can be toggled on or off. If the gesture is to be rotation-independent, Protractor will rotate around the centroid until the indicative angle is 0, just as $1 does. If rotation is enabled, it rotates the indicative angle to one of 8 equidistant angles. Protractor does not scale the strokes as $1 does, so it is scale-invariant. The rotation adjustment step is also modified. Instead of taking an iterative approach to finding the optimal orientation, an angle is calculated that is close to the optimal angle.

Because of these modifications, Protractor performs significantly faster than $1 as the number of training examples increases. The recognition rates are not significantly different from $1. Because of the speed enhancements, Protractor is ideally suited for mobile device applications.

__________

I like this extension of the $1 algorithm. It sounds like it isn't much more difficult to implement that $1, and the speed enhancements without sacrificing accuracy are nice. It is nice to be able to specify orientation-dependent gestures. This, along with the scale-invariance, can help expand the limited 16-gesture set used by the $1 paper. The paper did show us a 26 gesture class, and Protractor did perform significantly better on that than $1 did in that case.

Reading #5: Gestures without Libraries, Toolkits or Training: A $1 Recognizer for User Interface Prototypes (2007)

2010-09-14T07:46:00.005-05:00

by Jacob Wobbrock, Andrew Wilson, and Yang Li (paper)

Comments: George

This paper describes the $1 gesture recognizer. This sketch/gesture recognition algorithm is intended to be a simple, easy to program algorithm that can be implemented anywhere. This hopefully would allow gestures to be incorporated into rapid prototyped interfaces that otherwise might not have been able to use gesture input. This is because most user interface designers and programmers don't have the necessary knowledge or skills to be able to implement complex recognition algorithms, and current recognition toolkits are not everywhere in any language, especially in many environments human-computer interaction experts might use.

The authors describe the algorithm in 4 parts: point resampling, indicative angle rotation, scaling and translation, and finding the optimal angle for best score. These transformations applied to each input stroke allow them to easily match up to a few template strokes for each gesture. The recognition result is the template gesture with the smallest Euclidean distance to the input stroke.

The $1 algorithm is compared to the DTW and Rubine algorithms, and it is found to compete well against them, achieving high recognition rates and recognition speed. The $1 algorithm pseudo code is given as well to aid programmers.

__________

This paper is very clearly written and the $1 algorithm is indeed very simple. I find it interesting that such a simple, almost naive, approach can perform very well if executed intelligently. It is easy to imagine improvements and how to add more recognition capabilities to this algorithm, such as rotation-dependent or time-dependent gestures.

Reading #4: Sketchpad: A Man-Machine Graphical Communication System (1963)

2010-09-08T21:07:00.011-05:00

by Ivan E. Sutherland (paper)

Comments: Jonathan

This paper presents the initial sketch-based interaction work of Ivan Sutherland. This was one of the first systems to use a pen to draw on a screen, ushering in a new form of human-computer interaction.

To use the system, the user has a set of buttons and switches to activate certain modes and tools, such as a line tool or a delete mode. When the desired settings are set, interaction using the pen accomplishes the desired task. It is important to note that the pen does not perform any free-form drawings, but rather creates geometry using only pre-defined tools or performs commands using pointing or dragging. This makes it more like a CAD system that uses a pen for input (note that the mouse did not exist at the time of this work).

The paper shows its age by emphasizing things like the data structures and memory usage as well as generic representations of sketch elements. A "light pen" is used as the input device.

Most of the paper details the various constraints and tools and how they were implemented using non-procedural object oriented methods, all of which were new ideas (as discussed in this video).

__________

This paper introduced many new ideas about human-computer interaction, graphical displays, and programming. It was the first of its kind in almost every aspect. It is hard to appreciate it now without reading comments from many years ago. Much of the paper seems trivial to implement using our current software development languages and tools. I found it interesting that many ideas were introduced back in 1962 that are still active, and hard, research problems today (such as recognizing artistic drawings and electrical schematics).

Reading #3: "Those Look Similar!" Issues in Automating Gesture Design Advice (2001)

2010-09-07T10:34:00.011-05:00

by Long, Landay, and Rowe

Comments: Sam

This paper presents the quill gesture design tool that is aimed at helping developers create pen gesture-based interfaces. The quill software gives advice to the developers if multiple gestures might be ambiguous to the computer or visually similar to people.

The authors conducted some experiments to determine what kinds of gestures are perceived as similar by people by having a few hundred participants judge a large number of gestures and pick the most complex ones. They then developed an algorithm for predicting gesture similarity.

Interface designers use quill to input gestures for their interfaces. quill uses the similarity algorithm and Rubine's methods to give feedback to the users and train and recognize the gestures. The paper talks in detail about challenges related to giving advice such as how, when, and where advice is displayed in addition to what advice is displayed.

The authors conclude that the quill system, while it could use many refinements and improvements, is a good start and can possible inspire other advice-giving systems for gesture-based interfaces.

__________

I can appreciate the assistance given to developers relating to gesture definitions. There still are not many systems that can do this, especially with 3D hand gestures. I have run into issues in my own research where two gestures I didn't think were similar actually were, and it can be a pain to re-define gestures, especially if you discover the similarity after a large gesture set has been defined, which can make it difficult to think of a new unique gesture. I would really appreciate more development of these tools for 2D and 3D gestures.

Reading #2: Specifying Gestures by Example (1991)

2010-09-05T20:40:00.017-05:00

by Dean Rubine (paper)

Comments: Danielle

This paper presents Rubine's gesture-recognition algorithm and his implementation of a program that doesn't require a hand-coded recognizer. His goal is to increase the adoption of sketch-based gesture recognition in user interfaces by making it easier to integrate recognition by providing example gestures fed into a learning algorithm rather than hand-coding the recognizer.

Rubine has implemented a gestural drawing program in which simple single-stroke gestures are used to create and manipulate a drawing. Example gestures include rectangle creation, ellipse creation, copy, rotate-scale, and delete. The user of the program is able to add new gesture examples to aid recognition as well as modify the structure of each gesture.

He presents his simple gesture recognition algorithm, which assumes stroke segmentation is already taken care of. For the stroke drawn as the gesture, 13 features are computed. Rubine states that these 13 features are capable of recognizing many gestures, but fail in some cases. Once the features are calculated, they are input to a linear classifier that gives the class name of the stroke. He discusses how the classifier is trained, which is basically the standard method of training a linear classifier.

The classifier always gives one of the gestures from all gesture classes. A probability function is used to determine the probability that the gesture was classified correctly, and if that value falls below a threshold, the classification is rejected, as the gesture is ambiguous. He also rejects gestures based on the number of standard deviations of the gesture from the mean of the classification gesture class.

Rubine says his methods perform well in practice using 10 different gesture sets. He reports recognition rates in the mid to high 90s for varying numbers of examples per class, gesture classes per set, and test gestures per class.

__________

This paper seems to be on the cutting edge of sketch recognition technology for its time. Indeed, the concepts presented in this field are still widely used and studied today. Very little work and very few non-hand-coded recognition applications existed in 1991. I was impressed by the high accuracy achieved on the gesture sets using the linear classifier, though the accuracy reporting didn't seem complete. I have seen other systems, such as in our lab, that can recognize much larger classes of data, and I am particularly interested in 3D extensions of this method as well as other classification algorithms I have been brainstorming, which I look forward to implementing.