Non-contact Method for Producing Tactile Sensation Using Airborne Ultrasound (2008)

Takayuki Iwamoto, Mari Tatezono, and Hiroyuki Shinoda
The University of Tokyo
Comments: Manoj and others...

This paper is an introduction to the idea that ultrasonic transducers can be used for airborne interaction. The idea is that a person can interact with a virtual 3D object by manipulating it with the bare hand. The paper gives examples of hand-tracking systems that create 3D models of the hand, though the authors have not yet done work in that area.

This paper mostly discusses the feasibility of using ultrasonic vibrations to give tactile feedback to the user. Most of the paper gives technical details about the hardware and some calculations of the sound pressure the user is likely to feel.

The paper gives results of a setup with an electronic balance and a microphone to determine the force and resolution of a prototype the authors have created. They also report that people can feel the pulses at 250-300 mm above the device.

Since the ultrasonic device is deemed a success, the next step of this research is to create a 3D interaction system that uses the device to give tactile feedback.

This video shows some of the more recent work, including hand tracking and holographic display.


I think this system is an interesting idea. I personally have been thinking about methods of interaction with 3D objects using direct manipulation with the hand and providing some form of tactile feedback. The most obvious solution is the CyberTouch glove, which the authors of this paper denounce since it is always in contact with the hand even when not vibrating. I am interested in reading more papers in this area, and I hope we get to some more in this course.

Computer Vision-Based Gesture Recognition for an Augmented Reality Interface (2004)

Storring, et al. paper

This paper talks about gesture recognition using vision-based techniques to recognize 6 hand gestures for use in augmented reality.

Because the gestures are recognized using a vision based approach, the only hardware needed is a standard camera. The user does not need to wear a glove.

The hand gestures are recognized by manipulating the image of the hand's pixels in such a way that the hand becomes white and the background becomes solid black, even if there are other objects in the background.

To count the number of fingers that are outstretched, concentric circles about the hand are examined.

One extended finger is recognized as a pointing gesture, and a transition between three states is used for a click gesture, in which the thumb is used to "click."

The generality of this algorithm means people with varying hands can use it. Also, it works for both left and right hands.

The author claims that this gesturing method was easily understood and used in a simple AR application, which I don't feel like describing...


I like the vision-based approach used in this method. The CyberGlove is not easy to wear for a long period of time, so I like any solution that works with a bare hand as input. I have not studied any computer-vision methods, so this is new to me. I was wondering what the resolution was of the camera they use, because it seems that doing per-pixel calculations would take a long time.

I also wonder how sensitive the glove is to different skin tones. They mentioned an initialization step, but I don't remember if they specified if that included setting the skin color.

Motion Editing with Data Glove (2004)

Lam, Zou, and Komura paper

Comments: Franck and Sashi also Manoj... Murat too

I'm sure we have all made hand puppets in which we use two fingers as legs. Lam, et al. have used this idea to map this motion to 3D models to control animation. Specifically, they use the finger-leg motions to manipulate previously-captured motion data. This can potentially turn an existing walking animation into a running, dancing, or hopping animation.

The finger motion data was captured using a P5 glove. This glove does not provide as good or as much data as our CyberGloves, but are adequate for this research, especially when considering the limitations of human fingers themselves when compared to legs.

The human body has many more degrees of motion than the fingers, which the researchers acknowledge and try to compensate for in their mapping function. The mapping function they have implemented converts the finger-leg motion into full body motion. To control things like arms, simple, common rules are used, such as the opposite movement of the arms to the legs when walking or running. As expected, this method produces unnatural animations. The researchers propose some methods of remedying this for future work.

A small experiment was described in which an animator manipulated the motion-capture data by performing walking, running, and hopping motions with the hand. A complete analysis was not presented, nor was a complete description of the users.


I think this is an interesting concept that can be expanded on. I like the fact that Lam tries to tackle the problem of key-frame, static animation by introducing the dynamic, real-time finger controlled animation. I think this method would be good for establishing timing and rhythm, but I think the animations could be more defined by physics-based calculations rather than a direct mapping of the hand to the body.

EyePoint: Practical Pointing and Selection Using Gaze and Keyboard (2007)

Kumar, et al. Paper

Comments: Frank and Murat also Manoj

EyePoint™ is a gaze-based system for mouse cursor control that aims to provide a worthwhile alternative to the mouse to the degree that the average computer user might choose this system over a mouse, depending on the task at hand. Kumar wants to move eye tracking systems out of the domain of disabled users and into everyday computing with normal people.

Design of the system

EyePoint's design incorporates three principles:

1. They do not want to "overload the visual channel for pointing," i.e. don't map interaction directly to eye movement.

2. They aim to increase selection accuracy by including zooming/magnification in such a way that it enhances accuracy but does not inhibit performance. The solution to this is to magnify a square around the gaze area and overlay a grid of orange dots, called "focus points," to enhance fine selection. The magnification avoids distortion of the interface that some techniques, such as fish-eye magnification, introduce. The magnified area is partially transparent as well, to avoid obscuring the interface, as some zooming techniques do.

3. They want to reduce jitter, which they do with "fixation detection and [a] smoothing algorithm."

4. The selection method must be "fluid" and simple in order to make fast selections while maintaining usability by both disabled and normal functioning people. The solution to this is to perform selections with some "hotkeys" (corresponding to click, double click, right click, mouse over, start click-and-drag, and end click-and-drag) on a standard keyboard. The thought is that pressing keys is faster and better than dwell and zooming selection methods.

This system uses a Tobii 1750 eye tracking system.

Usage of the EyePoint software

1. The user looks at the target on the screen.
2. The user presses down one of the hotkeys (click, double click, right click, mouse over, start click-and-drag, and end click-and-drag)
3. A square around the estimated gaze point is magnified in the manner described above.
4. The user looks at the target in the magnified area (this helps enhance the selection)
5. The user releases the hotkey
6. The action of the hotkey is performed

The researchers think that this system can equally be used by able-bodied and disabled users. They propose changing the hotkeys to some other input device depending on the situations of the users. No testing was done in this area, however.

User Study and evaluation

A user study was conducted with 20 normal, able-bodied, participants who were very experienced computer users.

A quantitative evaluation was performed which had the users perform specific tasks, including navigating through web pages and clicking targets. Data including timing and error rates were collected. The tasks were varied to account for any learning curves.

A qualitative evaluation was also performed using a questionnaire to get users opinions of the EyePoint system compared to a standard mouse input.


Web browsing:
The "focus points" did not help with timing or accuracy.
The average time to click a target with EyePoint was about 400 ms slower than a mouse.
The mouse had a lower error rate by about 10% in the normal EyePoint tests.
User's were divided on which system they thought was faster and easier to use (mouse or EyePoint).
75% of the users said they might choose EyePoint in certain situations.
Most users preferred the focus points.

Target selection:
EyePoint is only marginally slower than the mouse for selection.
EyePoint has much higher error rates than the mouse.
Users generally preferred EyePoint for its ease of use despite its lower performance.


It was found that EyePoint is similar in speed to the mouse. The error rates for EyePoint vary from user to user. The study participants preferred EyePoint to the mouse.

The researchers conclude that it is possible and practical for an eye tracking system to be used by the common computer user in place of the mouse.

Watch the video if you want:


I like this eye tracking system the best of the papers we have read so far this semester. I have thought of using eye tracking to augment the mouse in some way, which would mean a system is mostly mouse-based with some eye help. This system is the opposite. It is a an eye-based system with some hand click help. This is pretty cool, and the speed and accuracy results look a lot better than some of the other work we looked at.

I am most intrigued by the magnification solution to enhance accuracy and reduce error by introducing a second focusing saccade at the cost of speed. I do think emphasizing accuracy over speed was a good move by Kumar, et al. since the mouse is already pretty darn fast and the time to actually focus on a target doesn't seem that it could be any faster than the mouse, at least for experienced computer users.

The EyePoint system might increase selection speed for computer novices that might not be used to using a mouse and therefore use it slower than us experienced users. After all, not all people are experience computer users, but nearly all people can see.


I have a few ideas about how we can aid the instruction of salsa dancing using the tools we have available.

We can use the HMDs to overlay some sort of 3d model, whether it be in first person or just showing a 3d model floating before the user. This way, the user can get real-time visual feedback as he naturally moves his head. Because the head is probably not going to remain still while dancing, this allows the user to constantly receive feedback without interrupting the dance. If feedback was given on a separate monitor, it would interrupt the dancing for the user to look at the screen.

We could use the eye tracker to determine which direction the user is looking to give feedback or keep the user from being distracted.

The vibrating gloves can be used in a variety of ways. They could vibrate when the user needs to do something, like grab his partner or perform a specific move.

---------- ---------- ----------

I also have a few ethnography suggestions for the undergrad CHI class.

First of all, they could scan through a list if student groups on campus and pick one that looks interesting. They could then attend a meeting (with permission if required, but hopefully anonymously) and observe the group.

Another possibility lies in the on-campus parking lots. The students could find a good vantage point to see a large portion of a particular lot (such as lot 50 since it is large and full during the days) and observe drivers' behaviors.

Similarly, they could watch the crosswalks to observe the students' behaviors in that situation.

VARK Questionnaire

So I filled out the VARK questionnaire that Josh e-mailed our class about. This questionnaire is supposed to give an idea of how you learn (Visual, Aural, Read/Write, Kinesthetic). Here are my results:

Visual: 6
Aural: 8
Read/Write: 8
Kinesthetic: 8

I was surprised at my results, as I thought I was mostly a visual learner. According to this, I am pretty equal across the board, but visual is slightly lower. The first thing I thought of to perhaps explain this is that I like to learn from demonstration, which I suppose can be a combination of these, and I might think of that as visual. However, this questionnaire is short, and obviously isn't going to be as accurate as a professional evaluation. This makes me wonder what a thorough examination of myself would say... perhaps I don't know as much about myself as I think I do, or perhaps I perceive myself differently from reality...

Lab Day!

So today we went into the lab to see some of the devices we have available for this class.

3D Display Glasses.

These glasses have 2 cameras in front and 2 displays, one for each eye. A control box is attached that has video outputs and inputs so a computer can process and alter/augment the user's vision.

This device was overall pretty easy to use. I was able to instantly function normally in an office or lab setting, though I may be impaired in other settings.

I can immediately see the benefits of these glasses, and I have a few ideas of applications. The first thing I thought of when I put them on was 3D object manipulation using augmented reality techniques. I'm sure many of us have seen the demos with the papers with markers on them that show a 3D model on the paper when viewed with a webcam on a computer. This could make those kinds of things feel more real, since it would be as if the 3D model is actually in the user's hand instead of on a computer monitor.

On a more whimsical note, I think these glasses could be coupled with facial recognition software and Facebook to allow the user to identify people they know on Facebook. Information about that person could be displayed around the head or body so the user can remember things about that person, such as a favorite food, color, or the person's name. This could help people get to know each other.

I did notice three possible problems. First, depth perception is altered while wearing the device. Everything seems farther away. While I was fine and adjusted to this quickly, other people may not be able to adapt or might become impaired by this. The long-term effects of this altered depth perception might be detrimental as well, but that would require more testing. Second, the resolution of the device limits the details the user can see. The biggest problem I noticed with this was the difficulty of reading small or far away text. Third, peripheral vision is blocked by the glasses. This, coupled with the extended depth perception, gives the user a kind of tunnel vision which makes it difficult to walk around. The biggest problem with this is the difficulty of seeing your feet while wearing it, so it is easy to trip over things. I also experienced problems going around corners, as my shoulder hit a few times.

Josh had us use the 3D glasses and fill out a questionnaire. We went through some actions to get a feel for the device and help Josh understand the device's capabilities.

Eye tracking glasses

These glasses have a camera attached that feeds data to a software application that recognizes the human eye and can determine the location the user is looking.

My impression of the device was medium. I was impressed with the eye recognition software and that it worked. However, the accuracy was not great, and I had trouble moving the cursor with my eyes. I also had some difficulty positioning the camera so my eye was completely in the viewing window, but I don't know if this is a problem or not. The device also requires that the user's head remain still, which is an impossible task. This device would be awesome if we can calibrate it better and be able to move our heads.

In its current state, I can see this device augmenting other input devices, such as a mouse or glove. I don't think it can currently be used as a standalone input device. For this class, I think we could implement some sort of eye gesture system similar to the EOG glasses we read about. The glasses seem to work fine for coarse input like that, though I had some difficulty with the up-left gesture.

CyberTouch gloves

I have been working with the gloves since last semester. My initial thoughts on the glove were very positive once I got one connected to a computer. The 3D hand in the configuration utility really illustrates what this glove can do, and the vibrotactile feedback adds a new layer of functionality and opens up a new realm of possibilities.

I have thought of a few ways this glove can control the mouse. First of all, it could be combined with a 3D location sensor, such as the Flock of Birds, to create a system for controlling the cursor on a large display (similar to Vogel et al.'s work). The vibrotactile actuators could add the feedback that Vogel was missing from his system. We could also just use the angles of one or more fingers to control the mouse. Finally, we could combine this glove with Josh's 3D glasses and the Flock of Birds to create a 3D interaction augmented reality application, in which 3D objects can be grabbed by the user and manipulated.

I have noticed a few problems with the glove. First, the calibration is not perfect, especially with the thumb. Second, the vibrotactile actuators have 5 discrete levels of vibration intensity instead of adjusting continuously. The lowest level of intensity is still pretty intense as well. I would like to have some gloves with a finer intensity akin to a cell phone vibrotactile actuator.

These are my initial thoughts and ideas for the devices we played with in class today.

Comments: Manoj, Paul