ASA PRESSROOM


Acoustical Society of America
157th Meeting Lay Language Papers


[ Lay Language Paper Index | Press Room ]


Talking and Singing With Your Hands

Sidney Fels - ssfels@ece.ubc.ca
Electrical and Computer Engineering, Univ. of British Columbia
2332 Main Mall, Vancouver, BC, Canada, V6T 1Z4

Bob Pritchard -
bob@interchange.ubc.ca
Eric Vatikiotis-Bateson - evb@interchange.ubc.ca University of British Columbia
Vancouver, BC, Canada V6T1Z2

Popular version of paper 1aSCa2
Presented Monday morning, May 18, 2009
157th ASA Meeting, Portland, OR

We have created a wearable system that allows a person to talk and sing just by moving his or her hands. We call this system a digital ventriloquized actor, or DIVA for short, since it is reminiscent of a person moving a puppets mouth to make it speak. Figure 1 shows a picture of a performer wearing a DIVA. With a DIVA, the performer shapes sounds with their hands to generate artificial speech and song. Unlike a sign-language, though, playing speech using a DIVA is closer to playing a musical instrument -- each movement causes vocal sounds. If played wrong, it sounds like a baby babbling. If played well, the sounds form intelligent speech and song in whatever language the performer wants to speak.

Figure 1. A performer using a DIVA. Notice the two gloves she is wearing contain sensors.

The way we do this is by having a performer think of her hands like a mouth. By moving her fingers and arms like a jaw moving up and down and a tongue moving around in her mouth, the shapes of sounds are formed, just like a normal vocal tract. These movements are converted by our system to control a speech synthesizer directly. The speech synthesizer we use is called a parallel-formant speech synthesizer created by John Holmes and requires a set of frequencies and amplitudes every 10msec to produce speech.

Each player has a unique style to play a DIVA and shape sounds. At first, a performer begins learning to speak using a rough guideline of the hand movements needed to make speech sounds, such as moving their index finger over their thumb to be like their tongue tip moving along their hard palate. As she progresses, she can adjust the relationship between her hand movement and the sounds through an adaptive system. The adaptive system allows the performer to provide examples of personal hand movements she likes for each sound, providing a personal accent. Using these examples, the system learns the relationship between gestures and speech sounds for each performer. This adaptive system makes learning to speak and sing with a DIVA easier. We provide each player an accent as well as other specific information about their preferences to make it easier for them to learn and sing. With practice, performers are able to speak intelligibly, in whichever language they prefer as well as sing.

Technically, a DIVA consists of two specially instruments gloves and a hand tracker that the performer wears. All the components are shown in Figure 2. The right hand glove is called a CyberGlove that measures the bend angles of the performers hand movements. The performer is wearing this device on her right hand in Figure 1. Attached to the CyberGlove is a Polhemus Patriot receiver that measures the performers right hand position relative to a transmitter that she wears on her belt that is shown in Figure 3. Movements of her right hand controls vowel and most consonant sounds, like F, V, SH, and ZH, as well as pitch. The left hand glove, shown in Figure 4, is called a TouchGlove that has eight contact pads used for creating plosive sounds like, B, D, G, J, P, T, K, and CH when touched by her thumb. The performer also has a foot switch to turn on and off sound. All of the sensor data goes to a computer in her backpack that uses software we wrote to convert the sensor data to sound. We use a visual programming language called Max/MSP to implement the software. The software produces sound that is played through a speaker that the performer wears that represents her second mouth. During performance, we also have a wireless audio transmitter that allows her to speak through a stage sound system.

Figure 2. All the parts of a DIVA. There are two sensor gloves, a hand tracker, a foot pedal, a computer, speaker and harness to hold everything. Figure 3. The gray cube on her belt is the transmitter used for tracking the position of a receiver coil attached to her right hand glove. Large movements of the hand are like moving the tongue body creating vowel sounds. Also, the height of the performers right hand controls pitch.

 

Figure 4. Close-up of the TouchGlove that has eight pads on the fingers that the performer touches with her thumb to make plosive sounds.

DIVAs will be used in three composed stage works of increasing complexity performed internationally, starting with one performer initially and culminating in three performers simultaneously using their natural voices as well as the hand based synthesizer. Our first performance was in Kyoto, Japan in January, 2009. Training performances will be used to study the processes associated with skill acquisition, the coordination of multiple voices within and among performers, and the intelligibility and realism of this new form of audio-visual speech production. We are also building a robotic face and computer graphics face that will be gesture controlled and synchronized with the speech and song.

Using DIVAs, we have created a new way for a person to speak and sing using their hands. Performers shape the sounds with their hands to produce speech. This new technique provides a new type of sound gesture language that can be used for performing, understanding speech acquisition, and exploring a new way to think about communication techniques for people with speech disabilities.

This project is funded by the Canada Council for the Arts and Natural Sciences and Engineering Research Council, Canada. More information is at: www.magic.ubc.ca/VisualVoice.htm.

Video example of speaking with DIVA (Japanese, with sub-titles) - http://www.magic.ubc.ca/artisynth/uploads/VisualVoice/DIVAJapanese.m4v
Video sample of first demonstration of DIVA (Kyoto, Japan, Jan. 2009) - http://www.magic.ubc.ca/artisynth/uploads/VisualVoice/DIVA-Performance-Japan09HQ.mov


[ Lay Language Paper Index | Press Room ]