[ Lay Language Paper Index | Press Room ]
A Robot That Mimics Human Speech
Kotaro Fukui- kotaro@toki.waeda.jp
Kazufumi Nishikawa, Toshiharu Kuwae, Masaaki Honda, Atsuo Takanishi
Waseda University
59-308, Ookubo, Shinjuku-ku, Tokyo, 169-8555, Japan
Hideaki Takanobu
Kogakuin University
Takemi Mochida
Communication Science Laboratories, Nippon Telegraph and Telephone Corporation
Popular version of paper 4aSC1
Presented Thursday morning, May19, 2005
149th ASA Meeting, Vancouver, BC
1. Introduction
The purpose of this research is to investigate a human vocal mechanism from
engineering viewpoints by reproducing the entire process of articulated speech
by using a talking robot. The mechanical talking robot has several engineering
applications like an audio-visual talking head, medical supporting devices for
vocally challenged people and lifelike learning devices for foreign languages.
2.WT-4(Waseda Talker No. 4)
We developed an anthropomorphic talking robot WT-4 (Waseda Talker No.4) to produce
human speech. WT-4 consists of lungs, vocal cords and articulators (tongue,
lips, teeth, nasal cavity and soft palate). Together these parts have 19 degrees
of freedom (DOF), with each DOF indicating an independent direction of movement.
The lips and the tongue are made of elastic material and controlled by a looped
wire mechanism (see below). The elastic material and its framework can deform
by large amounts, just as the tongue and lips can, and the elastic material
prevents air and sound leaks. The robot uses lungs to power its sounds. The
vocal cords contract to include voiced consonants such as the sound /b/ in "bus."
The vocal cords open to produce voiceless consonants such as the sound /k/ in
"walk." The robot's lips, teeth, tongue, nasal cavity and soft palate
are all constructed to move just like real human parts. WT-4 could produce other
phonetic sounds such as stops, fricatives and nasal consonants in all Japanese
syllables as well as five Japanese vowels with intelligible sound quality.
|
|
WT-4 (Section) |
Vibration of WT-4's Vocal Cords (High-Speed
Camera/1000[fps]) |
3. Mimicking Mechanism
WT-4 not only talks; it hears and imitates sounds autonomously. The robot tracks
acoustic information from a human speaker, then generates sound through its vocal
mechanisms. Through its imitation, it can repeat sentences as a person talks (this
is illustrated in the last video demonstration below). The talking robot can produce
vowel and consonant sounds by mimicking the vocal cords' vibration and the fricative
and plosive sound source generation by the air flow as well as though dynamically
controlled acoustic resonance. Articulatory control of the talking robot is designed
to track the acoustic goals (pitch, sound power, two formant frequencies, and
voice-unvoiced timing) of the speech. It was shown that this mimicking speech
control was effective in producing fluent continuous speech by the talking robot.
4. Demonstration of Talking Robot WT-4
Click the following pictures and see the Talking Robot movies.
[ Lay Language Paper Index | Press Room ]