Vocal Cord Model to Produce Human-like Vibrations
Kotaro Fukui - k-fukui@takanishi.mech.waseda.ac.jp
Eiji Shintaku,
Yuma Ishikawa,
Nana Sakakibara,
Yoshikazu Mukaeda,
Atsuo Takanishi,
Masaaki Honda
Waseda University
Popular version of paper 1aSCa1
Presented Monday morning, May 18, 2009
157th ASA Meeting, Portland, OR
In human speech production, the airflow from mechanical lungs causes the vocal cords to vibrate, producing a source sound. The vocal tract resonance characteristics are controlled by articulating the tongue, the jaw, the lips, and the velum. By using these mechanisms, human can produce various vowels and consonants. The voices have many variations, such as “high-pitched,” “breathy,” “falsetto,” and “creaky.” These variations are mostly produced by the difference of the vocal cord vibration pattern.
We had developed anthropomorphic talking robot, Waseda Talker Series, which mimicks human speech production mechanisms and has mechanical vocal cords and tongues. In an old model WT-4(Waseda Talker No. 4, written in 149th Vancouver meeting’s lay paper), the vocal cord model consisted of two pieces of thin soft rubber, and vocal tract was a two dimensional mechanism. This robot can produce Japanese vowels and consonant sounds; however, it is not enough to mimic movement of the human speech organs, because human has three dimensional structure. We had started development of a three dimensional talking robot with vocal cord and vocal tract.
The new vocal cord model is made of thermoplastic rubber Septon, (Kuraray Corp.); the material is very strong and has large ability to stretch. The shape and size of the fold is designed to mimic the vocal cord of adult male. This vocal cord model could reproduce vibration of human modal (normal) speech production. The vibrating pattern is different in phase in the upper and lower parts of the fold. This complex vibration is important for producing the human voice.
In WT-7 (Waseda Talker No. 7), we reproduced creaky voice and breathy voice. These are characteristic voice qualities in human speech. Creaky voice, also called as vocal fry, is low pitched and periodic vibration. To reproduce the vibration, we pushed the lower part of the vocal cord to stabilize the vibration and add no tension, and vibrated it with very low air pressure. By this method, the vibration has double peaks and spends a short time open in glottis in one cycle. (Glottis is opened in only once in double peak vibration). Breathy voice is voice with noise caused by breath through the glottis. To reproduce this vibration, the glottis is opened a little. By this method, the vibration has open area also in the closed phase of the vibration cycle, similar to humans. These vibrations have similar characteristics as vibrations of human vocal cord.
In future, we would like to develop a control system to reproduce human talk including such voice variations.
Figure 1. Three dimensional talking robot
Figure 2. Three dimensional vocal cord model mimicking human shape