Adult imitating child speech: A case study using 3D ultrasound

Colette Feehan – cmfeehan@iu.edu
Steven M. Lulich – slulich@iu.edu
Indiana University

Popular version of paper 2aSC11
Presented Tuesday morning, November 6, 2018
176^th ASA meeting, Victoria
Click here to read the abstract

Many people do not realize that a lot of the “child” voices they hear in animated TV shows and movies are actually produced by adults.¹ The field of animation has a long tradition of using adults to voice child characters such as in Peter Pan (1953), The Jetsons (1962-63), Rugrats (1991-2004), The Wild Thornberrys (1998-2004), and The Boondocks (2005-2014) to name just a few¹. Reasons for using adults include: the fact that children are hard to direct, they legally cannot work long hours, and their voices change as they grow up,¹ so if they had used real children in a series like The Simpsons (1989-), they might be on Bart number seven by now, whereas with the talented Nancy Cartwright, Bart has maintained the same vocal spunk of his 1980s self.⁸

Voice actors are an interesting population for linguistic study because they are essentially professional folk linguists⁹: this means that without formal training in linguistics they skillfully and reliably perform complex linguistic tasks. Previous studies^10-17 of voice actors investigated how changes in pitch, movement of the vocal tract, and voice quality (e.g. how breathy or scratchy a voice sounds) affect the way listeners and viewers understand and interpret the animated character. The current investigation uses 3D ultrasound data from an amateur voice actor to address the question: What do adult voice actors do with their vocal tracts in order to sound like a child?

Ultrasound works by emitting high-frequency sound and measuring the time it takes for the sound to echo back. For this study, an ultrasound probe (like what you use to see a baby) was placed under the participant’s chin and held in place using a customized helmet. The sound waves travel through the tissues of the face and tongue—a fairly dense medium—and when the waves come into contact with the air along the surface of the tongue—a much lower density medium—they echo back.

These echoes are represented in ultrasound images as a bright line (see Figure 1).

Multiple images can be analyzed and assembled into 3D representations of the tongue surface (see Figure 2).

This study identified three strategies for imitating a child’s voice. First the actor raised the hyoid bone (a tiny bone in your neck) which is visible as an acoustic “shadow” circled in Figure 3.

This gesture effectively shortens the vocal tract, helping the actor to sound like a smaller person. Second, the actor pushed tongue movements forward in the mouth (visible in Figure 4).

This gesture shortens the front part of the vocal tract, which also helps the actor to sound like a smaller person. Third, the actor produced a prominent groove down the middle of the tongue (visible in Figure 2), effectively narrowing the vocal tract. These three strategies together help voice actors sound like people with smaller vocal tracts, which is very effective when voicing an animated child character!

References

Holliday, C. “Emotion Capture: Vocal Performances by Children in the Computer-Animated Film”. Alphaville: Journal of Film and Screen Media 3 (Summer 2012). Web. ISSN: 2009-4078.
Disney, W. (Producer) Geronimi, C., Jackson, W., Luske, H. (Directors). (1953). Peter Pan [Motion Picture]. Burbank, CA: Walt Disney Productions.
Hanna, W., & Barbera, J. (1962). The Jetsons. [Television Series] Los Angles, CA: Hanna Barbera Productions.
Klasky, A., Csupo, G., Coffey, V., Germain, P., Harrington, M. (Executive Producers) (1991). Rugrats [Television Series]. Hollywood, CA: Klasky/Csupo, Inc.
Klasky, A., & Csupo, G. (Executive Producers). (1998). The Wild Thornberrys [Television Series]. Hollywood, CA: Klasky/Csupo, Inc.
McGruder, A., Hudlin, R., Barnes, R., Cowan, B., Jones, C. (Executive Producers). (2005). The Boondocks [Television Series] Culver City, CA: Adelaide Productions Television.
Brooks, J., & Groening, M. (Executive Producers). (1989). The Simpsons [Television Series]. Los Angeles, CA: Gracie Films.
Cartwright, N. (2001) My Life as a 10-Year-Old Boy. New York: Hyperion Books.
Preston, D. R. (1993). Folk dialectology. American dialect research, 333-378.
Starr, R. L. (2015). Sweet voice: The role of voice quality in a Japanese feminine style. Language in Society, 44(01), 1-34.
Teshigawara, M. (2003). Voices in Japanese animation: a phonetic study of vocal stereotypes of heroes and villains in Japanese culture. Dissertation.
Teshigawara, M. (2004). Vocally expressed emotions and stereotypes in Japanese animation: Voice qualities of the bad guys compared to those of the good guys. Journal of the Phonetic Society of Japan, 8(1), 60-76.
Teshigawara, M., & Murano, E. Z. (2004). Articulatory correlates of voice qualities of good guys and bad guys in Japanese anime: An MRI study. In Proceedings of INTERSPEECH (pp. 1249-1252).
Teshigawara, M., Amir, N., Amir, O., Wlosko, E., & Avivi, M. (2007). Effects of random splicing on listeners’ perceptions. In 16th international congress of phonetic sciences (icphs).
Teshigawara, M. 2009. Vocal expressions of emotions and personalities in Japanese anime. In Izdebski, K. (ed.), Emotions of the Human Voice, Vol. III Culture and Perception. San Diego: Plural Publishing, 275-287.
Teshigawara, K. (2011). Voice-based person perception: two dimensions and their phonetic properties. ICPhSXVII, 1974-1977.
Uchida, T. 2007. Effects of F0 range and contours in speech upon the image of speakers’ personality. Proc.19th ICA Madrid. http://www.seaacustica.es/WEB_ICA_07/fchrs/papers/cas-03-024.pdf

2aSC11 – Adult imitating child speech: A case study using 3D ultrasound

Like this:

Keep reading!

2aSC11 – Adult imitating child speech: A case study using 3D ultrasound

Share this:

Like this:

Keep reading!

Search for papers by Acoustics Keyword