How voice training changes the tongue in chest versus head voice

Jiu Song – jiusongjd@gmail.com
Integrated Speech Research Lab
University of British Columbia
Vancouver, British Columbia, V6T 1Z4
Canada

Additional authors:
Jaida Siu – jaidasiu@gmail.com
Jahurul Islam – jahurul.islam@ubc.ca
Bryan Gick – gick@mail.ubc.ca

Popular version of 1aMU8 – Effect of years of voice training on chest and head register tongue shape variability
Presented at the 187th ASA Meeting
Read the abstract at https://eppro01.ativ.me/web/page.php?page=IntHtml&project=ASAFALL24&id=3767562

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–


Imagine being in a voice lesson, and as you try to hit a high note, your voice coach says, “suppress your tongue” or “pretend your tongue doesn’t exist!” What does this mean, and why do singers do this?

One vocal technique used by professional singers is to sing in different vocal registers. Generally, a man’s natural speaking voice and the voice people use to sing lower notes is called the chest voice—you can feel a vibration in your chest if you place your hand over it as you vocalize. When moving to higher notes, singers shift to their head voice, where vibrations feel stronger in the head. However, what role does the tongue play in this transition? Do all singers, including amateurs, naturally adjust their tongue when switching registers, or is this adjustment a learned skill?

Figure 1: Approximate location of feeling/sensation for chest and head voice.

We are interested in vowels and the pitch range during the passaggio, which is the shift or transition point between different vocal registers. The voice is very unstable and prone to audible cracking during the passaggio, and singers are trained to navigate it smoothly. We also know that different vowels are produced in different locations in the mouth and possess different qualities. One way that singers successfully navigate the passaggio is by altering the vowel through slight adjustments to tongue shape. To study this, we utilized ultrasound imaging to monitor the position and shape of the tongue while participants with varying levels of vocal training sang vowels across their pitch range, similar to a vocal warm-up.

Video 1: Example of ultrasound recording

The results indicated that, in head voice, the tongue is generally positioned higher in the mouth than in chest voice. Unsurprisingly, this difference is more pronounced for certain vowels than for others.

Figure 2: Tongue position in chest and head voice for front and back vowel groups. Overlapping shades indicate that there is virtually no difference.

Singers’ tongues are also shaped by training. Recall the voice coach’s advice to lower your jaw and tongue while singing—this technique is employed to create more space in the mouth to enhance resonance and vocal projection. Indeed, trained singers generally have a lower overall tongue position.

As professional singers’ transitions between registers sound more seamless, we speculated that trained singers would exhibit smaller differences in tongue position between registers than untrained singers, who have less developed tongue control. In fact, it turns out that the opposite is true: the tongue behaves differently in chest voice and head voice, but only for individuals with vocal training.

Figure 3: Tongue position in chest and head voice for singers with different levels of training.

In summary, our research suggests that tongue adjustments for register shifts may be a learned technique. The manner in which singers adjust their tongues for different vowels and vocal registers could be an essential component in achieving a seamless transition between registers, as well as in the effective use of various vocal qualities. Understanding the interactions among vowels, registers, and the tongue provides insight into the mechanisms of human vocal production and voice pedagogy.

Vowel Adjustments: The Key to High-Pitched Singing

May Pik Yu Chan – pikyu@sas.upenn.edu

University of Pennsylvania, 3401-C Walnut Street, Suite 300, C Wing, Philadelphia, PA, 19104, United States

Jianjing Kuang

Popular version of 4aMU6 – Ultrasound tongue imaging of vowel spaces across pitches in singing
Presented at the 186 ASA Meeting
Read the abstract at https://doi.org/10.1121/10.0027410

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–

Singing isn’t just for the stage – everyone enjoys finding their voices in songs, regardless of whether they are performing in an auditorium or merely humming in the shower. Singing well is more than just hitting the right notes, it’s also about using your voice as an instrument effectively. One technique that professional opera singers master is to change how they pronounce their vowels based on the pitch they are singing. But why do singers change their vowels? Is it only to sound more beautiful, or is it necessary to hit these higher notes?

We explore this question by studying what non-professional singers do – if it is necessary to change the vowels to reach higher notes, then non-professional singers will also do the same at higher notes. The participants were asked to sing various English vowels across their pitch range, much like a vocal warm-up exercise. These vowels included [i] (like “beat”), [ɛ] (like “bet”), [æ] (like “bat”), [ɑ] (like “bot”), and [u] (like “boot”). Since vowels are made by different tongue gestures, we used ultrasound imaging to capture images of the participants’ tongue positions as they sang. This allowed us to see how the tongue moved across different pitches and vowels.

We found that participants who managed to sing more pitches did indeed adjust their tongue shapes when reaching high notes. Even when isolating the participants who said they have never sung in choir or acapella group contexts, the trend still stands. Those who are able to sing at higher pitches try to adjust their vowels at higher pitches. In contrast, participants who cannot sing a wide pitch range generally do not change their vowels based on pitch.

We then compared this to pilot data from an operatic soprano, who showed gradual adjustments in tongue positions across her whole pitch range, effectively neutralising the differences between vowels at her highest pitches. In other words, all the vowels at her highest pitches sounded very similar to each other.

Overall, these findings suggest that maybe changing our mouth shape and tongue position is necessary when singing high pitches. The way singers modify their vowels could be an essential part of achieving a well-balanced, efficient voice, especially for hitting high notes. By better understanding how vowels and pitch interact with each other, this research opens the door to further studies on how singers use their vocal instruments and what are the keys to effective voice production. Together, this research offers insights into not only our appreciation for the art of singing, but also into the complex mechanisms of human vocal production.

 

Video 1: Example of sung vowels at relatively lower pitches.
Video 2: Example of sung vowels at relatively higher pitches.

Why Australian Aboriginal languages have small vowel systems

Andrew Butcher – endymensch@gmail.com

Flinders University, GPO Box 2100, Adelaide, SA, 5001, Australia

Popular version of 1pSC6 – On the Small Flat Vowel Systems of Australian Languages
Presented at the 185th ASA Meeting
Read the abstract at https://doi.org/10.1121/10.0022855

Please keep in mind that the research described in this Lay Language Paper may not have yet been peer reviewed.

Australia originally had 250-350 Aboriginal languages. Today, about 20 of these survive and none has more than 5,000 speakers. Most of the original languages shared very similar sound systems. About half of them had just three vowels, another 10% or so had four, and a further 25% or so had a five-vowel system. Only 16% of the world’s languages have a vowel inventory of four or less (the average number is six; some Germanic languages, such as Danish, have 20 or so).

This paper asks why many Australian languages have so few vowels. Our research shows that the vowels of Aboriginal languages are much more ‘squashed down’ in the acoustic space than those of European languages (Fig 1), indicating that the tongue does not come as close to the roof of the mouth as in European languages. The two ‘closest’ vowels are [e] (a sound with the tongue at the front of the mouth, between ‘pit’ and ‘pet’) and [o] (at the back of the mouth with rounded lips, between ‘put’ and ‘pot’). The ‘open’ (low-tongue) vowel is best transcribed [ɐ], a sound between ‘pat’ and ‘putt’, but with a less open jaw. Four- and five-vowel systems squeeze the extra vowels in between these, adding [ɛ] (between ‘pet’ and ‘pat’) and [ɔ] (more or less exactly as in ‘pot’), with little or no expansion of the acoustic space. Thus, the majority of Australian languages lack any true close (high-tongue) vowels (as in ‘peat’ and ‘pool’).
So why do Australian languages have a ‘flattened’ vowel space? The answer may lie in the ears of the speakers rather than in their mouths. Aboriginal Australians have by far the highest prevalence of chronic middle ear infection in the world. Our research with Aboriginal groups of diverse age, language and geographical location shows 30-60% of speakers have a hearing impairment in one or both ears (Fig 2). Nearly all Aboriginal language groups have developed an alternate sign language to complement the spoken one. Our previous analysis has shown that the sound systems of Australian languages resemble those of individual hearing-impaired children in several important ways, leading us to hypothesise that the consonant systems and the word structure of these languages have been influenced by the effects of chronic middle ear infection over generations.

A reduction in the vowel space is another of these resemblances. Middle ear infection affects the low frequency end of the scale (under 500 Hz), thus reducing the prominence of the distinctive lower resonances of close vowels, such as in ‘peat’ and ‘pool’ (Fig 3). It is possible that, over generations, speakers have raised the frequencies of these resonances to make them more hearable, thereby constricting the acoustic space the languages use. If so, we may ask whether, on purely acoustic grounds, communicating in an Aboriginal language in the classroom – using a sound system optimally attuned to the typical hearing profile of the speech community – might offer improved educational outcomes for indigenous children in the early years.

1aPP – The Role of Talker/Vowel Change in Consonant Recognition with Hearing Loss

Ali Abavisani – aliabavi@illinois.edu
Jont B. Allen – jontalle@illinois.edu
Dept. of Electrical and Computer Engineering
University of Illinois at Urbana-Champaign
405 N Mathews Ave
Urbana, IL, 61801

Popular version of paper 1aPP
Presented Monday, May 13, 2019
177th ASA Meeting, Louisville, KY

Hearing loss can have serious impact on social life of individuals experiencing it. The effect of hearing loss becomes more complicated in environments such as restaurants, where the background noise is similar to speech. Although hearing aids in various designs, intend to address these issues, users complain about hearing aids performance in social situations, where they are mostly needed. Part of this problem refers to the nature of hearing aids, which do not use speech as part of design and fitting process. If we somehow incorporate speech sounds in real life conditions into the fitting process of hearing aids, it may be possible to address most of the shortcomings that irritates the users.

There have been many studies on the features that are important in identification of speech sounds such as isolated consonant + vowel (CV) phones (i.e., meaningless speech sound). Most of these studies ran experiments on normal hearing listeners, to identify the effects of different speech features in correct recognition. It turned out that manipulation of speech sounds, such as replacing a vowel, or amplifying/attenuating certain parts of sound in time-frequency domain, leads to identification of new speech sounds by the normal hearing listeners. One goal of current study is to investigate whether there are similar responses to such manipulations from listeners who have hearing loss.

We designed a speech-based test that may be utilized by audiologists to determine susceptible speech phones for each individual with hearing loss. The design includes a perceptual measure that corresponds to speech understanding in background noise, where the noise is similar to speech. The perceptual measure identifies the noise level in which the speech sound is recognizable by an average normal hearing listener, at least with 90% accuracy. The speech sounds within the test include combinations of 14 consonants {p, t, k, f, s, S, b, d, g, v, z, Z, m, n} and four vowels {A, ae, I, E}, to cover different features that are present in speech. All the test sounds have pre-evaluated to make sure they are recognizable by normal hearing listeners in the noise conditions of the experiments. Two sets of sounds named T$_1$ and T$_2$ having same consonant-vowel combinations of sounds but different talkers, had been presented to the listeners at their most comfortable level of hearing (not depending to their specific hearing loss). The two speech sets had distinct perceptual measure. When two sounds with similar perceptual measure, and with the same consonant but different vowel are presented to a listener with hearing loss, their response can show us how their particular hearing function, may cause errors in understanding this particular speech sound, and why this function led to recognition of a specific sound instead of the presented speech. Also, presenting sounds from the two sets constitute the means to compare the role of perceptual measure (which is based on normal hearing listeners), on listeners with hearing loss. When the recognition score for a particular listener increases as the result of a change in presented speech sounds, it is an indication on how the fitting process of hearing aid should follow, regarding that particular (listener, speech sound) pair.

While the study shows that improvement or degradation of the speech sounds are listener dependent, on average 85% of sounds are improved when we replaced the CV with same CV but with a better perceptual measure. Additionally, using CVs with similar perceptual measure, on average 28% of CVs are improved when we replaced the vowel with vowel {A}, 28% of CVs are improved when we replaced the vowel with vowel {E}, 25% of CVs are improved when we replaced the vowel with vowel {ae}, and 19% of CVs are improved when we replaced the vowel with vowel {I}.

The confusion pattern in each case, provides insight on how these changes affect the phone recognition in each ear. We propose to prescribe hearing aid amplification tailored to individual ears, based on the confusion pattern, the response from change in perceptual measure, and the response from change in vowel.

These tests are directed at the fine-tuning of hearing aid insertion gain, with the ultimate goal of improving speech perception, and to precisely identify when and for what consonants the ear with hearing loss needs treatment to enhance speech recognition.