World Hearing Day 2024

The Acoustical Society of America (ASA) takes pride in its mission to generate, disseminate, and promote the knowledge and practical applications of acoustics. This also aligns with one of the objectives of World Hearing Day 2024; to reshape public perceptions surrounding ear and hearing based on accurate, evidence-based information. In support of World Hearing Day[i], we would like to draw attention to a couple Special Issues of the Journal of the Acoustical Society of America (JASA) that delve into the clinical and investigational facets of noise-induced hearing disorders.

Noise-Induced Hearing Disorders: Clinical and Investigational Tools
Guest Editors: Colleen G. Le Prell (Liaison Guest Editor), Odile H. Clavier, and Jianxin Bao

This special issue provides valuable insights into cutting-edge clinical and investigational tools designed to sensitively detect noise injury in the cochlea. Emphasizing the importance of sound exposure monitoring and protection, the collection explores tools available for characterizing individual noise hazards and attenuation. Throughout, there is a concentrated focus on the suitability of diverse functional measures for hearing and balance-related clinical trials, including considerations for boothless auditory test technology in decentralized clinical trials. Furthermore, the issue offers guidance on designing clinical trials to prevent noise-induced hearing deficits such as hearing loss and tinnitus.

World Hearing Day JASA special issue

Issue Highlights

Noise-Induced Hearing Loss: Translating Risk from Animal Models to Real-World Environments
Guest Editors: Colleen G. Le Prell, CAPT William J. Murphy, Tanisha L. Hammill, and J. R. Stefanson

Noise-induced hearing loss (NIHL) stands as a common injury for service members and civilian workers exposed to noise. This special issue focuses on translating knowledge from animal models to real-world environments. Contributors delve into the cellular and molecular events in the inner ear post-noise exposure, exploring potential pharmaceutical prevention of NIHL. The collection includes insights into methods and models used during preclinical assessments of investigational new drug agents, as well as information about human populations at risk for NIHL.

World Hearing Day JASA Special Issue

Issue Highlights

Together, these special issues provide an exploration of noise-induced hearing disorders, offering valuable insights and potential solutions for both clinical and real-world settings. Be sure the share this post to make ear and hearing care a reality for all! For more information about World Hearing Day 2024, visit https://www.who.int/campaigns/world-hearing-day/2024.

[i] Due to an unexpected site wide issue, the posting of this content was unfortunately delayed to after March 3, 2024.

4aSC12 – When it comes to recognizing speech, being in noise is like being old

Kristin Van Engen – kvanengen@wustl.edu
Avanti Dey
Nichole Runge
Mitchell Sommers
Brent Spehar
Jonathen E. Peelle

Washington University in St. Louis
1 Brookings Drive
St. Louis, MO 63130

Popular version of paper 4aSC12
Presented Thursday morning, May 10, 2018
175th ASA Meeting, Minneapolis, MN

How hard is it to recognize a spoken word?

Well, that depends. Are you old or young? How is your hearing? Are you at home or in a noisy restaurant? Is the word one that is used often, or one that is relatively uncommon? Does it sound similar to lots of other words in the language?

As people age, understanding speech becomes more challenging, especially in noisy situations like parties or restaurants. This is perhaps unsurprising, given the large proportion of older adults who have some degree of hearing loss. However, hearing measurements do not actually do a very good job of predicting the difficulty a person will have with speech recognition, and older adults tend to do worse than younger adults even when their hearing is good.

We also know that some words are more difficult to recognize than others. Words that are used rarely are more difficult than common words, and words that sound similar to many other words in the language are recognized less accurately than unique-sounding words. Relatively little is known, however, about how these kinds of challenges interact with background noise to affect the process of word recognition or how such effects might change across the lifespan.

In this study, we used eye tracking to investigate how noise and word frequency affect the process of understanding spoken words. Listeners were shown a computer screen displaying four images, and listened the instruction “Click on the” followed by a target word (e.g., “Click on the dog.”). As the speech signal unfolds, the eye tracker records the moment-by-moment direction of the person’s gaze (60 times per second). Since listeners direct their gaze toward the visual information that matches incoming auditory information, this allows us to observe the process of word recognition in real time.

Our results indicate that word recognition is slower in noise than in quiet, slower for low-frequency words than high-frequency words, and slower for older adults than younger adults. Interestingly, young adults were more slowed down by noise than older adults. The main difference, however, was that young adults were considerably faster to recognize words in quiet conditions. That is, word recognition by older adults didn’t differ much from quiet to noisy conditions, but young listeners looked like older listeners when tasked with listening to speech in noise.

1pSC26 – Acoustics and Perception of Charisma in Bilingual English-Spanish

Rosario Signorello – rsignorello@ucla.edu
Department of Head and Neck Surgery
31-20 Rehab Center,
Los Angeles, CA 90095-1794
Phone: +1 (323) 703-9549

Popular version of paper 1pSC26 “Acoustics and Perception of Charisma in Bilingual English-Spanish 2016 United States Presidential Election Candidates”
Presented at the 171st Meeting on Monday May 23, 1:00 pm – 5:00 pm, Salon F, Salt Lake Marriott Downtown at City Creek Hotel, Salt Lake City, Utah,

Charisma is the set of leadership characteristics, such as vision, emotions, and dominance used by leaders to share beliefs, persuade listeners and achieve goals. Politicians use voice to convey charisma and appeal to voters to gain social positions of power. “Charismatic voice” refers to the ensemble of vocal acoustic patterns used by speakers to convey personality traits and arouse specific emotional states in listeners. The ability to manipulate charismatic voice results from speakers’ universal and learned strategies to use specific vocal parameters (such as vocal pitch, loudness, phonation types, pauses, pitch contours, etc.) to convey their biological features and their social image (see Ohala, 1994; Signorello, 2014a, 2014b; Puts et al., 2006). Listeners’ perception of the physical, psychological and social characteristics of the leader is influenced by universal ways to emotionally respond to vocalizations (see Ohala, 1994; Signorello, 2014a, 2014b) combined with specific, culturally-mediated, habits to manifest emotional response in public (Matsumoto, 1990; Signorello, 2014a).

Politicians manipulate vocal acoustic patterns (adapting them to the culture, language, social status, educational background and the gender of the voters) to convey specific types of leadership fulfilling everyone’s expectation of what charisma is. But what happen to leaders’ voice when they use different languages to address voters? This study investigates speeches of bilingual politicians to find out the vocal acoustic differences of leaders speaking in different languages. It also investigates how the acoustical differences in different languages can influence listeners’ perception of type of leadership and the emotional state aroused by leaders’ voices.

We selected vocal samples from two bilingual America-English/American-Spanish politicians that participated to the 2016 United States presidential primaries: Jeb Bush and Marco Rubio. We chose words with similar vocal characteristics in terms of average vocal pitch, vocal pitch range, and loudness range. We asked listeners to rate the type of charismatic leadership perceived and to assess the emotional states aroused by those voices. We finally asked participants how the different vocal patterns would affect their voting preference.

Preliminary statistical analyses show that English words like “terrorism” (voice sample 1) and “security” (voice sample 2), characterized by mid vocal pitch frequencies, wide vocal pitch ranges, and wide loudness ranges, convey an intimidating, arrogant, selfish, aggressive, witty, overbearing, lazy, dishonest, and dull type of charismatic leadership. Listeners from different language and cultural backgrounds also reported these vocal stimuli triggered emotional states like contempt, annoyance, discomfort, irritation, anxiety, anger, boredom, disappointment, and disgust. The listeners who were interviewed considered themselves politically liberal and they responded that they would probably vote for a politician with the vocal characteristics listed above.

Speaker Jeb Bush. Mid vocal pitch frequencies (126 Hz), wide vocal pitch ranges (97 Hz), and wide loudness ranges (35 dB)

Speaker Marco Rubio. Mid vocal pitch frequencies 178 Hz), wide vocal pitch ranges (127 Hz), and wide loudness ranges (30 dB)

Results also show that Spanish words like “terrorismo” (voice sample 3) and “ilegal” (voice sample 4) characterized by an average of mid-low vocal pitch frequencies, mid vocal pitch ranges, and narrow loudness ranges convey a personable, relatable, kind, caring, humble, enthusiastic, witty, stubborn, extroverted, understanding, but also weak and insecure type of charismatic. Listeners from different language and cultural backgrounds also reported these vocal stimuli triggered emotional states like happiness, amusement, relief, and enjoyment. The listeners who were interviewed considered themselves politically liberal and they responded that they would probably vote for a politician with the vocal characteristics listed above.  

Speaker Jeb Bush. Mid-low vocal pitch frequencies (95 Hz), mid vocal pitch ranges (40 Hz), and narrow loudness ranges (17 dB) 

Speaker Marco Rubio. Mid vocal pitch frequencies 146 Hz), wide vocal pitch ranges (75 Hz), and wide loudness ranges (25 dB)

Voice is a very dynamic non-verbal behavior used by politicians to persuade the audience and manipulate voting preference. The results of this study show how acoustic differences in voice convey different types of leadership and arouse differently the emotional states of the listeners. The voice samples studied show how speakers Jeb Bush and Marco Rubio adapt their vocal delivery to audiences of different backgrounds. The two politicians voluntary manipulate their voice parameters while speaking in order to appear as they were endowed of different leadership qualities. The vocal pattern used in English conveys the threatening and dark side of their charisma, inducing the arousal of negative emotions, which triggers a positive voting preference in listeners. The vocal pattern used in English conveys the charming and caring side of their charisma, inducing the arousal of positive emotions, which triggers a negative voting preference in listeners.

The manipulation of voice arouses emotional states that will induce voters to consider a certain type of leadership as more appealing. Experiencing emotions help voters to assess the effectiveness of a political leader. If the emotional arousing matches with voters’ expectation of how a charismatic leader should make them feel then voters would help the charismatic speaker to became their leader.

References
Signorello, R. (2014a). Rosario Signorello (2014). La Voix Charismatique : Aspects Psychologiques et Caractéristiques Acoustiques. PhD Thesis. Université de Grenoble, Grenoble, France and Università degli Studi Roma Tre, Rome, Italy.

Signorello, R. (2014b). The biological function of fundamental frequency in leaders’ charismatic voices. The Journal of the Acoustical Society of America 136 (4), 2295-2295.

Ohala, J. (1984). An ethological perspective on common cross-language utilization of F0 of voice. Phonetica, 41(1):1–16.

Puts, D. A., Hodges, C. R., Cárdenas, R. A. et Gaulin, S. J. C. (2007). Men’s voices as dominance signals : vocal fundamental and formant frequencies influence dominance attributions among men. Evolution and Human Behavior, 28(5):340–344.

2aMU4 – Yelling vs. Screaming in Operatic and Rock Singing

Lisa Popeil – lisa@popeil.com
Voiceworks®
14431 Ventura Blvd #200
Sherman Oaks, CA 91423

Popular version of paper 2aMU4
Presented Tuesday morning, May 24, 2016

There exist a number of ways the human vocal folds can vibrate which create unique sounds used in singing.  The two most common vibrational patterns of the vocal folds are commonly called “chest voice” and “head voice”, with chest voice sounding like speaking or yelling and head voice sounding more flute-like or like screaming on high pitches.  In the operatic singing tradition, men sing primarily in chest voice while women sing primarily in their head voice.  However, in rock singing, men often emit high screams using their head voice while female rock singers use almost exclusively their chest voice for high notes.

Vocal fold vibrational pattern differences are only a part of the story though, since the shaping of the throat, mouth and nose (the vocal tract) play a large part in the perception of the final sound.  That means that head voice can be made to “sound” like chest voice on high screams using vocal tract shaping and only the most experienced listener can determine if the vocal register used was chest or head voice.

Using spectrographic analysis, differences and similarities between operatic and rock singers can be seen.  One similarity between the two is the heightened output of a resonance commonly called “ring”.  This resonance, when amplified by vocal tract shaping, creates a piercing sound that’s perceived by the listener as extremely loud. The amplified ring harmonics can be seen in the 3,000 Hz band in both the male opera sample and in rock singing samples:

MALE OPERA – HIGH B (B4…494 Hz) CHEST VOICEPopeil1  Check Voice SingingFigure 1 MALE ROCK – HIGH E (E5…659 Hz) CHEST VOICEPopeil 2 Chest voice singingFigure 2 MALE ROCK – HIGH G (G5…784 Hz)    HEAD VOICEPopeil 3 Head voice singingFigure 3

Though each of these three male singers exhibit a unique frequency signature and whether singing in chest or head voice, each singer is using the amplified ring strategy in the 3,000Hz range amplify their sound and create excitement.

2aMU5 – Do people find vocal fry in popular music expressive?

Mackenzie Parrott – mackenzie.lanae@gmail.com
John Nix – john.nix@utsa.edu

Popular version of paper 2aMU5, “Listener Ratings of Singer Expressivity in Musical Performance.”
Presented Tuesday, May 24, 2016, 10:20-10:35 am, Salon B/C, ASA meeting, Salt Lake City

Vocal fry is the lowest register of the human voice.  Its distinct sound is characterized by a low rumble interspersed with uneven popping and crackling.  The use of fry as a vocal mannerism is becoming increasingly common in American speech, fueling discussion about the implications of its use and how listeners perceive the speaker [1].  Previous studies have suggested that listeners find vocal fry to be generally unpleasant in women’s speech, but associate it with positive characteristics in men’s speech [2].

As it has become more prevalent, fry has perhaps not surprisingly found its place in many commercial song styles as well.  Many singers are implementing fry as a stylistic device at the onset or offset of a sung tone.  This can be found very readily in popular musical styles, presumably to impact and amplify the emotion that the performer is attempting to convey.

Researchers at the University of Texas at San Antonio conducted a survey to analyze whether listeners’ ratings of a singer’s expressivity in musical samples in two contemporary commercial styles (pop and country) were affected by the presence of vocal fry, and to see if there was a difference in listener ratings according to the singer’s gender.  A male and a female singer recorded musical samples for the study in a noise reduction booth.  As can be seen in the table below, the singers were asked to sing most of the musical selections twice, once using vocal fry at phrase onsets, and once without fry, while maintaining the same vocal quality, tempo, dynamics, and stylization.  Some samples were presented more than one time in the survey portion of the study to test listener reliability.

Song Singer Gender Vocal Mode
(Hit Me) Baby One More Time Female Fry Only
If I Die Young Female With and Without Fry
National Anthem Female With and Without Fry
Thinking Out Loud Male Without Fry Only
Amarillo By Morning Male With and Without Fry
National Anthem Male With and Without Fry

Across all listener ratings of all the songs, the recordings which included vocal fry were rated as being only slightly more expressive than the recordings which contained no vocal fry.  When comparing the use of fry between the male and female singer, there were some differences between the genders.  The listeners rated the samples where the female singer used vocal fry higher (e.g., more expressive) than those without fry, which was surprising considering the negative association with women using vocal fry in speech.  Conversely, the listeners rated the male samples without fry as being more expressive than those with fry. Part of this preference pattern may have also been an indication of the singer; the male singer was much more experienced with pop styles than the female singer, who is primarily classically trained.  The overall expressivity ratings for the male singer were higher than those of the female singer by a statistically significant margin.

There were also listener rating trends between the differing age groups of participants.  Younger listeners drove the gap of preference between the female singer’s performances with fry versus non-fry and the male singer’s performances without fry versus with fry further apart.  Presumably they are more tuned into stylistic norms of current pop singers.  However, this could also imply a gender bias in younger listeners.  The older listener groups rated the mean expressivity of the performers as being lower than the younger listener groups.  Since most of the songs that we sampled are fairly recent in production, this may indicate a generational trend in preference.  Perhaps listeners rate the style of vocal production that is most similar to what they listened to during their young adult years as the most expressive style of singing. These findings have raised many questions for further studies about vocal fry in pop and country music.

 

  1. Anderson, R.C., Klofstad, C.A., Mayew, W.J., Venkatachalam, M. “Vocal Fry May Undermine the Success of Young Women in the Labor Market. “ PLoS ONE, 2014. 9(5): e97506. doi:10.1371/journal.pone.0097506.
  2. Yuasa, I. P. “Creaky Voice: A New Feminine Voice Quality for Young Urban-Oriented Upwardly Mobile American Women.” American Speech, 2010. 85(3): 315-337.