3pSC10 – Does increasing the playback speed of men’s and women’s voices reduce their intelligibility by the same amount?

Eric M. Johnson – eric.martin.johnson@utah.edu
Sarah Hargus Ferguson – sarah.ferguson@hsc.utah.edu

Department of Communication Sciences and Disorders
University of Utah
390 South 1530 East, Room 1201
Salt Lake City, UT 84112

Popular version of poster 3pSC10, “Gender and rate effects on speech intelligibility.”
Presented Wednesday afternoon, May 25, 2016, 1:00, Salon G
171st ASA Meeting, Salt Lake City

Older adults seeking hearing help often report having an especially hard time understanding women’s voices. However, this anecdotal observation doesn’t always agree with the findings from scientific studies. For example, Ferguson (2012) found that male and female talkers were equally intelligible for older adults with hearing loss. Moreover, several studies have found that young people with normal hearing actually understand women’s voices better than men’s voices (e.g. Bradlow et al., 1996; Ferguson, 2004). In contrast, Larsby et al. (2015) found that, when listening in background noise, groups of listeners with and without hearing loss were better at understanding a man’s voice than a woman’s voice. The Larsby et al. data suggest that female speech might be more affected by distortion like background noise than male speech is, which could explain why women’s voices may be harder to understand for some people.

We were interested to see if another type of distortion, speeding up the speech, would have an equal effect on the intelligibility of men and women. Speech that has been sped up (or time-compressed) has been shown to be less intelligible than unprocessed speech (e.g. Gordon-Salant & Friedman, 2011), but no studies have explored whether time compression causes an equal loss of intelligibility for male and female talkers. If an increase in playback speed causes women’s speech to be less intelligible than men’s, it could reveal another possible reason why so many older adults with hearing loss report difficulty understanding women’s voices. To this end, our study tested whether the intelligibility of time-compressed speech decreases for female talkers more than it does for male talkers.

Using 32 listeners with normal hearing, we measured how much the intelligibility of two men and two women went down when the playback speed of their speech was increased by 50%. These four talkers were selected based on their nearly equivalent conversational speaking rates. We used digital recordings of each talker and made two different versions of each sentence they spoke: a normal-speed version and a fast version. The software we used allowed us to speed up the recordings without making them sound high-pitched.

Audio sample 1: A sentence at its original speed.

Audio sample 2: The same sentence sped up to 50% faster than its original speed.

All of the sentences were presented to the listeners in background noise. We found that the men and women were essentially equally intelligible when listeners heard the sentences at their original speed. Speeding up the sentences made all of the talkers harder to understand, but the effect was much greater for the female talkers than the male talkers. In other words, there was a significant interaction between talker gender and playback speed. The results suggest that time-compression has a greater negative effect on the intelligibility of female speech than it does on male speech.

johnson & ferguson fig 1

Figure 1: Overall percent correct key-word identification performance for male and female takers in unprocessed and time-compressed conditions. Error bars indicate 95% confidence intervals.

These results confirm the negative effects of time-compression on speech intelligibility and imply that audiologists should counsel the communication partners of their patients to avoid speaking excessively fast, especially if the patient complains of difficulty understanding women’s voices. This counsel may be even more important for the communication partners of patients who experience particular difficulty understanding speech in noise.

 

  1. Bradlow, A. R., Torretta, G. M., and Pisoni, D. B. (1996). “Intelligibility of normal speech I: Global and fine-grained acoustic-phonetic talker characteristics,” Speech Commun. 20, 255-272.
  2. Ferguson, S. H. (2004). “Talker differences in clear and conversational speech: Vowel intelligibility for normal-hearing listeners,” J. Acoust. Soc. Am. 116, 2365-2373.
  3. Ferguson, S. H. (2012). “Talker differences in clear and conversational speech: Vowel intelligibility for older adults with hearing loss,” J. Speech Lang. Hear. Res. 55, 779-790.
  4. Gordon-Salant, S., and Friedman, S. A. (2011). “Recognition of rapid speech by blind and sighted older adults,” J. Speech Lang. Hear. Res. 54, 622-631.
  5. Larsby, B., Hällgren, M., Nilsson, L., and McAllister, A. (2015). “The influence of female versus male speakers’ voice on speech recognition thresholds in noise: Effects of low-and high-frequency hearing impairment,” Speech Lang. Hear. 18, 83-90.

1aAA4 – Optimizing the signal to noise ratio in classrooms using passive acoustics

Peter D’Antonio – pdantonio@rpginc.com

RPG Diffusor Systems, Inc.
651 Commerce Dr
Upper Marlboro, MD 20774

Popular version of paper 1aAA4 “Optimizing the signal to noise ratio in classrooms using passive acoustics”
Presented on Monday May 23, 10:20 AM – 5:00 pm, SALON I
171st ASA Meeting, Salt Lake City

The 2012 Program of International Student Assessment (PISA) has carried out an international comparative trial of student performance in reading comprehension, calculus, and natural science. The US ranks 36th out of 64 countries testing ½ million 15 year olds, as shown in Figure 1.

Dantonio1

Figure 1 PISA Study

What is the problem? Existing acoustical designs and products have not evolved to incorporate the current state-of-the-art and the result is schools that are failing to meet their intended goals. Learning areas are only beginning to include adjustable intensity and color lighting, shown to increase reading speeds, reduce testing errors and reduce hyperactivity; existing acoustical designs are limited to conventional absorptive-only acoustical materials, like thin fabric wrapped panels and acoustical ceiling tiles, which cannot address all of the speech intelligibility and music appreciation challenges.

What is the solution? Adopt modern products and designs for core and ancillary learning spaces which utilize binary, ternary, quaternary and other transitional hybrid surfaces, which simultaneously scatter consonant-containing high frequency early reflections and absorb mid-low frequencies to passively improve the signal to noise ratio, adopt recommendations of ANSI 12.6 to control reverberation, background noise and noise intrusion and integrate lighting that adjusts to the task at hand.

Let’s begin by considering how we hear and understand what is being said when information is being delivered via the spoken word. We often hear people say, I can hear what he or she is saying, but I cannot understand what is being said. The understanding of speech is referred to as speech intelligibility. How do we interpret speech? The ear / brain processor can fill in a substantial amount of missing information in music, but requires more detailed information for understanding speech. The speech power is delivered in the vowels (a, e, i, o, u and sometimes y) which are predominantly in the frequency range of 250Hz to 500Hz. The speech intelligibility is delivered in the consonants (b, c, d, f, g, h, j, k, l, m, n, p, q, r, s, t, v, w), which occur in the 2,000Hz to 6,000 Hz frequency range. People who suffer from noise induced hearing loss typically have a 4,000Hz notch, which causes severe degradation of speech intelligibility. I raise the question, “Why would we want to use exclusively absorption on the entire ceiling of a speech room and thin fabric wrapped panels on a significant proportion of wall areas, when these porous materials absorb these important consonant frequencies and prevents them from fusing with the direct sound making it louder and more intelligible?

Exclusive treatment of absorbing material on the ceiling of the room may excessively reduce the high-frequency consonants sound and result in the masking of high-frequency consonants by low-frequency vowel sounds, thereby reducing the signal to noise ratio (SNR).

The signal has two contributions. The direct line-of-sight sound and the early reflections arriving from the walls, ceiling, floor and people and items in the room. So the signal consists of direct sound and early reflection. Our auditory system, our ears and brain, have a unique ability called temporal fusion, which combines or fuses these two signals into one apparently louder and more intelligible signal. The goal then is to utilize these passive early reflections as efficiently as possible to increase the signal. The denominator in the SNR consists of external noise intrusion, occupant noise, HVAC noise and reverberation. These ideas are summarized in Figure 2.

Dantonio figure2

Figure 2 Signal to Noise Ratio

In Figure 3, we illustrate a concept model for an improved speech environment, whether it is a classroom, a lecture hall, a meeting/conference room, essentially any room in which information is being conveyed.

The design includes a reflective front, because the vertical and horizontal divergence of the consonants is roughly 120 degrees, so if a speaker turns away from the audience, the consonants must reflect from the front wall and ceiling overhead. The perimeter of the ceiling is absorptive to control the reverberation (noise). The center of the ceiling is diffusive to provide early reflections to increase the signal and its coverage in the room. The mid third of the walls utilize novel binary, ternary, quaternary and other transitional diffsorptive (diffusive/absorptive) panels, which scatter the information above 1 kHz (the signal) and absorb the sound below 1 kHz (the reverberation=noise). This design suggests that the current exclusive use of acoustical ceiling tile and traditional fabric wrapped panels is counterproductive in improving the SNR, speech intelligibility and coverage.

Dantonio figure3 - classrooms

Figure 3 Concept model for a classroom with a high SNR

2pSCb11 – Effect of Menstrual Cycle Hormone Variations on Dichotic Listening Results

Richard Morris – Richard.morris@cci.fsu.edu
Alissa Smith

Florida State University
Tallahassee, Florida

Popular version of poster presentation 2pSCb11, “Effect of menstrual phase on dichotic listening”
Presented Tuesday afternoon, November 3, 2015, 3:30 PM, Grand Ballroom 8

How speech is processed by the brain has long been of interest to researchers and clinicians. One method to evaluate how the two sides of the brain work when hearing speech is called a dichotic listening task. In a dichotic listening task two words are presented simultaneously to a participant’s left and right ears via headphones. One word is presented to the left ear and a different one to the right ear. These words are spoken at the same pitch and loudness levels. The listener then indicates what word was heard. If the listener regularly reports hearing the words presented to one ear, then there is an ear advantage. Since most language processing occurs in the left hemisphere of the brain, most listeners attend more closely to the right ear. The regular selection of the word presented to the right ear is termed a right ear advantage (REA).

Previous researchers reported different responses from males and females to dichotic presentation of words. Those investigators found that males more consistently heard the word presented to the right ear and demonstrated a stronger REA. The female listeners in those studies exhibited more variability as to the ear of the word that was heard. Further research seemed to indicate that women exhibit different lateralization of speech processing at different phases of their menstrual cycle. In addition, data from recent studies indicate that the degree to which women can focus on the input to one ear or the other varies with their menstrual cycle.

However, the previous studies used a small number of participants. The purpose of the present study was to complete a dichotic listening study with a larger sample of female participants. In addition, the previous studies focused on women who did not take oral contraceptives as they were assumed to have smaller shifts in the lateralization of speech processing. Although this hypothesis is reasonable, it needs to be tested. For this study, it was hypothesized that the women would exhibit a greater REA during the days that they menstruate than during other days of their menstrual cycle. This hypothesis was based on the previous research reports. In addition, it was hypothesized that the women taking oral contraceptives will exhibit smaller fluctuations in the lateralization of their speech processing.

Participants in the study were 64 females, 19-25 years of age. Among the women 41 were taking oral contraceptives (OC) and 23 were not. The participants listened to the sound files during nine sessions that occurred once per week. All of the women were in good general health and had no speech, language, or hearing deficits.

The dichotic listening task was executed using the Alvin software package for speech perception research. The sound file consisted of consonant-vowel syllables comprised of the six plosive consonants /b/, /d/, /g/, /p/, /t/, and /k/ paired with the vowel “ah”. The listeners heard the syllables over stereo headphones. Each listener set the loudness of the syllables to a comfortable level.

At the beginning of the listening session, each participant wrote down the date of the initiation of her most recent menstrual period on a participant sheet identified by her participant number. Then, they heard the recorded syllables and indicated the consonant heard by striking that key on the computer keyboard. Each listening session consisted of three presentations of the syllables. There were different randomizations of the syllables for each presentation. In the first presentation, the stimuli will be presented in a non-forced condition. In this condition the listener indicted the plosive that she heard most clearly. After the first presentation, the experimental files were presented in a manner referred to as a forced left or right condition. In these two conditions the participant was directed to focus on the signal in the left or right ear. The sequence of focus on signal to the left ear or to the right ear was counterbalanced over the sessions.

The statistical analyses of the listeners’ responses revealed that no significant differences occurred between the women using oral contraceptives and those who did not. In addition, correlations between the day of the women’s menstrual cycle and their responses were consistently low. However, some patterns did emerge for the women’s responses across the experimental sessions as opposed to the days of their menstrual cycle. The participants in both groups exhibited a higher REA and lower percentage of errors for the final sessions in comparison to earlier sessions.

The results from the current subjects differ from those previously reported. Possibly the larger sample size of the current study, the additional month of data collection, or the data recording method affected the results. The larger sample size might have better represented how most women respond to dichotic listening tasks. The additional month of data collection may have allowed the women to learn how to respond to the task and then respond in a more consistent manner. The short data collection period may have confused the learning to respond to a novel task with a hormonally dependent response. Finally, previous studies had the experimenter record the subjects’ responses. That method of data recording may have added bias to the data collection. Further studies with large data sets and multiple months of data collection are needed to determine any sex and oral contraceptive use effects on REA.

5aMU1 – The inner ear as a musical instrument

Brian Connolly – bconnolly1987@gmail.com
Music Department
Logic House
South Campus
Maynooth University
Co. Kildare
Ireland

Popular version of paper 5aMU1, “The inner ear as a musical instrument”
Presented Friday morning, November 6, 2015, 8:30 AM, Grand Ballroom 2
170th ASA meeting Jacksonville
See also: The inner ear as a musical instrument – POMA

(please use headphones for listening to all audio samples)

Did you know that your ears could sing? You may be surprised to hear that they, in fact, have the capacity to make particularly good performers and recent psychoacoustics research has revealed the true potential of the ears within musical creativity. ‘Psychoacoustics’ is loosely defined as the study of the perception of sound.

Figure 1: The Ear

inner ear

A good performer can carry out required tasks reliably and without errors. In many respects the very straight-forward nature of the ear’s responses to certain sounds results in the ear proving to be a very reliable performer as its behaviour can be predicted and so it is easily controlled. In the context of the listening system, the inner ear has the ability to behave as a highly effective instrument which can create its own sounds that many experimental musicians have been using to turn the listeners’ ears into participating performers in the realization of their music.

One of the most exciting avenues of musical creativity is the psychoacoustic phenomenon known as otoacoustic emissions. These are tones which are created within the inner ear when it is exposed to certain sounds. One such example of these emissions is ‘difference tones.’ When two clear frequencies enter the ear at, say 1,000Hz and 1,200Hz the listener will hear these two tones, as expected, but the inner ear will also create its own third frequency at 200Hz because this is the mathematical difference between the two original tones. The ear literally sends a 200Hz tone back out in reverse through the ear and this sound can be detected by an in-ear microphone, a process which doctors carrying out hearing tests on babies use as an integral part of their examinations. This means that composers can create certain tones within their work and predict that the listeners’ ears will also add their extra dimension to the music upon hearing it. Within certain loudness and frequency ranges, the listeners will also be able to feel their ears buzzing in response to these stimulus tones! This makes for a very exciting and new layer to contemporary music making and listening.

First listen to this tone. This is very close to the sound your ear will sing back during the second example.

Insert – 200.mp3

Here is the second sample containing just two tones at 1,000Hz and 1,200Hz. See if you can also hear the very low and buzzing difference tone which is not being sent into your ear, it is being created in your ear and sent back out towards your headphones!

Insert – 1000and1200.mp3

If you could hear the 200Hz difference tone in the previous example, have a listen to this much more complex demonstration which will make your ears sing a well known melody. It is important to try to not listen to the louder impulsive sounds and see if you can hear your ears humming along to perform the tune of Twinkle, Twinkle, Little Star at a much lower volume!

(NB: The difference tones will start after about 4 seconds of impulses)

Insert – Twinkle.mp3

Auditory beating is another phenomenon which has caught the interest of many contemporary composers. In the below example you will hear the following: 400Hz in your left ear and 405Hz in your right ear.

First play the below sample by placing the headphones into your ears just one at a time. Not together. You will hear two clear tones when you listen to them separately.

Insert – 400and405beating.mp3

Now try and see what happens when you place them into your ears simultaneously. You will be unable to hear these two tones together. Instead, you will hear a fused tone which beats five times per second. This is because each of your ears are sending electrical signals to the brain telling it what frequency it is responding to but these two frequencies are too close together and so a perceptual confusion occurs resulting in a combined frequency being perceived which beats at a rate which is the same as the mathematical difference between the two tones.

Auditory beating becomes particularly interesting in pieces of music written for surround sound environments when the proximity of the listener to the various speakers plays a key factor and so simply turning one’s head in these scenarios can often entirely change the colour of the sound as different layers of beating will alter the overall timbre of the sound.

So how can all of these be meaningful to composers and listeners alike? The examples shown here are intended to be basic and provide proofs of concept more so than anything else. In the much more complex world of music composition the scope for the employment of such material is seemingly endless. Considering the ear as a musical instrument gives the listener the opportunity to engage with sound and music in a more intimate way than ever before.

Brian Connolly’s compositions which explore such concepts in greater detail can be found at www.soundcloud.com/brianconnolly-1

1pABa2 – Could wind turbine noise interfere with greater prairie chicken (tympanuchus cupido pinnatus) courtship?

Edward J. Walsh – Edward.Walsh@boystown.org
JoAnn McGee – JoAnn.McGee@boystown.org
Boys Town National Research Hospital
555 North 30th St.
Omaha, NE 68131

Cara E. Whalen – carawhalen@gmail.com
Larkin A. Powell – lpowell3@unl.edu
Mary Bomberger Brown – mbrown9@unl.edu
School of Natural Resources
University of Nebraska-Lincoln
Lincoln, NE 68583

Popular version of paper 1pABa2 Hearing sensitivity in the Greater Prairie Chicken
Presented Monday afternoon, May 18, 2015
169th ASA Meeting, Pittsburgh

The Sand Hills ecoregion of central Nebraska is distinguished by rolling grass-stabilized sand dunes that rise up gently from the Ogallala aquifer. The aquifer itself is the source of widely scattered shallow lakes and marshes, some permanent and others that come and go with the seasons.

However, the sheer magnificence of this prairie isn’t its only distinguishing feature. Early on frigid, wind-swept, late-winter mornings, a low pitched hum, interrupted by the occasional dawn song of a Western Meadowlark (Sturnella neglecta) and other songbirds inhabiting the region, is virtually impossible to ignore.

Click here to listen to the hum

The hum is the chorus of the Greater Prairie Chicken (Tympanuchus cupido pinnatus), the communal expression of the courtship song of lekking male birds performing an elaborate testosterone-driven, foot-pounding ballet that will decide which males are selected to pass genes to the next generation; the word “lek” is the name of the so-called “booming” or courtship grounds where the birds perform their wooing displays.

While the birds cackle, whine, and whoop to defend territories and attract mates, it is the loud “booming” call, an integral component of the courtship display that attracts the interest of the bioacoustician – and the female prairie chicken.

The “boom” is an utterance that is carried long distances over the rolling grasslands and wetlands by a narrow band of frequencies ranging from roughly 270 to 325 cycles per second (Whalen et al., 2014). It lasts about 1.9 seconds and is repeated frequently throughout the morning courtship ritual.
Usually, the display begins with a brief but energetic bout of foot stamping or dancing, which is followed by an audible tail flap that gives way to the “boom” itself.

Watch the video clip below to observe the courtship display

For the more acoustically and technologically inclined, a graphic representation of the pressure wave of a “boom,” along with its spectrogram (a visual representation showing how the frequency content of the call changes during the course of the bout) and graphs depicting precisely where in the spectral domain the bulk of the acoustic power is carried is shown in Figure 1. The “boom” is clearly dominated by very low frequencies that are centered on approximately 300 Hz (cycles per second).

FIGURE 1 (file missing): Acoustics Characteristics of the “BOOM”

Vocalization is, of course, only one side of the communication equation. Knowing what these stunning birds can hear is on the other. We are interested in what Greater Prairie Chickens can hear because wind energy developments are encroaching onto their habitat, a condition that makes us question whether noise generated by wind turbines might have the capacity to mask vocal output and complicate communication between “booming” males and attending females.

Step number one in addressing this question is to determine what sounds the birds are capable of hearing – what their active auditory space looks like. The golden standard of hearing tests are behavioral in nature – you know, the ‘raise your hand or press this button if you can hear this sound’ kind of testing. However, this method isn’t very practical in a field setting; you can’t easily ask a Greater Prairie Chicken to raise its hand, or in this case its wing, when it hears the target sound.

To solve this problem, we turn to electrophysiology – to an evoked brain potential that is a measure of the electrical activity produced by the auditory parts of the inner ear and brain in response to sound. The specific test that we settled on is known as the ABR, the auditory brainstem response.

The ABR is a fairly remarkable response that captures much of the peripheral and central auditory pathway in action when short tone bursts are delivered to the animal. Within approximately 5 milliseconds following the presentation of a stimulus, the auditory periphery and brain produce a series of as many as five positive-going, highly reproducible electrical waves. These waves, or voltage peaks, more or less represent the sequential activation of primary auditory centers sweeping from the auditory nerve (the VIIIth cranial nerve), which transmits the responses of the sensory cells of the inner ear rostrally, through auditory brainstem centers toward the auditory cortex.

Greater Prairie Chickens included in this study were captured using nets that were placed on leks in the early morning hours. Captured birds were transported to a storage building that had been reconfigured into a remote auditory physiology lab where ABRs were recorded from birds positioned in a homemade, sound attenuating space – an acoustic wedge-lined wooden box.

FIGURE 2 (file missing): ABR Waveforms

The waveform of the Greater Prairie Chicken ABR closely resembles ABRs recorded from other birds – three prominent positive-going electrical peaks, and two smaller amplitude waves that follow, are easily identified, especially at higher levels of stimulation. In Figure 2, ABR waveforms recorded from an individual bird in response to 2.8 kHz tone pips are shown in the left panel and the group averages of all birds studied under the same stimulus conditions are shown in the right panel; the similarity of response waveforms from bird to bird, as indicated in the nearly imperceptible standard errors (shown in gray), testifies to the stability and utility of the tool. As stimulus level is lowered, ABR peaks decrease in amplitude and occur at later time points following stimulus onset.

Since our goal was to determine if Greater Prairie Chickens are sensitive to sounds produced by wind turbines, we generated an audiogram based on level-dependent changes in ABRs representing responses to tone pips spanning much of the bird’s audiometric range (Figure 3). An audiogram is a curve representing the relationship between response threshold (i.e., the lowest stimulus level producing a clear response) and stimulus frequency; in this case, thresholds were averaged across all animals included in the investigation.

FIGURE 3 (file missing): Audiogram and wind turbine noise

As shown in Figure 3, the region of greatest hearing sensitivity is in the 1 to 4 kHz range and thresholds increase (sensitivity is lost) rapidly at higher stimulus frequencies and more gradually at lower frequencies. Others have shown that ABR threshold values are approximately 30 dB higher than thresholds determined behaviorally in the budgerigar (Melopsittacus undulates) (Brittan-Powell et al., 2002). So, to answer the question posed in this investigation, ABR threshold values were adjusted to estimate behavioral thresholds, and the resulting sensitivity curve was compared with the acoustic output of a wind turbine farm studied by van den Berg in 2006. The finding is clear; wind turbine noise falls well within the audible space of Greater Prairie Chickens occupying booming grounds in the acoustic footprint of active wind turbines.

While findings reported here indicate that Greater Prairie Chickens are sensitive to at least a portion of wind turbine acoustic output, the next question that we plan to address will be more difficult to answer: Does noise propagated from wind turbines interfere with vocal communication among Greater Prairie Chickens courting one another in the Nebraska Sand Hills? Efforts to answer that question are in the works.

tags: chickens, mating, courtship, hearing, Nebraska, wind turbines

References
Brittan-Powell, E.F., Dooling, R.J. and Gleich, O. (2002). Auditory brainstem responses in adult budgerigars (Melopsittacus undulates). J. Acoust. Soc. Am. 112:999-1008.
van den Berg, G.P. (2006). The sound of high winds. The effect of atmospheric stability on wind turbine sound and microphone noise. Dissertation, Groningen University, Groningen, The Netherlands.
Whalen, C., Brown, M.B., McGee, J., Powell, L.A., Smith, J.A. and Walsh, E.J. (2014). The acoustic characteristics of greater prairie-chicken vocalizations. J. Acoust. Soc. Am. 136:2073.