Edward J. Walsh – Edward.Walsh@boystown.org
JoAnn McGee – JoAnn.McGee@boystown.org
Boys Town National Research Hospital
555 North 30th St.
Omaha, NE 68131

Cara E. Whalen – carawhalen@gmail.com
Larkin A. Powell – lpowell3@unl.edu
Mary Bomberger Brown – mbrown9@unl.edu
School of Natural Resources
University of Nebraska-Lincoln
Lincoln, NE 68583

Popular version of paper 1pABa2
Presented Monday afternoon, May 18, 2015
169th ASA Meeting, Pittsburgh

The Sand Hills ecoregion of central Nebraska is distinguished by rolling grass-stabilized sand dunes that rise up gently from the Ogallala aquifer. The aquifer itself is the source of widely scattered shallow lakes and marshes, some permanent and others that come and go with the seasons.
However, the sheer magnificence of this prairie isn’t its only distinguishing feature. Early on frigid, wind-swept, late-winter mornings, a low pitched hum, interrupted by the occasional dawn song of a Western Meadowlark (Sturnella neglecta) and other songbirds inhabiting the region, is virtually impossible to ignore.

The hum is the chorus of the Greater Prairie Chicken (Tympanuchus cupido pinnatus), the communal expression of the courtship song of lekking male birds performing an elaborate testosterone-driven, foot-pounding ballet that will decide which males are selected to pass genes to the next generation; the word “lek” is the name of the so-called “booming” or courtship grounds where the birds perform their wooing displays.
While the birds cackle, whine, and whoop to defend territories and attract mates, it is the loud “booming” call, an integral component of the courtship display that attracts the interest of the bioacoustician – and the female prairie chicken.

The “boom” is an utterance that is carried long distances over the rolling grasslands and wetlands by a narrow band of frequencies ranging from roughly 270 to 325 cycles per second (Whalen et al., 2014). It lasts about 1.9 seconds and is repeated frequently throughout the morning courtship ritual.
Usually, the display begins with a brief but energetic bout of foot stamping or dancing, which is followed by an audible tail flap that gives way to the “boom” itself.

For the more acoustically and technologically inclined, a graphic representation of the pressure wave of a “boom,” along with its spectrogram (a visual representation showing how the frequency content of the call changes during the course of the bout) and graphs depicting precisely where in the spectral domain the bulk of the acoustic power is carried is shown in Figure 1. The “boom” is clearly dominated by very low frequencies that are centered on approximately 300 Hz (cycles per second).

Vocalization is, of course, only one side of the communication equation. Knowing what these stunning birds can hear is on the other.
We are interested in what Greater Prairie Chickens can hear because wind energy developments are encroaching onto their habitat, a condition that makes us question whether noise generated by wind turbines might have the capacity to mask vocal output and complicate communication between “booming” males and attending females.
Step number one in addressing this question is to determine what sounds the birds are capable of hearing – what their active auditory space looks like. The golden standard of hearing tests are behavioral in nature – you know, the ‘raise your hand or press this button if you can hear this sound’ kind of testing. However, this method isn’t very practical in a field setting; you can’t easily ask a Greater Prairie Chicken to raise its hand, or in this case its wing, when it hears the target sound.
To solve this problem, we turn to electrophysiology – to an evoked brain potential that is a measure of the electrical activity produced by the auditory parts of the inner ear and brain in response to sound. The specific test that we settled on is known as the ABR, the auditory brainstem response.
The ABR is a fairly remarkable response that captures much of the peripheral and central auditory pathway in action when short tone bursts are delivered to the animal. Within approximately 5 milliseconds following the presentation of a stimulus, the auditory periphery and brain produce a series of as many as five positive-going, highly reproducible electrical waves. These waves, or voltage peaks, more or less represent the sequential activation of primary auditory centers sweeping from the auditory nerve (the VIIIth cranial nerve), which transmits the responses of the sensory cells of the inner ear rostrally, through auditory brainstem centers toward the auditory cortex.
Greater Prairie Chickens included in this study were captured using nets that were placed on leks in the early morning hours. Captured birds were transported to a storage building that had been reconfigured into a remote auditory physiology lab where ABRs were recorded from birds positioned in a homemade, sound attenuating space – an acoustic wedge-lined wooden box.

The waveform of the Greater Prairie Chicken ABR closely resembles ABRs recorded from other birds – three prominent positive-going electrical peaks, and two smaller amplitude waves that follow, are easily identified, especially at higher levels of stimulation. In Figure 2, ABR waveforms recorded from an individual bird in response to 2.8 kHz tone pips are shown in the left panel and the group averages of all birds studied under the same stimulus conditions are shown in the right panel; the similarity of response waveforms from bird to bird, as indicated in the nearly imperceptible standard errors (shown in gray), testifies to the stability and utility of the tool. As stimulus level is lowered, ABR peaks decrease in amplitude and occur at later time points following stimulus onset.
Since our goal was to determine if Greater Prairie Chickens are sensitive to sounds produced by wind turbines, we generated an audiogram based on level-dependent changes in ABRs representing responses to tone pips spanning much of the bird’s audiometric range (Figure 3). An audiogram is a curve representing the relationship between response threshold (i.e., the lowest stimulus level producing a clear response) and stimulus frequency; in this case, thresholds were averaged across all animals included in the investigation.

As shown in Figure 3, the region of greatest hearing sensitivity is in the 1 to 4 kHz range and thresholds increase (sensitivity is lost) rapidly at higher stimulus frequencies and more gradually at lower frequencies. Others have shown that ABR threshold values are approximately 30 dB higher than thresholds determined behaviorally in the budgerigar (Melopsittacus undulates) (Brittan-Powell et al., 2002). So, to answer the question posed in this investigation, ABR threshold values were adjusted to estimate behavioral thresholds, and the resulting sensitivity curve was compared with the acoustic output of a wind turbine farm studied by van den Berg in 2006. The finding is clear; wind turbine noise falls well within the audible space of Greater Prairie Chickens occupying booming grounds in the acoustic footprint of active wind turbines.
While findings reported here indicate that Greater Prairie Chickens are sensitive to at least a portion of wind turbine acoustic output, the next question that we plan to address will be more difficult to answer: Does noise propagated from wind turbines interfere with vocal communication among Greater Prairie Chickens courting one another in the Nebraska Sand Hills? Efforts to answer that question are in the works.

Presentation #1pABa2 “Hearing sensitivity in the Greater Prairie Chicken” by Edward J. Walsh, Cara Whalen, Larkin Powell, Mary B. Brown, and JoAnn McGee will be take place on Monday, May 18, 2015, at 1:15 PM in the Rivers room at the Wyndham Grand Pittsburgh Downtown Hotel. The abstract can be found by searching for the presentation number here:

tags: chickens, mating, courtship, hearing, Nebraska, wind turbines

Brittan-Powell, E.F., Dooling, R.J. and Gleich, O. (2002). Auditory brainstem responses in adult budgerigars (Melopsittacus undulates). J. Acoust. Soc. Am. 112:999-1008.
van den Berg, G.P. (2006). The sound of high winds. The effect of atmospheric stability on wind turbine sound and microphone noise. Dissertation, Groningen University, Groningen, The Netherlands.
Whalen, C., Brown, M.B., McGee, J., Powell, L.A., Smith, J.A. and Walsh, E.J. (2014). The acoustic characteristics of greater prairie-chicken vocalizations. J. Acoust. Soc. Am. 136:2073.

2pED – Sound education for the deaf and hard of hearing Cameron Vongsawad,Mark Berardi, Kent Gee, Tracianne Neilsen, Jeannette Lawler

Sound education for the deaf and hard of hearing

Cameron Vongsawad – cvongsawad@byu.edu
Mark Berardi – markberardi12@gmail.com
Kent Gee – kentgee@physics.byu.edu
Tracianne Neilsen – tbn@byu.edu
Jeannette Lawler – jeannette_lawler@physics.byu.edu
Department of Physics & Astronomy
Brigham Young University
Provo, Utah 84602

Popular version of paper 2pED, “Development of an acoustics outreach program for the deaf.”
Presented Tuesday Afternoon, May 19, 2015, 1:45 pm, Commonwealth 2
169th ASA Meeting, Pittsburgh

The deaf and hard of hearing have less intuition with sound but are no strangers to the effects of pressure, vibrations, and other basic acoustical principles. Brigham Young University recently expanded their “Sounds to Astound” outreach program (sounds.byu.edu) and developed an acoustics demonstration program for visiting deaf students. The program was designed to help the students connect to a wide variety of acoustical principles through highly visual and kinesthetic demonstrations of sound as well as utilizing the students’ primary language of American Sign Language (ASL).

In science education, the “Hear and See” methodology (Beauchamp 2005) has been shown to be an effective teaching tool in assisting students to internalize new concepts. This sensory-focused approach can be applied to a deaf audience in a different way, the “See and Feel” method. In both, whenever possible students participate in demonstrations to experience the physical principle being taught.

In developing the “See and Feel” approach, a fundamental consideration was to select the principles of sound that were easily communicated using words that exist and are commonly used in ASL. For example, the word “pressure” is common, while the word “wave” is uncommon. Additionally, the sign for “wave” is closely associated with a water wave, which could lead to confusion about the nature of sound as a longitudinal wave. In the absence of an ASL sign for “resonance,” the nature of sound was taught by focusing on the signs for “vibration” and “pressure.” Additional vocabulary, i.e., mode, amplitude, node, antinode, and wave propagation, were presented using classifiers (non-lexical visualizations of gestures and hand shapes) and finger spelling the words. (Sheetz 2012)

Two bilingual teaching approaches were tried to make ASL the primary instruction language while also enabling communication among the demonstrators. In the first approach, the presenter used ASL and spoken English simultaneously. In the second approach, the presenter used only ASL and other interpreters provided the spoken English translation. The second approach proved to be more effective for both the audience and the presenters because it allowed the presenter to focus on describing the principles in the native framework of ASL, resulting in a better presentation flow for the deaf students.

In addition to the tabletop demonstrations (illustrated in the figures), the students were also able to feel sound in BYU’s reverberation chamber as a large subwoofer was operated at resonance frequencies of the room. The students were invited to walk around the room to find where the vibrations felt weakest. In doing so, the students mapped the nodal lines of the wave patterns in the room. In addition, the participants enjoyed standing in the corners of the room, where the sound pressure is eight times as strong and feeling the power of sound vibrations.

The experience of sharing acoustics with the deaf and hard of hearing has been remarkable. We have learned a few lessons about what does and doesn’t work well with regards to the ASL communication, visual instruction, and accessibility of the demos to all participants. Clear ASL communication is key to the success of the event. As described above, it is more effective if the main presenter communicates with ASL and someone else, who understands ASL and physics, provides a verbal interpretation for non-ASL volunteers. Having a fair ratio of interpreters to participants gives individualized voices for each person in attendance throughout the event. Another important consideration is that the ASL presenter needs to be visible to all students at all times. Extra thought is required to illuminate the presenter when the demonstrations require low lighting for maximum visual effect.

Because most of the demonstration traditionally rely on the perception of sound, care must be taken to provide visual instruction about the vibrations for hearing-impaired participants. (Lang 1973, 1981) This required the presenters to think creatively about how to modify demos. Dividing students into smaller groups (3-4 students) allow each student to interact with the demonstrations more closely. (Vongsawad 2014) This hands-on approach will improve the students’ ability to “See & Feel” the principles of sound being illustrated in the demonstrations and benefit more fully from the event.

While a bit hesitant at first, by the end of the event, students were participating more freely, asking questions and excited about what they had learned. They left with a better understanding of principles of acoustics and how sound affects their lives. The primary benefit, however, was providing opportunities for deaf children to see that resources exist at universities for them to succeed in higher education.

We would like to acknowledge support for this work from a National Science Foundation Grant (IIS-1124548) and from the Sorensen Impact Foundation. The visiting students also took part in a research project to develop a technology referred to as “Signglasses” – head-mounted artificial reality displays that could be used to help deaf and hard of hearing students better participate in planetarium shows. We also appreciate the support from the Acoustical Society of America in the development of BYU’s student chapter outreach program, “Sounds to Astound.” This work could not have been completed without the help of the Jean Massieu School of the Deaf in Salt Lake City, Utah.

This video demonstrates the use of ASL as the primary means of communication for students. Communication in their native language improved understanding.

Vongsawad Fig 1 String Vibrations

Figure 1: Vibrations on a string were made to appear “frozen” in time by matching the frequency of a strobe light to the frequency of oscillation, which enhanced the ability of students to analyze the wave properties visually.

Vongsawad Fig 3 SpectrumOscilloscope

Figure 2: The Rubens Tube is another classic physics and acoustics demonstration to show resonance in a pipe. Similarly to the vibrations on a string, but this time being affected by sound waves directly. A speaker is attached to the end of a tube full of propane and the exiting propane that is lit on fire shows the variations in pressure due to the pressure wave caused by the sound in the tube. Here students are able to visualize a variety of sound properties.

Vongsawad Fig 4a LoudCandle

Figure 3: Free spectrum analyzer and oscilloscope software was used to visualize the properties of sound broken up into its derivative parts. Students were encouraged to make sounds by clapping, snapping, using a tuning fork or their voice, and were able to see that sounds made in different ways have different features. It was significant for the hearing-impaired students to see that the noises they made looked similar to everyone else’s.

Vongsawad Fig 4b LoudCandle

Figure 4: A loudspeaker driven at a frequency of 40 Hz was used to first make a candle flame flicker and then blow out as the loudness was increased to demonstrate the power of sound traveling as a pressure wave in the air.

Vongsawad Fig 5b Surface Vibration Speaker

Figure 5: A surface vibration loudspeaker placed on a table was another effective demonstration for the students to feel the sound. Students felt the sound as the surface vibration loudspeaker was placed on a table. Some students placed the surface vibration loudspeaker on their heads for an even more personal experience with sound.

Vongsawad Fig 6 Fogger

Figure 6: Pond foggers use high frequency and high amplitude sound to turn water into fog, or cold water vapor. This demonstration gave students the opportunity to see and feel how powerful sound or vibrations can be. They could also put their fingers close to the fogger and feel the vibrations in the water.

Tags: education, deafness, language


Michael S. Beauchamp, “See me, hear me, touch me: Multisensory integration in lateral occipital-temporal cortex,” Cognitive Neuroscience: Current Opinion in Neurobiology 15, 145-153 (2005).

N. A. Scheetz, Deaf Education in the 21st Century: Topics and Trends (Pearson, Boston, 2012) pp. 152-62.

Cameron T. Vongsawad, Tracianne B. Neilsen, and Kent L. Gee, “Development of educational stations for Acoustical Society of America outreach,” Proc. Mtgs. Acoust. 20, 025003 (2014).

Harry G. Lang, “Teaching Physics to the Deaf,” Phys. Teach. 11, 527 (September 1973).

Harry, G. Lang, “Acoustics for deaf physics students,” Phys. Teach. 11, 248 (April 1981).

3aSA11 – Hollow vs. Foam-filled racket: Feel-good vibrations – Kritika Vayur, Dr. Daniel A. Russell

Hollow vs. Foam-filled racket: Feel-good vibrations

Kritika Vayur – kuv126@psu.edu
Dr. Daniel A. Russell – dar119@psu.edu

Pennsylvania State University
201 Applied Science Building
State College, PA, 16802

Popular version of paper 3aSA11, “Vibrational analysis of hollow and foam-filled graphite tennis rackets”
Presented Wednesday morning, May 20, 2015, 11:15 AM in room Kings 3
169th ASA Meeting, Pittsburgh

Tennis Rackets and Injuries
The typical modern tennis racket has a light-weight, hollow graphite frame with a large head. Though these rackets are easier to swing, there seems to be an increase in the number of players experiencing injuries commonly known as “tennis elbow”. Recently, even notable professional players such as Rafael Nadal, Victoria Azarenka, and Novak Djokovic have withdrawn from tournaments because of wrist, elbow or shoulder injuries.
A recent new solid foam-filled graphite racket design claims to reduce the risk of injury. Previous testing has suggested that these foam-filled rackets are less stiff and damp the vibrations more than hollow rackets, thus reducing the risk of injury and shock delivered to the arm of the player [1]. Figure 1 shows cross-sections of the handles of hollow and foam-filled versions of the same model racket.
The preliminary study reported in this paper was an attempt to identify the vibrational characteristics that might explain why foam-filled rackets improve feel and reduce risk of injury.
Figure 1: Cross-section of the handle of a foam-filled racket (left) and a hollow racket (right).
Damping Rates

The first vibrational characteristic we set out to identify was the damping associated with first few bending and torsional vibrations of the racket frame. A higher damping rate means the unwanted vibration dies away faster and results in a less painful vibration delivered to the hand, wrist, and arm. Previous research on handheld sports equipment (baseball and softball bats and field hockey sticks) has demonstrated that bats and sticks with higher damping feel better and minimize painful sting [2,3,4].

We measured the damping rates of 20 different tennis rackets, by suspending the racket from the handle with rubber bands, striking the racket frame in the head region, and measuring the resulting vibration at the handle using an accelerometer. Damping rates were obtained from the frequency response of the racket using a frequency analyzer. We note that suspending the racket from rubber bands is a free boundary condition, but other research has shown that this free boundary condition more closely reproduces the vibrational behavior of a hand-held racket than does a clamped-handle condition [5,6].

Measured damping rates for the first bending mode, shown in Fig. 2, indicate no difference between the damping and decay rates for hollow and foam-filled graphite rackets. Similar results were obtained for other bending and torsional modes. This result suggests that the benefit of or preference for foam-filled rackets is not due to a higher damping that could cause unwanted vibrations to decay more quickly.

Figure 2: Damping rates of the first bending mode for 20 rackets, hollow (open circles) and foam-filled (solid squares). A higher damping rate means the vibration will have a lower amplitude and will decay more quickly.

Vibrational Mode Shapes and Frequencies
Experimental modal analysis is a common method to determine how the racket vibrates with various mode shapes at its resonance frequencies [7]. In this experiment, two rackets were tested, a hollow and a foam-filled racket of the same make and model. Both rackets were freely suspended by rubber bands, as shown in Fig. 3. An accelerometer, fixed at one location, measured the vibrational response to a force hammer impact at each of approximately 180 locations around the frame and strings of the racket. The resulting Frequency Response Functions for each impact location were post-processed with a modal analysis software to extract vibrational mode shapes and resonance frequencies. An example of the vibrational mode shapes for hollow graphite tennis racket may be found on Dr. Russell’s website.

Figure 3: Modal analysis set up for a freely suspended racket.

Figure 4 compares the first and third bending modes and the first torsional mode for a hollow and foam-filled racket. The only difference between the two rackets is that one was hollow and the other was foam-filled. In the figure, the pink and green regions represent motion in opposite directions, and the white regions indicate regions, called nodes, where no vibration occurs. The sweet spot of a tennis racket is often identified as being at the center of the nodal line of the first bending mode shape in the head region [8]. An impact from an incoming ball at this location results in zero vibration at the handle, and therefore a better “feel” for the player. The data in Fig. 4 shows that there are very few differences between the mode shapes of the hollow and foam-filled rackets. The frequencies at which the mode shapes for the foam-filled rackets occur are slightly higher than those of the hollow rackets, but the difference in shapes are negligible between the two types.

Figure 4: Contour maps representing the out-of-plane vibration amplitude for the first bending (left), first torsional (middle), and third bending (right) modes for a hollow (top) and a foam-filled racket (bottom) of the same make and model.


This preliminary study shows that damping rates for this particular design of foam-filled rackets are not higher than those of hollow rackets. The modal analysis gives a closer, yet non-conclusive, look at the intrinsic properties of the hollow and foam-filled rackets. The benefit of using this racket design is perhaps related to a larger impact shock, but additional testing is needed to discover this conjecture.

Tags: tennis, vibrations, graphite, design
[1] Ferrara, L., & Cohen, A. (2013). A mechanical study on tennis racquets to investigate design factors that contribute to reduced stress and improved vibrational dampening. Procedia Engineering, 60, 397-402.
[2] Russell D.A. (2012). Vibration damping mechanisms for the reduction of sting in baseball bats. In 164th meeting of the Acoustical Society of America, Kansas City, MO, Oct 22-26. Journal of Acoustical Society of America, 132(3) Pt.2, 1893.
[3] Russell, D.A. (2012). Flexural vibration and the perception of sting in hand-held sports implements. In Proceedings of InterNoise 2012, August 19-22, New York City, NY.
[4] Russell, D.A. (2006). Bending modes, damping, and the sensation of string in
baseball bats. In Proceedings 6th IOMAC Conference, 1, 11-16.
[5] Banwell, G.H., Roberts, J.R., & Halkon, B.J. (2014). Understanding the dynamics behavior of a tennis racket under play conditions. Experimental Mechanics, 54, 527-537.
[6] Kotze, J., Mitchell, S.R., & Rothberg, S.J. (2000).The role of the racket in high-speed tennis serves. Sports Engineering, 3, 67-84.
[7] Schwarz, B.J., & Richardson, M.H. (1999). Experimental modal analysis. CSI Reliability Week, 35(1), 1-12.
[8] Cross, R. (2004). Center of percussion of hand-held implements. American Journal of Physics, 72, 622-630.

2aNSa – Soundscapes and human restoration in green urban areas – Irene van Kamp, Elise van Kempen, Hanneke Kruize, Wim Swart

Soundscapes and human restoration in green urban areas
Irene van Kamp, (irene.van.kamp@rivm.nl)
Elise van Kempen,
Hanneke Kruize,
Wim Swart
National Institute for Public Health and the Environment
Pobox 1 Postvak 10
Phone +31629555704

Popular version of paper in session 2aNSa, “Soundscapes and human restoration in green urban areas”
Presented Tuesday morning, May 19, 2015, 9:35 AM, Commonwealth 1
169th ASA Meeting, Pittsburgh

Worldwide there is a revival of interest in the positive effect of landscapes, green and blue space, open countryside on human well-being, quality of life, and health especially for urban dwellers. However, most studies do not account for the influence of the acoustic environment in these spaces both in a negative and positive way. One of the few studies in the field, which was done by Kang and Zhang (2010) identified relaxation, communication, dynamics and spatiality as the key factors in the evaluation of urban soundscapes. Remarkable is their finding that the general public and urban designers clearly value public space very different. The latter had a much stronger preference for natural sounds and green spaces than the lay-observers. Do we as professionals tend to exaggerate the value of green and what characteristics of urban green space are key to health, wellbeing and restoration? And what role does the acoustic quality and accompanying social quality play in this? In his famous studies on livable streets Donald Appleyard concluded that in heavy traffic streets the number of contacts with friends, acquaintances and the amount of social interaction in general was much lower. Also people in busy streets had a tendency to describe their environment as being much smaller than their counterparts in quiet streets did. In other words, the acoustic quality affects not only our wellbeing and behavior but also our sense of territory, social cohesion and social interactions. And this concerns all of us: citing Appleyard “nearly everyone in the world lives in a street”.

There is evidence that green or natural areas/wilderness/ or urban environments with natural elements as well as areas with a high sound quality can intrinsically provide restoration through spending time there. Also merely the knowledge that such quiet and green places are available seems to work as a buffer effect between stress and health (Van Kamp, Klaeboe, Brown, and Lercher, 2015 : in Jian Kang and Brigitte Schulte-Fortkamp (Eds) in press).

Recently a European study was performed into the health effect of access and use of green area in four European cities of varying size in Spain, the UK, Netherlands and Lithuania)

At the four study centers people were selected from neighborhoods with varying levels of socioeconomic status and green and blue space. By means of a structured interview information was gathered about availability, use and importance of green space in the immediate environment as well as the sound quality of favorite green areas used for physical activity, social encounters and relaxation. Data are also available about perceived mental/physical health and medication use. This allowed for analyzing the association between indicators of green, restoration and health, while accounting for perceived soundscapes in more detail. In general there are four mechanisms assumed that lead from green and tranquil space to health: via physical activity, via social interactions and relaxation and finally via reduced levels of traffic related air and noise pollution. This paper will explore the role of sound in the process which leads from access and use of green space to restoration and health. So far this aspect has been understudied. There is some indication that certain areas contribute to restoration more than others. Most studies address the restorative effects of natural recreational areas outside the urban environment. The question is whether natural areas within, and in the vicinity of, urban areas contribute to psycho-physiological and mental restoration after stress as well. Does restoration require the absence of urban noise?



Example of an acoustic environment – a New York City Park – with potential restorative outcomes (Photo: A.L. Brown)

Tags: health, soundscapes, people, environment, green, urban

3aSPb5 – Improving Headphone Spatialization: Fixing a problem you’ve learned to accept

Improving Headphone Spatialization: Fixing a problem you’ve learned to accept

Muhammad Haris Usmani – usmani@cmu.edu
Ramón Cepeda Jr. – rcepeda@andrew.cmu.edu
Thomas M. Sullivan – tms@ece.cmu.edu
Bhiksha Raj – bhiksha@cs.cmu.edu
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213

Popular version of paper 3aSPb5, “Improving headphone spatialization for stereo music”
Presented Wednesday morning, May 20, 2015, 10:15 AM, Brigade room
169th ASA Meeting, Pittsburgh

The days of grabbing a drink, brushing dust from your favorite record and playing it in the listening room of the house are long gone. Today, with the portability technology has enabled, almost everybody listens to music on their headphones. However, most commercially produced stereo music is mixed and mastered for playback on loudspeakers– this presents a problem for the growing number of headphone listeners. When a legacy stereo mix is played on headphones, all instruments or voices in that piece get placed in between the listener’s ears, inside of their head. This not only is unnatural and fatiguing for the listener, but is detrimental toward the original placement of the instruments in that musical piece. It disturbs the spatialization of the music and makes the sound image appear as three isolated lobes inside of the listener’s head [1], see Figure 1.


Hard-panned instruments separate into the left and right lobes, while instruments placed at center stage are heard in the center of the head. However, as hearing is a dynamic process that adapts and settles with the perceived sound, we have accepted headphones to sound this way [2].
In order to improve the spatialization of headphones, the listener’s ears must be deceived into thinking that they are listening to the music inside of a listening room. When playing music in a room, the sound travels through the air, reverberates inside the room, and interacts with the listener’s head and torso before reaching the ears [3]. These interactions add the necessary psychoacoustic cues for perception of an externalized stereo soundstage presented in front of the listener. If this listening room is a typical music studio, the soundstage perceived is close to what the artist intended. Our work tries to place the headphone listener into the sound engineer’s seat inside a music studio to improve the spatialization of music. For the sake of compatibility across different headphones, we try to make minimal changes to the mastering equalization curve of the music.
Since there is a compromise between sound quality and the spatialization that can be presented, we developed three different systems that present different levels of such compromise. We label these as Type-I, Type-II, and Type-0. Type-I focuses on improving spatialization but at the cost of losing some sound quality, Type-II improves spatialization while taking into account that the sound quality is not degraded too much, and Type-0 focuses on refining conventional listening by making the sound image more homogeneous. Since the sound quality is key in music, we will skip over Type-I and focus on the other two systems.
Type-II, consists of a head related transfer function (HRTF) model [4], room reverberation (synthesized reverb [5]), and a spectral correction block. HRTFs embody all the complex spatialization cues that exist due to the relative positions of the listener and the source [6]. In our case, a general HRTF model is used which is configured to place the listener at the “sweet spot” in the studio (right and left speakers placed at an angle of 30° from the listener’s head). The spectral correction attempts to keep the original mastering equalization curve as intact as possible.
Type-0, is made up of a side-content crossfeed block and a spectral correction block. Some headphone amps allow crossfeed between the left and right channels to model the fact that when listening to music through loudspeakers, each ear can hear the music from each speaker with a delay attached to the sound originating from the speaker that is furthest away. A shortcoming of conventional crossfeed is that the delay we can apply is limited (to avoid comb filtering) [7]. Side-content crossfeed resolves this by only crossfeeding unique content between the two channels, allowing us to use larger delays. In this system, the side-content is extracted by using a stereo-to-3 upmixer, which is implemented as a novel extension to Nikunen et al.’s upmixer [8].
These systems were put to the test by conducting a subjective evaluation with 28 participants, all between 18 to 29 years of age. The participants were introduced to the metrics that were being measured in the beginning of the evaluation. Since the first part of the evaluation included specific spatial metrics which are a bit complicated to grasp for untrained listeners, we used a collection of descriptions, diagrams, and/or music excerpts that represented each metric to provide in-evaluation training for the listeners. The results of the first part of the evaluation suggest that this method worked well.
We were able to conclude from the results that Type-II externalized the sounds while performing at a level analogous to the original source in the other metrics and Type-0 was able to improve sound quality and comfort by compromising stereo width when compared to the original source, which is what we expected. Also, there was strong content-dependence observed in the results suggesting that a different setting of improving spatialization must be used with music that’s been produced differently. Overall, two of the three proposed systems in this work are preferred in equal or greater amounts to the legacy stereo mix.

Tags: music, acoustics, design, technology


[1] G-Sonique, “Monitor MSX5 – Headphone monitoring system,” G-Sonique, 2011. [Online]. Available: http://www.g-sonique.com/msx5headphonemonitoring.html.
[2] S. Mushendwa, “Enhancing Headphone Music Sound Quality,” Aalborg University – Institute of Media Technology and Engineering Science, 2009.
[3] C. J. C. H. K. K. Y. J. L. Yong Guk Kim, “An Integrated Approach of 3D Sound Rendering,” Springer-Verlag Berlin Heidelberg, vol. II, no. PCM 2010, p. 682–693, 2010.
[4] D. Rocchesso, “3D with Headphones,” in DAFX: Digital Audio Effects, Chichester, John Wiley & Sons, 2002, pp. 154-157.
[5] P. E. Roos, “Samplicity’s Bricasti M7 Impulse Response Library v1.1,” Samplicity, [Online]. Available: http://www.samplicity.com/bricasti-m7-impulse-responses/.
[6] R. O. Duda, “3-D Audio for HCI,” Department of Electrical Engineering, San Jose State University, 2000. [Online]. Available: http://interface.cipic.ucdavis.edu/sound/tutorial/. [Accessed 15 4 2015].
[7] J. Meier, “A DIY Headphone Amplifier With Natural Crossfeed,” 2000. [Online]. Available: http://headwize.com/?page_id=654.
[8] J. Nikunen, T. Virtanen and M. Vilermo, “Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization,” Journal of the AES, vol. 60, no. 10, pp. 794-806, October 2012.

2aSC – Speech: An eye and ear affair! – Pamela Trudeau-Fisette, Lucie Ménard

Speech: An eye and ear affair!
Pamela Trudeau-Fisette – ptrudeaufisette@gmail.com
Lucie Ménard – menard.lucie@uqam.ca
Université du Quebec à Montréal
320 Ste-Catherine E.
Montréal, H3C 3P8

Popular version of poster session 2aSC, “Auditory feedback perturbation of vowel production: A comparative study of congenitally blind speakers and sighted speakers”
Presented Tuesday morning, May 19, 2015, Ballroom 2, 8:00 AM – 12:00 noon
169th ASA Meeting, Pittsburgh
When learning to speak, young infants and toddlers use auditory and visual cues to correctly associate speech movements to a specific speech sound. In doing so, typically developing children compare their own speech and those of their ambient language to build and improve the relationship between what they hear, see and feel, and how to produce it.

In many day-to-day situations, we exploit the multimodal nature of speech: in noisy environments, for instance like in a cocktail party, we look at our interlocutor’s face and use lip reading to recover speech sounds. When speaking clearly, we open our mouth wider to make ourself sound more intelligible. Sometimes, just seeing someone’s face is enough to communicate!

What happens in cases of congenital blindness? Despite the fact that blind speakers learn to produce intelligible speech, they do not quite speak like sighted speakers do. Since they do not perceive others’ visual cues, blind speakers do not produce visible labial movements as much as their sighted peers do.

Production of the French vowel “ou” (similar as in cool) produced by a sighted adult speaker (on the left) and a congenitally blind adult speaker (on the right). We can clearly see that the articulatory movements of the lips are more explicit for the sighted speaker.

Therefore, blind speakers put more weight on what they hear (auditory feedback) than sighted speakers, because one sensory input is lacking. How does that affect the way blind individuals speak?
To answer this question, we conducted an experiment during which we asked congenitally blind adult speakers and sighted adult speakers to produce multiple repetitions of the French vowel “eu”. While they were producing the 130 utterances, we gradually altered their auditory feedback through headphones – without them knowing it- so that they were not hearing the exact sound they were saying. Consequently, they needed to modify the way they produced the vowel in order to compensate for the acoustic manipulation, so they could hear the vowel they were asked to produce (and the one they thought they were saying all along!).
What we were interested in is whether blind speakers and sighted speakers would react differently to this auditory manipulation. The blind speakers not being able to rely on visual feedback, we hypothesized that they would grant more importance on their auditory feedback and, therefore, compensate to a greater extent for the acoustic manipulation.

To explore this matter, we observed the acoustic (produced sounds) and articulatory (lips and tongue movements) differences between the two groups at three distinct time points of the experiment phases.
As predicted, congenitally blind speakers compensated for the altered auditory feedback in a greater extent than their sighted peers. More specifically, even though both speaker groups adapted their productions, the blind group compensated more than the control group did, as if they were integrating the auditory information more strongly. Also, we found that both speaker groups used different articulatory strategies to respond to the applied manipulation: blind participants used their tongue (which is not visible when you speak) more to compensate. This latter observation is not surprising considering the fact that blind speakers do not use their lips (which is visible when you speak) as much as their sighted peers do.

Tags: speech, language, learning, vision, blindness