Improving Headphone Spatialization: Fixing a problem you’ve learned to accept
Muhammad Haris Usmani – firstname.lastname@example.org
Ramón Cepeda Jr. – email@example.com
Thomas M. Sullivan – firstname.lastname@example.org
Bhiksha Raj – email@example.com
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213
Popular version of paper 3aSPb5, “Improving headphone spatialization for stereo music”
Presented Wednesday morning, May 20, 2015, 10:15 AM, Brigade room
169th ASA Meeting, Pittsburgh
The days of grabbing a drink, brushing dust from your favorite record and playing it in the listening room of the house are long gone. Today, with the portability technology has enabled, almost everybody listens to music on their headphones. However, most commercially produced stereo music is mixed and mastered for playback on loudspeakers– this presents a problem for the growing number of headphone listeners. When a legacy stereo mix is played on headphones, all instruments or voices in that piece get placed in between the listener’s ears, inside of their head. This not only is unnatural and fatiguing for the listener, but is detrimental toward the original placement of the instruments in that musical piece. It disturbs the spatialization of the music and makes the sound image appear as three isolated lobes inside of the listener’s head , see Figure 1.
Hard-panned instruments separate into the left and right lobes, while instruments placed at center stage are heard in the center of the head. However, as hearing is a dynamic process that adapts and settles with the perceived sound, we have accepted headphones to sound this way .
In order to improve the spatialization of headphones, the listener’s ears must be deceived into thinking that they are listening to the music inside of a listening room. When playing music in a room, the sound travels through the air, reverberates inside the room, and interacts with the listener’s head and torso before reaching the ears . These interactions add the necessary psychoacoustic cues for perception of an externalized stereo soundstage presented in front of the listener. If this listening room is a typical music studio, the soundstage perceived is close to what the artist intended. Our work tries to place the headphone listener into the sound engineer’s seat inside a music studio to improve the spatialization of music. For the sake of compatibility across different headphones, we try to make minimal changes to the mastering equalization curve of the music.
Since there is a compromise between sound quality and the spatialization that can be presented, we developed three different systems that present different levels of such compromise. We label these as Type-I, Type-II, and Type-0. Type-I focuses on improving spatialization but at the cost of losing some sound quality, Type-II improves spatialization while taking into account that the sound quality is not degraded too much, and Type-0 focuses on refining conventional listening by making the sound image more homogeneous. Since the sound quality is key in music, we will skip over Type-I and focus on the other two systems.
Type-II, consists of a head related transfer function (HRTF) model , room reverberation (synthesized reverb ), and a spectral correction block. HRTFs embody all the complex spatialization cues that exist due to the relative positions of the listener and the source . In our case, a general HRTF model is used which is configured to place the listener at the “sweet spot” in the studio (right and left speakers placed at an angle of 30° from the listener’s head). The spectral correction attempts to keep the original mastering equalization curve as intact as possible.
Type-0, is made up of a side-content crossfeed block and a spectral correction block. Some headphone amps allow crossfeed between the left and right channels to model the fact that when listening to music through loudspeakers, each ear can hear the music from each speaker with a delay attached to the sound originating from the speaker that is furthest away. A shortcoming of conventional crossfeed is that the delay we can apply is limited (to avoid comb filtering) . Side-content crossfeed resolves this by only crossfeeding unique content between the two channels, allowing us to use larger delays. In this system, the side-content is extracted by using a stereo-to-3 upmixer, which is implemented as a novel extension to Nikunen et al.’s upmixer .
These systems were put to the test by conducting a subjective evaluation with 28 participants, all between 18 to 29 years of age. The participants were introduced to the metrics that were being measured in the beginning of the evaluation. Since the first part of the evaluation included specific spatial metrics which are a bit complicated to grasp for untrained listeners, we used a collection of descriptions, diagrams, and/or music excerpts that represented each metric to provide in-evaluation training for the listeners. The results of the first part of the evaluation suggest that this method worked well.
We were able to conclude from the results that Type-II externalized the sounds while performing at a level analogous to the original source in the other metrics and Type-0 was able to improve sound quality and comfort by compromising stereo width when compared to the original source, which is what we expected. Also, there was strong content-dependence observed in the results suggesting that a different setting of improving spatialization must be used with music that’s been produced differently. Overall, two of the three proposed systems in this work are preferred in equal or greater amounts to the legacy stereo mix.
Tags: music, acoustics, design, technology
 G-Sonique, “Monitor MSX5 – Headphone monitoring system,” G-Sonique, 2011. [Online]. Available: http://www.g-sonique.com/msx5headphonemonitoring.html.
 S. Mushendwa, “Enhancing Headphone Music Sound Quality,” Aalborg University – Institute of Media Technology and Engineering Science, 2009.
 C. J. C. H. K. K. Y. J. L. Yong Guk Kim, “An Integrated Approach of 3D Sound Rendering,” Springer-Verlag Berlin Heidelberg, vol. II, no. PCM 2010, p. 682–693, 2010.
 D. Rocchesso, “3D with Headphones,” in DAFX: Digital Audio Effects, Chichester, John Wiley & Sons, 2002, pp. 154-157.
 P. E. Roos, “Samplicity’s Bricasti M7 Impulse Response Library v1.1,” Samplicity, [Online]. Available: http://www.samplicity.com/bricasti-m7-impulse-responses/.
 R. O. Duda, “3-D Audio for HCI,” Department of Electrical Engineering, San Jose State University, 2000. [Online]. Available: http://interface.cipic.ucdavis.edu/sound/tutorial/. [Accessed 15 4 2015].
 J. Meier, “A DIY Headphone Amplifier With Natural Crossfeed,” 2000. [Online]. Available: http://headwize.com/?page_id=654.
 J. Nikunen, T. Virtanen and M. Vilermo, “Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization,” Journal of the AES, vol. 60, no. 10, pp. 794-806, October 2012.
Can a spider “sing”? If so, who might be listening?
Alexander L. Sweger – firstname.lastname@example.org
George W. Uetz – email@example.com
University of Cincinnati
Department of Biological Sciences
2600 Clifton Ave, Cincinnati OH 45221
Popular version of paper 4pAB3, “the potential for acoustic communication in the ‘purring’ wolf spider’
Presented Thursday afternoon, May 21, 2015, 2:40 PM, Rivers room
169th ASA Meeting, Pittsburgh
While we are familiar with a wide variety of animals that use sound to communicate- birds, frogs, crickets, etc.- there are thousands of animal species that use vibration as their primary means of communication. Since sound and vibration are physically very similar, the two are inextricable connected, but biologically they are still somewhat separate modes of communication. Within the field of bioacoustics, we are beginning to fully realize how prevalent vibration is as a mode of animal communication, and how interconnected vibration and sound are for many species.
Wolf spiders are one group that heavily utilizes vibration as a means of communication, and they have very sensitive structures for “listening” to vibrations. However, despite the numerous vibrations that are involved in spider communication, they are not known for creating audible sounds. While a lot of species that use vibration will simultaneously use airborne sound, spiders do not possess structures for hearing sound, and it is generally assumed that they do not use acoustic communication in conjunction with vibration.
The “purring” wolf spider (Gladicosa gulosa) may be a unique exception to this assumption. Males create vibrations when they communicate with potential mates in a manner very similar to other wolf spider species, but unlike other wolf spider species, they also create airborne sounds during this communication. Both the vibrations and the sounds produced by this species are of higher amplitude than other wolf spider species, both larger and smaller, meaning this phenomenon is independent of species size. While other acoustically communicating species like crickets and katydids have evolved structures for producing sound, these spiders are vibrating structures in their environment (dead leaves) to create sound. Since we know spiders do not possess typical “ears” for hearing these sounds, we are interested in finding out if females or other males are able to use these sounds in communication. If they do, then this species could be used as an unusual model for the evolution of acoustic communication.
Figure 1: An image of a male “purring” wolf spider, Gladicosa gulosa, and the spectrogram of his accompanied vibration. Listen to a recording of the vibration here,
and the accompanying sound here.
Our work has shown that the leaves themselves are vital to the use of acoustic communication in this species. Males can only produce the sounds when they are on a surface that vibrates (like a leaf) and females will only respond to the sounds when they are on a similar surface. When we remove the vibration and only provide the acoustic signal, females still show a significant response and males do not, suggesting that the sounds produced by males may play a part in communicating specifically with females.
So, the next question is- how are females responding to the airborne sound without ears? Despite the relatively low volume of the sounds produced, they can still create a vibration in a very thin surface like a leaf. This creates a complex method of communication- a male makes a vibration in a leaf that creates a sound, which then travels to another leaf and creates a new vibration, which a female can then hear. While relatively “primitive” compared to the highly-evolved acoustic communication in birds, frogs, insects, and other species, this unique usage of the environment may create opportunities for studying the evolution of sound as a mode of animal communication.
Brigitte Schulte-Fortkamp – firstname.lastname@example.org
Technical University Berlin
Institute of Fluid Mechanics and Engineering Acoustics
-Psychoacoustics and Noise effects –
10587 Berlin -Germany
Popular version of paper 2aNSa1, “Soundscape as a resource to balance the quality of an acoustic environment”
Tuesday morning, May 19, 2015, 8:35 AM, Commonwealth 1
169th ASA Meeting, Pittsburgh Pennsylvania
Soundscape studies investigate and find increasingly better ways to measure and hone the acoustic environment. Soundscape offers the opportunity for multidisciplinary working, bringing together science, medicine, social studies and the arts – combined, crucially, with analysis, advice and feedback from the ‘users of the space’ as the primary ‘experts’ of any environment – to find creative and responsive solutions for protection of living places and to enhance the quality of life.
The Soundscape concept was introduced as a scope to rethink the evaluation of “noise” and its effects. The challenge was to consider the limits of acoustic measurements and to account for its cultural dimension.
The recent international standard ISO 12913-1 Acoustics — Soundscape —Part 1: Definition and conceptual framework Acoustique – Paysage sonore -Partie 1: Définition et cadre conceptual clarifies soundscape as an “acoustic environment as perceived or experienced and/or understood by a person or people, in context”
Figure 1 — Elements in the perceptual construct of soundscape
Soundscape suggests exploring noise in its complexity and its ambivalence and its approach towards sound to consider the conditions and purposes of its production, perception, and evaluation, to understand evaluation of noise/ sound as a holistic approach.
To discuss the contribution of Soundscape research into the area of Community noise research means to focus on the meaning of sounds and its implicit assessments to contribute to the understanding that the evaluation through perceptual effects is a key issue.
Using the resources- an example-
Soundscape Approach Public Space Perception and Enhancement Drawing on Experience in Berlin
Figure 2- Soundscape Nauener Platz
The concept of development of the open pace relies on the understanding that people living in the chosen are the “real” experts concerning the evaluation of this place according to their expectations and experiences in the respective area. The intention of scientific research here is to learn about the meaning of the noise with respect to people’s living situation and to implement the adequate procedure to open the “black box” of people’s mind.
Therefore, the aim was to get residents involved through workshops to get access to the different social groups.
Figure 3- Partipation and Collaboration
Figure 4- The concept of evaluation
Interdisciplinarity is considered as a must in the soundscape approach. In this case it was concerned with the collaboration of architects, acoustics engineers, environmental health specialists, psychologists, social scientists, and urban developers. The tasks are related to the local individual needs and are open to noise sensitive and other vulnerable groups. It is also concerned with cultural aspects and the relevance of natural soundscapes – sometimes referred to as quiet areas – which is obviously related to the highest level of needs.
Figure 5 – Soundscape – an interactive approach using the resources
Improving local soundscape quality?
Obviously, these new approaches and methods make it possible to learn about the process of perception and evaluation sufficiently as they take into account the context, ambiance, the usual interaction between noise and listener and the multidimensionality of noise perception.
By contrast, conventional methods often reduce the complexity of reality on controllable variables, which supposedly represent the scrutinized object. Furthermore, traditional tests neglect frequently the context-dependency of human perception; they only provide artificial realities and diminish the complexity of perception on merely predetermined values, which do not completely correspond with perceptual authenticity. However, perception and evaluations entirely depend on the respective influences of the acoustic and non-acoustic modifiers.
Following the comments and group discussion and also the results from the narrative interviews it could be defined why people prefer some places over the public place and why not. It also became clear how people experience the noise in the distance from the road and also with respect to social life and social control. One of the most important findings here is how people react to low frequency noise at the public place and how experiences and expectations work together. It becomes obvious that the most wanted sound in this area is based on wishes to escape the road traffic noise through natural sounds.
Figure 6- Selected sounds for audio islands
Reshaping the place based on people’s expertise
Relying on the combined evaluation procedures the place was reshaped installing a gabion wall along one of the main roads and further more audio islands like have been built that integrated the sounds people would like to enjoy when using the place. While the gabion wall protects against noise around the playground, the new installed audio islands provide nature sounds as selected by the people involved in the Soundscape approach.
Figure 7 –Installation of the sounds
Figure 8 – The new place
The process of tuning of urban areas with respect to the expertise of people’s mind and quality of life is related to the strategy of triangulation and provides the theoretical frame with regard to the solution of e.g. the change in an area. In other words: Approaching the field in this holistic manner is generally needed.
An effective and sustainable reduction of the number of highly annoyed people caused by noise is only possible with further scientific endeavors in the area of methods development and research of noise effects. Noise maps providing further information can help to obtain a deeper understanding of noise reactions and can help to reliably identify perception-related hot spots. Psychoacoustic maps are particularly interesting in areas where the noise levels are marginal below the noise level limits and offer an additional interpretation help with respect to the identification of required noise abatement measures.
But, the expertise of people involved will provide meaningful information. Soundwalks as an eligibly instrument for exploring urban areas by minds of the “local experts” as measuring device open a field of data for triangulation. These techniques in combination allow giving meaning to the numbers and values of recordings and their analysis to understand the significance of sound and noise as well as the perception of Soundscapes by its resources.
tags: soundscape, acoustics, people, health
J. Kang, B. Schulte-Fortkamp (editors) Soundscape and the Built Environment CRC Press | Taylor & Francis Group, in print
B. Schulte-Fortkamp, J. Kang (editors) Special Issue on Soundscape, JASA 2012
R. M. Schafer, “The Soundscape. Our sonic environment and the tuning of the world.” Rochester, Vermont: Destiny Books, (1977).
B. Hollstein, “Qualitative approaches to social reality: the search for meaning” in: John Scott & Peter J. Carrington (Eds.): Sage handbook of social network analysis. London/Newe Dehli: Sage. (2012)
R. M. Schafer, “The Book of Noise” (Price Milburn Co., Lee, Wellington, NZ, (1973).
B. Truax, (ed.) „Handbook for Acoustic Ecology” (A.R.C. Publication, Vancouver, (1978).
K. Hiramatsu, “Soundscape: The Concept and Its Significance in Acoustics,” Proc. ICA, Kyoto, 2004.
A. Fiebig, B. Schulte-Fortkamp, K. Genuit, „New options for the determination of environmental noise quality”, 35th International Congress and Exposition on Noise Control Engineering INTER-NOISE 2006, 04.-06.December 2006, Honolulu, HI.
P. Lercher, B. Schulte-Fortkamp, “Soundscape and community noise annoyance in the context of environmental impact assessments,” Proc. INTER-NOISE 2003, 2815-2824, (2003).
B. Schulte-Fortkamp, D. Dubois: (editors) Acta Acustica united with Acustica, Special Issue, Recent advances in Soundscape research, Vol 92 (6), (2006).
R. Klaboe, et. al. „Änderungen in der Klang- und Stadtlandschaft nach Änderung von Straßenverkehrsstraßen im Stadtteil Oslo-Ost“, Fortschritte der Akustik, Oldenburg, (2000).
If the MP’s speeches don’t put you to sleep, at least you should be able to understand what they are saying.
Using state-of-the-art audible simulations, a design team of acousticians, architects and sound system designers is working to ensure that speech within the House of Commons chamber of the Parliament of Canada now in design will be intelligible in either French or English.
The new chamber for the House of Commons is being built in a glass-topped atrium in the courtyard of the West Block building on Parliament Hill in Ottawa. The chamber will be the temporary home of the House of Commons, while their traditional location in the Center Block building is being renovated and restored.
The skylit atrium in the West Block will be about six times the volume of the existing room, resulting in significant challenges for ensuring speech will be intelligibility.
Figure 1: Existing Chamber of the House of Commons, Parliament of Canada
The existing House chamber is 21 meters (70 feet) long, 16 meters (53 feet) wide, and has seats for the current 308 Members of Parliament (to increase to 338 in 2015) and 580 people in the upper gallery that runs around the second level of the room. Most surfaces are wood, although the floor is carpeted, and there is an adjustable curtain at the rear of the MP seating area on both sides of the room. The ceiling is a painted stretched linen canvas over the ceiling 14.7 meters (48.5 feet) above the commons floor, resulting in a room volume of approximately 5000 cubic meters.
The new House chamber is being infilled into an existing courtyard that is 44 meters (145 feet) long, 39 meters (129 feet) wide, and 18 meters (59 feet) high. The meeting space itself will retain the same basic footprint as the existing room, including the upper gallery seating, but will be open to the sound reflective glass roof and stone and glass side walls of the courtyard. In the absence of any acoustic treatments, the high level of reverberant sound would make it very difficult to understand speech in the room.
ARCOP / FGM ARCHITECTS
Figure 2: Early Design Rendering of Chamber in West Block
In order to help the Public Works and Government Services Canada (PWGSC) and the House of Commons understand the acoustic differences between the existing house chamber and the one under design, and to assure them that excellent speech intelligibility will be achieved in the new chamber, Acoustic Distinctions, the New York-based acoustic consultant, created a computer model of both the new and existing house chambers, and performed acoustic tests in the existing chamber. AD also made comparisons of the two room using sophisticated data analysis and tables of data an produced graphs maps of speech intelligibility in each space.
An early design iteration, for example, included significant areas of sound absorptive materials at the sides of the ceiling areas, as well as sound absorptive materials integrated into the branches of the tree-like structure which supports the roof:
Figure 3: Computer Model of Room Finishes
The dark areas of the image show the location of sound absorptive materials, including triangularly-shaped wedges integrated into the structure which supports the roof.
Using a standardized measure of intelligibility, AD estimated a speech quality of 65% using the Speech Transmission Index (STI), a standardized measure of speech intelligibility, where a minimum of 75% was needed to ensure excellent intelligibility.
The computer analysis done by Acoustic Distinctions also produced colorful images relating to the degree of speech intelligibility that was to be expected:
Figure 4: Speech Transmission Index, single person speaking, no reinforcement
Talker at lower left; Listener at lower right
Dark blue to black color indicates fair to good intelligibility
While these numerical and graphical tools were useful in understanding acoustic conditions of the new room, in order to make it easier for the client and design team to appreciate the acoustic recommendations made by the consultant, Acoustic Distinctions also produced computer simulations of speech within the new room, enabling the team to hear the way the new room will sound when complete.
This approach, known as audible simulation or auralization, has been used to analyze a variety of room design options, and as the design progresses, new analysis and simulations are produced.
This first audible simulation is made using the room model shown above. The talker is an MP standing near the center of the bright yellow area in the STI map above. The listener is an MP seated in the opposite corner of the room, where the dark blue to black color confirms the STI value of just less than 0.70, corresponding to “good” intelligibility.
Audio file 1: Speech without Sound System. STI 0.68
(CLICK ON ABOVE LINK TO PLAY WAV FILE)
To increase the intelligibility to values above the 0.75 minimum design goal, we add the sound system, being designed by Engineering Harmonics, to our model. With the sound system operating, STI value are increased for the above talker/speaker pair to 0.85. Speech will sound like this:
Audio file 2: Speech with Sound System. STI 0.85
(CLICK ON ABOVE LINK TO PLAY WAV FILE)
While these examples clearly show the benefit of a speech reinforcement system in the Chamber, the design and client team were not satisifed with the extent of sound absorptive materials in the ceiling of the Chamber that were required to achieve the results of excellent intelligibility. An additional goal was expressed to reduce the total amount of sound absorptive materials in the room, to make the structure and skylight more visible and prominent.
Acoustic Distinctions therefore made changes to the model, strategically removing sound absorptive materials from specific ceiling locations, and reconfiguring the absorptive materials within the upper reaches of the structure supporting the roof. Computer models were again developed, and the resulting images showed that with careful design, excellent intelligibility would be achieved with reduced absorption.
Figure 5: Speech Transmission Index, single person speaking, with sound reinforcement
Talker at upper left; Listener at lower right
Bright pink to red color indicates excellent intelligibility
Not surprisingly, communicating this to the design team and House of Commons in a way that provided a high level of confidence in the results was required. We again used audible simulations to demonstrate the results:
Audio file 3: Speech with Sound System, reduced absorption. STI 0.82
(CLICK ON ABOVE LINK TO PLAY WAV FILE)
The rendering below shows the space configuration associated with the latest results:
ARCOP / FGM ARCHITECTS
Figure 6: Rendering, House of Commons, West Block, Parliament Hill
Proposed Design Configuration, showing sound absorptive panels
integrated into laylight and structure supporting roof
END OF PAPER
Audible Simulation in the Canadian Parliament The impact of auralization on design decisions for the House of Commons