3aSPb5 – Improving Headphone Spatialization: Fixing a problem you’ve learned to accept

Muhammad Haris Usmani – usmani@cmu.edu
Ramón Cepeda Jr. – rcepeda@andrew.cmu.edu
Thomas M. Sullivan – tms@ece.cmu.edu
Bhiksha Raj – bhiksha@cs.cmu.edu
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213

Popular version of paper 3aSPb5, “Improving headphone spatialization for stereo music”
Presented Wednesday morning, May 20, 2015, 10:15 AM, Brigade room
169th ASA Meeting, Pittsburgh

The days of grabbing a drink, brushing dust from your favorite record and playing it in the listening room of the house are long gone. Today, with the portability technology has enabled, almost everybody listens to music on their headphones. However, most commercially produced stereo music is mixed and mastered for playback on loudspeakers– this presents a problem for the growing number of headphone listeners. When a legacy stereo mix is played on headphones, all instruments or voices in that piece get placed in between the listener’s ears, inside of their head. This not only is unnatural and fatiguing for the listener, but is detrimental toward the original placement of the instruments in that musical piece. It disturbs the spatialization of the music and makes the sound image appear as three isolated lobes inside of the listener’s head [1], see Figure 1.

usmani_1

Hard-panned instruments separate into the left and right lobes, while instruments placed at center stage are heard in the center of the head. However, as hearing is a dynamic process that adapts and settles with the perceived sound, we have accepted headphones to sound this way [2].

In order to improve the spatialization of headphones, the listener’s ears must be deceived into thinking that they are listening to the music inside of a listening room. When playing music in a room, the sound travels through the air, reverberates inside the room, and interacts with the listener’s head and torso before reaching the ears [3]. These interactions add the necessary psychoacoustic cues for perception of an externalized stereo soundstage presented in front of the listener. If this listening room is a typical music studio, the soundstage perceived is close to what the artist intended. Our work tries to place the headphone listener into the sound engineer’s seat inside a music studio to improve the spatialization of music. For the sake of compatibility across different headphones, we try to make minimal changes to the mastering equalization curve of the music.

Since there is a compromise between sound quality and the spatialization that can be presented, we developed three different systems that present different levels of such compromise. We label these as Type-I, Type-II, and Type-0. Type-I focuses on improving spatialization but at the cost of losing some sound quality, Type-II improves spatialization while taking into account that the sound quality is not degraded too much, and Type-0 focuses on refining conventional listening by making the sound image more homogeneous. Since the sound quality is key in music, we will skip over Type-I and focus on the other two systems.

Type-II, consists of a head related transfer function (HRTF) model [4], room reverberation (synthesized reverb [5]), and a spectral correction block. HRTFs embody all the complex spatialization cues that exist due to the relative positions of the listener and the source [6]. In our case, a general HRTF model is used which is configured to place the listener at the “sweet spot” in the studio (right and left speakers placed at an angle of 30° from the listener’s head). The spectral correction attempts to keep the original mastering equalization curve as intact as possible.

Type-0, is made up of a side-content crossfeed block and a spectral correction block. Some headphone amps allow crossfeed between the left and right channels to model the fact that when listening to music through loudspeakers, each ear can hear the music from each speaker with a delay attached to the sound originating from the speaker that is furthest away. A shortcoming of conventional crossfeed is that the delay we can apply is limited (to avoid comb filtering) [7]. Side-content crossfeed resolves this by only crossfeeding unique content between the two channels, allowing us to use larger delays. In this system, the side-content is extracted by using a stereo-to-3 upmixer, which is implemented as a novel extension to Nikunen et al.’s upmixer [8].

These systems were put to the test by conducting a subjective evaluation with 28 participants, all between 18 to 29 years of age. The participants were introduced to the metrics that were being measured in the beginning of the evaluation. Since the first part of the evaluation included specific spatial metrics which are a bit complicated to grasp for untrained listeners, we used a collection of descriptions, diagrams, and/or music excerpts that represented each metric to provide in-evaluation training for the listeners. The results of the first part of the evaluation suggest that this method worked well.
We were able to conclude from the results that Type-II externalized the sounds while performing at a level analogous to the original source in the other metrics and Type-0 was able to improve sound quality and comfort by compromising stereo width when compared to the original source, which is what we expected. Also, there was strong content-dependence observed in the results suggesting that a different setting of improving spatialization must be used with music that’s been produced differently. Overall, two of the three proposed systems in this work are preferred in equal or greater amounts to the legacy stereo mix.

Tags: music, acoustics, design, technology

References

[1] G-Sonique, “Monitor MSX5 – Headphone monitoring system,” G-Sonique, 2011. [Online]. Available: http://www.g-sonique.com/msx5headphonemonitoring.html.
[2] S. Mushendwa, “Enhancing Headphone Music Sound Quality,” Aalborg University – Institute of Media Technology and Engineering Science, 2009.
[3] C. J. C. H. K. K. Y. J. L. Yong Guk Kim, “An Integrated Approach of 3D Sound Rendering,” Springer-Verlag Berlin Heidelberg, vol. II, no. PCM 2010, p. 682–693, 2010.
[4] D. Rocchesso, “3D with Headphones,” in DAFX: Digital Audio Effects, Chichester, John Wiley & Sons, 2002, pp. 154-157.
[5] P. E. Roos, “Samplicity’s Bricasti M7 Impulse Response Library v1.1,” Samplicity, [Online]. Available: http://www.samplicity.com/bricasti-m7-impulse-responses/.
[6] R. O. Duda, “3-D Audio for HCI,” Department of Electrical Engineering, San Jose State University, 2000. [Online]. Available: http://interface.cipic.ucdavis.edu/sound/tutorial/. [Accessed 15 4 2015].
[7] J. Meier, “A DIY Headphone Amplifier With Natural Crossfeed,” 2000. [Online]. Available: http://headwize.com/?page_id=654.
[8] J. Nikunen, T. Virtanen and M. Vilermo, “Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization,” Journal of the AES, vol. 60, no. 10, pp. 794-806, October 2012.

4pAAa2 – Uncanny Acoustics: Phantom Instrument Guides at Ancient Chavín de Huántar, Peru

Miriam Kolar, Ph.D. – mkolar@amherst.edu
AC# 2255, PO Box 5000
Architectural Studies Program & Dept. of Music
Amherst College
Amherst, MA 01002

Popular version of paper 4pAAa2. Pututus, Resonance and Beats: Acoustic Wave   Interference Effects at Ancient Chavín de Huántar, Perú
Presented Thursday afternoon, October 30, 2014
168th ASA Meeting, Indianapolis
See also: Archaeoacoustics: Re-Sounding Material Culture

Excavated from Pre-Inca archaeological sites high in the Peruvian Andes, giant conch shell horns known as “pututus” have been discovered far from the tropical sea floor these marine snails once inhabited.

Fig1a_ChavinPututu_inSitu_byJohnRick Chavín de Huántar
Fig. 1a: Excavation of a Chavín pututu at Chavín de Huántar, 2001. Photo by John Rick.

B)
Fig1b_ChavinPututu_MuseoNacChavin_2295_byJLC Chavín de Huántar
C)Fig1c_ChavinPututu_MuseoNacChavin_5976_byJLC Chavín de Huántar

Fig. 1 B-C: Chavín pututus: decorated 3,000-year-old conch shell horns from the Andes, on display at the Peruvian National Museum in Chavín de Huántar. Photos by José Luis Cruzado.

At the 3,000-year-old ceremonial center Chavín de Huántar, carvings on massive stone blocks depict humanoid figures holding and perhaps blowing into the weighty shells. A fragmented ceramic orb depicts groups of conches or pututus separated from spiny oysters by rectilinear divisions on its relief-modeled surface. Fossil sea snail shells are paved into the floor of the site’s Circular Plaza.

Fig2_CircularPlazaPututuPlayers_byJLC&MK
Fig. 2: Depictions of pututus players on facing stones in the Circular Plaza at Chavín. Photo by José Luis Cruzado & Miriam Kolar.

Pututus are the only known musical or sound-producing instruments from Chavín, whose monumental stone architecture was constructed and used over several centuries during the first millennium B.C.E.

Fig. 3 (VIDEO): Chavín’s monumental stone-and-earthen-mortar architecture towers above plazas and encloses kilometers of labyrinthine corridors, room, and canals. Video by José Luis Cruzado and Miriam Kolar, with soundtrack of a Chavín pututu performed by Tito La Rosa in the Museo Nacional Chavin.

How, by whom, and in what cultural contexts were these instruments played at ancient Chavín? What was their significance? How did they sound, and what sonic effects could have been produced between pututus and Chavín’s architecture or landform surroundings? Such questions haunt and intrigue archaeoacousticians, who apply the science of sound to material traces of the ancient past. Acoustic reconstructions of ancient buildings, instruments, and soundscapes can help us connect with our ancestors through experiential analogy. Computer music pioneer Dr. John Chowning and archaeologist Dr. John Rick founded the Chavín de Huántar Archaeological Acoustics Project (https://ccrma.stanford.edu/groups/chavin/) to discover more.

Material traces of past life––such as artifacts of ancient sound-producing instruments and architectural remains––provide data from which to reconstruct ancient sound. Nineteen use-worn Strombus galeatus pututus were unearthed at Chavín in 2001 by Stanford University’s Rick and teams. Following initial sonic evaluation by Rick and acoustician David Lubman (ASA 2002), a comprehensive assessment of their acoustics and playability was made in 2008 by Dr. Perry Cook and researchers based at Stanford’s Center for Computer Research in Music and Acoustics (CCRMA).

Fig4_PRC_Pututu_Measurement_0970_byJLC
Fig. 4: Dr. Perry Cook performs acoustic measurements of the Chavín pututus. Photo by José Luis Cruzado.

Transforming an empty room at the Peruvian National Museum at Chavín into a musical acoustics lab, we established a sounding-tone range for these specific instruments from about 272 Hz to 340 Hz (frequencies corresponding to a few notes ascending from around Middle C on the piano), and charted their harmonic structure.

Fig. 5 (VIDEO): Dr. Perry Cook conducting pututu measurements with Stanford CCRMA team. Video by José Luis Cruzado.

Back at CCRMA, Dr. Jonathan Abel led audio digital signal processing to map their strong directionality, and to track the progression of sound waves through their exponentially spiraling interiors. This data constitutes a digital archive of the shell instrument sonics, and drives computational acoustic models of these so-called Chavín pututus (ASA 2010; Flower World 2012; ICTM 2013).

Where does data meet practice? How could living musicians further inform our study? Cook’s expertise as winds and shells player allowed him to evaluate the Chavín pututus’ playability with respect to a variety of other instruments, and produce a range of articulations. Alongside the acoustic measurement sessions, Peruvian master musician Tito La Rosa offered a performative journey, a meditative ritual beginning and ending with the sound of human breath, the source of pututu sounding. This reverent approach took us away from our laboratory perspectives for a moment, and pushed us to consider not only the performative dynamics of voicing the pututus, but their potential for nuanced sonic expression.

Fig. 6 (VIDEO): Tito La Rosa performs one of the Chavín pututus in the Museo Nacional Chavín. Video by Cobi van Tonder.

When Cook and La Rosa played pututus together, we noted the strong acoustic “beats” that result when shell horns’ similar frequencies constructively and destructively interfere, producing an amplitude variation at a much lower frequency. Some nearby listeners described this as a “warbling” or “throbbing” of the tone, and said they thought that the performers were creating this effect through a performance technique (not so; it’s a well-known acoustic wave-interference phenomenon; see Hartmann 1998: 393-396).

Fig. 7 (VIDEO): José Cruzado and Swiss trombonist Michael Flury demonstrate amplitude “beats” between replica pututus in Chavín’s Doble Ménsula Galley. Video by Miriam Kolar.

If present-day listeners are unaware of an acoustics explanation for a sound effect, how might ancient listeners have understood and attributed such a sound? A pututu player would know that s/he was not articulating this warble, yet would be subject to its strong sensations. How would this visceral experience be interpreted? Might it be experienced as a phantom force?

The observed acoustic beating effect between pututus was so impressive that we sought to reproduce it during our on-site tests of architectural acoustics using replica shell horns. CCRMA Director Dr. Chris Chafe joined us, and he and Rick moved through Chavín’s labyrinthine corridors, blasting and droning pututus in different articulations to identify and excite acoustic resonances in the confined interior “galleries” of the site.

Fig8a_CC_Pututu_Laberintos_2792_byJLC Fig8b_JR_TritonPututu_Laberintos_2718_byJLC

Fig. 8: CCRMA Director Chris Chafe and archaeologist John Rick play replica pututus to test the acoustics of Chavín’s interior galleries. Photos by José Luis Cruzado.

The short reverberation times of Chavín’s interior architecture allow the pututus to be performed as percussive instruments in the galleries (ASA 2008). However, the strong modal resonances of the narrow corridors, alcoves, and rooms also support sustained tonal production, in an acoustically fascinating way. Present-day pututu players have reported the experience of their instruments’ tones being “pulled into tune” with these architectural resonances. This eerie effect is both sonic and sensed, an acoustic experience that is not only heard, but felt through the body, an external force that seemingly influences the way the instrument is played.

Fig. 9 (AUDIO MISSING): Resonant compliance: Discussion of phantom tuning effect as Kolar and Cruzado perform synchronizing replica pututus in the Laberintos Gallery at Chavín. Audio by Miriam Kolar.

From an acoustical science perspective, what could be happening? As is well known from musical acoustics research (e.g., Fletcher and Rossing 1998), shell horns are blown-open lip-reed or lip-valve instruments, terminology that refers to the physical dynamics of their sounding. Mechanically speaking, the instrument player’s lips vibrate (or “buzz”) in collaborative resonance with the oscillations produced within the air column of the pututu’s interior, known in instrument lingo as its “bore”. Novice players may have great difficulty producing sound, or immediately generate a strong tone; there’s not one typical tendency, though producing higher, lower, or sustained tones requires greater control.

Experienced pututu players such as Cook and La Rosa can change their lip vibrations to increase the frequency––and therefore raise the perceived pitch––that the shell horn produces. To drop the pitch below the instrument’s natural sounding tone (the fundamental resonant frequency of its bore), the player can insert a hand in the lip opening, or “bell”, of the shell horn. Instrument players also modify intonation by altering the shape of their vocal tracts. This vocal tract modification is produced intuitively, by “feel”, and may involve several different parts of that complex sound-producing system.

Strong architectural acoustic resonance can “couple”, or join with the air column in the instrument that is also coupled to that of the player’s vocal tract (with the players lips opening and closing in between). When the oscillatory frequencies of the player’s lips, those within the air column of his or her vocal tract, the pututu bore resonance, and the corridor resonance are synchronized, the effect can produce a strong sensation of immersion in the acoustic environment for the performer. The pututu is “tuned” to the architecture: both performer and shell horn are acoustically compliant with the architectural resonance.

When a second pututu player joins the first in the resonant architectural location, both players may share the experience of having their instrument tones guided into tune with the space, yet at the same time, sense the synchrony between their instruments. The closer together the shell openings, the more readily their frequencies will synchronize with each other. As Cook has observed, “if players are really close together, the wavefronts can actually get into the shells, and the lips of the players can phase lock.” (Interview between Kolar & Cook 2011: https://ccrma.stanford.edu/groups/chavin/interview_prc.html).

Fig. 10 (VIDEO): Kolar and Cruzado performing resonance-synchronizing replica pututus in the Laberintos Gallery at Chavín. Video by Miriam Kolar.

From the human interpretive perspective, what might pututu players in ancient Chavín have thought about these seemingly phantom instrument guides? A solo pututu performer who sensed the architectural and instrumental acoustic coupling might understand this effect to be externally driven, but how would s/he attribute the phenomenon? Would it be thought of as embodied by the instrument being played, or as an intervention of an otherworldly power, or an effect of some other aspect of the ceremonial context? Pairs or multiple performers experiencing the resonant pull might attribute the effect to the skill of a powerful lead player, with or without command of supernatural forces. Such interpretations are motivated by archaeological interpretations of Chavín as a cult center or religious site where social hierarchy was developing (Rick 2006).

However these eerie sonics might have been understood by people in ancient Chavín, from an acoustics perspective we can theorize and demonstrate complex yet elegant physical dynamics that are reported to produce strong experiential effects. Chavín’s phantom forces––however their causality might be interpreted––guide the sound of its instruments into resonant synchrony with each other and its architecture.

REFERENCES
Chavín de Huántar Archaeological Acoustics Project: https://ccrma.stanford.edu/groups/chavin/

(ASA 2002): Rick, John W., and David Lubman. “Characteristics and Speculations on the Uses of Strombus Trumpets found at the Ancient Peruvian Center Chavín de Huántar”. (Abstract). In Journal of the Acoustical Society of America 112/5, 2366, 2002.

(ASA 2010): Cook, Perry R., Abel, Jonathan S., Kolar, Miriam A., Huang, Patty, Huopaniemi, Jyri, Rick, John W., Chafe, Chris, and Chowning, John M. “Acoustic Analysis of the Chavín Pututus (Strombus galeatus Marine Shell Trumpets).(Abstract). Journal of the Acoustical Society of America, Vol. 128, No. 2, 359, 2010.

(Flower World 2012): Kolar, Miriam A., with Rick, John W., Cook, Perry R., and Abel, Jonathan S. “Ancient Pututus Contextualized: Integrative Archaeoacoustics at Chavín de Huántar, Perú”. In Flower World – Music Archaeology of the Americas, Vol. 1. Eds. M. Stöckli and A. Both. Ekho VERLAG, Berlin, 2012.

(ICTM 2013): Kolar, Miriam A. “Acoustics, Architecture, and Instruments in Ancient Chavín de Huántar, Perú: An Integrative, Anthropological Approach to Archaeoacoustics and Music Archaeology”. In Music & Ritual: Bridging Material & Living Cultures. Ed. R. Jiménez Pasalodos. Publications of the ICTM Study Group on Music Archaeology, Vol. 1. Ekho VERLAG, Berlin, 2013.

(Hartmann 1998): Hartmann, William M. Signals, Sound, and Sensation. Springer-Verlag, New York, 1998.

(ASA 2008): Abel, Jonathan S., Rick, John W., Huang, Patty P., Kolar, Miriam A., Smith, Julius O. / Chowning, John. “On the Acoustics of the Underground Galleries of Ancient Chavín de Huántar, Peru”. (Abstract). Journal of the Acoustical Society of America, Vol. 123, No. 3, 605, 2008.

(Fletcher and Rossing 1998): Fletcher, Neville H., and Thomas D. Rossing. The Physics of Musical Instruments. Springer-Verlag, New York, 1998.

Kolar and Cook Interview 2011: https://ccrma.stanford.edu/groups/chavin/interview_prc.html

(Rick 2006): Rick, John W. “Chavín de Huántar: Evidence for an Evolved Shamanism”. In Mesas and Cosmologies in the Central Andes (Douglas Sharon, ed.), 101-112. San Diego Museum Papers 44, San Diego, 2006.