1aSP2 – Propagation effects on acoustic particle velocity sensing

Sandra L. Collier – sandra.l.collier4.civ@mail.mil, Max F. Denis, David A. Ligon, Latasha I. Solomon, John M. Noble, W.C. Kirkpatrick Alberts, II, Leng K. Sim, Christian G. Reiff, Deryck D. James
U.S. Army Research Laboratory
2800 Powder Mill Rd
Adelphi, MD 20783-1138

Madeline M. Erikson
U.S. Military Academy
West Point, NY

Popular version of paper 1aSP2, “Propagation effects on acoustic particle velocity sensing”
Presented Monday morning, 7 May 2018, 9:20-9:40 AM, Greenway H/I
175th ASA Meeting Minneapolis, MN

Left: time series of the recorded particle velocity amplitude versus time for propane cannon shots. Right: corresponding spectrogram. Upper: 100 m; lower 400 m.

As a sound wave travels through the atmosphere, it may scatter from atmospheric turbulence. Energy is lost from the forward moving wave, and the once smooth wavefront may have tiny ripples in it if there is weak scattering, or large distortions if there is strong scattering. A significant amount of research has studied the effects of atmospheric turbulence on the sound wave’s pressure field. Past studies of the pressure field have found that strong scattering occurs when there are large turbulence fluctuations and/or the propagation range is long, both with respect to wavelength. This scattering regime is referred to as fully saturated. In the unsaturated regime, there is weak scattering and the atmospheric turbulence fluctuations and/or propagation distance are small with respect to the wavelength. The transition between the two regimes is referred to as partially saturated.

Usually, when people think of a sound wave, they think of the pressure field, after all, human ears are sophisticated pressure sensors. Microphones are pressure sensors. But a sound wave is a mechanical wave described not only by its pressure field, but also by its particle velocity. The objective of our research is to examine the effects of atmospheric turbulence on the particle velocity. Particle velocity sensors (sometimes referred to as vector sensors) in the air are relatively new, and as such, atmospheric turbulence studies have not been conducted before. We do this statistically, as the atmosphere is a random medium.  This means that every time a sound wave propagates, there may be a different outcome – a different path, a change in phase, a change in amplitude. The probability distribution function describes the set of possible outcomes.

The cover picture illustrates a typical transient broadband event (propane cannon) recorded 100 m (upper plots) away from the source. The time series on the left is the recorded particle velocity versus time. The spectrogram on the right is a visualization of the frequency and intensity of the wave through time. The sharp vertical lines across all frequencies are the propane cannon shots. We also see other noise sources: a passing airplane (between 0 and 0.5 minutes) and noise from power lines (horizontal lines). The same shots recorded at the 400 m are shown in the lower plots. We notice right away there are the numerous vertical lines – most probably due to wind noise. Since the sensor is further away, the amplitude of the sound is reduced, the higher frequencies have attenuated, and the signal-to-noise ratio is lower.

The atmospheric conditions (low wind speeds, warm temperatures) led to convectively driven turbulence described by a von Kármán spectrum. Statistically, we found that the particle velocity had similar probability distributions to previous observations of the pressure field with similar atmospheric conditions: unsaturated regime is observed for lower frequencies and shorter ranges; and the saturated regime is observed for higher frequencies and longer ranges. In the figure below (left), the unsaturated regime is seen as a tight collection of points, with little variation in phase (angle along the circle) or amplitude (distance from the center). The beginning of the transition into the partially saturated regime has very little amplitude fluctuations and small phase fluctuations, and the set of observations has the shape of a comma (middle). The saturated regime is when there are large variations in the amplitude and phase, and the set of observations appears to be fully randomized – points everywhere (right).

Scatter plots of the particle velocity for observations over two days (blue – day 1; green – day 2).  From left to right, the scatter plots depict the unsaturated regime, partially saturated regime, and saturated regime.

The propagation environment has numerous other states that we also need to study to have a more complete picture. It is standard practice to benchmark the performance of different microphones, so as to determine sensor limitations and optimal operating conditions.  Similar studies should be done for vector field sensors once new instrumentation is available.  Vector sensors are of importance to the U.S. Army for the detection, localization, and tracking of potential threats in order to provide situational understanding and potentially life-saving technology to our soldiers. The particle velocity sensor we used was just bigger than a pencil. Including the windscreen, it was about a foot in diameter. Compare that to a microphone array that could be meters in size to accomplish the same thing.

Bibliography

  1. Cheinet, M. Cosnefroy, D.K. Wilson, V.E. Ostashev, S.L. Collier and J.E. Cain, “Effets de la turbulence sur des impulsions acoustiques propageant près du sol (Effects of turbulence on acoustic impulses propagating near the ground),” Congrès Français d’Acoustique (French Congress of Acoustics), 11-15 April 2016, Le Mans, France.
  2. Ehrhardt, S. Cheinet, D. Juvé and P. Blanc-Benon, “Evaluating a linearized Euler equations model for strong turbulence effects on sound propagation,” J. Acoust. Soc. Am., 133, 1922-1933 (2013).
  3. L. Collier, “Fisher Information for a Complex Gaussian Random Variable: Beamforming Applications for Wave Propagation in a Random Medium,” IEEE Trans. Sig. Proc. 53, 4236-4248 (2005).
  4. E. Norris, D.K. Wilson and D.W. Thomson, “Correlations Between Acoustic Travel-Time Fluctuations and Turbulence in the Atmospheric Surface Layer,” Acta Acust. Acust., 87, 677-684 (2001).

Acknowledgement:
This research was supported in part by an appointment to the U.S. Army Research Laboratory
Research Associateship Program administered by Oak Ridge Associated Universities.

1pSP10 – Design of an Unmanned Aerial Vehicle Based on Acoustic Navigation Algorithm

Yunmeng Gong1 – (476793382@qq.com)
Huping Xu1 – (hupingxu@126.com)
Yu Hen Hu2 – (yuhen.hu@wisc.edu)
1 School of Logistic Engineering
Wuhan University of Technology
Wuhan, Hubei, China 430063
2Department of Electrical and Computer Engineering
University of Wisconsin – Madison
Madison, WI 53706 USA

Popular version of paper 1pSP10, Design of an Unmanned Aerial Vehicle Based on Acoustic Navigation Algorithm”
Presented on Monday afternoon, December 4, 2017, 4:05-4:20 PM, Salon D
174th ASA Meeting, New Orleans

Acoustic UAV guidance is an enabling technology for future urban UAV transportation systems. When large numbers of commercial UAVs are tasked to deliver goods and services in a metropolitan area, they need to be guided to travel orderly along aerial corridors above streets. They will need to land or take off from designated parking structure and obey “traffic signals” to mitigate potential collisions.

An UAV acoustic guidance system consists of a group of ground stations distributed over the operating region. When the UAV is entering the system, the UAV’s fly path will be under the guidance of a regional air-traffic controller system. The UAV and the controller will communicate via radio channel using wifi or 5G cellular network internet of things protocols. The UAV’s position will be estimated through estimation of the DoA angles of narrow band acoustic signals.

a)acoustic navigation b)acoustic navigation

Figure 1 UAV acoustic guidance system (a) passive mode acoustic guidance system (b) active model acoustic guidance system

As shown in Figure 1, acoustic UAV guidance can operate in a passive self-guidance mode as well as an active guidance mode. In the passive self-guidance mode, beacons with known 3D positions will emit known, distinct narrow-band (harmonic) signals. A UAV will passively receive acoustic signals using an on-board microphone phase array. It will use the acoustic signals so sampled to estimate the direction-of-arrival (DoA) of each beacon harmonic signal. If the UAV is provided with the beacon stations’ 3D coordinates, the UAV will be able to determine its own locations and heading complement those estimated using GPS or inertial guidance systems. The advantage of the passive guidance system is that multiple UAVs can use the same group of beacon stations to estimate their own position. The technical challenge is that each UAV will be mounted with a bulky acoustic phase array; and the received acoustic signal will suffer from strong noise interference due to engine, propeller/rotor, and wind.

Conversely, in an active guidance mode, the UAV will actively emit an omni-directionally transmitted, narrow-band acoustic signal using a harmonic frequency designated by the local air-traffic controller. Each beacon station will use its local acoustic micro-phone phase array to estimate the DoA of the UAV acoustic signal. The UAV’s location, speed, and heading then will be estimated by the local air-traffic controller and transmitted to the UAV. The advantage of the active guidance mode is that the UAV has a lighter payload which consists of an amplified speaker and related circuitry. The disadvantage of this approach is that each UAV within the region needs to be able to generate harmonic signals with a distinct center frequency. As the number of UAVs within the region increases, available acoustic frequencies may be insufficient.

In this paper, we investigate key issues relating to the design and implementation of a passive mode acoustic guidance system. We ask fundamental questions such as what is the effective range of applying acoustic guidance? What are sizes and configurations of the on-board phase array? What is an efficient formulation of a direction of arrival estimation algorithm so that it can be implemented on the computers on-board a UAV?

We conducted on-the-ground experiment and found the sound attenuation as a function of distance and harmonic frequency. The result is shown in Figure 2 below.

Figure 2. Sound attenuation in air as a function of distance for different harmonic frequencies

Using a commercial UAV (DJI Phantom model), we conduct experiments to study the frequency spectrum of sound at different motion states to identify beacon frequencies that may be least interfered by engine sound and noise. An example of the acoustic spectrum during taking off is shown in Figure 3 below.

Figure 3. UAV acoustic noise during take-off

We also developed a simplified direction of arrival estimation algorithm that achieves encouraging accuracy while implemented using a STM32F407 micro-controller that can easily be installed on a UAV.

2pSP6 – Directive and focused acoustic wave radiation by tessellated transducers with folded curvatures

Ryan L. Harne*: harne.3@osu.edu
Danielle T. Lynd: lynd.47@osu.edu
Chengzhe Zou: zou.258@osu.edu
Joseph Crump: crump.1@themetroschool.org
201 W 19th Ave., N350 Scott Lab, Department of Mechanical and Aerospace Engineering, The Ohio State University, Columbus, OH 43210, USA
* Corresponding author

Popular version of paper 2pSP6 presented Mon afternoon, 26 June 2017
173rd ASA Meeting, Boston, Massachusetts, USA

Directed or focused acoustic wave energies are central to many applications, broadly ranging from ultrasound imaging to underwater ecosystem monitoring and to voice and music projection. The interference patterns necessary to realizing such directed or focused waves, guiding the radiated acoustic energy from transducer arrays to locations in space, requires close control over contributions of sound provided from each transducer source. Recent research has revealed advantages of mechanically reconfiguring acoustic transducer constituents along the folding patterns of an origami-inspired tessellation, as opposed to digitally processing signals sent to each element in a fixed configuration [1] [2] [3].

Video: Origami-inspired acoustic solutions. Credit: Harne/Lynd/Zou/Crump

One such proof-of-concept for a foldable, tessellated array of acoustic transducers is shown in Figure 1. We cut a folding pattern into piezoelectric PVDF (type of plastic) film, which is then bonded to a polypropylene plastic substrate scored with the same folding pattern. Rather than control each constituent of the array, as in digital signal processing methods, the singular driving of the whole array and the mechanical reconfiguration of the array by the folding pattern leads to comparable means to guide the acoustic wave energies.

tessellated transducers

Figure 1. Folding pattern for the array, where blue are mountain folds and red are valley folds. The laser cut PVDF is bonded to polypropylene to result in the final proof-of-concept tessellated array prototype shown at right. The baffle fixture is needed to maintain the curvature and fixed-edge boundary conditions during experiments. Credit: Harne/Lynd/Zou/Crump

To date, this concept of foldable, tessellated arrays has exemplified that the transmission of sound in angularly narrow beams, referred to technically as the directionality far field wave radiation, can be adapted by orders of magnitude when the array constituents are driven by the same signal. These arrays can be adapted up to a point dictated by the foldings of a Miura-ori style of tessellated array.

Our research investigates a new form of adaptive acoustic energy delivery from foldable arrays by studying tessellated transducers that adopt folded curvatures, thus introducing opportunity for near field energy focusing alongside the far field directionality.

For instance, Fig. 1 reveals the curvature of the proof-of-concept array of star-shaped transducer components for the partially folded state. This suggests that the array will focus sound energy to a location near the radius of curvature. The outcomes of these computational and experimental efforts find that foldable, tessellated transducers that curve upon folding offer straightforward means for the fine, real-time control needed to beam and focus sound to specific points in space.

Due to the numerous applications of acoustic wave guiding, these concepts could enhance the versatility and multifunctionality of acoustic arrays by a more straightforward mechanical reconfiguration approach that controls the radiated or received wave field. Alternatively, by strategically integrating with digital signal processing methods, future studies might uncover new synergies of performance capabilities by using actively controlled, origami-inspired acoustic arrays.

References

[1] R.L. Harne, D.T. Lynd, Origami acoustics: using principles of folding structural acoustics for simple and large focusing of sound energy, Smart Materials and Structures 25, 085031 (2016).
[2] D.T. Lynd, R.L. Harne, Strategies to predict radiated sound fields from foldable, Miura-ori-based transducers for acoustic beamfolding, The Journal of the Acoustical Society of America 141, 480-489 (2017).
[3] C. Zou, R.L. Harne, Adaptive acoustic energy delivery to near and far fields using foldable, tessellated star transducers, Smart Materials and Structures 26, 055021 (2017).

2aSP5 – Using Automatic Speech Recognition to Identify Dementia in Early Stages

Roozbeh Sadeghian, J. David Schaffer, and Stephen A. Zahorian
Rsadegh1@binghamton.edu
SUNY at Binghamton
Binghamton, NY

Popular version of paper 2aSP5, “Using automatic speech recognition to identify dementia in early stages”
Presented Tuesday morning, November 3, 2015, 10:15 AM, City Terrace room
170th ASA Meeting, Jacksonville, Fl

The clinical diagnosis of Alzheimer’s disease (AD) and other dementias is very challenging, especially in the early stages. It is widely believed to be underdiagnosed, at least partially because of the lack of a reliable non-invasive diagnostic test.  Additionally, recruitment for clinical trials of experimental dementia therapies might be improved with a highly specific test. Although there is much active research into new biomarkers for AD, most of these methods are expensive and or invasive such as brain imaging, often with radioactive tracers, or taking blood or spinal fluid samples and expensive lab procedures.

There are good indications that dementias can be characterized by several aphasias (defects in the use of speech). This seems plausible since speech production involves many brain regions, and thus a disease that effects particular regions involved in speech processing might leave detectable finger prints in the speech. Computerized analysis of speech signals and computational linguistics (analysis of word patterns) have progressed to the point where an automatic speech analysis system could be within reach as a tool for detection of dementia. The long-term goal is an inexpensive, short duration, non-invasive test for dementia; one that can be administered in an office or home by clinicians with minimal training.

If a pilot study (cross sectional design: only one sample from each subject) indicates that suitable combinations of features derived from a voice sample can strongly indicate disease, then the research will move to a longitudinal design (many samples collected over time) where sizable cohorts will be followed so that early indicators might be discovered.

A simple procedure for acquiring speech samples is to ask subjects to describe a picture (see Figure 1). Some such samples are available on the web (DementiaBank), but they were collected long ago and the audio quality is often lacking in quality. We used 140 of these older samples, but also collected 71 new samples with good quality audio. Roughly half of the samples had a clinical diagnosis of probable AD, and the others were demographically similar and cognitively normal (NL).

(a) (b)Sadeghian Figure1b

Figure 1- The picture used for recording samples (a) famous cookie theft samples and (b) newly recorded samples

One hundred twenty eight features were automatically extracted from speech signals, including pauses and pitch variation (indicating emotion); word-use features were extracted from manually-prepared transcripts. In addition, we had the results of a popular cognitive test, the mini mental state exam (MMSE) for all subjects. While widely used as an indicator of cognitive difficulties, the MMSE is not sufficiently diagnostic for dementia by itself. We searched for patterns with and without the MMSE. This gives the possibility of a clinical test that combines speech with the MMSE. Multiple patterns were found using an advanced pattern discovery approach (genetic algorithms with support vector machines). The performances of two example patterns are shown in Figure 2. The training samples (red circles) were used to discover the patterns, so we expect them to perform well. The validation samples (blue) were not used for learning, only to test the discovered patterns. If we say that a subject will be declared AD if the test score is > 0.5 (the red line in Figure 2), we can see some errors: in the left panel we see one false positive (NL case with a high test score, blue triangle) and several false negatives (AD cases with low scores, red circles).  

Sadeghian 2_graphs - Dementia

Figure 2. Two discovered diagnostic patterns (left with MMSE) (right without MMSE). The normal subjects are to the left in each plot (low scores) and the AD subjects to the right (high scores). No perfect pattern has yet been discovered. 

As mentioned above, manually prepared transcripts were used for these results, since automatic speaker-independent speech recognition is very challenging for small highly variable data sets.  To be viable, the test should be completely automatic.  Accordingly, the main emphasis of the research presented at this conference is the design of an automatic speech-to-text system and automatic pause recognizer, taking into account the special features of the type of speech used for this test of dementia.

3aSPb5 – Improving Headphone Spatialization: Fixing a problem you’ve learned to accept

Muhammad Haris Usmani – usmani@cmu.edu
Ramón Cepeda Jr. – rcepeda@andrew.cmu.edu
Thomas M. Sullivan – tms@ece.cmu.edu
Bhiksha Raj – bhiksha@cs.cmu.edu
Carnegie Mellon University
5000 Forbes Avenue
Pittsburgh, PA 15213

Popular version of paper 3aSPb5, “Improving headphone spatialization for stereo music”
Presented Wednesday morning, May 20, 2015, 10:15 AM, Brigade room
169th ASA Meeting, Pittsburgh

The days of grabbing a drink, brushing dust from your favorite record and playing it in the listening room of the house are long gone. Today, with the portability technology has enabled, almost everybody listens to music on their headphones. However, most commercially produced stereo music is mixed and mastered for playback on loudspeakers– this presents a problem for the growing number of headphone listeners. When a legacy stereo mix is played on headphones, all instruments or voices in that piece get placed in between the listener’s ears, inside of their head. This not only is unnatural and fatiguing for the listener, but is detrimental toward the original placement of the instruments in that musical piece. It disturbs the spatialization of the music and makes the sound image appear as three isolated lobes inside of the listener’s head [1], see Figure 1.

usmani_1

Hard-panned instruments separate into the left and right lobes, while instruments placed at center stage are heard in the center of the head. However, as hearing is a dynamic process that adapts and settles with the perceived sound, we have accepted headphones to sound this way [2].

In order to improve the spatialization of headphones, the listener’s ears must be deceived into thinking that they are listening to the music inside of a listening room. When playing music in a room, the sound travels through the air, reverberates inside the room, and interacts with the listener’s head and torso before reaching the ears [3]. These interactions add the necessary psychoacoustic cues for perception of an externalized stereo soundstage presented in front of the listener. If this listening room is a typical music studio, the soundstage perceived is close to what the artist intended. Our work tries to place the headphone listener into the sound engineer’s seat inside a music studio to improve the spatialization of music. For the sake of compatibility across different headphones, we try to make minimal changes to the mastering equalization curve of the music.

Since there is a compromise between sound quality and the spatialization that can be presented, we developed three different systems that present different levels of such compromise. We label these as Type-I, Type-II, and Type-0. Type-I focuses on improving spatialization but at the cost of losing some sound quality, Type-II improves spatialization while taking into account that the sound quality is not degraded too much, and Type-0 focuses on refining conventional listening by making the sound image more homogeneous. Since the sound quality is key in music, we will skip over Type-I and focus on the other two systems.

Type-II, consists of a head related transfer function (HRTF) model [4], room reverberation (synthesized reverb [5]), and a spectral correction block. HRTFs embody all the complex spatialization cues that exist due to the relative positions of the listener and the source [6]. In our case, a general HRTF model is used which is configured to place the listener at the “sweet spot” in the studio (right and left speakers placed at an angle of 30° from the listener’s head). The spectral correction attempts to keep the original mastering equalization curve as intact as possible.

Type-0, is made up of a side-content crossfeed block and a spectral correction block. Some headphone amps allow crossfeed between the left and right channels to model the fact that when listening to music through loudspeakers, each ear can hear the music from each speaker with a delay attached to the sound originating from the speaker that is furthest away. A shortcoming of conventional crossfeed is that the delay we can apply is limited (to avoid comb filtering) [7]. Side-content crossfeed resolves this by only crossfeeding unique content between the two channels, allowing us to use larger delays. In this system, the side-content is extracted by using a stereo-to-3 upmixer, which is implemented as a novel extension to Nikunen et al.’s upmixer [8].

These systems were put to the test by conducting a subjective evaluation with 28 participants, all between 18 to 29 years of age. The participants were introduced to the metrics that were being measured in the beginning of the evaluation. Since the first part of the evaluation included specific spatial metrics which are a bit complicated to grasp for untrained listeners, we used a collection of descriptions, diagrams, and/or music excerpts that represented each metric to provide in-evaluation training for the listeners. The results of the first part of the evaluation suggest that this method worked well.
We were able to conclude from the results that Type-II externalized the sounds while performing at a level analogous to the original source in the other metrics and Type-0 was able to improve sound quality and comfort by compromising stereo width when compared to the original source, which is what we expected. Also, there was strong content-dependence observed in the results suggesting that a different setting of improving spatialization must be used with music that’s been produced differently. Overall, two of the three proposed systems in this work are preferred in equal or greater amounts to the legacy stereo mix.

Tags: music, acoustics, design, technology

References

[1] G-Sonique, “Monitor MSX5 – Headphone monitoring system,” G-Sonique, 2011. [Online]. Available: http://www.g-sonique.com/msx5headphonemonitoring.html.
[2] S. Mushendwa, “Enhancing Headphone Music Sound Quality,” Aalborg University – Institute of Media Technology and Engineering Science, 2009.
[3] C. J. C. H. K. K. Y. J. L. Yong Guk Kim, “An Integrated Approach of 3D Sound Rendering,” Springer-Verlag Berlin Heidelberg, vol. II, no. PCM 2010, p. 682–693, 2010.
[4] D. Rocchesso, “3D with Headphones,” in DAFX: Digital Audio Effects, Chichester, John Wiley & Sons, 2002, pp. 154-157.
[5] P. E. Roos, “Samplicity’s Bricasti M7 Impulse Response Library v1.1,” Samplicity, [Online]. Available: http://www.samplicity.com/bricasti-m7-impulse-responses/.
[6] R. O. Duda, “3-D Audio for HCI,” Department of Electrical Engineering, San Jose State University, 2000. [Online]. Available: http://interface.cipic.ucdavis.edu/sound/tutorial/. [Accessed 15 4 2015].
[7] J. Meier, “A DIY Headphone Amplifier With Natural Crossfeed,” 2000. [Online]. Available: http://headwize.com/?page_id=654.
[8] J. Nikunen, T. Virtanen and M. Vilermo, “Multichannel Audio Upmixing by Time-Frequency Filtering Using Non-Negative Tensor Factorization,” Journal of the AES, vol. 60, no. 10, pp. 794-806, October 2012.

4pAAa2 – Uncanny Acoustics: Phantom Instrument Guides at Ancient Chavín de Huántar, Peru

Miriam Kolar, Ph.D. – mkolar@amherst.edu
AC# 2255, PO Box 5000
Architectural Studies Program & Dept. of Music
Amherst College
Amherst, MA 01002

Popular version of paper 4pAAa2. Pututus, Resonance and Beats: Acoustic Wave   Interference Effects at Ancient Chavín de Huántar, Perú
Presented Thursday afternoon, October 30, 2014
168th ASA Meeting, Indianapolis
See also: Archaeoacoustics: Re-Sounding Material Culture

Excavated from Pre-Inca archaeological sites high in the Peruvian Andes, giant conch shell horns known as “pututus” have been discovered far from the tropical sea floor these marine snails once inhabited.

Fig1a_ChavinPututu_inSitu_byJohnRick Chavín de Huántar
Fig. 1a: Excavation of a Chavín pututu at Chavín de Huántar, 2001. Photo by John Rick.

B)
Fig1b_ChavinPututu_MuseoNacChavin_2295_byJLC Chavín de Huántar
C)Fig1c_ChavinPututu_MuseoNacChavin_5976_byJLC Chavín de Huántar

Fig. 1 B-C: Chavín pututus: decorated 3,000-year-old conch shell horns from the Andes, on display at the Peruvian National Museum in Chavín de Huántar. Photos by José Luis Cruzado.

At the 3,000-year-old ceremonial center Chavín de Huántar, carvings on massive stone blocks depict humanoid figures holding and perhaps blowing into the weighty shells. A fragmented ceramic orb depicts groups of conches or pututus separated from spiny oysters by rectilinear divisions on its relief-modeled surface. Fossil sea snail shells are paved into the floor of the site’s Circular Plaza.

Fig2_CircularPlazaPututuPlayers_byJLC&MK
Fig. 2: Depictions of pututus players on facing stones in the Circular Plaza at Chavín. Photo by José Luis Cruzado & Miriam Kolar.

Pututus are the only known musical or sound-producing instruments from Chavín, whose monumental stone architecture was constructed and used over several centuries during the first millennium B.C.E.

Fig. 3 (VIDEO): Chavín’s monumental stone-and-earthen-mortar architecture towers above plazas and encloses kilometers of labyrinthine corridors, room, and canals. Video by José Luis Cruzado and Miriam Kolar, with soundtrack of a Chavín pututu performed by Tito La Rosa in the Museo Nacional Chavin.

How, by whom, and in what cultural contexts were these instruments played at ancient Chavín? What was their significance? How did they sound, and what sonic effects could have been produced between pututus and Chavín’s architecture or landform surroundings? Such questions haunt and intrigue archaeoacousticians, who apply the science of sound to material traces of the ancient past. Acoustic reconstructions of ancient buildings, instruments, and soundscapes can help us connect with our ancestors through experiential analogy. Computer music pioneer Dr. John Chowning and archaeologist Dr. John Rick founded the Chavín de Huántar Archaeological Acoustics Project (https://ccrma.stanford.edu/groups/chavin/) to discover more.

Material traces of past life––such as artifacts of ancient sound-producing instruments and architectural remains––provide data from which to reconstruct ancient sound. Nineteen use-worn Strombus galeatus pututus were unearthed at Chavín in 2001 by Stanford University’s Rick and teams. Following initial sonic evaluation by Rick and acoustician David Lubman (ASA 2002), a comprehensive assessment of their acoustics and playability was made in 2008 by Dr. Perry Cook and researchers based at Stanford’s Center for Computer Research in Music and Acoustics (CCRMA).

Fig4_PRC_Pututu_Measurement_0970_byJLC
Fig. 4: Dr. Perry Cook performs acoustic measurements of the Chavín pututus. Photo by José Luis Cruzado.

Transforming an empty room at the Peruvian National Museum at Chavín into a musical acoustics lab, we established a sounding-tone range for these specific instruments from about 272 Hz to 340 Hz (frequencies corresponding to a few notes ascending from around Middle C on the piano), and charted their harmonic structure.

Fig. 5 (VIDEO): Dr. Perry Cook conducting pututu measurements with Stanford CCRMA team. Video by José Luis Cruzado.

Back at CCRMA, Dr. Jonathan Abel led audio digital signal processing to map their strong directionality, and to track the progression of sound waves through their exponentially spiraling interiors. This data constitutes a digital archive of the shell instrument sonics, and drives computational acoustic models of these so-called Chavín pututus (ASA 2010; Flower World 2012; ICTM 2013).

Where does data meet practice? How could living musicians further inform our study? Cook’s expertise as winds and shells player allowed him to evaluate the Chavín pututus’ playability with respect to a variety of other instruments, and produce a range of articulations. Alongside the acoustic measurement sessions, Peruvian master musician Tito La Rosa offered a performative journey, a meditative ritual beginning and ending with the sound of human breath, the source of pututu sounding. This reverent approach took us away from our laboratory perspectives for a moment, and pushed us to consider not only the performative dynamics of voicing the pututus, but their potential for nuanced sonic expression.

Fig. 6 (VIDEO): Tito La Rosa performs one of the Chavín pututus in the Museo Nacional Chavín. Video by Cobi van Tonder.

When Cook and La Rosa played pututus together, we noted the strong acoustic “beats” that result when shell horns’ similar frequencies constructively and destructively interfere, producing an amplitude variation at a much lower frequency. Some nearby listeners described this as a “warbling” or “throbbing” of the tone, and said they thought that the performers were creating this effect through a performance technique (not so; it’s a well-known acoustic wave-interference phenomenon; see Hartmann 1998: 393-396).

Fig. 7 (VIDEO): José Cruzado and Swiss trombonist Michael Flury demonstrate amplitude “beats” between replica pututus in Chavín’s Doble Ménsula Galley. Video by Miriam Kolar.

If present-day listeners are unaware of an acoustics explanation for a sound effect, how might ancient listeners have understood and attributed such a sound? A pututu player would know that s/he was not articulating this warble, yet would be subject to its strong sensations. How would this visceral experience be interpreted? Might it be experienced as a phantom force?

The observed acoustic beating effect between pututus was so impressive that we sought to reproduce it during our on-site tests of architectural acoustics using replica shell horns. CCRMA Director Dr. Chris Chafe joined us, and he and Rick moved through Chavín’s labyrinthine corridors, blasting and droning pututus in different articulations to identify and excite acoustic resonances in the confined interior “galleries” of the site.

Fig8a_CC_Pututu_Laberintos_2792_byJLC Fig8b_JR_TritonPututu_Laberintos_2718_byJLC

Fig. 8: CCRMA Director Chris Chafe and archaeologist John Rick play replica pututus to test the acoustics of Chavín’s interior galleries. Photos by José Luis Cruzado.

The short reverberation times of Chavín’s interior architecture allow the pututus to be performed as percussive instruments in the galleries (ASA 2008). However, the strong modal resonances of the narrow corridors, alcoves, and rooms also support sustained tonal production, in an acoustically fascinating way. Present-day pututu players have reported the experience of their instruments’ tones being “pulled into tune” with these architectural resonances. This eerie effect is both sonic and sensed, an acoustic experience that is not only heard, but felt through the body, an external force that seemingly influences the way the instrument is played.

Fig. 9 (AUDIO MISSING): Resonant compliance: Discussion of phantom tuning effect as Kolar and Cruzado perform synchronizing replica pututus in the Laberintos Gallery at Chavín. Audio by Miriam Kolar.

From an acoustical science perspective, what could be happening? As is well known from musical acoustics research (e.g., Fletcher and Rossing 1998), shell horns are blown-open lip-reed or lip-valve instruments, terminology that refers to the physical dynamics of their sounding. Mechanically speaking, the instrument player’s lips vibrate (or “buzz”) in collaborative resonance with the oscillations produced within the air column of the pututu’s interior, known in instrument lingo as its “bore”. Novice players may have great difficulty producing sound, or immediately generate a strong tone; there’s not one typical tendency, though producing higher, lower, or sustained tones requires greater control.

Experienced pututu players such as Cook and La Rosa can change their lip vibrations to increase the frequency––and therefore raise the perceived pitch––that the shell horn produces. To drop the pitch below the instrument’s natural sounding tone (the fundamental resonant frequency of its bore), the player can insert a hand in the lip opening, or “bell”, of the shell horn. Instrument players also modify intonation by altering the shape of their vocal tracts. This vocal tract modification is produced intuitively, by “feel”, and may involve several different parts of that complex sound-producing system.

Strong architectural acoustic resonance can “couple”, or join with the air column in the instrument that is also coupled to that of the player’s vocal tract (with the players lips opening and closing in between). When the oscillatory frequencies of the player’s lips, those within the air column of his or her vocal tract, the pututu bore resonance, and the corridor resonance are synchronized, the effect can produce a strong sensation of immersion in the acoustic environment for the performer. The pututu is “tuned” to the architecture: both performer and shell horn are acoustically compliant with the architectural resonance.

When a second pututu player joins the first in the resonant architectural location, both players may share the experience of having their instrument tones guided into tune with the space, yet at the same time, sense the synchrony between their instruments. The closer together the shell openings, the more readily their frequencies will synchronize with each other. As Cook has observed, “if players are really close together, the wavefronts can actually get into the shells, and the lips of the players can phase lock.” (Interview between Kolar & Cook 2011: https://ccrma.stanford.edu/groups/chavin/interview_prc.html).

Fig. 10 (VIDEO): Kolar and Cruzado performing resonance-synchronizing replica pututus in the Laberintos Gallery at Chavín. Video by Miriam Kolar.

From the human interpretive perspective, what might pututu players in ancient Chavín have thought about these seemingly phantom instrument guides? A solo pututu performer who sensed the architectural and instrumental acoustic coupling might understand this effect to be externally driven, but how would s/he attribute the phenomenon? Would it be thought of as embodied by the instrument being played, or as an intervention of an otherworldly power, or an effect of some other aspect of the ceremonial context? Pairs or multiple performers experiencing the resonant pull might attribute the effect to the skill of a powerful lead player, with or without command of supernatural forces. Such interpretations are motivated by archaeological interpretations of Chavín as a cult center or religious site where social hierarchy was developing (Rick 2006).

However these eerie sonics might have been understood by people in ancient Chavín, from an acoustics perspective we can theorize and demonstrate complex yet elegant physical dynamics that are reported to produce strong experiential effects. Chavín’s phantom forces––however their causality might be interpreted––guide the sound of its instruments into resonant synchrony with each other and its architecture.

REFERENCES
Chavín de Huántar Archaeological Acoustics Project: https://ccrma.stanford.edu/groups/chavin/

(ASA 2002): Rick, John W., and David Lubman. “Characteristics and Speculations on the Uses of Strombus Trumpets found at the Ancient Peruvian Center Chavín de Huántar”. (Abstract). In Journal of the Acoustical Society of America 112/5, 2366, 2002.

(ASA 2010): Cook, Perry R., Abel, Jonathan S., Kolar, Miriam A., Huang, Patty, Huopaniemi, Jyri, Rick, John W., Chafe, Chris, and Chowning, John M. “Acoustic Analysis of the Chavín Pututus (Strombus galeatus Marine Shell Trumpets).(Abstract). Journal of the Acoustical Society of America, Vol. 128, No. 2, 359, 2010.

(Flower World 2012): Kolar, Miriam A., with Rick, John W., Cook, Perry R., and Abel, Jonathan S. “Ancient Pututus Contextualized: Integrative Archaeoacoustics at Chavín de Huántar, Perú”. In Flower World – Music Archaeology of the Americas, Vol. 1. Eds. M. Stöckli and A. Both. Ekho VERLAG, Berlin, 2012.

(ICTM 2013): Kolar, Miriam A. “Acoustics, Architecture, and Instruments in Ancient Chavín de Huántar, Perú: An Integrative, Anthropological Approach to Archaeoacoustics and Music Archaeology”. In Music & Ritual: Bridging Material & Living Cultures. Ed. R. Jiménez Pasalodos. Publications of the ICTM Study Group on Music Archaeology, Vol. 1. Ekho VERLAG, Berlin, 2013.

(Hartmann 1998): Hartmann, William M. Signals, Sound, and Sensation. Springer-Verlag, New York, 1998.

(ASA 2008): Abel, Jonathan S., Rick, John W., Huang, Patty P., Kolar, Miriam A., Smith, Julius O. / Chowning, John. “On the Acoustics of the Underground Galleries of Ancient Chavín de Huántar, Peru”. (Abstract). Journal of the Acoustical Society of America, Vol. 123, No. 3, 605, 2008.

(Fletcher and Rossing 1998): Fletcher, Neville H., and Thomas D. Rossing. The Physics of Musical Instruments. Springer-Verlag, New York, 1998.

Kolar and Cook Interview 2011: https://ccrma.stanford.edu/groups/chavin/interview_prc.html

(Rick 2006): Rick, John W. “Chavín de Huántar: Evidence for an Evolved Shamanism”. In Mesas and Cosmologies in the Central Andes (Douglas Sharon, ed.), 101-112. San Diego Museum Papers 44, San Diego, 2006.