2pNS8 – Noise Dependent Coherence-Super Gaussian based Dual Microphone Speech Enhancement for Hearing Aid Application using Smartphone Nikhil Shankar
Underwater sound from recreational swimmers, divers, surfers, and kayakers
Christine Erbe – Curtin University, firstname.lastname@example.org
Miles Parsons – Curtin University and Australian Institute of Marine Science, email@example.com
Alec Duncan – Curtin University, A.J.Duncan@curtin.edu.au
Klaus Lucke – Curtin University and JASCO Applied Sciences, Klaus.firstname.lastname@example.org
Alexander Gavrilov – Curtin University, A.Gavrilov@curtin.edu.au
Kim Allen – THHINK Autonomous Systems, email@example.com
Centre for Marine Science & Technology, Curtin University, Bentley, 6102 Western Australia, AUSTRALIA|
Popular version of paper 1aAO5
Presented Monday morning, May 7, 2018, 11:10-11:25 a.m., GREENWAY A
175th ASA Meeting, Minneapolis, MN
Video 1: Underwater video and sound recording of different water sports activities.
Underwater sound contains a lot of information about the source that produces it. Ships, for example, have a characteristic sound signature underwater, by which the type of vessel, its speed, and its route can easily be determined. In some cases, individual vessels can be identified by their sound and information about the type of propulsion, operational mode, and load can be deduced and maintenance issues (e.g., relating to the propeller) can be picked out. Similarly, just by listening, we can study marine life from whales to fishes and shrimp; we can track their movements; monitor their behavior; and in the case of some species of dolphins, even say which family and individuals are there. Sound is an important commodity for marine life; marine mammals as well as fishes, for example, communicate through sound, sense their environment, navigate, and forage—all mediated by sound.
Given the important role sound plays in the life functions of marine fauna, the potential interference by man-made noise has received growing interest. Noise may disrupt animal behavior, affect their hearing abilities, mask communication, cause stress, and in extreme cases cause physical and physiological damage that can ultimately be fatal. The research and management focus has—quite sensibly—been on the strongest sources, such as geophysical surveys or coastal and marine construction. Non-motorised activities are expected quieter and have hardly been studied.
Within the framework of an underwater acoustic project, we had the opportunity to record ourselves and friends performing a number of recreational water sports activities in a quiet Olympic pool, with all surrounding machinery (including cleaning pumps) switched off [1,2]. Specifically, different people were filmed and acoustically recorded while swimming breaststroke, backstroke, freestyle, and butterfly; snorkeling with and without fins; paddling a surfboard with alternating single or double arms; scuba diving; kayaking; and jumping into the pool. Sound pressure and water particle velocity were measured.
Activities that occurred at the surface, involved repeatedly piercing the surface and hence created bubble clouds were the strongest sound generators. Received levels were 110-131 dB re 1 µPa (10-16,000 Hz) for all of the activities at the closest point of approach (1 m). Levels were lower than those found in environmental noise regulations, but were clearly above ambient noise levels recorded off beaches and hence predicted audible by marine fauna over tens to hundreds of meters.
The characterization and quantification of underwater sound from recreational water sports has applicability well beyond environmental management. For example, just by listening to the recordings, it is easy to identify who of the volunteers was in the pool and which activity (including which style of swimming, with or without fins, with single versus double arms, etc.) was performed. The better (i.e., faster and smoother) swimmers were the quieter swimmers. Underwater sound might be a useful tool to assess professional or competitive swimmer performance and can be used for security monitoring of pools.
 C. Erbe, M. Parsons, A. J. Duncan, K. Lucke, A. Gavrilov and K. Allen, “Underwater particle motion (acceleration, velocity and displacement) from recreational swimmers, divers, surfers and kayakers,” Acoustics Australia 45, 293-299 (2017). doi: 10.1007/s40857-017-0107-6
 C. Erbe, M. Parsons, A. J. Duncan and K. Allen, “Underwater acoustic signatures of recreational swimmers, divers, surfers and kayakers,” Acoustics Australia 44 (2), 333-341 (2016). doi: 10.1007/s40857-016-0062-7
Acoustic Cloaking Using the Principles of Active Noise Cancellation
Jordan Cheer – firstname.lastname@example.org
Institute of Sound and Vibration Research
University of Southampton
Popular version of paper 4pEA7, “Cancellation, reproduction and cloaking using sound field control”
Presented Thursday morning, December 1, 2016
172nd ASA Meeting, Honolulu
Loudspeakers are synonymous with audio reproduction and are widely used to play sounds people want to hear. Loudspeakers have also been used for the opposite purpose, to attenuate noise that people may not want to hear. Active noise cancellation technology is an example of this, which combines loudspeakers, microphones and digital signal processing to adaptively control unwanted noise sources .
More recently, the scientific community has focused attention on controlling and manipulating sound fields to acoustically cloak objects, with the aim of rendering objects acoustically invisible. A new class of engineered materials called metamaterials have already demonstrated this ability . However, acoustic cloaking has also been demonstrated using methods based on both sound field reproduction and active noise cancellation . Despite its demonstration there has been limited research exploring the physical links between acoustic cloaking, active noise cancellation and sound field reproduction. Therefore, we began exploring these links with the aim of developing active acoustic cloaking systems that build on the advanced knowledge of implementing both audio reproduction and active noise cancellation systems.
Acoustic cloaking attempts to control the sound scattered from a solid object. Using a numerical computer simulation, we therefore investigated the physical limits on active acoustic cloaking in the presence of a rigid scattering sphere. The scattering sphere, shown in Figure 1, was surrounded by an array of sources (loudspeakers) used to control the sound field, shown by the black dots surrounding the sphere in the figure. In the first instance we investigated the effect of the scattering sphere on a simple sound field.
Looking at a horizontal slice through the simulated sound field without a scattering object, shown in the second figure, modifications by the presence of the scattering sphere are obvious in comparison to the same slice when the object is present, seen in third figure. Scattering from the sphere distorts the sound field, rendering it acoustically visible.
Figure 1 – The geometry of the rigid scattering sphere and the array of sources, or loudspeakers used to control the sound field (black dots).
Figure 3 – The sound field produced when an acoustic plane wave is incident on the rigid scattering sphere.
To understand the physical limitations on controlling this sound field, and thus implementing an active acoustic cloak, we investigated the ability of the array of loudspeakers surrounding the scattering sphere to achieve acoustic cloaking . In comparison to active noise cancellation, rather than attempting to cancel the total sound field, we only attempted to control the scattered component of the sound field and thus render the sphere acoustically invisible.
With active acoustic cloaking, the sound field appears undisturbed, where the scattered component has been significantly attenuated and results in a field, shown in the fourth figure, that is indistinguishable from the object-less simulation of the Figure 2.
Figure 4 – The sound field produced when active acoustic cloaking is used to attempt to cancel the sound field scattered by a rigid scattering sphere and thus render the scattering sphere acoustically ‘invisible’.
Our results indicate active acoustic attenuation can be achieved using an array of loudspeakers surrounding a sphere that would otherwise scatter sound detectably. In this and related work, further investigations showed that the performance of active acoustic cloaking is most effective when the loudspeakers are in close proximity to the object being cloaked. This may lead to design concepts involving acoustic sources embedded in objects for acoustic cloaking or control of the scattered sound field.
Future work will attempt to demonstrate the performance of active acoustic cloaking experimentally and overcome significant challenges of not only controlling the scattered sound field, but detecting it using an array of microphones.
 P. Nelson and S. J. Elliott, Active Control of Sound, 436 (Academic Press, London) (1992).
 L. Zigoneanu, B.I. Popa, and S.A. Cummer, “Three-dimensional broadband omnidirectional acoustic ground cloak”. Nat. Mater, 13(4), 352-355, (2014).
 E. Friot and C. Bordier, “Real-time active suppression of scattered acoustic radiation”, J. Sound Vib., 278, 563–580 (2004).
 J. Cheer, “Active control of scattered acoustic fields: Cancellation, reproduction and cloaking”, J. Acoust. Soc. Am., 140 (3), 1502-1512 (2016).
Ever noticed how or wondered why people sound different on your cell phone than in person? You might already know that the reason is because a cell phone doesn’t transmit all of the sounds that the human voice creates. Specifically, cell phones don’t transmit very low-frequency sounds (below about 300 Hz) or high-frequency sounds (above about 3,400 Hz). The voice can and typically does make sounds at very high frequencies in the “treble” audio range (from about 6,000 Hz up to 20,000 Hz) in the form of vocal overtones and noise from consonants. Your cell phone cuts all of this out, however, leaving it up to your brain to “fill in” if you need it.
Figure 1. A spectrogram showing acoustical energy up to 20,000 Hz (on a logarithmic axis) created by a male human voice. The current cell phone bandwidth (dotted line) only transmits sounds between about 300 and 3400 Hz. High-frequency energy (HFE) above 6000 Hz (solid line) has information potentially useful to the brain when perceiving singing and speech.
What are you missing out on? One way to answer this question is to have individuals listen to only the high frequencies and report what they hear. We can do this using conventional signal processing methods: cut out everything below 6,000 Hz thereby only transmitting sounds above 6,000 Hz to the ear of the listener. When we do this, some listeners only hear chirps and whistles, but most normal-hearing listeners report hearing voices in the high frequencies. Strangely, some voices are very easy to hear out in the high frequencies, while others are quite difficult. The reason for this difference is not yet clear. You might experience this phenomenon if you listen to the following clips of high frequencies from several different voices. (You’ll need a good set of high-fidelity headphones or speakers to ensure you’re getting the high frequencies.)
Until recently, these treble frequencies were only thought to affect some aspects of voice quality or timbre. If you try playing with the treble knob on your sound system you’ll probably notice the change in quality. We now know, however, that it’s more than just quality (see Monson et al., 2014). In fact, the high frequencies carry a surprising amount of information about a vocal sound. For example, could you tell the gender of the voices you heard in the examples? Could you tell whether they were talking or singing? Could you tell what they were saying or singing? (Hint: the words are lyrics to a familiar song.) Most of our listeners could accurately report all of these things, even when we added noise to the recordings.
Figure 2. A frequency spectrum (on a linear axis) showing the energy in the high frequencies combined with speech-shaped low-frequency noise.
[Insert noise clip here: MonsonM1singnoise.wav]
What does this all mean? Cell phone and hearing aid technology is now attempting to include transmission of the high frequencies. It is tempting to speculate how inclusion of the high frequencies in cell phones, hearing aids, and even cochlear implants might benefit listeners. Lack of high-frequency information might be why we sometimes experience difficulty understanding someone on our phones, especially when sitting on a noisy bus or at a cocktail party. High frequencies might be of most benefit to children who tend to have better high-frequency hearing than adults. And what about quality? High frequencies certainly play a role in determining voice quality, which means vocalists and sound engineers might want to know the optimal amount of high-frequency energy for the right aesthetic. Some voices naturally produce higher amounts of high-frequency energy, and this might contribute to how well you like that voice. These possibilities give rise to many research questions we hope to pursue in our study of the high frequencies.
Monson, B. B., Hunter, E. J., Lotto, A. J., and Story, B. H. (2014). “The perceptual significance of high-frequency energy in the human voice,” Frontiers in Psychology, 5, 587, doi: 10.3389/fpsyg.2014.00587.
Hearing voices in the high frequencies: What your cell phone isn’t telling you
Brian B. Monson – email@example.com
Department of Pediatric Newborn Medicine
Brigham and Women’s Hospital
Harvard Medical School
75 Francis St
Boston, MA 02115
Popular version of paper 4pAAa12
Presented Thursday afternoon, October 30, 2014
168th ASA Meeting, Indianapolis
For many decades, speech scientists have marveled at the complexity of speech sounds. In English, a relatively simple task of distinguishing “bat” from “pat” can involve as many as 16 different sound cues. Also, English vowels are pronounced so differently across speakers that one person’s “Dan” can sound like another’s “done”. Despite all this, most adult native English speakers are able to understand English speech sounds rapidly, effortlessly, and accurately. In contrast, learning a new language is not an easy task, partly because the characteristics of foreign speech sounds are unfamiliar to us. For instance, Mandarin Chinese is a tonal language, which means that the pitch pattern used to produce each syllable can change the meaning of the word. Therefore, the word “ma” can mean “mother”, “hemp”, “horse”, or “to scold,” depending on whether the word was produced with a flat, rising, dipping, or a falling pitch pattern. It is no surprise that many native English speakers struggle in learning Mandarin Chinese. At the same time, some seem to master these new speech sounds with relative ease. With our research, we seek to discover the neural and genetic bases of this individual variability in language learning success. In this paper, we are focusing on genes that target activity of two distinct neural regions: prefrontal cortex and striatum.
Recent advances in speech science research strongly suggest that for adults, learning speech sounds for the first time is a cognitively challenging task. What this means is that every time you hear a new speech sound, a region of your brain called the prefrontal cortex – the part of the cerebral cortex that sits right under your forehead –¬ must do extra work to extract relevant sound patterns and parse them according to learned rules. Such activity in the prefrontal cortex is driven by dopamine, which is one of the many chemicals that the cells in your brain use to communicate with each other. In general, higher dopamine activity in the prefrontal cortex means better performance in complex and difficult tasks.
Interestingly, there is a well-studied gene called COMT that affects the dopamine activity level in the prefrontal cortex. Everybody has a COMT gene, although with different subtypes. Individuals with a subtype of the COMT gene that promotes dopamine activity perform hard tasks better than do those with other subtypes. In our study, we found that the native English speakers with the dopamine-promoting subtype of the COMT gene (40 out of 169 participants) learned Mandarin Chinese speech sounds better than those with different subtypes. This means that, by assessing your COMT gene profile, you might be able to predict how well you will learn a new language.
However, this is only half the story. While new learners may initially use their prefrontal cortex to discern foreign speech sound contrasts, expert learners are less likely to do so. As with any other skill, speech perception becomes more rapid, effortless, and accurate with practice. At this stage, your brain can bypass all that burdensome cognitive reasoning in the prefrontal cortex. Instead, it can use the striatum – a deep structure within the brain¬¬ – to directly decode the speech sounds. We find that the striatum is more active for expert learners of new speech sounds. Furthermore, individuals with a subtype of a gene called FOXP2 that promotes flexibility of the striatum to new experiences (31 out of 204 participants) were found to learn Mandarin Chinese speech sounds better than those with other subtypes.
Our research suggests that learning speech sounds in a foreign language involves multiple neural regions, and that genetic variations which affect the activity within those regions lead to better or worse learning. In other words, your genetic framework may be contributing to how well you learn to understand a new language. What we do not know at this point is how these variables interact with other sources of variability, such as prior experience. Previous studies have shown that extensive musical training, for example, can enhance learning speech sounds of a foreign language. We are a long way from cracking the code of how the brain, a highly complex organism, functions. We hope that a neurocognitive genetic approach may help bridge the gap between biology and language.
Han-Gyol Yi – firstname.lastname@example.org
W. Todd Maddox ¬– email@example.com
The University of Texas at Austin
2504A Whitis Ave. (A1100)
Austin, TX 78712
Valerie S. Knopik – firstname.lastname@example.org
Rhode Island Hospital
593 Eddy Street
Providence, RI 02093
John E. McGeary – email@example.com
Providence Veterans Affairs Medical Center
830 Chalkstone Avenue
Providence, RI 02098
Bharath Chandrasekaran – firstname.lastname@example.org
The University of Texas at Austin
2504A Whitis Ave. (A1100)
Austin, TX 78712
Popular version of paper 4aSCb16
Presented Thursday morning, October 30, 2014
168th ASA Meeting, Indianapolis