Shhh! Smart Tech at Work: Zoning in on Target Sounds Amid the Noise

Jingya Yang – jing.ya161@gmail.com

Department of Power Mechanical Engineering, National Tsing Hua University, Hsinchu, -, 300, Taiwan

Popular version of 1aSP2 – Target-Direction Sound Extraction Using a Hybrid DSP/Deep Learning Approach
Presented at the 187th ASA Meeting
Read the abstract at https://eppro01.ativ.me//web/index.php?page=IntHtml&project=ASAFALL24&id=3771518

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–


In a noisy world, capturing clear audio from specific directions can be a game-changer. Imagine a system that can zero in on a target sound, even amid background noise. This is the goal of Target Directional Sound Extraction (TDSE), a process designed to isolate sounds from a particular direction, while filtering out unwanted noise.

Our team has developed an innovative TDSE system that combines Digital Signal Processing (DSP) and deep learning. Traditional sound extraction relies on signal processing, but it struggles when multiple sounds come from various directions or when using fewer microphones. Deep learning can help, but it sometimes results in distorted audio. By integrating DSP-based spatial filtering with a deep neural network (DNN), our system extracts clear target audio with minimal interference, even with limited microphones.

The system relies on spatial filtering techniques like beamforming and blocking. Beamforming serves as a signal estimator, enhancing sounds from the target direction, while blocking acts as a noise estimator, suppressing sounds from the target direction and leaving other unwanted noises intact. Using a deep learning model, our system processes spatial features and sound embeddings (unique characteristics of the target sound), yielding clear, isolated audio. In our tests, this method improved sound quality by 3-9 dB and performed well with different microphone setups, even those not used during training.

Audio 1 & Audio 2

TDSE could transform various industries, from virtual meetings to entertainment, by enhancing audio clarity in real time. Our system’s design offers flexibility, making it adaptable for real-world applications where clear directional audio is crucial.

This approach is an exciting step toward more robust, adaptive audio processing systems, allowing users to capture target sounds even in challenging environments.

Introducing Project ELLA: Enhancing Early Language and Literacy

Jennell Vick – jvick@chsc.org
Twitter: @DrJVick

Cleveland Hearing and Speech Center
6001 Euclid Avenue Suite 100
Cleveland, OH, 44103
United States

Popular version of 2aSC4 – From intention to understanding and back again: How a simple message of ‘Catch and Pass’ can build language in children
Presented at the 187th ASA Meeting
Read the abstract at https://eppro01.ativ.me/web/index.php?page=IntHtml&project=ASAFALL24&id=3763506

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–


Project ELLA (Early Language and Literacy for All) is an exciting new program designed to boost early language and literacy skills in young children. The program uses a simple yet powerful message, “Catch and Pass,” to teach parents, grandparents, daycare teachers and other caregivers the importance of having back-and-forth conversations with children from birth. These interactions help build and strengthen the brain’s language pathways, setting the foundation for lifelong learning.

Developed by the Cleveland Hearing & Speech Center, Project ELLA focuses on helping children in the greater Cleveland area, especially those in under-resourced communities. Community health workers visit neighborhoods to build trust with neighbors, raise awareness about the importance of responsive interactions for language development, and help empower families to put their children on-track for later literacy (See Video1). They also identify children who may need more help through speech and language screenings. For children identified as needing more help, Project ELLA offers free speech-language therapy and support for caregivers at Cleveland Hearing & Speech Center.

The success of the project is measured by tracking the number of children and families served, the progress of children in therapy, the knowledge and skills of caregivers and teachers, and the partnerships established in the community (See Fig. 1). Project ELLA is a groundbreaking model that has the potential to transform language and literacy development in Cleveland and beyond.

Early Language and Literacy for All

Sound That Gets Under Your Skin (Literally): Testing Bone Conduction Headphones

Kiersten Reeser – kreeser@ara.com

Applied Research Associates, Inc., 7921 Shaffer Pkwy, Littleton, Colorado, 80127, United States

Twitter: @ARA_News_Events
Instagram: @appliedresearchassociates

Additional authors:
Alexandria Podolski
William Gray
Andrew Brown
Theodore Argo

Popular version of 1pEA3 – Investigating Commercially Available Force Sensors for Bone Conduction Hearing Device Evaluation
Presented at the 187th ASA Meeting
Read the abstract at https://eppro01.ativ.me//web/index.php?page=IntHtml&project=ASAFALL24&id=3771572

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–


Bone conduction (BC) headphones produce sound without covering the outer ears, offering an appealing alternative to conventional headphones. While BC technologies have long been used for diagnosing and treating hearing loss, consumer BC devices have become increasingly popular with a variety of claimed benefits, from safety to sound quality. However, objectively measuring BC signals – to guide improvement of device design, for example – presents several unique challenges, beginning with measurement of the BC signal itself.

Airborne audio signals, like those generated by conventional headphones, are measured using microphones; BC signals are generated by vibrating transducers pressed against the head. These vibrations are impacted by how/where and how tightly the BC headphones are positioned on the head, and other factors including individualized anatomy.

BC devices have historically been evaluated using an artificial mastoid (Figure 1 – left), a specialized (and expensive) measurement tool that was designed to simulate key properties of the tissue behind the ear, capturing the output of selected clinical BC devices under carefully controlled measurement conditions. While the artificial mastoid’s design allows for high-precision measurements, it does not account for the variety of shapes and sizes of consumer BC devices. Stakeholders ranging from manufacturers to researchers need a method to measure the effective outputs of consumer BC devices as worn by actual listeners.

Figure 1. The B&K Artificial Mastoid (left) is the standard solution for measuring BC device output. There is a need for a sensor to be placed between the BCD and human head for real-life measurements of the device’s output.

 

Our team, made up of collaborators at Applied Research Associates, Inc. (ARA) and the University of Washington, is working to develop a system that can be used across a wide variety of unique anatomy, BC devices, and sensor placement locations (Figure 1 – right). The goal is to use thin/flexible sensors placed directly under BC devices during use to accurately and repeatably measure the coupling of the BC device with the head (static force) and the audio-frequency vibrations produced by the device (dynamic force).

Three low-cost force sensors have been identified, shown in Figure 2, each having different underlying technologies with potential to meet the requirements necessary to characterize BC device output. The sensors have undergone preliminary testing, which revealed that all three can produce static force measurements. However, the detectable frequencies and signal quality of the dynamic force measurements varied based on the sensing design and circuitry of each sensor. The design of the Ohmite force sensing resistor (Figure 3– left) limited the quality of the measured signal. The SingleTact force sensing capacitor (Figure 3– middle) was incapable of collecting dynamic measurements for audio signals. The Honeywell FSA was limited by its circuitry and could only partially detect the desired frequency ranges.

Figure 2. Three force-sensors were evaluated; Ohmite force-sensing resistor (left), SingleTact force-sensing capacitor (middle), and Honeywell FSA (right).

 

Further testing and development are necessary to identify whether dynamic force measurements can be improved by utilizing different hardware for data collection or implementing different data analysis techniques. Parallel efforts are focused on streamlining the interface between the BC device and the sensors to improve listener comfort.

Mitigating Train Derailments Through Proactive Condition Monitoring of Rolling Stock

Constantine Tarawneh – constantine.tarawneh@utrgv.edu

University Transportation Center for Railway Safety
University of Texas Rio Grande Valley
Edinburg, Texas, 78539
United States

Popular version of 3aPA4 – Preventing Hot Bearing Derailments via Wireless Onboard Condition Monitoring
Presented at the 187th ASA Meeting
Read the abstract at https://eppro01.ativ.me//web/index.php?page=IntHtml&project=ASAFALL24&id=3770628

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–


The 2023 train derailment that occurred in East Palestine, OH, brought attention to the limitations of the detectors currently used in the industry. Typically, the health of train bearings is monitored intermittently through wayside temperature detection systems that can be as far as 40 miles apart. Nonetheless, catastrophic bearing failure is often sudden and develops rapidly. Current wayside detection systems are reactive in nature and depend on significant temperature increases above ambient. Thus, when these systems are triggered, train operators rarely have enough time to react before a derailment occurs, as it did in East Palestine, OH. Multiple comprehensive studies have shown that the temperature difference between healthy and faulty bearings is not statistically meaningful until the onset of catastrophic failure. Thus, temperature alone is an insufficient metric for health monitoring.

Video 1. Vibration and noise emitted from train bearings.

Over the past two decades, we have demonstrated vibration-based solutions for wireless onboard condition monitoring of train components to address this problem. Early stages of bearing failure are reliably detected via vibrations and acoustics signatures, as shown in Video 1, which can also be used to determine the severity and location of failure. This is accomplished in three levels of analysis where Level 1 determines the bearing condition based on the vibration levels within the bearing as compared to a maximum vibration threshold for healthy bearings. In Level 2, the vibration signature is analyzed to identify the defective component within the bearing, and Level 3 estimates the size of the defect based on a developed correlation that relates the vibration levels to defect size.

To demonstrate this process, Figure 1 provides the vibration and temperature profiles for two bearings. Examining the vibration profile, the vibration levels within bearing R7 exceed the maximum threshold for healthy bearings 730 hours into the test, thus indicating a defective bearing. At the same time, the operating temperature of that bearing never exceeded the normal operating range, which would suggest that the bearing is healthy. Upon teardown and visual inspection, we found severe damage to the bearing components at the raceways, as pictured in Figure 2. Despite severe damage all around the bearing inner ring (cone), the operating temperature did not indicate any abnormal behavior.

Figure 1: Vibration and temperature profiles of two railroad bearings showcasing how vibration levels within the bearings can indicate the development of defects while operating temperature does not exhibit any abnormal behavior.
Figure 2: Picture of the damage that developed within bearing R7 (refer to Figure 1). Interestingly, the bearing inner ring (cone) had severe damage and a crack that the vibration levels picked up but not the operating temperature.

We believe that vibration-based sensors can provide proactive monitoring of bearing conditions affording rail operators ample time to detect the onset of bearing failure and schedule non-disruptive maintenance. Our work aims to continue to optimize these new methods and help the rail industry deploy these technologies to advance rail safety and efficiency. Moreover, this research program has had an extraordinary transformative impact from the local to the national level by training hundreds of engineers from underrepresented backgrounds and positioning them for success in industry, government, and higher education.

How Pitch, Dynamics, and Vibrato Shape Emotions in Violin Music

Wenyi Song – wsongak@cse.ust.hk
Twitter: @sherrys72539831

Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR, NA, NA, Hong Kong

Anh Dung DINH
addinh@connect.ust.hk

Andrew Brian Horner
horner@cse.ust.hk
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR

Popular version of 1aMU2 – The emotional characteristics of the violin with different pitches, dynamics, and vibrato levels
Presented at the 187th ASA Meeting
Read the abstract at https://eppro01.ativ.me//web/index.php?page=IntHtml&project=ASAFALL24&id=3767557

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–


Music has a unique way of moving us emotionally, but have you ever wondered how individual sounds shape these feelings?

In our study, we looked at how different features of violin notes—like pitch (the height of the notes), dynamics (the loudness of the sounds), and vibrato (how the note vibrates)—combine to create emotional responses. While previous research often focuses on each feature in isolation, we explored how they interact, revealing how the violin’s sounds evoke specific emotions.

To conduct this study, we used single-note recordings from the violin at different pitches, two levels of dynamics (loud and soft), and two vibrato settings (no vibrato and high vibrato). We invited participants to listen to these sounds and rate their emotional responses using a scale of emotional positivity (valence) and intensity (arousal). Participants also selected which emotions they felt from a list of 16 emotions, such as joyful, nervous, relaxed, or agitated.

Audio 1. The experiment used a violin single-note sample (middle C pitch + loud dynamics + no vibrato).

Audio 2. The experiment used a violin single-note sample (middle C pitch + soft dynamics + no vibrato).

Audio 3. The experiment used a violin single-note sample (middle C pitch + loud dynamics + high vibrato).

Audio 4. The experiment used a violin single-note sample (middle C pitch + loud dynamics + high vibrato).

Our findings reveal that each element plays a unique role in shaping emotions. As shown in Figure 1, higher pitches and strong vibrato generally raised emotional intensity, creating feelings of excitement or tension. Lower pitches were more likely to evoke sadness or calmness, while loud dynamics made emotions feel more intense. Surprisingly, sounds without vibrato were linked to calmer emotions, while vibrato added energy and excitement, especially for emotions like anger or fear. And Figure 2 illustrates how strong vibrato enhances emotions like anger and sadness, while the absence of vibrato correlates with calmer feelings.

Figure 1. Pitch, Dynamics, and Vibrato average ratings on valence-arousal with different levels. It shows that higher pitches and strong vibrato increase arousal, while soft dynamics and no vibrato are linked to higher valence, highlighting pitch as the most influential factor.

 

Figure 2. Pitch, Dynamics, and Vibrato average ratings on 16 emotions. It shows that strong vibrato enhances angry and sad emotions, while no vibrato supports calm emotions; higher pitches increase arousal for angry emotions, and brighter tones evoke calm and happy emotions.

Our research provides insights for musicians, composers, and even music therapists, helping them understand how to use the violin’s features to evoke specific emotions. With this knowledge, violinists can fine-tune their performance to match the emotional impact they aim to create, and composers can carefully select sounds that resonate with listeners’ emotional expectations.

Understanding rapid fluid flow from the passage of a sound wave

James Friend – jfriend@ucsd.edu

Medically Advanced Devices Laboratory, Department of Mechanical and Aerospace Engineering, University of California San Diego, La Jolla, CA, 92093, United States

Popular version of 1pPA6 – Acoustic Streaming
Presented at the 187th ASA Meeting
Read the abstract at https://eppro01.ativ.me//web/index.php?page=Session&project=ASAFALL24&id=3770639

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–


Acoustic streaming is the flow of fluid driven by the interaction of sound waves with a fluid. Traditionally, this effect was viewed as slow and steady, but recent research shows it can cause fluids to flow rapidly and usefully. To understand how this mechanism works, the researchers devised an entirely new approach to the problem, spatiotemporally separating the acoustics from the fluid flow, providing a closed-form solution, a first. This phenomena has applications in areas like medical diagnostics, biosensing, and microfluidics where precise fluid manipulation is needed, and the analysis techniques may be useful from particle physics to geoengineering.