How Online Meetings Change Your Voice—and How We Measure It

Akira Takeuchi – takeuchi.akira@studio-infinity.co.jp

Instagram: @akira_reference_
Studio Infinity
Tokyo, Minato-ku, 107-0061
Japan

Additional Authors
Yixuan Huang, Miki Morinaga, Satoshi Tsuboya, Yuto Hosoya, and Sungyoung Kim

Popular version of 1pCA5 – Evaluating speech quality for automatic transcription in videoconferencing
Presented at the 189th ASA Meeting
Read the abstract at https://eppro02.ativ.me/appinfo.php?page=Session&project=ASAASJ25&id=3983372&server=eppro02.ativ.me

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–

Ghosts in Online Meetings: Why Clear Voices Sometimes Get Lost
Have you ever noticed that voices suddenly sound unclear during an online meeting—even though the speaker believes they are speaking clearly? You may find yourself straining to listen, missing words, or misunderstanding what was said. These problems are surprisingly common and can be difficult to fix on the spot, especially when meeting participants are not familiar with the technical details of videoconference systems.

We study this hidden problem by developing a machine learning–based system that can evaluate speech quality without interrupting the meeting. Our goal is to detect sound problems automatically, before they become frustrating for listeners.

AI Transcription vs. Human Listening
Humans are remarkably good at understanding speech, even when parts of it are missing or covered by noise. When a word is unclear, listeners often guess the meaning from context and still understand the overall message.

Automatic speech transcription, which is now widely used to record and summarize meetings, works very differently. AI systems analyze sound exactly as it is received. If speech is distorted, masked by noise, or partially missing, transcription accuracy drops sharply.

We turn this weakness into a strength. By measuring how much transcription quality degrades, we use AI transcription accuracy as an indicator of speech quality. In other words, if the transcription struggles, listeners are likely struggling too.

Causes of sound deterioration
Sound deterioration during online meetings can be grouped into four main causes (Figure 1):

  • Speech factors
    • How and what the speaker says, such as speaking speed or clarity.
  • Acoustic factors
    • Background noise or room reverberation that affects sound before it reaches the microphone.
  • System factors
    • Problems with microphones, cables, or audio hardware quality.
  • Communication factors
    • Network issues that occur after sound is converted into digital data, such as data compression or packet loss.

Our research focuses on communication factors, which are especially important in videoconference systems and differ from traditional phone calls.

Figure 1. Causes of sound deterioration

Packet loss simulation
Online meetings send sound over the internet in small pieces called packets. We use the SILK audio codec, a common system for converting speech into a format suitable for network transmission. Sometimes, these packets are lost during transmission, causing brief gaps or distortions in the sound.

To study this effect, we intentionally simulate packet loss and create artificially degraded speech. This allows us to generate large amounts of training data and teach machine learning models what poor communication quality sounds like.

Figures 2 and 3 compare a clean speech signal with a packet-loss-simulated version, showing how missing data changes the sound structure.

Figure 2. Spectrogram of clean speech (click image to listen)

Figure 3. Spectrogram of packet loss simulated speech (click image to listen)

Why This Matters
As online meetings become a permanent part of work and education, unnoticed sound degradation can silently reduce communication quality. By automatically detecting these problems, our approach helps make virtual meetings clearer, fairer, and less tiring—so no one’s voice turns into a “ghost” in the meeting.

More details can be found on our R&D webpage.

Manduca sexta Caterpillars Hear Using Hairs

Sara Aghazadeh – saghaza1@binghamton.edu
Instagram: @saraaghazadeh1016
Department of Mechanical Engineering
Binghamton University (SUNY)
Binghamton, NY, USA

Aishwarya Sriram – asriram@binghamton.edu
Instagram: @sriram.aishwarya
Department of Biological Sciences
Binghamton University (SUNY)
Binghamton, NY, USA

Prof. Carol Miles – cmiles@binghamton.edu
Department of Biological Sciences
Binghamton University (SUNY)
Binghamton, NY, USA

Prof. Ronald Miles – miles@binghamton.edu
Department of Mechanical Engineering
Binghamton University (SUNY)
Binghamton, NY, USA

Popular version of 4pABb3 – The ears of Manduca sexta caterpillars
Presented at the 189th ASA Meeting
Read the abstract at https://eppro02.ativ.me/appinfo.php?page=Session&project=ASAASJ25&id=3982723&server=eppro02.ativ.me

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–

The aim of this research is to explore how insects perceive vibration and sound, ultimately mimic these biological strategies to advance the technology of MEMS microphones. Some insects have tympanal membranes for the detection of sound pressure. These include, for example, katydids and crickets that have tympanal membranes on their forelegs, and the fly, Ormia ochracea that has paired tympanal organs on its prothorax. Most insects and spiders that can hear sound use non-tympanal sensors, such as long hairs on bee and mosquito antennae, and slit sensilla hairs in spiders. We are looking at Manduca sexta, tobacco hornworm caterpillars, a common garden pest devouring tobacco plants. This caterpillar can be found devouring tomato plants in your vegetable garden. While it does respond to sound, it is not clear whether it hears by detecting air-borne sound pressure using a tympanal membrane, hears acoustic particle velocity through the use of sensory hairs, or hears through the detection of sound-induced substrate vibration.

In this study, the caterpillars’ behavioral responses to sound were examined using sounds at two different frequencies: a 150 Hz tone, and a 2000 Hz tone. Previous studies have found strong behavioral responses at 150 Hz in tuning curve experiments. By measuring the sound-induced motion of a thoracic hair using laser vibrometry, we have observed a natural resonance of the hair at 2000 Hz. While we don’t normally expect insect hairs to be effective sound detectors at such high frequencies, this observation motivates further examination to look for behavioral responses.

We monitored caterpillars’ behavioral responses to vibrations of the surface the caterpillars were standing on, and to air-borne sound while we recorded the amplitude of the surface vibration. The results revealed that the caterpillars were 10-100 times more responsive to airborne sound than sound-induced vibration of the surface detected by their feet; this confirms that they perceive airborne sound. Our results show that they can hear airborne sound at a low-frequency of 150 Hz and a high-frequency of 2000 Hz.

We investigated whether certain identifiable thoracic and abdominal hairs enable the caterpillars to hear these specific frequencies through a series of experiments with and without the hairs removed. Please watch the video.

The result of the behavioral response comparisons before and after removal of the hairs on each caterpillar showed a greatly reduced ability of the caterpillars to detect sounds without the hairs. This indicates that M.sexta caterpillars use specific hairs located on their abdomen and thorax for detecting airborne sounds at 150 Hz and 2000 Hz. This provides evidence of non-tympanal sound detection in these caterpillars for these specific frequencies.

Listening to ultrasonic signals reveals the mechanical behavior of next-generation batteries

Simón Montoya-Bedoya – simonmontoyabedoya@gmail.com
Bluesky: @simontoyabe.bsky.social
Instagram: @simontoyabe
Walker Department of Mechanical Engineering, The University of Texas at Austin, Austin, Texas, 78712-1591, United States

Prof. Michael R. Haberman (Walker Department of Mechanical Engineering, The University of Texas at Austin)

Other contributors to the research:
Donal P. Finegan (National Laboratory of the Rockies, Golden, CO, US)
Hadi Khani (Texas Materials Institute, The University of Texas at Austin)
Ofodike Ezekoye (Walker Mechanical Engineering Department, The University of Texas at Austin)

Popular version of 2aPAb4 – Non-destructive ultrasonic monitoring of next-generation lithium-ion batteries
Presented at the 189th ASA Meeting
Read the abstract at https://eppro02.ativ.me/appinfo.php?page=Session&project=ASAASJ25&id=3977608&server=eppro02.ativ.me

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–

Have you noticed how heavily our current society depends on batteries? Batteries are used everywhere, from powering your phone to electrifying mobility, and energy storage to mitigate the intermittent nature of renewable energy sources like wind and sun. This increased demand for lithium-ion batteries (LIBs) has led to the exploration of new technologies with improved attributes such as safer operation or improved lifetime. For example, silicon solid-state batteries (Si-SSB) are promising because silicon as an anode material offers a higher specific capacity (~3500 mAh/g) than graphite (~300 mAh/g) used in conventional LIBs. They are also potentially safer to operate due to the use of a solid electrolyte rather than the flammable liquid electrolyte used in conventional LIBs.

However, Si-SSBs come with their own challenges associated with the avoidance of a liquid electrolyte, primarily the requirement to maintain reliable interfacial contact between all the solid layers for lithium-ion movement. Si-SSBs are therefore more brittle and more prone to contact loss and fracture.

Another challenge in studying the intricate mechanical changes that arise from the electrochemical processes in the battery is that we are “blind” to them, in other words, we cannot see inside batteries while they are operating. That’s why, just as a doctor uses ultrasound to monitor a beating heart, we can use ultrasonic waves to monitor batteries without opening them, as represented by the cartoon in Fig. 1. The key to understanding what changes within the batteries is having information about how the movement of lithium ions alters its mechanical properties. When lithium ions migrate during charging and discharging, they cause swelling, internal stresses, and sometimes fracture within the battery structure. These mechanical changes can significantly affect the propagation of ultrasonic waves through the material. This is specifically true for the silicon anode, where silicon forms alloys with the lithium ions, rather than the lithium ions becoming embedded in the molecular structure as occurs in conventional batteries. These electrochemical changes lead to large volumetric and mechanical changes. Thus, SSBs are a compelling technology to explore using ultrasound using ultrasonic signals observables, such as shifts in the time of flight (TOF) of the wave through the battery, or changes in how sound is absorbed or scattered. These “acoustic fingerprints” can potentially help us gain more insights into degradation in these next-generation (“next-gen”) batteries and therefore improve the technology for more widespread use in commercial products.

Figure 1. Analogy of the usage of ultrasonic waves for battery diagnostics, similar to how a doctor would use ultrasonics to monitor heart health. [Image generated with AI using Google NanoBanana Pro]

We aim to extend the use of ultrasonic testing methods for next-gen batteries and investigate opportunities and challenges associated with evaluating this new technology. In this work, we investigated both contact-based and immersion ultrasonic testing to monitor changes in the mechanical properties of Si-SSBs under cycle-induced aging.

In general, our experiments showed an overall stiffness reduction with aging as indicated by the increase in ultrasonic wave TOF (see Fig 2a). Further, we observed an overall reduction of transmitted energy with increased cycling. These two findings may be associated with the accumulation of damage at layer interfaces associated with the creation of solid-gas interfaces and/or debonding between layers. Finally, ultrasonic imaging using immersion testing provided information regarding the distribution and evolution of damage in space as these next-gen batteries are aged (see Fig 2b).

By refining these techniques to evaluate next-gen battery technologies, we will develop more sensitive methods to determine when something is wrong before it’s too late. In a world increasingly dependent on safe and reliable energy storage, the ability to “listen” to batteries might be precisely what we need to power the clean energy revolution.

Figure 2. Evolution of cell stiffness during aging. a) Stiffness of the SSB, normalized to its initial value, plotted against discharge capacity for both charged (blue) and discharged (red) states. With representative ultrasonic images from transmitted signals at two states of the SSB: b.1) pristine before cycling, and b.2) after 40 cycles of aging. We observed a significant reduction in transmission in the middle region of the SSB. Warmer colors indicate higher transmission, and dashed outlines mark the active cell region.

Fifteen Years of Research on Active Noise Control Systems for Partially Open Windows

Delf Sachau – sachau@hsu-hh.de

Professur für Mechatronik, Helmut-Schmidt-Universität, Hamburg, Hamburg, 22043, Germany

Dr.-Ing Tim Karl
Professur für Mechatronik
Helmut-Schmidt-Universität
Hamburg

Popular version of 3pAA10 – Fifteen Years of Research on Active Noise Control Systems for Partially Open Windows: A Summary of Key Findings
Presented at the 189th ASA Meeting
Read the abstract at https://eppro02.ativ.me/appinfo.php?page=Session&project=ASAASJ25&id=3979397&server=eppro02.ativ.me

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–

Motivation
In many cities, people want to keep their windows open to allow fresh air into their homes. However, especially in busy urban areas, open windows also let in unwanted noise from traffic, trains, aircraft, and general city activity. Constant exposure to this noise is not just annoying, it can affect sleep, concentration, and even long-term health. To address this problem, researchers at the Helmut Schmidt University in Hamburg have spent the past fifteen years developing systems that can reduce noise coming through partially open windows while still allowing natural ventilation.

Passive Absorbers
The approach combines two methods: passive noise reduction and active noise control (ANC). Passive noise reduction involves using materials that naturally absorb or block sound, such as foam-like acoustic panels or special seals. These materials are very good at reducing high-frequency noise but are less effective for deeper, low-pitched sounds like engines or traffic rumble.

ActiveNoise Control
This is where active noise control comes in. ANC works in a way similar to noise-cancelling headphones. Small loudspeakers placed near the window play “anti-noise sound waves” that are shaped to cancel out incoming noise. When the incoming noise and the anti-noise meet, they interfere with each other and reduce the amount of sound that reaches inside the room. To make this happen, microphones are used to measure the sound, while computer algorithms constantly adjust the sound from the speakers to keep the cancellation effective.

Figure 1: Internoise 2020, J. Hanselka, D. Sachau, Converting an Active Noise Blocker for a Tilted Window from Feedforward Control into a Feedback System

Algorithm
worked on improving the computer algorithms that run the ANC system. These algorithms need to react quickly to changing noise, remain stable, and avoid using too much power. Therefor analyses conduction different real-time-controller platforms were evaluated, including DSP and FPGA technology

Figure 2: ISMA 2014, D. Sachau, S. Jukkert, Real-time implementation of the frequency-domain FxLMS algorithm without block delay for adaptive noise blocker

Simulation
However, using ANC at an open window is much more complicated than inside headphones. The sound field near an open window is irregular and constantly changing because of airflow, reflections, and outdoor conditions. The research team therefore studied how sound moves through small openings of different shapes and sizes. One important discovery is that the depth of the opening relative to the wavelength of the sound plays a enormous role in how much noise gets through. This knowledge helps guide how the ANC system can be designed and placed.

Figure 3: Internoise2020, M. Sandner, D. Sachau, Influence of parameters of small gaps regarding sound transmission and ANC-performance-a numerical simulation

Position Optimization
Another major research effort focused on the best positions for microphones and speakers. Their placement determines how well the noise can be cancelled. The researchers found that placing the speaker near the center of the opening often provides the most even noise reduction throughout the room. Meanwhile, microphone placement is very important for stability, because the microphone input is what guides the control system in real time.

Figure 4: DAGA 2025, T. Karl, D. Sachau, Numerical position optimization approach for sensor and actuator placement in an active noise cancelling system

Conclusion
Overall, the research shows that a combination of passive materials and active noise control is the best approach. Passive elements reduce parts of the noise that are hard to cancel electronically, while ANC handles the deep, low-frequency noise that humans find especially disturbing. Together, these methods make it possible to keep windows open for fresh air -without letting in the city.

Breaking the Skull Barrier: “Listening” to Ultrasound Therapy Inside the Brain

Pradosh Pritam Dash – ppdash@gatech.edu

Instagram: @pra.dosh.dash
George W. Woodruff School of Mechanical Engineering
Georgia Institute of Technology
Atlanta, GA, 30318
United States

Costas D. Arvanitis
Georgia Institute of Technology and Emory University

Popular version of 3pBAa7 – Breaking the Skull Barrier: Parametric Array Enable Non-Invasive Monitoring of Transcranial Focused Ultrasound
Presented at the 189th ASA Meeting
Read the abstract at https://eppro02.ativ.me/web/index.php?page=Session&project=ASAASJ25&id=3982986&nohistory&nohistory=true

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–

The Challenge of Treating the Brain
Focused Ultrasound (FUS) is a revolutionary, incision-free technology that promises to treat brain disorders, such as tumors and Parkinson’s disease. It works by concentrating high-frequency sound waves to a precise point deep within the brain, much like a magnifying glass focuses sunlight. However, this promising therapy faces a major obstacle: the human skull. The skull is a thick, bony barrier that scrambles, reflects, and weakens these high-frequency waves. This makes it incredibly difficult for doctors to monitor the treatment in real-time and confirm that the energy is actually reaching the intended target. This uncertainty limits the safety and effectiveness of FUS brain therapies.

Figure 1: a - Conceptual illustration of the technique. A transmitter (bottom) sends high-frequency (1 MHz) therapeutic ultrasound waves through the skull. Where these waves interact at the focus, they generate a 50kHz low frequency "parametric Array" signal that easily passes through the skull to a receiver (top). The HASPA framework uses this detected signal to map the therapy. b- The reconstructed (first order) 1 MHz high-frequency and 100 kHz low frequency parametric field using HASPA framework with 3,6, and 9 dB contours.

Figure 1: a – Conceptual illustration of the technique. A transmitter (bottom) sends high-frequency (1 MHz) therapeutic ultrasound waves through the skull. Where these waves interact at the focus, they generate a 50kHz low frequency “parametric Array” signal that easily passes through the skull to a receiver (top). The HASPA framework uses this detected signal to map the therapy. b- The reconstructed (first order) 1 MHz high-frequency and 100 kHz low frequency parametric field using HASPA framework with 3,6, and 9 dB contours.

The skull is a thick, bony barrier that scrambles, reflects, and weakens these high-frequency waves. This makes it incredibly difficult for doctors to monitor the treatment in real-time and confirm that the energy is actually reaching the intended target. This uncertainty limits the safety and effectiveness of FUS brain therapies.d

An Acoustic “Trick” to Overcome the Barrier
Researchers at Georgia Tech and Emory University have developed a new computational framework called HASPA (Heterogeneous Angular Spectrum Parametric Array) that exploits a nonlinear acoustic “trick” known as the “parametric array effect.” When two high-frequency ultrasound beams around 1 MHz beams used for therapy meet at the target inside the brain, they interact nonlinearly and mix. This interaction generates a brand-new sound wave at a much lower difference frequency (around 50-100 kHz).

Think of it this way: High-frequency sounds, like a faint whistle, are easily blocked by a thick wall (the skull). However, low-frequency sounds, like the thumping bass from a neighbor’s stereo, travel through walls easily. In this new approach, the therapeutic “whistles” create a localized “bass” beat exactly where the treatment is happening. This low-frequency signal acts as a messenger, traveling cleanly back out through the skull to be detected by external sensors.

Decoding the Message: The HASPA Framework
The challenge is translating this low-frequency message back into a high-resolution picture of the high-frequency treatment zone inside the brain.

To achieve this, the team developed a novel computational framework called HASPA (Heterogeneous Angular Spectrum Parametric Array) and an associated inverse algorithm (iHASPA).

iHASPA analyzes the low-frequency signal measured outside the skull and mathematically reconstructs a map of the original therapy beams deep inside the brain. Crucially, the framework accounts for the complex ways sound travels through the specific properties of the patient’s skull and brain tissue, correcting for distortions.

Impact and Future
By leveraging this nonlinear acoustic effect, the HASPA framework allows us to “see” through the skull using sound. This new technique enables real-time, non-invasive monitoring of ultrasound beams inside the brain, paving the way for safer, more precise, and more effective focused ultrasound therapies for debilitating neurological disorders.

Hearing where it counts: Toward better directional hearing during earplug and earmuff use

Andrew Brown – andrewdb@uw.edu

University of Washington, Department of Speech and Hearing Sciences, Seattle, WA, 98105, United States

Additional authors: DJ Audet Jr, Aoi A. Hunsaker, Mallory Butler, Carol Sammeth, Alexandria Podolski, Theodore F. Argo, David A. Anderson, Nathaniel T. Greene,

Popular version of 2pNSa4 – Two-dimensional sound localization during hearing protector use in a large sample of human listeners
Presented at the 189th ASA Meeting
Read the abstract at https://eppro02.ativ.me//web/index.php?page=Session&project=ASAASJ25&id=3982069

–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–

In noisy professions – from manufacturing to the military – hearing protection and perception are often at odds. The sense of hearing normally enables listeners to detect and locate sounds arriving from any direction – an especially valuable ability in settings with low visibility (darkness, fog, smoke), visual clutter, or in which important sound sources may be outside the field of vision altogether, whether off in the distance or “right behind you!” However, when noisy settings demand the use of hearing protectors (usually earplugs or earmuffs), the ability to determine sound direction is reduced. Hearing protectors lower the level of transmitted sound – their designed purpose – but they also change the quality of the transmitted sound, disrupting the subtle bits of acoustic information the brain relies on to determine sound direction. This means listeners may confuse forward and rearward sounds, or struggle to locate sounds overhead. The trade-off between protection and perception can contribute to disuse of hearing protectors in critical settings where situational awareness and personal safety may be acutely valued above long-term hearing health.

Methods to evaluate hearing protector impacts have varied widely across previous studies; hearing protectors come in many shapes and sizes, and directional hearing ability varies across people even before hearing protectors enter the picture. Here, in an effort to identify key factors that mediate hearing protector impacts, we measured directional hearing during hearing protector use in a large sample of listeners across two different sites (130 subjects enrolled study-wide). Listeners were asked to orient to sounds that varied in horizontal and vertical location while wearing a variety of commercially available hearing protector styles, with orientation accuracy measured using wireless sensors.

All hearing protectors reduced directional hearing ability, but variation across devices pointed to key variables that may impact performance – and may be captured using relatively simple acoustic measurements. This work is part of an effort to develop metrics beyond the industry-standard “Noise Reduction Rating” that consumers and hearing conservation professionals alike might use to select job-appropriate hearing protectors, and that hearing protection manufacturers might leverage to design and build better devices.

This work was funded by the US Department of Defense Joint Warfighter Medical Research Program.