MELVILLE, N.Y., Nov. 19, 2024 – Alzheimer’s disease affects more than 50 million people worldwide, often devastating both the individuals who have it and their families and loved ones. It has no known cure, and the slow, progressive nature of the disease makes early diagnosis difficult.
Researchers from École de Technologie Supérieure and Dartmouth University are investigating the use of earpiece microphones to spot early signs of Alzheimer’s. Miriam Boutros will present their work on Tuesday, Nov. 19, at 4:15 p.m. ET, as part of the virtual 187th Meeting of the Acoustical Society of America, running Nov. 18-22, 2024.
People with Alzheimer’s exhibit a loss of motor control along with cognitive decline. One of the earliest signs of this decay can be spotted in involuntary eye movements known as saccades. These quick twitches of the eyes in Alzheimer’s patients are often slower, less accurate, or delayed compared to those in healthy individuals.
The researchers will track abnormal saccades, an early sign of Alzheimer’s, using both eye-tracking technology and in-ear hearables. Credit: Boutros et al.
“Eye movements are fascinating since they are some of the most rapid and precise movements in the human body, thus they rely on both excellent motor skills and cognitive functioning,” said researcher Arian Shamei.
Detecting and analyzing saccades directly requires a patient to be monitored by eye-tracking equipment, which is not easily accessible for most people. Boutros and her colleagues are exploring an alternative method using a more ubiquitous and less intrusive technology: earpiece microphones. This research is led by Rachel Bouserhal at the Research in Hearing Health and Assistive Devices (RHAD) Laboratory at École de Technologie Supérieure and Chris Niemczak at the Geisel School of Medicine at Dartmouth University.
“We are using a device called a hearable,” said Boutros. “It is an earpiece with in-ear microphones that captures physiological signals from the body. Our goal is to develop health-monitoring algorithms for hearables, capable of continuous, long-term monitoring and early disease detection.”
Eye movements, including saccades, cause eardrum vibrations that can be picked up by sensitive microphones located within the ear. The researchers are conducting experiments with volunteers, giving them both hearables and conventional eye trackers. Their goal is to identify signals corresponding to saccades, and to differentiate between healthy signals and others that are indicative of neurological disorders like Alzheimer’s.
They hope one day their research will lead to devices that can perform noninvasive continuous monitoring for Alzheimer’s along with other neurological diseases.
“While the current project is focused on long-term monitoring of Alzheimer’s disease, eventually, we would like to tackle other diseases and be able to differentiate between them based on symptoms that can be tracked through in-ear signals,” said Shamei.
ASA PRESS ROOM In the coming weeks, ASA’s Press Room will be updated with newsworthy stories and the press conference schedule at https://acoustics.org/asa-press-room/.
LAY LANGUAGE PAPERS ASA will also share dozens of lay language papers about topics covered at the conference. Lay language papers are summaries (300-500 words) of presentations written by scientists for a general audience. They will be accompanied by photos, audio, and video. Learn more at https://acoustics.org/lay-language-papers/.
PRESS REGISTRATION ASA will grant free registration to credentialed and professional freelance journalists. If you are a reporter and would like to attend the virtual meeting and/or press conferences, contact AIP Media Services at media@aip.org. For urgent requests, AIP staff can also help with setting up interviews and obtaining images, sound clips, or background information.
ABOUT THE ACOUSTICAL SOCIETY OF AMERICA The Acoustical Society of America is the premier international scientific society in acoustics devoted to the science and technology of sound. Its 7,000 members worldwide represent a broad spectrum of the study of acoustics. ASA publications include The Journal of the Acoustical Society of America (the world’s leading journal on acoustics), JASA Express Letters, Proceedings of Meetings on Acoustics, Acoustics Today magazine, books, and standards on acoustics. The society also holds two major scientific meetings each year. See https://acousticalsociety.org/.
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
Have you ever listened to a song and later been surprised to hear the artist speak with a different accent than the one you heard in the song? Take country singer Keith Urban’s song “What About Me” for instance; when listening, you might assume that he has a Southern American (US) English accent. However, in his interviews, he speaks with an Australian English accent. So why did you think he sounded Southern?
Research suggests that specific accents or dialects are associated with musical genres [2], that singers adjust their accents based on genre [4]; and that foreign accents are more difficult to recognize in songs compared to speech [5]. However, when listeners perceive an accent in a song, it is unclear which type of information they rely on: the acoustic speech information or information about the musical genre. Our previous research investigated this question for Country and Reggae music and found that genre recognition may play a larger role in dialect perception than the actual sound of the voice [9].
Our current study explores American Blues and Folk music, genres that allow for easier separation of vocals from instrumentals, with more refined stimuli manipulation. Blues is strongly associated with African American English [3], while Folk can be associated with a variety of (British, American, etc.) dialects [1]. Participants listened to manipulated clips of sung and “spoken” lines taken from songs in both genres, which were transcribed for participants (see Figure 1). AI applications were used to remove instrumentals for both sung and spoken clips, while “spoken” clips also underwent rhythm and pitch normalization so that they sounded like spoken rather than sung speech. After hearing each sung or spoken line, participants were asked to identify the dialect they heard from six options [7, 8] (see Figure 2).
Figure 1: Participant view of a transcript from a Folk song clip.
Figure 2: Participant view of six dialect options after hearing a clip.
Participants were much more confident and accurate in categorizing accents for clips in the Sung condition, regardless of genre. The proportion of uncertainty (“Not Sure” responses) in the Spoken condition was consistent across genres (see “D” in Figure 3), suggesting that participants were more certain of dialect when musical cues were present. Dialect categories followed genre expectations, as can be seen from the increase in identifying African American English for Blues in the Sung condition (see “A”). Removing uncertainty by adding genre cues did not increase the likelihood of “Irish English” or “British English” being chosen for Blues, though it did for Folk (see “B” and “C” in Figure 3), in line with genre-based expectations.
Figure 3: Participant dialect responses.
These findings enhance our understanding of the relationship between musical genre and accent. Referring again to the example of Keith Urban, the singer’s stylistic accent change may not be the only culprit for our interpretation of a Southern drawl. Rather, we may have assumed we were listening to a musician with a Southern American English Accent when we heard the first banjo-like twang or tuned into iHeartCountry Radio. When we listen to a song and perceive a singer’s accent, we are not only listening to the sounds of their speech, but are also shaping our perception from our expectations of dialect based on the musical genre.
References:
Carrigan, J., Henry L. (2004). Lornell, kip. the NPR curious listener’s guide to american folk music. Library Journal (1976), 129(19), 63.
De Timmerman, Romeo, et al. (2024). The globalization of local indexicalities through music: African‐American English and the blues. Journal of Sociolinguistics, 28(1), 3–25. https://doi.org/10.1111/josl.12616.
Gibson, A. M. (2019). Sociophonetics of popular music: insights from corpus analysis and speech perception experiments [Doctoral dissertation, University of Canterbury]. http://dx.doi.org/10.26021/4007.
Mageau, M., Mekik, C., Sokalski, A., & Toivonen, I. (2019). Detecting foreign accents in song. Phonetica, 76(6), 429–447. https://doi.org/10.1159/000500187.
RStudio. (2020). RStudio: Integrated Development for R. RStudio, PBC, Boston, MA. http://www.rstudio.com/.
Stoet, G. (2010). PsyToolkit – A software package for programming psychological experiments using Linux. Behavior Research Methods, 42(4), 1096-1104.
Stoet, G. (2017). PsyToolkit: A novel web-based method for running online questionnaires and reaction-time experiments. Teaching of Psychology, 44(1), 24-31.
Walter, M., Bengtson, G., Maitinsky, M., Islam, M. J., & Gick, B. (2023). Dialect perception in song versus speech. The Journal of the Acoustical Society of America, 154(4_supplement), A161. https://doi.org/10.1121/10.0023131.
OTTAWA, Ontario, May 16, 2024 – Oscar Wilde once said that sarcasm was the lowest form of wit, but the highest form of intelligence. Perhaps that is due to how difficult it is to use and understand. Sarcasm is notoriously tricky to convey through text — even in person, it can be easily misinterpreted. The subtle changes in tone that convey sarcasm often confuse computer algorithms as well, limiting virtual assistants and content analysis tools.
Xiyuan Gao, Shekhar Nayak, and Matt Coler of Speech Technology Lab at the University of Groningen, Campus Fryslân developed a multimodal algorithm for improved sarcasm detection that examines multiple aspects of audio recordings for increased accuracy. Gao will present their work Thursday, May 16, at 10:35 a.m. EDT as part of a joint meeting of the Acoustical Society of America and the Canadian Acoustical Association, running May 13-17 at the Shaw Centre located in downtown Ottawa, Ontario, Canada.
Using text recognition, incorporating emoticons, and introducing audio analysis, researchers designed a robust system for detecting sarcasm in human speech. Image credit: This image was created with the assistance of DALL•E 3.
Traditional sarcasm detection algorithms often rely on a single parameter to produce their results, which is the main reason they often fall short. Gao, Nayak, and Coler instead used two complementary approaches — sentiment analysis using text and emotion recognition using audio — for a more complete picture.
“We extracted acoustic parameters such as pitch, speaking rate, and energy from speech, then used Automatic Speech Recognition to transcribe the speech into text for sentiment analysis,” said Gao. “Next, we assigned emoticons to each speech segment, reflecting its emotional content. By integrating these multimodal cues into a machine learning algorithm, our approach leverages the combined strengths of auditory and textual information along with emoticons for a comprehensive analysis.”
The team is optimistic about the performance of their algorithm, but they are already looking for ways to improve it further.
“There are a range of expressions and gestures people use to highlight sarcastic elements in speech,” said Gao. “These need to be better integrated into our project. In addition, we would like to include more languages and adopt developing sarcasm recognition techniques.”
This approach can be used for more than identifying a dry wit. The researchers highlight that this technique can be widely applied in many fields.
“The development of sarcasm recognition technology can benefit other research domains using sentiment analysis and emotion recognition,” said Gao. “Traditionally, sentiment analysis mainly focuses on text and is developed for applications such as online hate speech detection and customer opinion mining. Emotion recognition based on speech can be applied to AI-assisted health care. Sarcasm recognition technology that applies a multimodal approach is insightful to these research domains.”
ASA PRESS ROOM In the coming weeks, ASA’s Press Room will be updated with newsworthy stories and the press conference schedule at https://acoustics.org/asa-press-room/.
LAY LANGUAGE PAPERS ASA will also share dozens of lay language papers about topics covered at the conference. Lay language papers are summaries (300-500 words) of presentations written by scientists for a general audience. They will be accompanied by photos, audio, and video. Learn more at https://acoustics.org/lay-language-papers/.
PRESS REGISTRATION ASA will grant free registration to credentialed and professional freelance journalists. If you are a reporter and would like to attend the in-person meeting or virtual press conferences, contact AIP Media Services at media@aip.org. For urgent requests, AIP staff can also help with setting up interviews and obtaining images, sound clips, or background information.
ABOUT THE ACOUSTICAL SOCIETY OF AMERICA The Acoustical Society of America is the premier international scientific society in acoustics devoted to the science and technology of sound. Its 7,000 members worldwide represent a broad spectrum of the study of acoustics. ASA publications include The Journal of the Acoustical Society of America (the world’s leading journal on acoustics), JASA Express Letters, Proceedings of Meetings on Acoustics, Acoustics Today magazine, books, and standards on acoustics. The society also holds two major scientific meetings each year. See https://acousticalsociety.org/.
ABOUT THE CANADIAN ACOUSTICAL ASSOCIATION/ASSOCIATION CANADIENNE D’ACOUSTIQUE
fosters communication among people working in all areas of acoustics in Canada
promotes the growth and practical application of knowledge in acoustics
encourages education, research, protection of the environment, and employment in acoustics
is an umbrella organization through which general issues in education, employment and research can be addressed at a national and multidisciplinary level
The CAA is a member society of the International Institute of Noise Control Engineering (I-INCE) and the International Commission for Acoustics (ICA), and is an affiliate society of the International Institute of Acoustics and Vibration (IIAV). Visit https://caa-aca.ca/.
M. Fernanda Alonso Arteche – maria.alonsoarteche@mail.mcgill.ca
Instagram: @laneurotransmisora
School of Communication Science and Disorders, McGill University, Center for Research on Brain, Language, and Music (CRBLM), Montreal, QC, H3A 0G4, Canada
Instagram: @babylabmcgill
Popular version of 2pSCa – Implicit and explicit responses to infant sounds: a cross-sectional study among parents and non-parents
Presented at the 186th ASA Meeting
Read the abstract at https://doi.org/10.1121/10.0027179
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
Imagine hearing a baby coo and instantly feeling a surge of positivity. Surprisingly, how we react to the simple sounds of a baby speaking might depend on whether we are women or men, and whether we are parents. Our lab’s research delves into this phenomenon, revealing intriguing differences in how adults perceive baby vocalizations, with a particular focus on mothers, fathers, and non-parents.
Using a method that measures reaction time to sounds, we compared adults’ responses to vowel sounds produced by a baby and by an adult, as well as meows produced by a cat and by a kitten. We found that women, including mothers, tend to respond positively only to baby speech sounds. On the other hand, men, especially fathers, showed a more neutral reaction to all sounds. This suggests that the way we process human speech sounds, particularly those of infants, may vary significantly between genders. While previous studies report that both men and women generally show a positive response to baby faces, our findings indicate that their speech sounds might affect us differently.
Moreover, mothers rated babies and their sounds highly, expressing a strong liking for babies, their cuteness, and the cuteness of their sounds. Fathers, although less responsive in the reaction task, still rated highly their liking for babies, the cuteness of them, and the appeal of their sounds. This contrast between implicit (subconscious) reactions and explicit (conscious) opinions highlights an interesting complexity in parental instincts and perceptions. Implicit measures, such as those used in our study, tap into automatic and unconscious responses that individuals might not be fully aware of or may not express when asked directly. These methods offer a more direct window into the underlying feelings that might be obscured by social expectations or personal biases.
This research builds on earlier studies conducted in our lab, where we found that infants prefer to listen to the vocalizations of other infants, a factor that might be important for their development. We wanted to see if adults, especially parents, show similar patterns because their reactions may also play a role in how they interact with and nurture children. Since adults are the primary caregivers, understanding these natural inclinations could be key to supporting children’s development more effectively.
The implications of this study are not just academic; they touch on everyday experiences of families and can influence how we think about communication within families. Understanding these differences is a step towards appreciating the diverse ways people connect with and respond to the youngest members of our society.
Northwestern University, Communication Sciences & Disorders, Evanston, IL, 60208, United States
Jeff Crukley – University of Toronto; McMaster University
Emily Lundberg – University of Colorado, Boulder
James M. Kates – University of Colorado, Boulder
Kathryn Arehart – University of Colorado, Boulder
Pamela Souza – Northwestern University
Popular version of 3aPP1 – Modeling the relationship between listener factors and signal modification: A pooled analysis spanning a decade
Presented at the 186th ASA Meeting
Read the abstract at https://doi.org/10.1121/10.0027317
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
Imagine yourself in a busy restaurant, trying to focus on a conversation. Often, even with hearing aids, the background noise can make it challenging to understand every word. While some listeners manage to follow the conversations rather easily, others find it hard to follow along, despite having their hearing aids adjusted.
Studies show that cognitive abilities (and not just how well we hear) can affect how well we understand speech in noisy places. Individuals with weaker cognitive abilities struggle more in these situations. Unfortunately, current clinical approaches to hearing aid treatment have not yet been catered to these individuals. The standard approach to setting up hearing aids is to make speech sounds louder or more audible. However, a downside is that hearing aid settings that make speech more audible or attempt to remove background noise, can unintentionally modify other important cues, such as fluctuations in the intensity of the sound, that are necessary for understanding speech. Consequently, some listeners who depend on these cues may be at a disadvantage. Our investigations have focused on understanding why listeners with hearing aids experience these noisy environments differently and developing an evidence-based method for adjusting hearing aids to each person’s individual abilities.
To address this, we pooled data from 73 individuals across four different published studies from our group over the last decade. In these studies, listeners with hearing loss were asked to repeat sentences that were mixed with background chatter (like at a restaurant or a social gathering). The signals were processed through hearing aids that were adjusted in various ways, changing how they handle loudness and background noise. We measured how these adjustments applied to the noisy speech affected the ability of the listeners to understand the sentences. Each of these studies also used a measurement to capture how the hearing aids and background noise together alter the speech sounds (signal fidelity) heard by the listener.
Figure 1. Effect of individual cognitive abilities (working memory) on word recognition as signal fidelity changes.
Our findings reveal that listeners generally understand speech better when the background noise is less intrusive, and the hearing aids do not alter the speech cues too much. But there’s more to it: how well a person’s brain collects and manipulates speech information (their working memory), their age, and the severity of their hearing loss all play a role in how well they understand speech in noisy situations. Specifically, those with lower working memory tend to have more difficulty understanding speech when it is obscured by noise or altered by the hearing aid (Figure 1). So, improving the listening environment by reducing the background noise and/or choosing milder settings on the hearing aids could benefit these individuals.
In summary, our study indicates that a tailored approach that considers each person’s cognitive abilities could lead to better communication, especially in noisier situations. Clinically, the measurement of signal fidelity may be a useful tool to help make these decisions. This could mean the difference between straining to hear and enjoying a good conversation over dinner with family.
Emma Holmes – emma.holmes@ucl.ac.uk
X (Twitter): @Emma_Holmes_90
University College London (UCL), Department of Speech Hearing and Phonetic Sciences, London, Greater London, WC1N 1PF, United Kingdom
Popular version of 4aPP4 – How does voice familiarity affect speech intelligibility?
Presented at the 186th ASA Meeting
Read the abstract at https://doi.org/10.1121/10.0027437
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
It’s much easier to understand what others are saying if you’re listening to a close friend or family member, compared to a stranger. If you practice listening to the voices of people you’ve never met before, you might also become better at understanding them too.
Many people struggle to understand what others are saying in noisy restaurants or cafés. This can become much more challenging as people get older. It’s often one of the first changes that people notice in their hearing. Yet, research shows that these situations are much easier if people are listening to someone they know very well.
In our research, we ask people to visit the lab with a friend or partner. We record their voices while they read sentences aloud. We then invite the volunteers back for a listening test. During the test, they hear sentences and click words on a screen to show what they heard. This is made more difficult by playing a second sentence at the same time, which the volunteers are told to ignore. This is like having a conversation when there are other people talking around you. Our volunteers listen to many sentences over the course of the experiment. Sometimes, the sentence is one recorded from their friend or partner. Other times, it’s one recorded from someone they’ve never met. Our studies have shown that people are best at understanding the sentences spoken by their friend or partner.
In one study, we manipulated the sentence recordings, to change the sound of the voices. The voices still sounded natural. Yet, volunteers could no longer recognize them as their friend or partner. We found that participants were still better at understanding the sentences, even though they didn’t recognize the voice.
In other studies, we’ve investigated how people learn to become familiar with new voices. Each volunteer learns the names of three new people. They’ve never met these people, but we play them lots of recordings of their voices. This is like when you listen to a new podcast or radio show. We’ve found that people become very good at understanding these people. In other words, we can train people to become familiar with new voices.
In new work that hasn’t yet been published, we found that voice familiarization training benefits both older and younger people. So, it may help older people who find it very difficult to listen in noisy places. Many environments contain background noise—from office parties to hospitals and train stations. Ultimately, we hope that we can familiarize people with voices they hear in their daily lives, to make it easier to listen in noisy places.