3aSC4 – Effects of two-talker child speech on novel word learning in preschool-age children

Tina M. Grieco-Calub, tina_griecocalub@rush.edu
Rush University Medical Center
Rush NeuroBehavioral Center

Popular version of 3aSC4 – Effects of two-talker child speech on novel word learning in preschool-age children
Presented Wednesday morning, May 25, 2022
182nd ASA Meeting in Denver, Colorado
Click here to read the abstract

One of the most important tasks for children during preschool and kindergarten is building vocabulary knowledge. This vocabulary is the foundation upon which later academic knowledge and reading skills are built. Children acquire new words through exposure to speech by other people including their parents, teachers, and friends. However, this exposure does not occur in a vacuum. Rather, these interactions often occur in situations where there are other competing sounds, including other people talking or environmental noise. Think back to a time when you tried to have a conversation with someone in a busy restaurant with multiple other conversations happening around you. It can be difficult to focus on the conversation of interest and ignore the other conversations in noisy settings.

Now, think about how a preschool- or kindergarten-aged child might navigate a similar situation, such as a noisy classroom. This child has less mature language and cognitive skills compared to you. Therefore, they have a harder time ignoring those irrelevant conversations to process what the teacher says. Also, children in classrooms must hear and understand the words they know and learn new words. Children who have a hard time ignoring the background noise can have a particularly hard time building essential vocabulary knowledge in classroom settings.

In this study, we are testing the extent to which background speech like what might occur in a preschool classroom influences word learning in preschool- and kindergarten-aged children. We are testing children’s ability to learn and remember unfamiliar words either in quiet and in a noise condition when two other children are talking in the background. In the noise condition, the volume of the teacher is slightly louder than the background talkers, like what a child would experience in a classroom. During the word learning task, children are first shown unfamiliar objects and are asked to repeat their names (e.g., This is a topin. You say topin; see attached movie clip). Children then receive training on the objects and their names. After training, children are asked to name each object. Children’s performance is quantified by how close their production of the object’s name is to the actual name. For example, a child might call the “topin” a “dobin”. Preliminary results suggest that children in quiet and in noise are fairly accurate at repeating the unfamiliar words:    they can focus on the teacher’s speech and repeat all the sounds of the word immediately regardless of condition. Children can also learn the words in both quiet and noise. However, children’s spoken productions of the words are less accurate when they are trained in noise than in quiet. These findings tentatively suggest that when there is background noise, children need more training to learn the precise sounds of words. We will be addressing this issue in future iterations of this study.

Stuttering Starts at Speech Initiation, Not Due to Impaired Motor Skills

Stuttering Starts at Speech Initiation, Not Due to Impaired Motor Skills

Theory suggests anomalies in the brain’s initiation circuit cause stuttering

Media Contact:
Larry Frum
AIP Media

SEATTLE, November 30, 2021 — About one in 20 people go through a period of stuttering during childhood. Until the latter half of the 20th century, stuttering was believed to be a psychological problem stemming from lack of effort or from trauma.

However, techniques in neuroimaging are leading to a much better understanding of brain function during speech and how stuttering arises. Frank Guenther, from Boston University, will present his findings on the origins of stuttering at the 181st Meeting of the Acoustical Society of America, which runs from Nov. 29 to Dec. 3, at the Hyatt Regency Seattle. The talk, “A neurocomputational view of developmental stuttering,” will take place Tuesday, Nov. 30 at 2:15 p.m. Eastern U.S.

Guenther compares speech to a jukebox that plays CDs. The jukebox has two circuits: one that chooses a CD and one that plays the CD.

Inside the brain, this corresponds to one circuit initiating the desired speech in the basal ganglia, while another circuit coordinates the muscles needed to generate the speech. Stuttering stems from the initiation of speech, so only the first of the two circuits is impaired.

“In stuttering, the CDs themselves are fine, but the mechanism for choosing them is impaired,” said Guenther.

This theory matches behavioral observations of stuttering. People will often speak words fluently later in a sentence, even if the same words cause stuttering at the beginning of a sentence.

Guenther and his team created computational models of how the speech initiation circuit performs in a nonstuttering individual. Because Parkinson’s disease also affects the initiation circuit, they can compare these models directly to data taken from the basal ganglia during deep brain stimulation surgery in patients with the disease.

“This gives us a fighting chance of finding the specific problems underlying stuttering and addressing them with highly targeted drugs or technological treatments that have minimal unwanted side effects,” said Guenther.

Main meeting website: https://acousticalsociety.org/asa-meetings/
Technical program: https://eventpilotadmin.com/web/planner.php?id=ASASPRING22
Press Room: https://acoustics.org/world-wide-press-room/

In the coming weeks, ASA’s Worldwide Press Room will be updated with additional tips on dozens of newsworthy stories and with lay language papers, which are 300 to 500 word summaries of presentations written by scientists for a general audience and accompanied by photos, audio and video. You can visit the site during the meeting at https://acoustics.org/world-wide-press-room/.

We will grant free registration to credentialed journalists and professional freelance journalists. If you are a reporter and would like to attend, contact AIP Media Services at media@aip.org. For urgent requests, staff at media@aip.org can also help with setting up interviews and obtaining images, sound clips, or background information.

The Acoustical Society of America (ASA) is the premier international scientific society in acoustics devoted to the science and technology of sound. Its 7,000 members worldwide represent a broad spectrum of the study of acoustics. ASA publications include The Journal of the Acoustical Society of America (the world’s leading journal on acoustics), JASA Express Letters, Proceedings of Meetings on Acoustics, Acoustics Today magazine, books, and standards on acoustics. The society also holds two major scientific meetings each year. See https://acousticalsociety.org/.

4aAA10 – Acoustic Effects of Face Masks on Speech: Impulse Response Measurements Between Two Head and Torso Simulators

Victoria Anderson – vranderson@unomaha.edu
Lily Wang – lilywang@unl.edu
Chris Stecker – cstecker@spatialhearing.org
University of Nebraska Lincoln at the Omaha Campus
1110 S 67th Street
Omaha, Nebraska

Popular version of 4aAA10 – Acoustic effects of face masks on speech: Impulse response measurements between two binaural mannikins
Presented Thursday morning, December 2nd, 2021
181st ASA Meeting
Click here to read the abstract

Due to the COVID-19 Pandemic, masks that cover both the mouth and nose have been used to reduce the spread of illness. While they are effective at preventing the transmission of COVID, they have also had a noticeable impact on communication. Many find it difficult to understand a speaker if they are wearing a mask. Masks effect the sound level and direction of speech, and if they are opaque, can block visual cues that help in understanding speech. There are many studies that explore the effect face masks have on understanding speech. The purpose of this project was to begin assembling a database of the effect that common face masks have on impulse responses from one head and torso simulator (HATS) to another. Impulse response is the measurement of sound radiating out from a source and how it bounces through a space. The resulting impulse response data can be used by researchers to simulate masked verbal communication scenarios.To see how the masks specifically effect the impulse response, all measurements were taken in an anechoic chamber so no reverberant noise would be included in the impulse response measurement. The measurements were taken with one HATS in the middle of the chamber to be used as the source, and another HATS placed at varying distances to act as the receiver. The mouth of the source HATS was covered with various face masks: paper, cloth, N95, nano, and face shield. These were put on individually and in combination with a face shield to get a wider range of potential masked combinations that would reasonably occur in real life. The receiver HATS took measurements at 90° and 45° from the source, at distances of 6’ and 8’. A sine sweep, which is a signal that changes frequency over a set amount of time, was played to determine the impulse response of each masked condition at every location. The receiver HATS measured the impulse response in both right and left ears, and the software used to produce the sine sweep was used to analyze and store the measurement data. This data will be available for use in simulated communication scenarios to better portray how sound would behave in a space when coming from a masked speaker.

masks masks head and torso simulator (HATS) masks


3aSC7 – Human BeatBoxing : A Vocal Exploration

Alexis Dehais-Underdown – alexis-dehais-underdown@sorbonne-nouvelle.fr
Paul Vignes – vignes.paul@gmail.com
Lise Crevier-Buchman – lise.buchman1@gmail.com
Didier Demolin – didier.demolin@sorbonne-nouvelle.fr
Université Sorbonne-Nouvelle
13, rue de Santeuil
75005, Paris, FRANCE

Popular version of 3aSC7 – Human beatboxing: Physiological aspects of drum imitation
Presented Wednesday morning, December 1st, 2021
181st ASA Meeting, Seattle, Washington
Click here to read the abstract

We are interested in exploring the potential of the human vocal tract by understanding beatboxing production. Human Beatboxing (HBB) is a musical technique that uses the vocal tract to imitate musical instruments. Similar to languages like French or English, HBB relies on the combination of smaller units into larger ones. Unlike linguistic systems, HBB has no meaning: while we speak to be understood, beatboxers do not perform to be understood. Speech production obeys to linguistic constraints to ensure efficient communication, for example, the fact that each language have a finite number of vowels and consonants. This is not the case for HBB production because beatboxers use a larger number of sounds. We hypothesize that beatboxers acquire a more accurate and extended knowledge on physical capacities of the vocal tract that allows them to use a larger number of sounds.

Acquisition of laryngoscopic data (left) and acoustic & aerodynamic data (right)

We use 3 technics on 5 professional beatboxers : (1) aerodynamic recordings, (2) laryngoscopic recordings and (3) acoustic recordings. Aerodynamic data gives information about pressure and airflow changes that are the result of articulatory movements. Laryngoscopic images give a view of the different anatomical laryngeal structures and their role in beatboxing production. Acoustic data allows us to investigate the sound characteristics in terms of frequency and amplitude. We extracted 9 basic beatboxing sounds from our database: the classic kick drum and its humming variant, the closed hi-hat and its humming variant, the inward k-snare and its humming variant, the cough snare and the lips roll and its humming variant. Humming is a beatboxing strategy that allows simultaneous and independent articulation in the mouth and melodic voice production in the larynx. Some sounds are illustrated here :

The preliminary results are very interesting. While speech is mainly produced on an egressive airflow from the lungs (i.e. exhalation phase of breathing), HBB is not. We found a wide range of mechanisms to produce basic sounds. Mechanisms were described by where the airflow was set in motion (i.e. lungs, larynx, mouth) and by which direction the airflow goes (i.e. in or out of the vocal tract). Sounds shows different combinations of airflow location and direction :
• buccal egressive (humming classic kick and closed hi-hat) and ingressive (humming k-snare and lips roll)
• pulmonic egressive (cough snare) and ingressive sounds (classic inward k-snare and lips roll),
• laryngeal egressive (classic kick drum and closed hi-hat) and ingressive (classic k-snare and inward classic kick drum).

A same sound may be produced differently by different beatboxers but may sound perceptually similar. HBB displays high pressure values that suggests these mechanisms are more powerful than speech ones in a quiet conversation.

In the absence of linguistic constraints, artists are exploiting the vocal tract capacities more freely. It raises several questions about how they reorganize the respiratory activity, how they coordinate sounds together and how beatboxers avoid lesions or damages of the vocal tract structures. Our research project will produce further analysis on the description and coordination of beatboxing sounds at different speed rates based on MRI, Laryngoscopic, Aerodynamic and Acoustic data.


See also: Alexis Dehais-UnderdownPaul VignesLise Crevier-Buchman, and Didier Demolin, “In and out: production mechanisms in Human Beatboxing”, Proc. Mtgs. Acoust. 45, 060005 (2021) https://doi.org/10.1121/2.0001543

2aSC1 – Testing invisible Participants: Conducting Behavioural Science online during the Pandemic

Prof Jennifer Rodd
Department of Experimental Psychology, University College London

Popular version of paper 2aSC1 Collecting experimental data online: How to maintain data quality when you can’t see your participants
Presented at the 180th ASA meeting

In early 2020 many researchers across the world had to close up their labs and head home to help prevent further spread of coronavirus.

If this pandemic had arrived a few years earlier, these restrictions on testing human volunteers in person would have resulted in a near-complete shutdown of behavioural research. Fortunately, the last 10 years have seen rapid advances in the software needed to conduct behavioural research online (e.g., Gorilla, jsPsych) and researchers now have access to well regulated pools of paid participants (e.g., Prolific). This allowed the many researchers who had already switched to online data collection could to continue to collect data throughout the pandemic. In addition, many lab-based researchers, who may have been sceptical about online data collection made the switch to online experiments over the last year. Jo Evershed (Founder CEO of Gorilla Experiment Builder) reports that the number of participants who completed a task online using Gorilla nearly tripled between the first quarter of 2020 and the same time period in 2021.

But this rapid shift to online research is not without problems. Many researchers have well-founded concerns about the lack of experimental control that arises when we cannot directly observe our participants.

Based on 8 years of running behavioural research online, I encourage researchers to embrace online research, but argue that we must carefully adapt our research protocols to maintain high data quality.

I present a general framework for conducting online research. This requires researcher to explicitly specify how moving data collection online might negatively impact their data and undermine their theoretical conclusions.

  • Where are participants doing the experiment? Somewhere noisy or distracting? Will this make data noisy or introduce systematic bias?


  • What equipment are participants using? Slow internet connection? Small screen? Headphones or speakers? How might this impact results?


  • Are participants who they say they are? Why might they lie about their age or language background? Does this matter?


  • Can participants cheat on your task? By writing things down as they go, or looking up information on the internet?


I encourage researchers to take a ‘worst case’ approach and assume that some of the data they collect will inevitably be of poor quality. The onus is on us to carefully build in experiment-specific safeguards to ensure that poor quality data can be reliably identified and excluded from our analyses. Sometimes this can be achieved by pre-specifying specific performance criteria on existing tasks, but often it included creating new tasks to provide critical information about our participants and their behaviour. These additional steps must be take prior to data collection, and can be time-consuming, but are vital to maintain the credibility of data obtained using online methods.

1aSC2 – The McGurk Illusion

Kristin J. Van Engen – kvanengen@wustl.edu
Washington University in St. Louis
1 Brookings Dr.
Saint Louis, MO 63130

Popular version of paper 1aSC2 The McGurk illusion
Presented Tuesday morning, June 8, 2021
180th ASA Meeting, Acoustics in Focus

In 1976, Harry McGurk and John MacDonald published their now-famous article, “Hearing Lips and Seeing Voices.” The study was a remarkable demonstration of how what we see affects what we hear: when the audio for the syllable “ba” was presented to listeners with the video of a face saying “ga”, listeners consistently reported hearing “da”.

That original paper has been cited approximately 7500 times to date, and in the subsequent 45 years, the “McGurk effect” has been used in countless studies of audiovisual processing in humans. It is typically assumed that people who are more susceptible to the illusion are also better at integrating auditory and visual information. This assumption has led to the use of susceptibility to the McGurk illusion as a measure of an individual’s ability to process audiovisual speech.

However, when it comes to understanding real-world multisensory speech perception, there are several reasons to think that McGurk-style stimuli are poorly-suited to the task. Most problematic is the fact that McGurk stimuli rely on audiovisual incongruence that never occurs in real-life audiovisual speech perception. Furthermore, recent studies show that susceptibility to the effect does not actually correlate with performance on audiovisual speech perception tasks such as understanding sentences in noisy conditions. This presentation reviews these issues, arguing that, while the McGurk effect is a fascinating illusion, it is the wrong tool for understanding the combined use of auditory and visual information during speech perception.