ASA Lay Language Papers
163rd Acoustical Society of America Meeting


Speech in Noise and Ease of Language Understanding:
When and how working memory capacity plays a role


Jerker Rönnberg --  jerker.ronnberg@liu.se
Linnaeus Centre HEAD, Swedish Institute for Disability Research, Linköping University, Sweden
Department of Behavioural Sciences and Learning, Linköping University, Sweden

Patrik Sörqvist
Linnaeus Centre HEAD, Swedish Institute for Disability Research, Linköping University, Sweden
Department of Building, Energy and Environmental Engineering, University of Gävle, Gävle, Sweden

Örjan Dahlström
Linnaeus Centre HEAD, Swedish Institute for Disability Research, Linköping University, Sweden
Department of Behavioural Sciences and Learning, Linköping University, Sweden

Mary Rudner
Linnaeus Centre HEAD, Swedish Institute for Disability Research, Linköping University, Sweden
Department of Behavioural Sciences and Learning, Linköping University, Sweden

Ingrid Johnsrude
Linnaeus Centre HEAD, Swedish Institute for Disability Research, Linköping University
Department of Psychology, Queen's University, Kingston, Ontario, Canada

Stefan Stenfelt
Linnaeus Centre HEAD, Swedish Institute for Disability Research, Linköping University, Sweden
Department of Clinical and Experimental Medicine, Linköping University, Sweden

Popular Version of Paper 2pPP11
Presented Tuesday afternoon, May 15, 2012
163rd ASA Meeting, Hong Kong

Introduction

When listening to a colleague at a cocktail party, while others are talking in the background, is it easy for you to understand what your colleague is saying? Some people have no problem following and understanding speech heard in noise, whereas others have a severely limited ability to do so. What is the basis of those individual differences? Here, we argue that part of the answer has to do with something called working memory: A memory system that allows on-line storage, selective attention and rehearsal of information. Working memory is crucial for listening, speaking and for communication in general (Baddeley, 2012).

Ronnberg_fig1

“Figure 1. A cocktail party at the poster session of the First International Conference on Cognitive Hearing Science for Communication, june 2011, Linköping, Sweden”

Aim

This paper is about the role of working memory capacity in speech understanding under challenging listening conditions. The theoretical model that has driven most of the research reported in this paper is called the Ease-of-Language understanding model (Ronnberg, 2003; Ronnberg et al., 2008). The Ease-of-Language understanding model is part of a larger scientific endeavor called cognitive hearing science (Ronnberg, Rudner & Lunner, 2011a).

The model

The model assumes that speech understanding runs smoothly (with ease) as long as the listener can extract sufficient phonological information from the input signal. If the phonological input (e.g. auditory and visual information about phonemes, syllables, onset and rhyme, articulatory features) can be matched with the listener's phonological representation in long-term memory, then the lexical meaning can be "unlocked" automatically and rapidly. Speed is essential to keep up with a dialogue.

However, when this rapid matching process is blocked by, a noisy background, the hearing loss of the listener, inappropriate signal processing in the hearing aid, or by poor phonological representations in long-term memory (Ronnberg et al., 2011b), mismatch will occur. As a result, the flow of information processing is disrupted, and at this point in time (after 200-300 ms) the system asks for working memory resources to help resolve alternative interpretations of what is said by means of the present context and semantic cues. The mental work-bench for such guess-work is what we call working memory, typically done in a more elaborate and tedious fashion (during seconds). The capacity to multi-task, in the sense of inferring from what has been perceptually picked up from the spoken input signals, and combining that with what can be retrieved from long-term memory via contextual/semantic cues, is a crucial component of working memory capacity (Baddeley, 2000).

The test

Important to note is that the working memory capacity test that best taps into this capacity is one where there is a dual task involved (called complex-span tasks). For example, in the reading span test, the participant has to read sentences under time pressure, make a (semantic) judgement (during max 2 seconds) as to whether the sentence is absurd or not (Daneman & Carpenter, 1980; Ronnberg et al., 1989), and then, after a set of short sentences have been presented and read, they are to recall either the first or last word of the sentence in the correct presentation order. For example, the presentation sequence could be as follows: Presentation of sentence 1: The train sang a song (Judgement 1); Sentence 2: The captain saw a boat (Judgement 2); Sentence 3: The flask drank water (Judgement 3). Recall the last word: song, boat, water. The participant obviously has to manage several things at the same time: semantic judgements, intermediate storage of sentences (without knowing whether you will be tested on the first or last word), and then retrieval. Typically, a score of three (as above) is what people manage. With e.g. six sentences, very few persons succeed in full. We argue that the dual- or multi-task ability measured by complex-span tasks, is important for the hearing impaired person to be able to compensate for loss of signal, by orchestrating the interplay between distorted perceptual input, long-term memory and contextual cues.

Background data

We have collected data that reveal a strong working memory capacity dependence when hearing-impaired listeners have to perceive and understand speech in noise (Foo et al., 2007; Rudner et al., 2009). The capacity is also related to the type of noise. Processing speech in fluctuating noise, and with fast compression of the signal in the hearing aid, is even more dependent on working memory capacity, than processing speech in stationary noise with slow compression (Lunner & Sundewall & Thoren, 2007; Ronnberg, et al., 2010). A typical explanation that has been put forward is that people who can "listen in the gaps" or are relatively better at taking advantage of "glimpses" between noise bursts, especially with fast compression in the hearing aid, are endowed with a high working memory capacity.

Recent data

The presentation will also explore the boundaries of the effects of working memory capacity on on-line processing of attention-demanding or degraded signal conditions, both for immediate perception and understanding, as well as for remembering of target materials. We will first examine the effects of working memory capacity and working memory load on very early brain stem responses, i.e. at the level of processing in the auditory nerve close in time after the output from the hearing organ (the cochlea). Second, recent data reveal that another level of the auditory system in the brain, the cortical level, is also modulated by working memory capacity, especially at intermediate levels of speech intelligibility (Dahlstrom et al, 2012). Third, and finally, in a recent publication (Sorqvist & Ronnberg, 2012), we demonstrated that a special version of a dual working memory task, the size comparison span, is particularly capable of predicting what you remember of heard sentences masked by speech from another speaker. Size comparison span involves comparing the size of objects or animals, responding with a yes or no, and then focusing on a to-be recalled word: The sequence could be: Question 1: Is a Zebra larger than a mouse? Yes/no: then, encoding of a to-be-recalled word: lion; Question 2: Is a dog larger than a cat? Yes/no: Encoding of a to-be-recalled word: sheep; and so on, up to a certain list length. Then recall of the to-be-recalled words is required, in the correct serial order. A possibly important feature of this test is that the participant must resolve the confusion (or semantic interference) between the comparison words and the to-be-recalled words, such that you recall the target words only, by inhibiting the semantic interference. We argue that this inhibition ability is involved when you are resolving the confusion between the target speech and competing speech while listening to a friend at the cocktail party, and will, at least in part, determine how much you will remember from that conversation (Sorqvist & Ronnberg, 2012).

Summary statement

The presentation covers extremely different situations in which working memory capacity plays a role for selective attention, speech understanding, and memory for attended materials. Theoretical understanding of these effects is key to the development of the whole area of cognitive hearing science (Ronnberg et al., 2011b). Research is under way that evaluates whether training of working memory will improve speech understanding in noise.

References

-Baddeley, A.D. (2000). The episodic buffer: a new component of working memory? Trends in Cognitive Science, 4, 417-423.
-Baddeley, A.D. (2012). Working Memory: Theories, Models, and Controversies. Annual Review of Psychology, 63, 1-29.
-Daneman, M., & Carpenter, P.A. (1980). Individual differences in working memory and reading. Journal ofVerbal Learning and Verbal Behavior, 19, 450-66
-Dahlstrom, O., Johnsrude, I.S., Rudner, M., Stenfelt, S., & Ronnberg, J. (2012). Individual differences in working memory capacity modulate frontal cortical activity while listening to speech in noise. Nineteenth Annual Cognitive Neuroscience Society Meeting, March 31 - April 3, 2012, Chicago, Illinois
-Foo, C., Rudner, M., Ronnberg, J., & Lunner, T. (2007). Recognition of speech in noise with new hearing instrument compression release settings requires explicit cognitive storage and processing capacity. Journal of American Academy of Audiology, 18, 553-566.
-Lunner, T., & Sundewall-Thoren, E. (2007). Interactions between cognition, compression, and listening conditions: effects on speech-in-noise performance in a two-channel hearing aid. Journal of the American Academy of Audiology, 18, 539-552.
-Rudner, M., Foo, C. Ronnberg, J., & Lunner, T. (2009). Cognition and aided speech recognition in noise: specific role for cognitive factors following nine-week experience with adjusted compression settings in hearing aids. Scandinavian Journal of Psychology, 50, 405-418.
-Ronnberg, J. (2003). Cognition in the hearing impaired and deaf as a bridge between signal and dialogue: A framework and a model. International Journal of Audiology, 42, S68-S76.
-Ronnberg, J., Arlinger, S., Lyxell, B., & Kinnefors, C. (1989). Visual evoked potentials: Relation to adult speechreading and cognitive function. Journal of Speech and Hearing Research, 32, 725-735.
-Ronnberg, J., Danielsson, H., Rudner, M., Arlinger, S., Sternang, O., Wahlin, A., & Nilsson, L-G. (2011b). Hearing loss is negatively related to episodic and semantic long-term memory but not to short-term memory. Journal of Speech, Language, and Hearing Research, 54, 705-726.
-Ronnberg J., Rudner, M., Foo C., & Lunner T. (2008). Cognition counts: A working memory system for ease of language understanding (ELU). International Journal of Audiology, 47, S171-S177.
-Ronnberg, J., Rudner, M., & Lunner, T. (2011a). Cognitive hearing science: the legacy of Stuart Gatehouse. Trends in Amplification, 15 (3), 140-148.
-Ronnberg, J., Rudner, M., Lunner, T., & Zekveld, A.A. (2010). When cognition kicks in: Working memory and speech understanding in noise. Noise and Health, 12, 49, 263-9.
-Sorqvist, P., & Ronnberg, J. (2012). Episodic long-term memory of spoken discourse masked by speech: What role for working memory capacity? Journal of Speech, Language, and Hearing Research, 55, 210-218.