169th Meeting | Acoustics.org

2aNSa1 – Soundscape will tune an acoustic environment through peoples’ mind

Brigitte Schulte-Fortkamp – b.schulte-fortkamp@tu-berlin.de
Technical University Berlin
Institute of Fluid Mechanics and Engineering Acoustics
-Psychoacoustics and Noise effects –
Einsteinufer 25
10587 Berlin -Germany

Popular version of paper 2aNSa1, “Soundscape as a resource to balance the quality of an acoustic environment”
Tuesday morning, May 19, 2015, 8:35 AM, Commonwealth 1
169th ASA Meeting, Pittsburgh Pennsylvania

Preface
Soundscape studies investigate and find increasingly better ways to measure and hone the acoustic environment. Soundscape offers the opportunity for multidisciplinary working, bringing together science, medicine, social studies and the arts – combined, crucially, with analysis, advice and feedback from the ‘users of the space’ as the primary ‘experts’ of any environment – to find creative and responsive solutions for protection of living places and to enhance the quality of life.

The Soundscape concept was introduced as a scope to rethink the evaluation of “noise” and its effects. The challenge was to consider the limits of acoustic measurements and to account for its cultural dimension.

The recent international standard ISO 12913-1 Acoustics — Soundscape —Part 1: Definition and conceptual framework Acoustique – Paysage sonore -Partie 1: Définition et cadre conceptual clarifies soundscape as an “acoustic environment as perceived or experienced and/or understood by a person or people, in context”

Figure 1 — Elements in the perceptual construct of soundscape

Soundscape suggests exploring noise in its complexity and its ambivalence and its approach towards sound to consider the conditions and purposes of its production, perception, and evaluation, to understand evaluation of noise/ sound as a holistic approach.

To discuss the contribution of Soundscape research into the area of Community noise research means to focus on the meaning of sounds and its implicit assessments to contribute to the understanding that the evaluation through perceptual effects is a key issue.

Using the resources – an example-
Soundscape Approach Public Space Perception and Enhancement Drawing on Experience in Berlin

Figure 2 – Soundscape Nauener Platz

The concept of development of the open pace relies on the understanding that people living in the chosen are the “real” experts concerning the evaluation of this place according to their expectations and experiences in the respective area. The intention of scientific research here is to learn about the meaning of the noise with respect to people’s living situation and to implement the adequate procedure to open the “black box” of people’s mind.

Therefore, the aim was to get residents involved through workshops to get access to the different social groups.

Figure 3 – Participation and Collaboration

Figure 4 – The concept of evaluation

Interdisciplinarity is considered as a must in the soundscape approach. In this case it was concerned with the collaboration of architects, acoustics engineers, environmental health specialists, psychologists, social scientists, and urban developers. The tasks are related to the local individual needs and are open to noise sensitive and other vulnerable groups. It is also concerned with cultural aspects and the relevance of natural soundscapes – sometimes referred to as quiet areas – which is obviously related to the highest level of needs.

Figure 5 – Soundscape – an interactive approach using the resources

Improving local soundscape quality?
Obviously, these new approaches and methods make it possible to learn about the process of perception and evaluation sufficiently as they take into account the context, ambiance, the usual interaction between noise and listener and the multidimensionality of noise perception.

By contrast, conventional methods often reduce the complexity of reality on controllable variables, which supposedly represent the scrutinized object. Furthermore, traditional tests neglect frequently the context-dependency of human perception; they only provide artificial realities and diminish the complexity of perception on merely predetermined values, which do not completely correspond with perceptual authenticity. However, perception and evaluations entirely depend on the respective influences of the acoustic and non-acoustic modifiers.

Following the comments and group discussion and also the results from the narrative interviews it could be defined why people prefer some places over the public place and why not. It also became clear how people experience the noise in the distance from the road and also with respect to social life and social control. One of the most important findings here is how people react to low frequency noise at the public place and how experiences and expectations work together. It becomes obvious that the most wanted sound in this area is based on wishes to escape the road traffic noise through natural sounds.

Figure 6 – Selected sounds for audio islands

Reshaping the place based on people’s expertise
Relying on the combined evaluation procedures the place was reshaped installing a gabion wall along one of the main roads and further more audio islands like have been built that integrated the sounds people would like to enjoy when using the place. While the gabion wall protects against noise around the playground, the new installed audio islands provide nature sounds as selected by the people involved in the Soundscape approach.

Figure 7 – Installation of the sounds

Conclusions

Figure 8 – The new place

The process of tuning of urban areas with respect to the expertise of people’s mind and quality of life is related to the strategy of triangulation and provides the theoretical frame with regard to the solution of e.g. the change in an area. In other words: Approaching the field in this holistic manner is generally needed.

An effective and sustainable reduction of the number of highly annoyed people caused by noise is only possible with further scientific endeavors in the area of methods development and research of noise effects. Noise maps providing further information can help to obtain a deeper understanding of noise reactions and can help to reliably identify perception-related hot spots. Psychoacoustic maps are particularly interesting in areas where the noise levels are marginal below the noise level limits and offer an additional interpretation help with respect to the identification of required noise abatement measures.

But, the expertise of people involved will provide meaningful information. Soundwalks as an eligibly instrument for exploring urban areas by minds of the “local experts” as measuring device open a field of data for triangulation. These techniques in combination allow giving meaning to the numbers and values of recordings and their analysis to understand the significance of sound and noise as well as the perception of Soundscapes by its resources.
tags: soundscape, acoustics, people, health

REFERENCES
J. Kang, B. Schulte-Fortkamp (editors) Soundscape and the Built Environment CRC Press | Taylor & Francis Group, in print
B. Schulte-Fortkamp, J. Kang (editors) Special Issue on Soundscape, JASA 2012
R. M. Schafer, “The Soundscape. Our sonic environment and the tuning of the world.” Rochester, Vermont: Destiny Books, (1977).
B. Hollstein, “Qualitative approaches to social reality: the search for meaning” in: John Scott & Peter J. Carrington (Eds.): Sage handbook of social network analysis. London/Newe Dehli: Sage. (2012)
R. M. Schafer, “The Book of Noise” (Price Milburn Co., Lee, Wellington, NZ, (1973).
B. Truax, (ed.) „Handbook for Acoustic Ecology” (A.R.C. Publication, Vancouver, (1978).
K. Hiramatsu, “Soundscape: The Concept and Its Significance in Acoustics,” Proc. ICA, Kyoto, 2004.
A. Fiebig, B. Schulte-Fortkamp, K. Genuit, „New options for the determination of environmental noise quality”, 35th International Congress and Exposition on Noise Control Engineering INTER-NOISE 2006, 04.-06.December 2006, Honolulu, HI.
P. Lercher, B. Schulte-Fortkamp, “Soundscape and community noise annoyance in the context of environmental impact assessments,” Proc. INTER-NOISE 2003, 2815-2824, (2003).
B. Schulte-Fortkamp, D. Dubois: (editors) Acta Acustica united with Acustica, Special Issue, Recent advances in Soundscape research, Vol 92 (6), (2006).
R. Klaboe, et. al. „Änderungen in der Klang- und Stadtlandschaft nach Änderung von Straßenverkehrsstraßen im Stadtteil Oslo-Ost“, Fortschritte der Akustik, Oldenburg, (2000).

3aBA5 – Fabricating Blood Vessels with Ultrasound

Diane Dalecki, Ph.D.
Eric S. Comeau, M.S.
Denise C. Hocking, Ph.D.
Rochester Center for Biomedical Ultrasound
University of Rochester
Rochester, NY 14627

Popular version of paper 3aBA5, “Applications of acoustic radiation force for microvascular tissue engineering”
Presented Wednesday morning May 20, 9:25 AM, in room Kings 2
169th ASA Meeting, Pittsburgh

Tissue engineering is the field of science dedicated to fabricating artificial tissues and organs that can be made available for patients in need of organ transplantation or tissue reconstructive surgery. Tissue engineers have successfully fabricated relatively thin tissues, such as skin substitutes, that can receive nutrients and oxygen by simple diffusion. However, recreating larger and/or more complex tissues and organs will require developing methods to fabricate functional microvascular networks to bring nutrients to all areas of the tissue for survival.

In the laboratories of Diane Dalecki, Ph.D. and Denise C. Hocking, Ph.D., research is underway to develop new ultrasound technologies to control and enhance the fabrication of artificial tissues¹. Ultrasound fields are sound fields at frequencies higher than humans can hear (i.e., > 20 kHz). Dalecki and Hocking have developed a technology that uses a particular type of ultrasound field, called an ultrasound standing wave field, as a tool to non-invasively engineer complex spatial patterns of cells² and fabricate microvessel networks^3,4 within artificial tissue constructs.

When a solution of collagen and cells is exposed to an ultrasound standing wave field, the forces associated with the field lead to the alignment of the cells into planar bands (Figure 1). The distance between the bands of cells is controlled by the ultrasound frequency, and the density of cells within each band is controlled by the intensity of the sound field. The collagen polymerizes into a solid gel during the ultrasound exposure, thereby maintaining the spatial organization of the cells after the ultrasound is turned off. More complex patterning can be achieved by use of more than one ultrasound transducer.

Figure 1. Acoustic-patterning of microparticles (dark bands) using an ultrasound standing wave field. Distance between planar bands is 750 µm. Scale bar = 100 μm

An exciting application of this technology involves the fabrication of microvascular networks within artificial tissue constructs. Specifically, acoustic-patterning of endothelial cells into planar bands within collagen hydrogels leads to the rapid development of microvessel networks throughout the entire volume of the hydrogel. Interestingly, the structure of the resultant microvessel network can be controlled by choice of the ultrasound exposure parameters. As shown in Figure 2, ultrasound standing wave fields can be employed to fabricate microvessel networks with different physiologically relevant morphologies, including capillary-like networks (left panel), aligned non-branching vessels (center panel) or aligned vessels with hierarchically branching microvessels. Ultrasound fields provide an ideal technology for microvascular engineering; the technology is rapid, noninvasive, can be broadly applied to many types of cells and hydrogels, and can be adapted to commercial fabrication processes.

Dalecki-2-ASA - Ultrasound-fabricated microvessel

Figure 2. Ultrasound-fabricated microvessel networks within collagen hydrogels. The ultrasound pressure amplitude used for initial patterning determines the final microvessel morphology, which can resemble torturous capillary-like networks (left panel), aligned non-branching vessels (center panel) or aligned vessels with hierarchically branching microvessels. Scale bars = 100 μm.

To learn more about this research, please view this informative video (https://www.youtube.com/watch?v=ZL-cx21SGn4).

References:

[1] Dalecki D, Hocking DC. Ultrasound technologies for biomaterials fabrication and imaging. Annals of Biomedical Engineering 43:747-761; 2015.

[2] Garvin KA, Hocking DC, Dalecki D. Controlling the spatial organization of cells and extracellular matrix proteins in engineered tissues using ultrasound standing wave fields. Ultrasound Med. Biol. 36:1919-1932; 2010.

[3] Garvin KA, Dalecki D, Hocking DC. Vascularization of three-dimensional collagen hydrogels using ultrasound standing wave fields. Ultrasound Med. Biol. 37:1853-1864; 2011.

[4] Garvin KA, Dalecki D, Youssefhussien M, Helguera M, Hocking DC. Spatial patterning of endothelial cells and vascular network formation using ultrasound standing wave fields. J. Acoust. Soc. Am. 134:1483-1490; 2013.

5aMU1 – Understanding timbral effects of multi-resonator/generator systems of wind instruments in the context of western and non-western music

Popular version of poster 5aMU1
Presented Friday morning, May 22, 2015, 8:35 AM – 8:55 AM, Kings 4
169th ASA Meeting, Pittsburgh

In this paper the relationship between musical instruments and the rooms they are performed in was investigated. A musical instrument is typically characterized as a system that consists of a tone generator combined with a resonator. A saxophone for example has a reed as a tone generator and a comical shaped resonator that can be effectively changed in length with keys to produce different musical notes. Often neglected is the fact that there is a second resonator for all wind instruments coupled to the tone generator – the vocal cavity. We use our vocal cavity everyday when we speak to form characteristic formants, local enhancements in frequency to shape vowels. This is achieved by varying the diameter of the vocal tract at specific local positions along its axis. In contrast to the resonator of a wind instrument, the vocal tract is fixed its length by the dimensions between the vocal chords and the lips. Consequently, the vocal tract cannot be used to change the fundamental frequency over a larger melodic range. For out voice, the change in frequency is controlled via the tension of the vocal chords. The musical instrument’s instrument resonator however is not an adequate device to control the timbre (harmonic spectrum) of an instrument because it can only be varied in length but not in width. Therefore, the players adjustment of the vocal tract is necessary to control the timbre if the instrument. While some instruments posses additional mechanisms to control timbre, e.g., via the embouchure to control the tone generator directly using the lip muscles, for others like the recorder changes in the wind supply provided by the lungs and the changes of the vocal tract. The role of the vocal tract has not been addressed systematically in literature and learning guides for two obvious reasons. Firstly, there is no known systematic approach of how to quantify internal body movements to shape the vocal tract. Each performer has to figure out the best vocal tract configurations in an intuitive manner. For the resonator system, the changes are described through the musical notes, and in cases where multiple ways exist to produce the same note, additional signs exist to demonstrate how to finger this note (e.g., by providing a specific key combination). Secondly, in western classic music culture the vocal tract adjustments predominantly have a correctional function to balance out the harmonic spectrum to make the instrument sound as even as possible across the register.

PVC-Didgeridoo adapter for soprano saxophone

In non-western cultures, the role of the oral cavity can be much more important to convey musical meaning. The didgeridoo, for example, has a fixed resonator with no keyholes and consequently it can only produce a single pitched drone. The musical parameter space is then defined by modulating the overtone spectrum above the tone by changing the vocal tract dimensions and creating vocal sounds on top of the buzzing lips on the didgeridoo edge. Mouthpieces of Western brass instruments have a cup behind the rim with a very narrow opening to the resonator, the throat. The didgeridoo does not have a cup, and the rim is the edge of the resonator with a ring of bee wax. While the narrow throat of western mouthpiece mutes additional sounds produced with the voice, didgeridoos are very open from end to end and carry the voice much better.

The room, a musical instrument is performed in acts as a third resonator, which also affect the timbre of the instrument. In our case, the room was simulated using a computer model with early reflections and late reverberation.

Tone generators for soprano saxophone from left to right: Chinese Bawu, soprano saxophone, Bassoon reed, cornetto.

In general, it is difficult to assess the effect of a mouthpiece and resonator individually, because both vary across instruments. The trumpet for example has a narrow cylindrical bore with a brass mouthpiece, the saxophone has a wide conical bore with reed-based mouthpiece. To mitigate this effect, several tone generators were adapted for a soprano saxophone, including a brass mouthpiece from a cornetto, a bassoon mouthpiece and a didgeridoo adapter made from a 140 cm folded PCV pipe that can be attached to the saxophone as well. It turns out that the exchange of tone generators change the timbre of the saxophone significantly. The cornetto mouthpiece gives the instrument a much mellower tone. Similar to the baroque cornetto, the instruments sounds better in a bright room with lot of high frequencies, while the saxophone is at home at a 19th-century concert hall with a steeper roll off at high frequencies.

Cardiovascular Effects of Noise on Man

Wolfgang Babisch – wolfgang.babisch@t-online.de
Himbeersteig 37
14129 Berlin, Germany

Presented Tuesday afternoon, May 19, 2015
169th ASA Meeting, Pittsburgh
Click here to read the abstract

Sound penetrates our life everywhere. It is an essential component of our social life. We need it for communication, orientation and as a warning signal. The auditory system is continuously analyzing acoustic information, including unwanted and disturbing sound, which is filtered and interpreted by different cortical (conscious perception and processing) and sub-cortical brain structures (non-conscious perception and processing). The terms “sound” and “noise” are often used synonymously. Sound becomes noise when it causes adverse health effects, including annoyance, sleep disturbance, cognitive impairment, mental or physiological disorders, including hearing loss and cardiovascular disorders. The evidence is increasing that ambient noise levels below hearing damaging intensities are associated with the occurrence of metabolic disorders (type 2 diabetes), high blood pressure (hypertension), coronary heart diseases (including myocardial infarction), and stroke. Environmental noise from transportation noise sources, including road, rail and air traffic, is increasingly recognized as a significant public health issue.

Systematic research on the non-auditory physiological effects of noise has been carried out for a long time starting in the post war period of the last century. The reasoning that long-term exposure to environmental noise causes cardiovascular health effects is based on the following experimental and empirical findings:

Short-term laboratory studies carried out on humans have shown that the exposure to noise affects the autonomous nervous system and the endocrine system. Heart rate, blood pressure, cardiac output, blood flow in peripheral blood vessels and stress hormones (including epinephrine, nor-epinephrine, cortisol) are affected. At moderate environmental noise levels such acute reactions are found, particularly, when the noise interferes with activities of the individuals (e.g. concentration, communication, relaxation).
Noise-induced instantaneous autonomic responses do not only occur in waking hours, but also in sleeping subjects even when they report not being disturbed by the noise.
The responses do not adapt on a long-term basis. Subjects who had lived for several years in a noisy environment still respond to acute noise stimuli.
The long-term effects of chronic noise exposure have been studied in animals at high noise levels showing manifest vascular changes (thickening of vascular walls) and alterations in the heart muscle (increases of connective tissue) that indicate an increased aging of the heart and a higher risk of cardiovascular mortality.
Long-term effects of chronic noise exposure in humans have been studied in workers exposed to high noise levels in the occupational environment showing higher rates of hypertension and ischemic heart diseases in exposed subjects compared with less exposed subjects.

These findings make it plausible to deduct that similar long-term effects of chronic noise exposure may also occur at comparably moderate or low environmental noise levels. It is important to note that non-auditory noise effects do not follow the toxicological principle of dosage. This means that it is not simply the accumulated total sound energy that causes the adverse effects. Instead, the individual situation and the disturbed activity need to be taken into account (time activity patterns). It may very well be that an average sound pressure level of 85 decibels (dB) at work causes less of an effect than 65 dB at home when carrying out mental tasks or relaxing after a stressful day, or 50 dB when being asleep. This makes a substantial difference compared to many other environmental exposures where the accumulated dose is the hazardous factor, e. g. air pollution (“dealing with decibels is not like summing up micrograms as we do for chemical exposures”).

The general stress theory is the rationale and biological model for the non-auditory physiological effects of noise on man. According to the general stress concept, repeated temporal changes in biological responses disturb the biorhythm, cause permanent dysregulation, resulting in physiological and metabolic imbalance and disturbed haemostasis of the organism leading to chronic diseases in the long run. In principle, a variety of body functions may be affected, including the cardiovascular system, the gastrointestinal system, and the immune system, for example. Noise research has been focusing on cardiovascular health outcomes because cardiovascular diseases have a high prevalence in the general population. Noise-induced cardiovascular effects may therefore be relevant for public health and provide a strong argument for noise abatement policies within the global context of adverse health effects due to community noise, including annoyance and sleep disturbance.

Figure 1 shows a simplified reaction scheme used in epidemiological noise research. It simplifies the cause-effect chain i.e.: sound > disturbance > stress response > (biological) risk factors > disease. Noise affects the organism either directly through nervous interactions of the acoustic nerve with other regions of the central nervous system, or indirectly through the emotional and the cognitive perception of sound. The objective noise exposure (sound level) and the subjective noise exposure (annoyance) may both be predictors in the relationship between noise and health endpoints. The direct, non-conscious, pathway may be predominant in sleeping subjects.

The body of epidemiological studies regarding the association between transportation noise (mainly road traffic and aircraft noise) and cardiovascular diseases (hypertension, coronary heart disease, stroke) has increased a lot in the recent years. Most of the studies suggest a continuous increase in risk with increasing noise level. Exposure modifiers such as long years of residence and the location of rooms (facing the street) have been associated with a stronger risk supporting the causal interpretation of findings. The question is no longer whether environmental noise causes cardiovascular disorders, the question is rather to what extent (the slope of the exposure-response curve) and at which threshold (the empirical onset of the exposure-response curve (reference level)). Different noise sources differ in their noise characteristics with respect to the maximum noise level, the time course including the number of events, the noise level rise time of a single event, the frequency spectrum, the tonality and their informational content. In principle, different exposure-response curves must be considered for different noise sources. This not only applies to noise annoyance where aircraft noise is found to be more annoying than road traffic noise and railway noise (at the same average noise level), but may, in principle, also be true for the physiological effects of noise.

So called meta-analyses have been carried out pooling the results of relevant studies on the same associations for deriving common exposure-response relationships that can be used for a quantitative risk assessment. Figure 2 shows pooled exposure-response relationships of the associations between road traffic noise and hypertension (24 studies, weighted pooled reference level 50 dB), road traffic noise and coronary heart disease (14 studies, weighted pooled reference level 52 dB), aircraft noise and hypertension (5 studies, weighted pooled reference level 49 dB), and aircraft noise and coronary heart disease (3 studies weighted pooled reference level 48 dB). Conversions of different noise indicators were made with respect to the 24-hour day(+0 dB)-evening(+5 dB)-night(+10 dB)-weighted annual A-weighted equivalent continuous sound pressure level Lden which is commonly used for noise mapping in Europe and elsewhere, referring to the most exposed façade of the buildings. The curves suggest increases in risks (hypertension, coronary heart disease) between 5 and 10 percent per increase of the noise indicator Lden by 10 dB, starting at noise levels around 50 dB. This corresponds with approximately 10 dB lower night noise levels Lnight of approximately 40 dB. According to the graphs, subjects that live in areas where the ambient average noise level Lden exceeds 65 dB run an approximately 15-25 percent higher risk of cardiovascular diseases compared with subjects that live in comparably quiet areas. With respect to high blood pressure the risk tends to be larger for aircraft noise compared with road traffic noise which may have to do with the fact that people do not have access to a quiet side when the noise comes from above. However, the number of aircraft noise studies is much smaller than the number of road traffic noise studies. More research is needed in this field. Nevertheless, the available data provide information for action taking.

The decision upon critical noise levels and “accepted” public health risks within a social and economic context is not a scientific one but a political one. Expert groups had concluded that average A-weighted road traffic noise levels at the facades of the houses exceeding 65 dB during daytime and 55 dB during the night were to be considered as being detrimental to ill-health. New studies that were able to assess the noise level in more detail at the lower end of the exposure range (e. g. including secondary roads) tended to find lower threshold values for the onset of the increase in risk than the earlier studies where noise data were area-wide not available (e. g. only primary road network). Based on the current knowledge regarding the cardiovascular health effects of environmental noise it seems justified to refine the recommendations towards lower critical noise levels, particularly with respect to the exposure during the night. Sleep is an important modulator of cardiovascular function. Some studies showed stronger associations of cardiovascular outcomes with the exposure during the night than with the exposure during the day. Noise-disturbed sleep, in this respect, must be considered as a particular potential pathway for the development of cardiovascular disorders.

The WHO (World Health Organization) Regional Office for Europe is currently developing a new set of guidelines (“WHO Environmental Noise Guidelines for the European Region”) to provide suitable scientific evidence and recommendations for policy makers of the Member States in the European Region. The activity can be viewed as an initiative to update the WHO Community Noise Guidelines from 1999 where cardiovascular effects of environmental noise were not explicitly considered in the recommendations. This may change in the new version of the document.

Figure 1. Noise reaction model according to Babisch (2014) [Babisch, W. (2014). Updated exposure-response relationship between road traffic noise and coronary heart diseases: A meta-analysis. Noise Health 16 (68): 1-9.]

Figure 2. Exposure-response relationships of the associations between transportation noise and cardiovascular health outcomes. Data taken from:

Babisch, W. and I. van Kamp (2009). Exposure-response relationship of the association between aircraft noise and the risk of hypertension. Noise Health 11 (44): 149-156.
van Kempen, E. and W. Babisch (2012). The quantitative relationship between road traffic noise and hypertension: a meta-analysis. Journal of Hypertension 30(6): 1075-1086.
Babisch, W. (2014). Updated exposure-response relationship between road traffic noise and coronary heart diseases: A meta-analysis. Noise Health 16 (68): 1-9.
Vienneau, D., C. Schindler, et al. (2015). The relationship between transportation noise exposure and ischemic heart disease: A meta analysis. Environmental Research 138: 372-380.

Note: Study-specific reference values were pooled after conversion to Lden using the derived meta-analysis weights of each study (according to Vienneau et al. (2015)).

Abbreviations: Road = road traffic noise, Air = aircraft noise, Hyp = hypertension, CHD = coronary heart disease

2pSC14 – Improving the Accuracy of Automatic Detection of Emotions From Speech

Reza Asadi and Harriet Fell

Popular version of poster 2pSC14 “Improving the accuracy of speech emotion recognition using acoustic landmarks and Teager energy operator features.”
Presented Tuesday afternoon, May 19, 2015, 1:00 pm – 5:00 pm, Ballroom 2
169th ASA Meeting, Pittsburgh

“You know, I can feel the fear that you carry around and I wish there was… something I could do to help you let go of it because if you could, I don’t think you’d feel so alone anymore.”
— Samantha, a computer operating system in the movie “Her”

Introduction
Computers that can recognize human emotions could react appropriately to a user’s needs and provide more human like interactions. Emotion recognition can also be used as a diagnostic tool for medical purposes, onboard car driving systems to keep the driver alert if stress is detected, a similar system in aircraft cockpits, and also electronic tutoring and interaction with virtual agents or robots. But is it really possible for computers to detect the emotions of their users?

During the past fifteen years, computer and speech scientists have worked on the automatic detection of emotion in speech. In order to interpret emotions from speech the machine will gather acoustic information in the form of sound signals, then extract related information from the signals and find patterns which relate acoustic information to the emotional state of speaker. In this study new combinations of acoustic feature sets were used to improve the performance of emotion recognition from speech. Also a comparison of feature sets for detecting different emotions is provided.

Methodology
Three sets of acoustic features were selected for this study: Mel-Frequency Cepstral Coefficients, Teager Energy Operator features and Landmark features.

Mel-Frequency Cepstral Coefficients:
In order to produce vocal sounds, vocal cords vibrate and produce periodic pulses which result in glottal wave. The vocal tract starting from the vocal cords and ending in the mouth and nose acts as a filter on the glottal wave. The Cepstrum is a signal analysis tool which is useful in separating source from filter in acoustic waves. Since the vocal tract acts as a filter on a glottal wave we can use the cepstrum to extract information only related to the vocal tract.

The mel scale is a perceptual scale for pitches as judged by listeners to be equal in distance from one another. Using mel frequencies in cepstral analysis approximates the human auditory system’s response more closely than using the linearly-spaced frequency bands. If we map frequency powers of energy in original speech wave spectrum to mel scale and then perform cepstral analysis we get Mel-Frequency Cepstral Coefficients (MFCC). Previous studies use MFCC for speaker and speech recognition. It has also been used to detect emotions.

Teager Energy Operator features:
Another approach to modeling speech production is to focus on the pattern of airflow in the vocal tract. While speaking in emotional states of panic or anger, physiological changes like muscle tension alter the airflow pattern and can be used to detect stress in speech. It is difficult to mathematically model the airflow, therefore Teager proposed the Teager Energy Operators (TEO), which computes the energy of vortex-flow interaction at each instance of time. Previous studies show that TEO related features contain information which can be used to determine stress in speech.

Acoustic landmarks:
Acoustic landmarks are locations in the speech signal where important and easily perceptible speech properties are rapidly changing. Previous studies show that the number of landmarks in each syllable might reflect underlying cognitive, mental, emotional, and developmental states of the speaker.

Figure 1 – Spectrogram (top) and acoustic landmarks (bottom) detected in neutral speech sample

Sound File 1 – A speech sample with neutral emotion

Figure 2 – Spectrogram (top) and acoustic landmarks (bottom) detected in anger speech sample

Sound File 2 – A speech sample with anger emotion

Classification:
The data used in this study came from the Linguistic Data Consortium’s Emotional Prosody and Speech Transcripts. In this database four actresses and three actors, all in their mid-20s, read a series of semantically neutral utterances (four-syllable dates and numbers) in fourteen emotional states. A description for each emotional state was handed over to the participants to be articulated in the proper emotional context. Acoustic features described previously were extracted from the speech samples in this database. These features were used for training and testing Support Vector Machine classifiers with the goal of detecting emotions from speech. The target emotions included anger, fear, disgust, sadness, joy, and neutral.

Results
The results of this study show an average detection accuracy of approximately 91% among these six emotions. This is 9% better than a previous study conducted at CMU on the same data set.

Specifically TEO features resulted in improvements in detecting anger and fear and landmark features improved the results for detecting sadness and joy. The classifier had the highest accuracy, 92%, in detecting anger and the lowest, 87%, in detecting joy.

On Bleats, in the Year of the Sheep

David G. Browning, 139 Old North Road, Kingston, RI 02881 decibeldb@aol.com

Peter M. Scheifele, Dept. of Communication Science, Univ. of Cincinnati, Cincinnati, OH 45267
Click here to read the abstract

A bleat is usually defined as the cry of a sheep or goat but they are just two voices in a large worldwide animal chorus that we are just starting to understand.

A bleat is a simple short burst of sound comprised of harmonic tones. It is easily voiced by young or small animals, who are the majority of the bleaters. From deer to polar bears; muskoxen to sea lions, the young bleats produce a sound of enough character to allow easy detection and possible identification by concerned mothers. As these animals mature usually their voices shift lower, longer, and louder and a vocabulary of other vocalizations are developed.

But for some notable exceptions this is not the case. For example, sheep and goats retain their bleating structure as their principal vocalization through adulthood – hence bleating is usually associated with them. Their bleats have been the most studied and show a characteristic varietal structure and at least a limited ability for maternal recognition of specific individuals.

For another example, at least four small varieties of toad, such as the Australian Bleating Toad and in America, the Eastern Narrow Mouthed Toad are strong bleaters through their entire life. Bleats provide them a signature signal that carries in the night and is easily repeatable and sustainable. But why these four amphibians? Our lack of an answer speaks to our still limited knowledge of the vast field of animal communication.

Perhaps most interestingly, the Giant Panda retains bleating while developing a complex mix of other vocalizations. It is probably the case that in the visually challenging environment of a dense bamboo thicket they must retain all possible vocal tools to communicate. Researchers link their bleating to male size and female age.

In summary, bleating is an important aspect of youth for many animals; for some it is the principal vocalization for life; and, for a few, a retained tool among many.

Next Entries »