Andrew Brian Horner horner@cse.ust.hk
Department of Computer Science and Engineering, The Hong Kong University of Science and Technology, Hong Kong SAR
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
Music has a unique way of moving us emotionally, but have you ever wondered how individual sounds shape these feelings?
In our study, we looked at how different features of violin notes—like pitch (the height of the notes), dynamics (the loudness of the sounds), and vibrato (how the note vibrates)—combine to create emotional responses. While previous research often focuses on each feature in isolation, we explored how they interact, revealing how the violin’s sounds evoke specific emotions.
To conduct this study, we used single-note recordings from the violin at different pitches, two levels of dynamics (loud and soft), and two vibrato settings (no vibrato and high vibrato). We invited participants to listen to these sounds and rate their emotional responses using a scale of emotional positivity (valence) and intensity (arousal). Participants also selected which emotions they felt from a list of 16 emotions, such as joyful, nervous, relaxed, or agitated.
Audio 1. The experiment used a violin single-note sample (middle C pitch + loud dynamics + no vibrato).
Audio 2. The experiment used a violin single-note sample (middle C pitch + soft dynamics + no vibrato).
Audio 3. The experiment used a violin single-note sample (middle C pitch + loud dynamics + high vibrato).
Audio 4. The experiment used a violin single-note sample (middle C pitch + loud dynamics + high vibrato).
Our findings reveal that each element plays a unique role in shaping emotions. As shown in Figure 1, higher pitches and strong vibrato generally raised emotional intensity, creating feelings of excitement or tension. Lower pitches were more likely to evoke sadness or calmness, while loud dynamics made emotions feel more intense. Surprisingly, sounds without vibrato were linked to calmer emotions, while vibrato added energy and excitement, especially for emotions like anger or fear. And Figure 2 illustrates how strong vibrato enhances emotions like anger and sadness, while the absence of vibrato correlates with calmer feelings.
Figure 1. Pitch, Dynamics, and Vibrato average ratings on valence-arousal with different levels. It shows that higher pitches and strong vibrato increase arousal, while soft dynamics and no vibrato are linked to higher valence, highlighting pitch as the most influential factor.
Figure 2. Pitch, Dynamics, and Vibrato average ratings on 16 emotions. It shows that strong vibrato enhances angry and sad emotions, while no vibrato supports calm emotions; higher pitches increase arousal for angry emotions, and brighter tones evoke calm and happy emotions.
Our research provides insights for musicians, composers, and even music therapists, helping them understand how to use the violin’s features to evoke specific emotions. With this knowledge, violinists can fine-tune their performance to match the emotional impact they aim to create, and composers can carefully select sounds that resonate with listeners’ emotional expectations.
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
Have you ever listened to a song and later been surprised to hear the artist speak with a different accent than the one you heard in the song? Take country singer Keith Urban’s song “What About Me” for instance; when listening, you might assume that he has a Southern American (US) English accent. However, in his interviews, he speaks with an Australian English accent. So why did you think he sounded Southern?
Research suggests that specific accents or dialects are associated with musical genres [2], that singers adjust their accents based on genre [4]; and that foreign accents are more difficult to recognize in songs compared to speech [5]. However, when listeners perceive an accent in a song, it is unclear which type of information they rely on: the acoustic speech information or information about the musical genre. Our previous research investigated this question for Country and Reggae music and found that genre recognition may play a larger role in dialect perception than the actual sound of the voice [9].
Our current study explores American Blues and Folk music, genres that allow for easier separation of vocals from instrumentals, with more refined stimuli manipulation. Blues is strongly associated with African American English [3], while Folk can be associated with a variety of (British, American, etc.) dialects [1]. Participants listened to manipulated clips of sung and “spoken” lines taken from songs in both genres, which were transcribed for participants (see Figure 1). AI applications were used to remove instrumentals for both sung and spoken clips, while “spoken” clips also underwent rhythm and pitch normalization so that they sounded like spoken rather than sung speech. After hearing each sung or spoken line, participants were asked to identify the dialect they heard from six options [7, 8] (see Figure 2).
Figure 1: Participant view of a transcript from a Folk song clip.
Figure 2: Participant view of six dialect options after hearing a clip.
Participants were much more confident and accurate in categorizing accents for clips in the Sung condition, regardless of genre. The proportion of uncertainty (“Not Sure” responses) in the Spoken condition was consistent across genres (see “D” in Figure 3), suggesting that participants were more certain of dialect when musical cues were present. Dialect categories followed genre expectations, as can be seen from the increase in identifying African American English for Blues in the Sung condition (see “A”). Removing uncertainty by adding genre cues did not increase the likelihood of “Irish English” or “British English” being chosen for Blues, though it did for Folk (see “B” and “C” in Figure 3), in line with genre-based expectations.
Figure 3: Participant dialect responses.
These findings enhance our understanding of the relationship between musical genre and accent. Referring again to the example of Keith Urban, the singer’s stylistic accent change may not be the only culprit for our interpretation of a Southern drawl. Rather, we may have assumed we were listening to a musician with a Southern American English Accent when we heard the first banjo-like twang or tuned into iHeartCountry Radio. When we listen to a song and perceive a singer’s accent, we are not only listening to the sounds of their speech, but are also shaping our perception from our expectations of dialect based on the musical genre.
References:
Carrigan, J., Henry L. (2004). Lornell, kip. the NPR curious listener’s guide to american folk music. Library Journal (1976), 129(19), 63.
De Timmerman, Romeo, et al. (2024). The globalization of local indexicalities through music: African‐American English and the blues. Journal of Sociolinguistics, 28(1), 3–25. https://doi.org/10.1111/josl.12616.
Gibson, A. M. (2019). Sociophonetics of popular music: insights from corpus analysis and speech perception experiments [Doctoral dissertation, University of Canterbury]. http://dx.doi.org/10.26021/4007.
Mageau, M., Mekik, C., Sokalski, A., & Toivonen, I. (2019). Detecting foreign accents in song. Phonetica, 76(6), 429–447. https://doi.org/10.1159/000500187.
RStudio. (2020). RStudio: Integrated Development for R. RStudio, PBC, Boston, MA. http://www.rstudio.com/.
Stoet, G. (2010). PsyToolkit – A software package for programming psychological experiments using Linux. Behavior Research Methods, 42(4), 1096-1104.
Stoet, G. (2017). PsyToolkit: A novel web-based method for running online questionnaires and reaction-time experiments. Teaching of Psychology, 44(1), 24-31.
Walter, M., Bengtson, G., Maitinsky, M., Islam, M. J., & Gick, B. (2023). Dialect perception in song versus speech. The Journal of the Acoustical Society of America, 154(4_supplement), A161. https://doi.org/10.1121/10.0023131.
Department of Department of Computer Science and Engineering
The Hong Kong University of Science and Technology
Hong Kong SAR
Andrew Brian Horner horner@cse.ust.hk
Department of Department of Computer Science and Engineering
The Hong Kong University of Science and Technology
Hong Kong SAR
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
Music speaks to us across cultures, but can the instruments we choose shape our emotions in different ways?
This study compares the emotional responses evoked by two similar yet culturally distinct string instruments: the Chinese erhu and the Western violin. Both are bowed string instruments, but they have distinct sounds and cultural roles that could lead listeners to experience different emotions. Our research focuses on whether these instruments, along with variations in performance and listener familiarity, influence emotional intensity in unique ways.
Western violin performance example: violinist Ray Chan playing ‘Mendelssohn Violin Concerto in E minor, Op. 64’
Chinese erhu performance example: erhu player Guo Gan playing the Chinese piece ‘Horse Racing’ (feat. Pianist Lang Lang)
To explore these questions, we conducted three online listening experiments. Participants were asked to listen to a series of short musical pieces performed on both the erhu and violin. They then rated each piece using two emotional measures: specific emotion categories (such as happy, sad, calm, and agitated) and emotional positivity and intensity.
Our results show clear emotional differences between the instruments. The violin often evokes positive, energetic emotions, which may be due to its bright tone and dynamic range. By contrast, the erhu tends to evoke sadness, possibly because of its softer timbre and its traditional association with melancholy in Chinese music.
Interestingly, familiarity with the instrument played a significant role in listeners’ emotional responses. Those who were more familiar with the violin rated the pieces as more emotionally intense, suggesting that cultural background and previous exposure shape how we emotionally connect with music. However, our analysis also found that different performances of the same piece generally did not change emotional ratings, emphasizing that the instrument itself is a major factor in shaping our emotional experience.
These findings open new paths for understanding how cultural context and personal experiences influence our emotional reactions to music. The distinct emotional qualities of the erhu and violin reveal how musical instruments can evoke different emotional responses, even when playing the same piece.
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
Imagine being in a voice lesson, and as you try to hit a high note, your voice coach says, “suppress your tongue” or “pretend your tongue doesn’t exist!” What does this mean, and why do singers do this?
One vocal technique used by professional singers is to sing in different vocal registers. Generally, a man’s natural speaking voice and the voice people use to sing lower notes is called the chest voice—you can feel a vibration in your chest if you place your hand over it as you vocalize. When moving to higher notes, singers shift to their head voice, where vibrations feel stronger in the head. However, what role does the tongue play in this transition? Do all singers, including amateurs, naturally adjust their tongue when switching registers, or is this adjustment a learned skill?
Figure 1: Approximate location of feeling/sensation for chest and head voice.
We are interested in vowels and the pitch range during the passaggio, which is the shift or transition point between different vocal registers. The voice is very unstable and prone to audible cracking during the passaggio, and singers are trained to navigate it smoothly. We also know that different vowels are produced in different locations in the mouth and possess different qualities. One way that singers successfully navigate the passaggio is by altering the vowel through slight adjustments to tongue shape. To study this, we utilized ultrasound imaging to monitor the position and shape of the tongue while participants with varying levels of vocal training sang vowels across their pitch range, similar to a vocal warm-up.
Video 1: Example of ultrasound recording
The results indicated that, in head voice, the tongue is generally positioned higher in the mouth than in chest voice. Unsurprisingly, this difference is more pronounced for certain vowels than for others.
Figure 2: Tongue position in chest and head voice for front and back vowel groups. Overlapping shades indicate that there is virtually no difference.
Singers’ tongues are also shaped by training. Recall the voice coach’s advice to lower your jaw and tongue while singing—this technique is employed to create more space in the mouth to enhance resonance and vocal projection. Indeed, trained singers generally have a lower overall tongue position.
As professional singers’ transitions between registers sound more seamless, we speculated that trained singers would exhibit smaller differences in tongue position between registers than untrained singers, who have less developed tongue control. In fact, it turns out that the opposite is true: the tongue behaves differently in chest voice and head voice, but only for individuals with vocal training.
Figure 3: Tongue position in chest and head voice for singers with different levels of training.
In summary, our research suggests that tongue adjustments for register shifts may be a learned technique. The manner in which singers adjust their tongues for different vowels and vocal registers could be an essential component in achieving a seamless transition between registers, as well as in the effective use of various vocal qualities. Understanding the interactions among vowels, registers, and the tongue provides insight into the mechanisms of human vocal production and voice pedagogy.
American University, Department of Performing Arts, American University, Washington, DC, 20016, United States
Braxton Boren, Department of Performing Arts, American University
X (twitter): @bbboren
Popular version of 2pAAa12 – Acoustics of two Hindu temples in southern India
Presented at the 186th ASA Meeting
Read the abstract at https://doi.org/10.1121/10.0027050
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
What is the history behind the sonic experiences of millions of devotees of one of the oldest religions in the world?
Hindu temple worship dates back over 1,500 years. There are Vedic scriptures from the 5th century C.E describing the rules for temple construction. Sound is a key component of Hindu worship, and consequently, its temples. Acoustically important aspects include, the striking of bells, gongs, blowing of conch shells, and chanting of the Vedas. The bells, gongs, and conch shells all have specific fundamental frequencies and unique sonic characteristics that play out of them, while the chanting is specifically stylized to include phonetic characteristics such as pitch, duration, emphasis, and uniformity. This great prominence of the frequency domain soundscape makes Hindu worship unique. In this study, we analyzed the acoustic characteristics of two UNESCO heritage temples in Southern India.
Figure 1: Virupaksha temple, Pattadakal
The Virupaksha temple in Pattadakal, built around 745 C.E, is part of one of the largest and ancient temple complexes in India.1 We performed a thorough analysis of the space, taking sine sweep measurements from 36 different source-receiver positions. The mid-frequency reverberation time (the time it takes for the sound to decay by a level of 60dB) was found to be 2.1s and the clarity index for music, C80 was -0.9dB. Clarity index is a metric that tells us how balanced the space is and how well complex passages of music can be heard. A reverberation time of 2.1s is similar to a modern concert hall’s reinforcement, and a C80 of -0.9dB means that the space is very good for complex music too. In terms of the music performed, it would be a combination of vocal and instrumental South Indian music with the melodic framework being akin to melodic modes of western classical music set to different time signatures and played at various tempi ranging from very slow (40-50 beats per minute) to very fast (200+ beats per minute).
Figure 2: The sine sweep measurement process in progress at the Virupaksha temple, Pattadakal
The second site was the 15th century Vijaya Vittala temple in Hampi which is another major tourist attraction. Here the poet, composer, and the father of South Indian classical music, Purandara Dasa, spent many years creating compositions in praise of the deity. He was known to have created thousands of compositions in many complex melodic modes.
Measurements at this site spanned 29 source-receiver positions with the mid-frequency reverberation time being 2.5s and the clarity index for music, C80 being -1.7dB. These values also fall in the ideal range for complex music to be interpreted clearly. Based on these findings, we conclude that the Vijaya Vittala temple provided the optimum acoustical conditions for the performance and appreciation of Purandara Dasa’s compositions and South Indian classical music more broadly.
Other standard room acoustic metrics have been calculated and analyzed from the temples’ sound decay curves. We will use this data to build wave-based computer simulations and further analyze the resonant modes in the temples, study the sonic characteristics of the bells, gongs, and conch shells to understand the relationship between the worship ceremony and the architecture of the temples. We also plan to auralize compositions of Purandara Dasa to recreate his experience in the Vijaya Vittala temple 500 years ago.
1 Alongside the ritualistic sounds discussed earlier, music performance holds a vital place in Hindu worship. The Virupaksha temple, in particular, has a rich history of fulfilling this role, as evidenced by inscriptions detailing grants given to temple musicians by the local queen.