American University, Department of Performing Arts, American University, Washington, DC, 20016, United States
Braxton Boren, Department of Performing Arts, American University
X (twitter): @bbboren
Popular version of 2pAAa12 – Acoustics of two Hindu temples in southern India
Presented at the 186th ASA Meeting
Read the abstract at https://doi.org/10.1121/10.0027050
–The research described in this Acoustics Lay Language Paper may not have yet been peer reviewed–
What is the history behind the sonic experiences of millions of devotees of one of the oldest religions in the world?
Hindu temple worship dates back over 1,500 years. There are Vedic scriptures from the 5th century C.E describing the rules for temple construction. Sound is a key component of Hindu worship, and consequently, its temples. Acoustically important aspects include, the striking of bells, gongs, blowing of conch shells, and chanting of the Vedas. The bells, gongs, and conch shells all have specific fundamental frequencies and unique sonic characteristics that play out of them, while the chanting is specifically stylized to include phonetic characteristics such as pitch, duration, emphasis, and uniformity. This great prominence of the frequency domain soundscape makes Hindu worship unique. In this study, we analyzed the acoustic characteristics of two UNESCO heritage temples in Southern India.
Figure 1: Virupaksha temple, Pattadakal
The Virupaksha temple in Pattadakal, built around 745 C.E, is part of one of the largest and ancient temple complexes in India.1 We performed a thorough analysis of the space, taking sine sweep measurements from 36 different source-receiver positions. The mid-frequency reverberation time (the time it takes for the sound to decay by a level of 60dB) was found to be 2.1s and the clarity index for music, C80 was -0.9dB. Clarity index is a metric that tells us how balanced the space is and how well complex passages of music can be heard. A reverberation time of 2.1s is similar to a modern concert hall’s reinforcement, and a C80 of -0.9dB means that the space is very good for complex music too. In terms of the music performed, it would be a combination of vocal and instrumental South Indian music with the melodic framework being akin to melodic modes of western classical music set to different time signatures and played at various tempi ranging from very slow (40-50 beats per minute) to very fast (200+ beats per minute).
Figure 2: The sine sweep measurement process in progress at the Virupaksha temple, Pattadakal
The second site was the 15th century Vijaya Vittala temple in Hampi which is another major tourist attraction. Here the poet, composer, and the father of South Indian classical music, Purandara Dasa, spent many years creating compositions in praise of the deity. He was known to have created thousands of compositions in many complex melodic modes.
Measurements at this site spanned 29 source-receiver positions with the mid-frequency reverberation time being 2.5s and the clarity index for music, C80 being -1.7dB. These values also fall in the ideal range for complex music to be interpreted clearly. Based on these findings, we conclude that the Vijaya Vittala temple provided the optimum acoustical conditions for the performance and appreciation of Purandara Dasa’s compositions and South Indian classical music more broadly.
Other standard room acoustic metrics have been calculated and analyzed from the temples’ sound decay curves. We will use this data to build wave-based computer simulations and further analyze the resonant modes in the temples, study the sonic characteristics of the bells, gongs, and conch shells to understand the relationship between the worship ceremony and the architecture of the temples. We also plan to auralize compositions of Purandara Dasa to recreate his experience in the Vijaya Vittala temple 500 years ago.
1 Alongside the ritualistic sounds discussed earlier, music performance holds a vital place in Hindu worship. The Virupaksha temple, in particular, has a rich history of fulfilling this role, as evidenced by inscriptions detailing grants given to temple musicians by the local queen.
Department of Music Acoustics, University of Music and Performing Arts Vienna, Vienna, Vienna, 1030, Austria
Alex Hofmann
Department of Music Acoustics
University of Music and Performing Arts Vienna
Vienna, Vienna, 1030
Austria
Popular version of 5aMU6 – Two-dimensional playability maps for single-reed woodwind instruments
Presented at the 185 ASA Meeting
Read the abstract at https://doi.org/10.1121/10.0023675
Please keep in mind that the research described in this Lay Language Paper may not have yet been peer reviewed.
Musicians show incredible flexibility when generating sounds with their instruments. Nevertheless, some control parameters need to stay within certain limits for this to occur. Take for example a clarinet player. Using too much or too little blowing pressure would result in no sound being produced by the instrument. The required pressure value (depending on the note being played and other instrument properties) has to stay within certain limits. A way to study these limits is to generate ‘playability diagrams’. Such diagrams have been commonly used to analyze bowed-string instruments, but may be also informative for wind instruments, as suggested by Woodhouse at the 2023 Stockholm Music Acoustics Conference. Following this direction, such diagrams in the form of playability maps can highlight the playable regions of a musical instrument, subject to variation of certain control parameters, and eventually support performers in choosing their equipment.
One way to fill in these diagrams is via physical modeling simulations. Such simulations allow predicting the generated sound while slowly varying some of the control parameters. Figure 1 shows such an example, where a playability region is obtained while varying the blowing pressure and the stiffness of the clarinet reed. (In fact, the parameter varied on the y-axis is the effective stiffness per unit area of the reed, corresponding to the reed stiffness after it has been mounted on the mouthpiece and the musician’s lip is in contact with it). Black regions indicate ‘playable’ parameter combinations, whereas white regions indicate parameter combinations, where no sound is produced.
Figure 1: Pressure-stiffness playability map. The black regions correspond to parameter combinations that generate sound.
One possible observation is that, when players wish to play with a larger blowing pressure (resulting in louder sounds) they should use stiffer reeds. As indicated by the plot, for a reed of stiffness per area equal to 0.6 Pa/m (soft reed) it is not possible to generate a note with a blowing pressure above 2750 Pa. However, when using a harder reed (say with a stiffness of 1 Pa/m) one can play with larger blowing pressures, but it is impossible to play with a pressure lower than 3200 Pa in this case. Varying other types of control parameters could highlight similar effects regarding various instrument properties. For instance, playability maps subject to different mouthpiece geometries could be obtained, which would be valuable information for musicians and instrument makers alike.
Modern music can be inaccessible to those with hearing loss; sound mixing tweaks could make a difference.
Listeners with hearing loss can struggle to make out vocals and certain frequencies in modern music. Credit: Aravindan Joseph Benjamin
WASHINGTON, August 22, 2023 – Millions of people around the world experience some form of hearing loss, resulting in negative impacts to their health and quality of life. Treatments exist in the form of hearing aids and cochlear implants, but these assistive devices cannot replace the full functionality of human hearing and remain inaccessible for most people. Auditory experiences, such as speech and music…click to read more
Text-to-Audio Models Make Music from Scratch #ASA183
Much like machine learning can create images from text, it can also generate sounds.
Media Contact: Ashley Piccone AIP Media 301-209-3090 media@aip.org
NASHVILLE, Tenn., Dec. 7, 2022 – Type a few words into a text-to-image model, and you’ll end up with a weirdly accurate, completely unique picture. While this tool is fun to play with, it also opens up avenues of creative application and exploration and provides workflow-enhancing tools for visual artists and animators. For musicians, sound designers, and other audio professionals, a text-to-audio model would do the same.
The algorithm transforms a text prompt into audio. Credit: Zach Evans
As part of the 183rd Meeting of the Acoustical Society of America, Zach Evans, of Stability AI, will present progress toward this end in his talk, “Musical audio samples generated from joint text embeddings.” The presentation will take place on Dec. 7 at 10:45 a.m. Eastern U.S. in the Rail Yard room, as part of the meeting running Dec. 5-9 at the Grand Hyatt Nashville Hotel.
“Text-to-image models use deep neural networks to generate original, novel images based on learned semantic correlations with text captions,” said Evans. “When trained on a large and varied dataset of captioned images, they can be used to create almost any image that can be described, as well as modify images supplied by the user.”
A text-to-audio model would be able to do the same, but with music as the end result. Among other applications, it could be used to create sound effects for video games or samples for music production.
But training these deep learning models is more difficult than their image counterparts.
“One of the main difficulties with training a text-to-audio model is finding a large enough dataset of text-aligned audio to train on,” said Evans. “Outside of speech data, research datasets available for text-aligned audio tend to be much smaller than those available for text-aligned images.”
Evans and his team, including Belmont University’s Dr. Scott Hawley, have shown early success in generating coherent and relevant music and sound from text. They employed data compression methods to generate the audio with reduced training time and improved output quality.
The researchers plan to expand to larger datasets and release their model as an open-source option for other researchers, developers, and audio professionals to use and improve.
ASA PRESS ROOM In the coming weeks, ASA’s Press Room will be updated with newsworthy stories and the press conference schedule at https://acoustics.org/asa-press-room/.
LAY LANGUAGE PAPERS ASA will also share dozens of lay language papers about topics covered at the conference. Lay language papers are 300 to 500 word summaries of presentations written by scientists for a general audience. They will be accompanied by photos, audio, and video. Learn more at https://acoustics.org/lay-language-papers/.
PRESS REGISTRATION ASA will grant free registration to credentialed and professional freelance journalists. If you are a reporter and would like to attend the meeting or virtual press conferences, contact AIP Media Services at media@aip.org. For urgent requests, AIP staff can also help with setting up interviews and obtaining images, sound clips, or background information.
ABOUT THE ACOUSTICAL SOCIETY OF AMERICA The Acoustical Society of America (ASA) is the premier international scientific society in acoustics devoted to the science and technology of sound. Its 7,000 members worldwide represent a broad spectrum of the study of acoustics. ASA publications include The Journal of the Acoustical Society of America (the world’s leading journal on acoustics), JASA Express Letters, Proceedings of Meetings on Acoustics, Acoustics Today magazine, books, and standards on acoustics. The society also holds two major scientific meetings each year. See https://acousticalsociety.org/.
Queen Mary University of London, Mile End Road, London, England, E1 4NS, United Kingdom
Popular version of 3aSP1-Artificial intelligence in music production: controversy and opportunity, presented at the 183rd ASA Meeting.
Music production
In music production, one typically has many sources. They each need to be heard simultaneously, but can all be created in different ways, in different environments and with different attributes. The mix should have all sources sound distinct yet contribute to a nice clean blend of the sounds. To achieve this is labour intensive and requires a professional engineer. Modern production systems help, but they’re incredibly complex and all require manual manipulation. As technology has grown, it has become more functional but not simpler for the user.
Intelligent music production
Intelligent systems could analyse all the incoming signals and determine how they should be modified and combined. This has the potential to revolutionise music production, in effect putting a robot sound engineer inside every recording device, mixing console or audio workstation. Could this be achieved? This question gets to the heart of what is art and what is science, what is the role of the music producer and why we prefer one mix over another.
Figure 1 Caption: The architecture of an automatic mixing system. [Image courtesy of the author]
Perception of mixing
But there is little understanding of how we perceive audio mixes. Almost all studies have been restricted to lab conditions; like measuring the perceived level of a tone in the presence of background noise. This tells us very little about real world cases. It doesn’t say how well one can hear lead vocals when there are guitar, bass and drums.
Best practices
And we don’t know why one production will sound dull while another makes you laugh and cry, even though both are on the same piece of music, performed by competent sound engineers. So we needed to establish what is good production, how to translate it into rules and exploit it within algorithms. We needed to step back and explore more fundamental questions, filling gaps in our understanding of production and perception.
Knowledge engineering
We used an approach that incorporated one of the earliest machine learning methods, knowledge engineering. Its so old school that its gone out of fashion. It assumes experts have already figured things out, they are experts after all. So let’s capture best practices as a set of rules and processes. But this is no easy task. Most sound engineers don’t know what they did. Ask a famous producer what he or she did on a hit song and you often get an answer like ‘I turned the knob up to 11 to make it sound phat.” How do you turn that into a mathematical equation? Or worse, they say it was magic and can’t be put into words.
We systematically tested all the assumptions about best practices and supplemented them with listening tests that helped us understand how people perceive complex sound mixtures. We also curated multitrack audio, with detailed information about how it was recorded, multiple mixes and evaluations of those mixes.
This enabled us to develop intelligent systems that automate much of the music production process.
Video Caption: An automatic mixing system based on a technology we developed.
Transformational impact
I gave a talk about this once in a room that had panel windows all around. These talks are usually half full. But this time it was packed, and I could see faces outside pressed up against the windows. They all wanted to find out about this idea of automatic mixing. It’s a unique opportunity for academic research to have transformational impact on an entire industry. It addresses the fact that music production technologies are often not fit for purpose. Intelligent systems open up new opportunities. Amateur musicians can create high quality mixes of their content, small venues can put on live events without needing a professional engineer, time and preparation for soundchecks could be drastically reduced, and large venues and broadcasters could significantly cut manpower costs.
Taking away creativity
Its controversial. We entered an automatic mix in a student recording competition as a sort of Turing Test. Technically we cheated, because the mixes were supposed to be made by students, not by an ‘artificial intelligence’ (AI) created by a student. Afterwards I asked the judges what they thought of the mix. The first two were surprised and curious when I told them how it was done. The third judge offered useful comments when he thought it was a student mix. But when I told him that it was an ‘automatic mix’, he suddenly switched and said it was rubbish and he could tell all along.
Mixing is a creative process where stylistic decisions are made. Is this taking away creativity, is it taking away jobs? Such questions come up time and time again with new technologies, going back to 19th century protests by the Luddites, textile workers who feared that time spent on their skills and craft would be wasted as machines could replace their role in industry.
Not about replacing sound engineers
These are valid concerns, but its important to see other perspectives. A tremendous amount of music production work is technical, and audio quality would be improved by addressing these problems. As the graffiti artist Banksy said “All artists are willing to suffer for their work. But why are so few prepared to learn to draw?”
Creativity still requires technical skills. To achieve something wonderful when mixing music, you first have to achieve something pretty good and address issues with masking, microphone placement, level balancing and so on.
Video Caption: Time offset (comb filtering) correction, a technical problem in music production solved by an intelligent system.
The real benefit is not replacing sound engineers. Its dealing with all those situations when a talented engineer is not available; the band practicing in the garage, the small restaurant venue that does not provide any support, or game audio, where dozens of sounds need to be mixed and there is no miniature sound engineer living inside the games console.