Music in Our Ears - What Makes a Piano Sound Like a Piano?
Mounya Elhilali -
mounya@jhu.edu
Kailash Patil - kailash@jhu.edu
Department
of Electrical and Computer Engineering
Johns
Hopkins University
3400
N Charles Street
Barton
Hall, Rm 105
Baltimore,
MD 21218
Popular
version of paper 4aPP11
Presented
Thursday morning, April 22, 2010
159th
ASA Meeting, Baltimore, MD
Music
is a complex acoustic experience that we often take for granted. Whether
sitting at a symphony hall or enjoying a musical melody over earphones, we have
no difficulty identifying the instruments playing, following various beats and
melodies, or simply distinguishing a flute from an oboe. Our brains rely on a
number of sound attributes to analyze the music in our ears. These attributes
can be very simple like loudness. They can also be very complex like the
identity of the instrument or the source, formally called timbre. For
example, the timbre of percussive instruments like a drum or a marimba has a
unique almost noise-like quality that makes them uniquely distinguishable from
wind instruments like an oboe or a flute. Defining what makes an oboe and oboe
and not a drum or a flute is a complex combination of sound attributes that we
still dont fully understand and has been subject of research efforts for many
decades. So, what is a good way to define the timbre of musical instruments?
In
this work, we used our knowledge of how sounds are processed in the auditory
system of our brains to help us define a good space of what timbre of musical
instruments is. When sounds enter our ears, they travel through various brain
structures that extract a number of informative attributes of these sounds,
encompassing both the tonal components (or frequency component) of sound, as
well the temporal dynamics (how sound elements change over time). This analysis
is done using a complex network of brain neurons, each analyzing one aspect of
the sensory sound.
Using
this inspiration, we built a mathematical model that mimics this array of
biological neurons. This model takes any sound, analyzes it through these
computer-simulated neurons, and produces a rich and high-dimensional
representation that explicitly captures all the information we need to extract
from the sound (its frequency components, temporal dynamics and how frequency
and time components combine together). Using this new way of representing
sounds, we can then see whether all piano sounds tend to activate the same
group of neurons, while all violin sounds tend to active another set of
neurons, etc. Therefore, we attempt to cluster musical instruments in this new
space. Our results show that this model gives us an almost perfect
classification of musical instruments with an accuracy of 98% (i.e. making a
faux-classification only in 2% of cases), by clustering sounds from the same
instrument with each other. We can therefore use this model to improve our understanding
of the neural mechanisms for timbre perception, and give us a better grasp of
what allows our brains to appreciate the diversity of the musical world. In the
long term, work on similar computer models can lead to better technologies to
help hearing-impaired listeners experience the joys of music.