Université Grenoble Alpes (UGA)
11 rue des Mathématiques
38402 Saint Martin d’Hères (GRENOBLE)
Popular version of paper 3pAB4 “Automatic fish sounds classification”
Presented Wednesday afternoon, May 25, 2016, 2:15 in Salon I
171st ASA Meeting, Salt Lake City
In the current context of global warming and environmental concern, we need tools to evaluate and monitor the evolution of our environment. The evolution of animal populations is of a special concern in order to prevent changes of behaviour under environmental stress and to preserve biodiversity. Monitoring animal populations however, can be a complex and costly task. Experts can either (1) monitor animal populations directly on the field, or (2) use sensors to gather data on the field (audio or video recordings, trackers, etc.) and then process those data to retrieve knowledge about the animal population. In both cases the issue is the same: experts are needed and can only process limited quantity of data.
An alternative idea would be to keep using the field sensors but to build software tools in order to automatically process the data, thereby allowing monitoring animal populations on larger geographic areas and for extensive time periods.
The work we present is about automatically monitoring fish populations using audio recordings. Sounds have a better propagation underwater: by recording sounds under the sea we can gather loads of information about the environment and animal species it shelters. Here is an example of such recordings:
Legend: Raw recording of fish sounds, August 2014, Corsica, France.
Regarding fish populations, we distinguish four types of sounds that we call (1) Impulsions, (2) Roars, (3) Drums and (4) Quacks. We can hear them in the previous recording, but here are some extracts with isolated examples:
Legend: Filtered recording of fish sounds to hear Roar between 5s and 13s and Drums between 22s to 29s and 42s to 49s.
Legend: Filtered recording of fish sounds to hear Quacks and Impulsions. Both sounds are quite short (<0.5s) and are heard all along the recording.
However, to make a computer automatically classify a fish sound into one of those four groups is a very complex task. A simple or intuitive task for humans is often extremely complex for a computer, and vice versa. This is because humans and computers process information in different ways. For instance, a computer is very successful at solving complex calculations and at performing repetitive tasks, but it is very difficult to make a computer recognize a car in a picture. Humans however, tend to struggle with complex calculations but can very easily recognise objects in images. How do you explain a computer ‘this is a car’? It has four wheels. But then, how do you know this is a wheel? Well, it has a circular shape. Oh, so this ball is a wheel, isn’t it?
This easy task for a human is very complex for a machine. Scientists found a solution to make a computer understand what we call ‘high-level concepts’ (recognising objects in pictures, understanding speech, etc.). They designed algorithms called Machine Learning. The idea is to give a computer a lot of examples of each concept we want to teach it. For instance, to make a computer recognise a car in a picture, we feed it with many pictures of cars so that it can learn what a car is, and with many pictures without cars so that it can learn what a car is not. Many companies such as Facebook, Google, or Apple use those algorithms for face recognition, speech understanding, individualised advertisement, etc. It works very well.
In our work, we use the same technics to teach a computer to recognize and automatically classify fish sounds. Once those sounds have been classified, we can study their evolutions and see if fish populations behave differently from place to place, or if their behaviours evolve with time. It is also possible to study their density and see if their numbers vary through time.
This work is of a particular interest since to our knowledge, we present the first tool to automatically classify fish sounds. One of the main challenges is to make a sound understandable by a computer,that is to find and extract relevant information in the acoustic signal. By doing that, it gets easier for the computer to understand similarities and differences between all signals and in the end of the day, to be able to predict to which group a sound belongs.
University of Utah
201 Presidents Cir
Salt Lake City, UT
Popular version of paper 4pMU4 “How well can a human mimic the sound of a trumpet?”
Presented Thursday May 26, 2:00 pm, Solitude room
171st ASA Meeting Salt Lake City
Man-made musical instruments are sometimes designed or played to mimic the human voice, and likewise vocalists try to mimic the sounds of man-made instruments. If flutes and strings accompany a singer, a “brassy” voice is likely to produce mismatches in timbre (tone color or sound quality). Likewise, a “fluty” voice may not be ideal for a brass accompaniment. Thus, singers are looking for ways to color their voice with variable timbre.
Acoustically, brass instruments are close cousins of the human voice. It was discovered prehistorically that sending sound over long distances (to locate, be located, or warn of danger) is made easier when a vibrating sound source is connected to a horn. It is not known which came first – blowing hollow animal horns or sea shells with pursed and vibrating lips, or cupping the hands to extend the airway for vocalization. In both cases, however, airflow-induced vibration of soft tissue (vocal folds or lips) is enhanced by a tube that resonates the frequencies and radiates them (sends them out) to the listener.
Around 1840, theatrical singing by males went through a revolution. Men wanted to portray more masculinity and raw emotion with vocal timbre. “Do di Petto”, which is Italien for “C in chest voice” was introduced by operatic tenor Gilbert Duprez in 1837, which soon became a phenomenon. A heroic voice in opera took on more of a brass-like quality than a flute-like quality. Similarly, in the early to mid- twentieth century (1920-1950), female singers were driven by the desire to sing with a richer timbre, one that matched brass and percussion instruments rather than strings or flutes. Ethel Merman became an icon in this revolution. This led to the theatre belt sound produced by females today, which has much in common with a trumpet sound.
Fig.1. Mouth opening to head-size ratio for Ethel Merman and corresponding frequency spectrum for the sound “aw” with a fundamental frequency fo (pitch) at 547 Hz and a second harmonic frequency 2 fo at 1094 Hz.
The length of an uncoiled trumpet horn is about 2 meters (including the full length of the valves), whereas the length of a human airway above the glottis (the space between the vocal cords) is only about 17 cm (Fig. 2). The vibrating lips and the vibrating vocal cords can produce similar pitch ranges, but the resonators have vastly different natural frequencies due to the more than 10:1 ratio in airway length. So, we ask, how can the voice produce a brass-like timbre in a “call” or “belt”?
One structural similarity between the human instrument and the brass instrument is the shape of the airway directly above the glottis, a short and narrow tube formed by the epiglottis. It corresponds to the mouthpiece of brass instruments. This mouthpiece plays a major role in shaping the sound quality. A second structural similarity is created when a singer uses a wide mouth opening, simulating the bell of the trumpet. With these two structural similarities, the spectrum of tones produced by the two instruments can be quite similar, despite the huge difference in the overall length of the instrument.
Fig 2. Human airway and trumpet (not drawn to scale).
Acoustically, the call or belt-like quality is achieved by strengthening the second harmonic frequency 2fo in relation to the fundamental frequency fo. In the human instrument, this can be done by choosing a bright vowel like /ᴂ/ that puts an airway resonance near the second harmonic. The fundamental frequency will then have significantly less energy than the second harmonic.
Why does that resonance adjustment produce a brass-like timbre? To understand this, we first recognize that, in brass-instrument playing, the tones produced by the lips are entrained (synchronized) to the resonance frequencies of the tube. Thus, the tones heard from the trumpet are the resonance tones. These resonance tones form a harmonic series, but the fundamental tone in this series is missing. It is known as the pedal tone. Thus, by design, the trumpet has a strong second harmonic frequency with a missing fundamental frequency.
Perceptually, an imaginary fundamental frequency may be produced by our auditory system when a series of higher harmonics (equally spaced overtones) is heard. Thus, the fundamental (pedal tone) may be perceptually present to some degree, but the highly dominant second harmonic determines the note that is played.
In belting and loud calling, the fundamental is not eliminated, but suppressed relative to the second harmonic. The timbre of belt is related to the timbre of a trumpet due to this lack of energy in the fundamental frequency. There is a limit, however, in how high the pitch can be raised with this timbre. As pitch goes up, the first resonance of the airway has to be raised higher and higher to maintain the strong second harmonic. This requires ever more mouth opening, literally creating a trumpet bell (Fig. 3).
Fig 3. Mouth opening to head-size ratio for Idina Menzel and corresponding frequency spectrum for a belt sound with a fundamental frequency (pitch) at 545 Hz.
Note the strong second harmonic frequency 2fo in the spectrum of frequencies produced by Idina Menzel, a current musical theatre singer.
One final comment about the perceived pitch of a belt sound is in order. Pitch perception is not only related to the fundamental frequency, but the entire spectrum of frequencies. The strong second harmonic influences pitch perception. The belt timbre on a D5 (587 Hz) results in a higher pitch perception for most people than a classical soprano sound on the same note. This adds to the excitement of the sound.
Acoustics and Perception of Charisma in Bilingual English-Spanish
2016 United States Presidential Election Candidates
Rosario Signorello – firstname.lastname@example.org
Department of Head and Neck Surgery
31-20 Rehab Center,
Los Angeles, CA 90095-1794
Phone: +1 (323) 703-9549
Popular version of paper 1pSC26 “Acoustics and Perception of Charisma in Bilingual English-Spanish 2016 United States Presidential Election Candidates”
Presented at the 171st Meeting on Monday May 23, 1:00 pm – 5:00 pm, Salon F, Salt Lake Marriott Downtown at City Creek Hotel, Salt Lake City, Utah,
Charisma is the set of leadership characteristics, such as vision, emotions, and dominance used by leaders to share beliefs, persuade listeners and achieve goals. Politicians use voice to convey charisma and appeal to voters to gain social positions of power. “Charismatic voice” refers to the ensemble of vocal acoustic patterns used by speakers to convey personality traits and arouse specific emotional states in listeners. The ability to manipulate charismatic voice results from speakers’ universal and learned strategies to use specific vocal parameters (such as vocal pitch, loudness, phonation types, pauses, pitch contours, etc.) to convey their biological features and their social image (see Ohala, 1994; Signorello, 2014a, 2014b; Puts et al., 2006). Listeners’ perception of the physical, psychological and social characteristics of the leader is influenced by universal ways to emotionally respond to vocalizations (see Ohala, 1994; Signorello, 2014a, 2014b) combined with specific, culturally-mediated, habits to manifest emotional response in public (Matsumoto, 1990; Signorello, 2014a).
Politicians manipulate vocal acoustic patterns (adapting them to the culture, language, social status, educational background and the gender of the voters) to convey specific types of leadership fulfilling everyone’s expectation of what charisma is. But what happen to leaders’ voice when they use different languages to address voters? This study investigates speeches of bilingual politicians to find out the vocal acoustic differences of leaders speaking in different languages. It also investigates how the acoustical differences in different languages can influence listeners’ perception of type of leadership and the emotional state aroused by leaders’ voices.
We selected vocal samples from two bilingual America-English/American-Spanish politicians that participated to the 2016 United States presidential primaries: Jeb Bush and Marco Rubio. We chose words with similar vocal characteristics in terms of average vocal pitch, vocal pitch range, and loudness range. We asked listeners to rate the type of charismatic leadership perceived and to assess the emotional states aroused by those voices. We finally asked participants how the different vocal patterns would affect their voting preference.
Preliminary statistical analyses show that English words like “terrorism” (voice sample 1) and “security” (voice sample 2), characterized by mid vocal pitch frequencies, wide vocal pitch ranges, and wide loudness ranges, convey an intimidating, arrogant, selfish, aggressive, witty, overbearing, lazy, dishonest, and dull type of charismatic leadership. Listeners from different language and cultural backgrounds also reported these vocal stimuli triggered emotional states like contempt, annoyance, discomfort, irritation, anxiety, anger, boredom, disappointment, and disgust. The listeners who were interviewed considered themselves politically liberal and they responded that they would probably vote for a politician with the vocal characteristics listed above.
Results also show that Spanish words like “terrorismo” (voice sample 3) and “ilegal” (voice sample 4) characterized by an average of mid-low vocal pitch frequencies, mid vocal pitch ranges, and narrow loudness ranges convey a personable, relatable, kind, caring, humble, enthusiastic, witty, stubborn, extroverted, understanding, but also weak and insecure type of charismatic. Listeners from different language and cultural backgrounds also reported these vocal stimuli triggered emotional states like happiness, amusement, relief, and enjoyment. The listeners who were interviewed considered themselves politically liberal and they responded that they would probably vote for a politician with the vocal characteristics listed above.
Voice is a very dynamic non-verbal behavior used by politicians to persuade the audience and manipulate voting preference. The results of this study show how acoustic differences in voice convey different types of leadership and arouse differently the emotional states of the listeners. The voice samples studied show how speakers Jeb Bush and Marco Rubio adapt their vocal delivery to audiences of different backgrounds. The two politicians voluntary manipulate their voice parameters while speaking in order to appear as they were endowed of different leadership qualities. The vocal pattern used in English conveys the threatening and dark side of their charisma, inducing the arousal of negative emotions, which triggers a positive voting preference in listeners. The vocal pattern used in English conveys the charming and caring side of their charisma, inducing the arousal of positive emotions, which triggers a negative voting preference in listeners.
The manipulation of voice arouses emotional states that will induce voters to consider a certain type of leadership as more appealing. Experiencing emotions help voters to assess the effectiveness of a political leader. If the emotional arousing matches with voters’ expectation of how a charismatic leader should make them feel then voters would help the charismatic speaker to became their leader.
Signorello, R. (2014a). Rosario Signorello (2014). La Voix Charismatique : Aspects Psychologiques et Caractéristiques Acoustiques. PhD Thesis. Université de Grenoble, Grenoble, France and Università degli Studi Roma Tre, Rome, Italy.
Signorello, R. (2014b). The biological function of fundamental frequency in leaders’ charismatic voices. The Journal of the Acoustical Society of America 136 (4), 2295-2295.
Ohala, J. (1984). An ethological perspective on common cross-language utilization of F0 of voice. Phonetica, 41(1):1–16.
Puts, D. A., Hodges, C. R., Cárdenas, R. A. et Gaulin, S. J. C. (2007). Men’s voices as dominance signals : vocal fundamental and formant frequencies influence dominance attributions among men. Evolution and Human Behavior, 28(5):340–344.
Department of Bioengineering
University of Utah
36 S. Wasatch Dr., Room 3100
Salt Lake City, Utah 84112
Popular version of paper 3aBA1, “Ultrasound-mediated drug targeting to tumors: Revision of paradigms through intravital imaging”
Presented Wednesday morning, May 25, 2016, 8:15 AM in Salon H
171st ASA Meeting, Salt Lake City
More than a century ago, Nobel Prize laureate Paul Ehrlich formulated the idea of a “magic bullet”. This is a virtual drug that hits its target while bypassing healthy tissues. No field of medicine could benefit more from the development of a “magic bullet” than cancer chemotherapy, which is complicated by severe side effects. For decades, the prospects of developing “magic bullets” remained elusive. During the last decade, progress in nanomedicine has enabled tumor-targeted delivery of anticancer drugs via their encapsulation in tiny carriers called nanoparticles. Nanoparticle tumor targeting is based on the “Achilles’ heels” of cancerous tumors – their poorly organized and leaky microvasculature. Due to their size, nanoparticles are not capable to penetrate through a tight healthy tissue vasculature. In contrast, nanoparticles penetrate through a leaky tumor microvasculature thus providing for localized accumulation in tumor tissue. After tumor accumulation of drug-loaded nanoparticles, a drug should be released from the carrier to allow penetration into a site of action (usually located in a cell cytoplasm or nucleus). A local release of an encapsulated drug may be triggered by tumor-directed ultrasound; application of ultrasound has additional benefits: ultrasound enhances nanoparticle penetration through blood vessel walls (extravasation) as well as drug uptake (internalization) by tumor cells.
For decades, ultrasound has been used only as an imaging modality; the development of microbubbles as ultrasound contrast agents in early 2000s has revolutionized imaging. Recently, microbubbles have attracted attention as drug carriers and enhancers of drug and gene delivery. Microbubbles could have been ideal carriers for the ultrasound-mediated delivery of anticancer drugs. Unfortunately, their micron-scale size does not allow effective extravasation from the tumor microvasculature into tumor tissue. In Dr. Rapoport’s lab, this problem has been solved by the development of nanoscale microbubble precursors, namely drug-loaded nanodroplets that converted into microbubbles under the action of ultrasound[1-6]. Nanodroplets comprised a liquid core formed by a perfluorocarbon compound and a two-layered drug-containing polymeric shell (Figure 1.Schematic representation of a drug-loaded nanodroplet). An aqueous dispersion of nanodroplets is called nanoemulsion.
A suggested mechanism of therapeutic action of drug-loaded perfluorocarbon nanoemulsions is discussed below [3, 5, 6]. A nanoscale size of droplets (ca. 250 nm) provides for their extravasation into a tumor tissue while bypassing normal tissues, which is a basis of tumor targeting. Upon nanodroplet tumor accumulation, tumor-directed ultrasound triggers nanodroplet conversion into microbubbles, which in turn triggers release of a nanodroplet-encapsulated drug. This is because in the process of the droplet-to-bubble conversion, particle volume increases about a hundred-fold, with a related decrease of a shell thickness. Microbubbles oscillate in the ultrasound field, resulting in a drug “ripping” off a thin microbubble shell (Figure 2. Schematic representation of the mechanism of drug release from perfluorocarbon nanodroplets triggered by ultrasound-induced droplet-to-bubble conversion; PFC – perfluorocarbon). In addition, oscillating microbubbles enhance internalization of released drug by tumor cells.
This tumor treatment modality has been tested in mice bearing breast, ovarian, or pancreatic cancerous tumors and has been proved very effective. Dramatic tumor regression and sometimes complete resolution was observed when optimal nanodroplet composition and ultrasound parameters were applied
(Figure 3. A – Photographs of a mouse bearing a subcutaneously grown breast cancer tumor xenograft treated by four systemic injections of the nanodroplet-encapsulated anticancer drug paclitaxel (PTX) at a dose of 40 mg/kg as PTX. B – Photographs of a mouse bearing two ovarian carcinoma tumors (a) – immediately before and (b) – three weeks after the end of treatment; mouse was treated by four systemic injections of the nanodroplet-encapsulated PTX at a dose of 20 mg/kg as PTX; only the right tumor was sonicated. C – Photographs (a, c) and fluorescence images (b, d) of a mouse bearing fluorescent pancreatic tumor taken before (a, b) and three weeks after the one-time treatment with PTX-loaded nanodroplets at a dose of 40 mg/kg as PTX (c,d). The tumor was completely resolved and never recurred) [3, 4, 6].
In the current presentation, the proposed mechanism of a therapeutic action of drug-loaded, ultrasound-activated perfluorocarbon nanoemulsions has been tested using intravital laser fluorescence microscopy performed in collaboration with Dr. Brian O’Neill (then with Houston Methodist Research Institute, Houston, Texas) . Fluorescently labeled nanocarrier particles (or a fluorescently labeled drug) were systemically injected though the tail vein to anesthetized live mice bearing subcutaneously grown pancreatic tumors. Nanocarrier and drug arrival and extravasation in the region of interest (i.e. normal or tumor tissue) were quantitatively monitored. Various drug nanocarriers in the following size hierarchy were tested: individual polymeric molecules; tiny micelles formed by a self-assembly of these molecules; nanodroplets formed from micelles. The results obtained confirmed the mechanism discussed above.
As expected, dramatic differences in the extravasation rates of nanoparticles were observed.
The extravsation of individual polymer molecules was extremely fast even in the normal (thigh muscle) tissue; In contrast, the extravasation of nanodroplets into the normal tissue was very slow. (Figure 4. A – Bright field image of the adipose and thigh muscle tissue. B,C – extravasation of individual molecules (B – 0 min; C – 10 min after injection); vasculature lost fluorescence while tissue fluorescence increased. D,E – extravasation of nanodroplets; blood vessel fluorescence was retained for an hour of observation (D – 30 min; E – 60 min after injection).
Nanodroplet extravasation into the tumor tissue was substantially faster than that into the normal tissue thus providing for effective nanodroplet tumor targeting.
Tumor-directed ultrasound significantly enhanced extravasation and tumor accumulation of both, micelles and nanodroplets (Figure 5. Effect of ultrasound on the extravasation of Fluorescence of blood vessels dropped while that of the tumor tissue increased after ultrasound). Also, pay attention to a very irregular tumor microvasculature, to be compared with that of a normal tissue shown in Figure 4.
The ultrasound effect on nanodroplets was 3-fold stronger than that on micelles thus making nanodroplets a better drug carriers for ultrasound-mediated drug delivery.
On a negative side, some premature drug release into the circulation that preceded tumor accumulation was observed. This proposes directions for a further improvement of nanoemulsion formulations.
University of Virginia
Box 800759, Health System
Charlottesville, VA 22908
Popular version of paper 3aBA
Presented Wednesday morning, May 25, 2016
171st ASA Meeting, Salt Lake City
Parkinson’s disease is characterized by the degeneration of nerve cells in the brain, often leading to poor balance, difficulties with walking, muscle pain and rigidity, tremors and involuntary movements, dementia, and memory loss. Fortunately, new gene therapy approaches for treating the root cause of the problem (i.e. neural cell degeneration) are beginning to show some pre-clinical success. These approaches involve introducing genes for neurotrophic factors [i.e. glial derived neurotrophic factor (GDNF)] into well-defined regions of the brain that are affected by neurodegeneration. Once the gene is introduced into the neural cells, the hope is that they will begin to manufacture the neurotrophic protein which, in turn, will halt neural degeneration. However, as currently implemented, there are significant weaknesses to such approaches. Foremost, the genes must be delivered by direct injection through a needle and/or infection-prone indwelling catheters. In addition to being highly invasive procedures, these direct injection approaches are unlikely to yield a homogeneous distribution of the gene in the target region of the brain. Indeed, due to both the fact that human gene therapy is in its genesis and that these current gene delivery procedures are highly invasive, only patients with very advanced disease will be considered candidates for treatment at first. This is unfortunate because, ideally, patients should be treated before significant degeneration occurs. Thus, the ultimate goal is to develop a new and minimally-invasive gene delivery approach for Parkinson’s that would incur minimal risk to the patient and therefore be safe enough to apply to healthy “early-stage” patients who are just beginning to exhibit symptoms.
Our proposed approach entails delivering non-viral neurotrophic gene-bearing nanocarriers to specific regions of the brain following their intravenous injection into the bloodstream. To achieve this, two physical barriers to gene delivery must be overcome. The first is the barrier offered by brain tissue itself, the so-called brain-tissue barrier (BTB). Our collaborators at Johns Hopkins University have developed a new technology that allows nanoparticles to diffuse easily through the BTB. These so-called “brain-penetrating nanoparticles” exhibit uniform, long-lasting, and effective delivery. The second barrier to delivery is the blood-brain barrier (BBB), the essentially impenetrable membrane created by brain capillaries that separates the bloodstream from brain tissue. The Price lab at the University of Virginia has been studying how the BBB may be opened in a site selective manner for targeted drug and gene delivery. In essence, they have shown that applying focused ultrasound energy to the brain after the injection of micron-sized gas bubbles (FDA approved for other applications) can open the BBB (Figure 1). Of particular importance to this project, the Price group has demonstrated that opening the BBB with this targeted technology permits the delivery of brain-penetrating nanoparticles (fabricated in the Hanes lab) from the bloodstream to the tissue. The nanoparticles are transported by diffusion and convection to the brain and distribute evenly throughout, yielding homogeneous delivery without an invasive transcranial injection.
Advancing this concept to the clinic as a treatment for Parkinson’s disease will require testing the efficacy of the approach in a small animal model of neurodegeneration. Here, we first delivered non-viral reporter gene nanoparticles to rat brain using focused ultrasound, resulting in robust dose-dependent gene expression, only in the region exposed to ultrasound, through day 28. We also measured a transfection efficiency (i.e. the percentage of cells expressing the delivered gene) at > 40%. Toxicity was not evident. We then tested whether the approach had therapeutic potential for treating Parkinson’s disease by delivering neurotrophic (GDNF) gene nanoparticles to the striatum of Parkinson’s rats. MR images of BBB opening with focused ultrasound in the striatum are shown in Figure 2. For treated rats, motor impairment tests (apomorphine-induced rotation and cylinder) revealed significant improvement and dopaminergic neuron density was fully restored in key brain structures (i.e. striatum and substantia nigra pars compacta). We conclude that image-guided nanoparticle delivery with focused ultrasound is a safe and non-invasive strategy for brain transfection that has potential to be translated into a non-invasive clinical treatment for Parkinson’s disease.
Figure 1.Transcranial focused ultrasound achieves non-invasive, safe, repeated and targeted blood-brain barrier disruption, leading to improved drug or gene delivery.
Figure 2. Left: MR image showing structure of the striatum (outlined in yellow), which is the brain region targeted for treatment. Right: MR image of the striatum after treatment with focused ultrasound. The 4 bright spots show where the BBB has been opened in the striatum, allowing for the delivery of gene nanoparticles.
14431 Ventura Blvd #200
Sherman Oaks, CA 91423
Popular version of paper 2aMU4
Presented Tuesday morning, May 24, 2016
There exist a number of ways the human vocal folds can vibrate which create unique sounds used in singing. The two most common vibrational patterns of the vocal folds are commonly called “chest voice” and “head voice”, with chest voice sounding like speaking or yelling and head voice sounding more flute-like or like screaming on high pitches. In the operatic singing tradition, men sing primarily in chest voice while women sing primarily in their head voice. However, in rock singing, men often emit high screams using their head voice while female rock singers use almost exclusively their chest voice for high notes.
Vocal fold vibrational pattern differences are only a part of the story though, since the shaping of the throat, mouth and nose (the vocal tract) play a large part in the perception of the final sound. That means that head voice can be made to “sound” like chest voice on high screams using vocal tract shaping and only the most experienced listener can determine if the vocal register used was chest or head voice.
Using spectrographic analysis, differences and similarities between operatic and rock singers can be seen. One similarity between the two is the heightened output of a resonance commonly called “ring”. This resonance, when amplified by vocal tract shaping, creates a piercing sound that’s perceived by the listener as extremely loud. The amplified ring harmonics can be seen in the 3,000 Hz band in both the male opera sample and in rock singing samples:
MALE OPERA – HIGH B (B4…494 Hz) CHEST VOICE
MALE ROCK – HIGH E (E5…659 Hz) CHEST VOICE
MALE ROCK – HIGH G (G5…784 Hz) HEAD VOICE
Though each of these three male singers exhibit a unique frequency signature and whether singing in chest or head voice, each singer is using the amplified ring strategy in the 3,000Hz range amplify their sound and create excitement.