5aSCb17 – Pronunciation differences: Gender and ethnicity in Southern English – Wendy Herd, Devan Torrence, and Joy Cariño

 

Wendy Herd – wherd@english.msstate.edu

Devan Torrence – dct74@msstate.edu

Joy Carino – carinoj16@themsms.org

 

Linguistics Research Laboratory
English Department
Mississippi State University
Mississippi State, MS 39762

 

Popular version of paper 5aSCb17, “Prevoicing differences in Southern English: Gender and ethnicity effects”

Presented Friday morning, May 27, 10:05 – 12:00 in Salon F

171st ASA Meeting, Salt Lake City

 

We often notice differences in pronunciation between ourselves and other speakers. More noticeable differences, like the Southern drawl or the New York City pronunciation yuge instead of huge, are even used overtly when we guess where a given speaker is from. Our speech also varies in more subtle ways.

If you hold your hand in front of your mouth when saying tot and dot aloud, you will be able to feel a difference in the onset of vocal fold vibration. Tot begins with a sound that lacks vocal fold vibration, so a large rush of air can be felt on the hand at the beginning of the word. No such rush of air can be felt at the beginning of dot because it begins with a sound with vocal fold vibration. A similar difference can be felt when comparing [p] of pot to [b] of bot and [k] of cot to [ɡ] of got. This difference between [t] and [d] is very noticeable, but the timing of our vocal fold vibration also varies each time we pronounce a different version of [t] or [d].

Our study is particularly focused, not on the large difference between sounds like [t] and [d], but on how speakers produce the smaller differences between different [d] pronunciations. For example, an English [d] might be pronounced with no vocal fold vibration before the [d] as shown in Figure 1(a) or with vocal fold vibration before the [d] as shown in Figure 1(b). As can be heard in the accompanying sound files, the difference between these two [d] pronunciations is less noticeable for English speakers than the difference between [t] and [d].

Figure 1. Spectrogram of (a) dot with no vocal fold vibration before [d] and (b) dot with vocal fold vibration before [d]. (Only the first half of dot is shown.)

We compared the pronunciations of 40 native speakers of English from Mississippi to see if some speakers were more likely to vibrate their vocal folds before [b, d, ɡ] rather than shortly after those sounds. These speakers included equal numbers of African American participants (10 women, 10 men) and Caucasian American participants (10 women, 10 men).

 

Previous research found that men were more likely to vibrate their vocal folds before [b, d, ɡ] than women, but we found no such gender differences [1]. Men and women from Mississippi employed vocal fold vibration similarly. Instead, we found a clear effect of ethnicity. African American participants produced vocal fold vibration before initial [b, d, ɡ] 87% of the time while Caucasian American participants produced vocal fold vibration before these sounds just 37% of the time. This striking difference, which can be seen in Figure 2, is consistent with a previous smaller study that found ethnicity effects in vocal fold vibration among young adults from Florida [1, 2]. It is also consistent with descriptions of regional variation in vocal fold vibration [3].

 

Figure 2. Percentage of pronunciations produced with vocal fold vibration before [b, d, ɡ] displayed by ethnicity and gender.

The results suggest that these pronunciation differences are due to dialect variation. African American speakers from Mississippi appear to systematically use vocal fold vibration before [b, d, ɡ] to differentiate them from [p, t, k], but the Caucasian American speakers are using the cue differently and less frequently. Future research in the perception of these sounds could shed light on how speakers of different dialects vary in the way they interpret this cue. For example, if African American speakers are using this cue to differentiate [d] from [t], but Caucasian American speakers are using the same cue to add emphasis or to convey emotion, it is possible that listeners sometimes use these cues to (mis)interpret the speech of others without ever realizing it. We are currently attempting to replicate these results in other regions.

Each accompanying sound file contains two repetitions of the same word. The first repetition does not include fold vibration before the initial sound, and the second repetition does include vocal fold vibration before the initial sound.
 

  1. Ryalls, J., Zipprer, A., & Baldauff, P. (1997). A preliminary investigation of the effects of gender and race on voice onset time. Journal of Speech Language and Hearing, 40(3), 642-645.
  2. Ryalls, J., Simon, M., & Thomason, J. (2004). Voice onset time production in older Caucasian- and African-Americans. Journal of Multilingual Communication Disorders, 2(1), 61-67.
  3. Jacewicz, E., Fox, R.A., & Lyle, S. (2009). Variation in stop consonant voicing in two regional varieties of American English. Language Variation and Change, 39(3), 313-334.

 

 

Share This