Popular version of paper 2aSC4
Presented Tuesday afternoon, December 2, 1997
134th ASA Meeting, San Diego, CA
Embargoed until December 2, 1997
In every language there are tight restrictions on how sounds can be
combined to make words. English,
for example, has the sounds /p/ and /f/, but "pfropf" is not a possible
word of English. This is not
because it is unpronounceable --German speakers can pronounce it (it
means "stopper") -- but because
English native speakers reject *any* word which begins or ends with
the consonant cluster /pf/.
Without having to think about it consciously, people seem to know which
sequences their native
language allows and which ones it forbids -- the "phonotactics" of
their language. The aim of this
experiment is to find out *how* they know.
There are two main views on this question. One, popular among linguists,
is that knowing a language
involves unconsciously knowing a set of rules -- in this case, phonotactic
rules that forbid particular
sound sequences like /pf/. Another, originating with the school of
"connectionist" psychology, holds that
there are no such rules; English speakers find /pfropf/ objectionable
simply because it is too different
from all of the thousands of words in their vocabulary, none of which
contains /pf/. In this theory,
phonotactics emerges from the statistics of English vocabulary, organized
as a neural network
(McClelland and Elman's TRACE model).
In order to test the competing theories, we exploit an effect discovered
by Massaro and Cohen in 1983:
listeners seem to use phonotactics in deciding how to interpret an
acoustically vague sound.
It is possible, for instance, to digitally synthesize sounds that are
acoustically between /r/ and /l/, ranging
from very /r/-like, to middling, to very /l/-like, and to insert these
ambiguous sounds into words. Massaro and Cohen presented English
speakers with syllables containing such sounds, and asked them to say which
ones sounded more like /r/ and which more like /l/. They found that the
boundary between "more like /r/" and "more like /l/" depended on what came
before the ambiguous sound: After /t/, the boundary was very close to /l/,
while after /s/, it was very close to /r/.
(1) The Massaro-Cohen experiment
After /t/, /l/ is forbidden. Similarly, after /s/, /r/ is not
permitted. People's judgments of the ambiguous sounds were distributed
like this:
The English-speaking listeners heard more of the interval as /r/ after
/t/ (/tl/ is illegal, while /tr/ is okay),
According to the rule-based theory, plausibility is determined by consulting
a phonotactic rule.
The experiment worked like this: We synthesized English nonsense words
ending in closed syllables like "grihdj" (rhymes with "fridge"), "greedj",
"krihdj", and "kreedj", with stress on the final syllable, then
We took pairs of these words that differed only in the vowel, and synthesized
words with acoustically
(2) Typical stimulus family
CLOSED/g pulgrihdj ---- 1 ---- 2 ---- 3 ---- 4 ---- 5 ---- pulgreedj
CLOSED/k pulkrihdj ---- 1 ---- 2 ---- 3 ---- 4 ---- 5 ---- pulkreedj
OPEN/g pulgrih ---- 1 ----
2 ---- 3 ---- 4 ---- 5 ---- pulgree
OPEN/k pulkrih ---- 1 ----
2 ---- 3 ---- 4 ---- 5 ---- pulkree
[CLOSED indicates that the vowel sound of interest occurs
in the middle of a word, and OPEN indicates that it occurs at the
end of a word. 1, 2, 3, 4, 5 represent words with ambiguous vowel
sounds lying acoustically in between the vowel sounds at the two endpoints.
For example, in the top line, we took the two nonsense words "pulgrihdj"
(vowel sound "ih") and "pulgreedj" (vowel sound "ee") and constructed 5
words whose vowel sounds lie in between the "ih" and "ee."]
English-speaking listeners heard a series of trials that went like this:
First they heard one endpoint, then
Now, American English does not tolerate "ih" at the end of a word. There
are no words ending in that
(3) Actual results from experiment
CLOSED/g pulgrihdj ---- 1 ---- 2 ---- 3|---- 4 ---- 5 ---- pulgreedj
CLOSED/k pulkrihdj ---- 1 ---- 2 ---- 3|---- 4 ---- 5 ---- pulkreedj
OPEN/g pulgrih ---- 1 ----
2 --|- 3---- 4 ---- 5 ---- pulgree
OPEN/k pulkrih ---- 1 ----
2 --|- 3 ---- 4 ---- 5 ---- pulkree
(Just as in (1), the "|" symbol divides the sounds that
were heard as "more like 'ih'" from those heard as "more like 'ee'".
It marks the location of the sound that would be heard one way 50% of the
time and the other way the other 50%.)
The location of the boundary did not depend on whether the word contained
a g or a k. This is
(4) Predictions of the statistical theory
CLOSED/g pulgrihdj ---- 1 ---- 2 ---- 3|---- 4 ---- 5 ---- pulgreedj
CLOSED/k pulkrihdj ---- 1 ---- 2 ---- 3|---- 4 ---- 5 ---- pulkreedj
OPEN/g pulgrih ---- 1 ----
2|---- 3 ---- 4 ---- 5 ---- pulgree
OPEN/k pulkrih ---- 1 ----
2 --|- 3 ---- 4 ---- 5 ---- pulkree
Since this isn't what we got, the actual results match the predictions
of the rule-based theory --
This experiment therefore supports the traditional linguistic view that
knowing a language involves
(Work supported by Public Health Service Grant 5 T32 HD 07327 to the
author, and by
/tl/-/tr/: /l/ -- more like /l/ -- | -------------- more like
/r/ -------------- /r/
/sl/-/sr/: /l/ -------------- more like /l/-------------- | --more
like /r/---- /r/
(The listener heard a sound sequence consisting of /t/
or /s/ plus either a pure /l/, a pure /r/, or an acoustically ambiguous
sound somewhere between /l/ and /r/. The horizontal axis shows the
gradation between /l/ and /r/ sounds. The extreme left endpoint represents
a pure /l/ and the extreme right represents a pure /r/. In between
are synthesized sounds that lie in between /l/ and /r/. The ones
closer to /l/ are more acoustically similar to /l/ and the ones closer
to /r/ are more acoustically similar to /r/. However, since /tl/
is a forbidden sequence in American English, listeners tended to hear /tr/
sounds, even when the ambiguous sound was acoustically closer to /l/.
The character "|" marks the boundary between when listeners heard /tl/
and when they heard /tr/. Similarly, listeners heard /sl/ more frequently
than /sr/, even when the synthesized sound was acoustically more similar
to /l/.)
and more of the interval as /l/ after /s/ (/sl/ is legal, but /sr/
is not). The /l/-/r/ boundary is apparently
shifted by the phonotactics of English: listeners hear the illegal
sound only when the acoustic evidence is overwhelming. In other words,
the mechanisms of speech perception seem to be taking into account the
*plausibility* of an /l/ or /r/ in the given context, and demanding stronger
acoustic evidence before
assenting to the less plausible hypothesis.
According to the connectionist theory, plausibility is determined by
looking at real words that are similar
to the context. The connectionist theory therefore predicts that the
strength of the phonotactic effect
(i.e., the size of the boundary shift) should depend on the
composition of the set of similar words: a
context that is similar to many words, or to a few very common words,
should produce a large boundary shift, while one that is similar
to very few or very rare words should produce a small shift. On the other
hand, the rule-based theory expects equal shifts in all contexts, since
the rule is an impartial ban on *all* occurrences of the illegal sequence,
irrespective of context.
lopped off the final consonant to make nonsense words ending in the
open syllables "grih", "gree", "krih", and "kree."
ambiguous vowels that were in between the endpoints, to produce four
scales that gradually changed
from one word to the other in five steps. Example:
they heard an ambiguous word, then they heard the other endpoint. They
were asked to say which
endpoint the ambiguous word was closer to. From their responses, we
could determine the location of
the boundary between "ih" and "ee" in each of the four contexts.
vowel, and you can't make up new ones. (For example: The letter i in
"delicatessen" is pronounced "ih",
but when the word is shortened to "deli" it has to be pronounced "ee",
since words cannot end in "ih".)
This means that "ih" is legal in the CLOSED contexts, but not in the
OPEN ones. The Massaro-Cohen
effect should therefore move the "ih"-"ee" boundary towards "ih" in
the OPEN contexts as compared to the CLOSED ones. This is in fact
what we found -- the boundary was about a half-step closer to "ih" in the
OPEN contexts:
important, because the syllable "gree" is extremely common at the end
of a word, while the others are
all very rare. The set of words that is similar to "pulgree" is much
larger and more common than the set
of words that is similar to "pulkree" (out of every million words you
read or hear, about 450 end in
"gree", while only 10 end in "kree"). Therefore, the connectionist
theory predicts a larger boundary shift
for the g words than for the k words -- something more like this:
phonotactics applies uniformly and impartially in all contexts.
knowing certain rules or constraints for manipulating symbols, and
constitutes a problem for connectionist theories that would eliminate symbol
manipulation from cognition.
NIH/NIDCD Grant 5 R29 DC 01708 to John Kingston).