Telling
the future: How speakers time word preparation and articulation
Zenzi M. Griffin
Georgia Institute of Technology
Speakers must coordinate the processing of ideas, words, and movements
over time, but they have a great deal of flexibility in how. They can
start utterances with minimal preparation of their words or after preparing
and buffering all of their words in phonological or motor codes [1].
What information can speakers use to control the timing of speech and
preparation? Two experiments demonstrate that speakers can use a correlate
of word length to estimate the amount of time they have available for
word preparation during speech and minimize the amount of preparation
that precedes speech. They thereby minimize word buffering while maintaining
fluency.
In Experiment 1, 20 speakers were asked to name 32 object pairs without
pausing between names (e.g., "scarf pipe"). Articulating "scarf"
takes less than 600 ms, but preparing "pipe" can take ~900
ms [2]. Saying "scarf" as soon as it was ready would leave
a speaker with ~300 ms of silence before "pipe" was ready
to follow. "Scarf" must be buffered. When
monosyllabic and multisyllabic object names like "scarf" and
"skeleton" are matched on other dimensions, they take the
same amount of time to prepare in mixed length lists [3,4]. "Scarf
pipe" and "skeleton pipe" should take the same amount
of time to prepare. Speech onset latencies should only differ if speakers
consider the length of the first name in timing speech. They did. Speakers
began saying "skeleton" earlier than "scarf." Speakers
gazed at the long- and short-named objects equally before speaking,
but gazed at second objects more before saying short names. In contrast,
with long first words, they gazed at second objects more during speech.
Adding words that require little preparation should provide more time
to prepare second names while speaking. In Experiment 2, when speakers
said "next to" between names, latencies for long and short
names were equal. Speech began significantly earlier than when no words
intervened. When nothing intervened, these speakers replicated Experiment
1's results. Similar timing occurs in speakers' gazes while describing
scenes [5] and in sequences of arm movements [6]. These results suggest
that people are sensitive to the amount of time it takes to prepare
and perform an action. When speakers choose to, they can use this information
to minimize advance preparation and buffering of words while speaking
fluently. This has many implications for language production theories.
References
[1] Wheeldon, L., & Lahiri, A. (1997). Prosodic units in speech
production. Journal of Memory and Language, 37, 356-381.
[2] Snodgrass, J. G., & Yuditsky, T. (1996). Naming times for the
Snodgrass and Vanderwart pictures. Behavior Research Methods, Instruments,
& Computers, 28, 516-536.
[3] Bachoud-Levi, A.-C., Dupoux, E., Cohen, L., & Mehler, J. (1998).
Where is the length effect? A cross-linguistic study of speech production.
Journal of Memory and Language, 39, 331-346.
[4] Meyer, A. S., Roelofs, A., & Levelt, W. J. M. (in press). Word
length effects in object naming: The role of a response criterion. Journal
of Memory and Language.
[5] Griffin, Z. M., & Bock, K. (2000). What the eyes say about speaking.
Psychological Science, 11, 274-279.
[6] Ketelaars, M. A. C., Garry, M. I., & Frank, I. M. (1997). On-line
programming of simple movement sequences. Human Movement Science, 16,
461-483.