Lexical statistics, predictability, and eye movements.

Scott McDonald and Richard Shillcock

Department of Psychology
University of Edinburgh
Edinburgh, Scotland

Although it is well-established that contextual predictability can affect eye movement behaviour during reading, the source and temporal locus of such effects are not yet clear. Several studies have shown that manipulations of 'high-level' contextual constraint (i.e. requiring the integration of the meanings of individual words in the context) primarily influence 'late' processing measures, such as second pass reading time and regression probability (e.g. Calvo & Meseguer, 2002). However, the evidence for effects on 'early' measures such as initial fixation duration is mixed (e.g. Binder et al., 1999; Rayner & Well, 1996), and there has been little attempt to distinguish 'high-level' from 'low-level' predictability (but cf. MacDonald, 1993).

In two eye-tracking studies, we investigated the potential influence on early processing of a simple low-level source of predictability: the transitional probability of a pair of words (i.e. P(word2|word1)) computed from the British National Corpus.

For Experiment 1, we constructed sentence pairs (see [1]) where the length and frequency of the noun immediately following the verb was closely controlled on a pairwise basis, and the neutral prior context was held constant. Only the transitional probability of the verb-noun pair was varied (e.g. the target noun "confusion" [1a] is more predictable than "discovery" [1b], given the verb "avoid"). Importantly, the high- and low-probability conditions were matched for their rated plausibility by a separate group of subjects.

1a. One way to avoid confusion is to make the changes during vacation.
b. One way to avoid discovery is to make the changes during vacation.

First-pass eye movement data revealed a early effect of low-level predictability on the target noun: first fixation durations were significantly shorter for the high-transitional probability condition, Surprisingly, there was no influence on skipping rate, suggesting that it is the 'when' and not the 'where' of eye movement control that was affected. A time-course analysis showed that predictability effects emerged at approximately 150 ms (frequency distributions diverged at this point), comparable to the 150-175 ms reported for the emergence of frequency effects (Vitu et al., 2001).

These results were confirmed in Experiment 2, where we manipulated the transitional probabilities of verb+closed class word pairs:

2a. Today she is to preside over morning coffee while her boss attends the meetings.
b. Today she is to preside from morning coffee to noon over two public meetings.

Our experiments have demonstrated an early influence of low-level predictability on eye movements, not easily attributable to high-level processes concerned with the construction of integrated semantic representations or elaborative inferencing. We suggest that the processor is able to rapidly draw upon statistical information about word contingencies in order to predict the identity of upcoming words. As both word frequency and transitional probability give rise to early effects, we are currently exploring the viability of a probabilistic model incorporating both, where frequency is viewed as
the a priori probability of the word occurring.

References

Binder, K. S., Pollatsek, A. & Rayner, K. (1999). Extraction of information to the left of the fixated word in reading. Journal of Experimental Psychology: Human Perception and Performance, 25, 1162-1172.

Calvo, M. G. & Meseguer, E. (2002). Eye movements and processing stages in reading: Relative contributions of visual, lexical, and contextual factors. The Spanish Journal of Psychology, 5, 1138-7416,

MacDonald, M. (1993). The interaction of lexical and syntactic ambiguity. Journal of Memory & Language, 32, 692-715.

Rayner, K. & Well, A. D. (1996). Effects of contextual constraint on eye movements in reading: A further examination. Psychonomic Bulletin & Review, 3, 504-509.

Vitu, F., McConkie, G., Kerr, P. & O'Regan, J. K. (2001). Fixation location effects on fixation durations during reading: an inverted optimal viewing position effect. Vision Research, 41, 3513-3533.