Current advances in SWIFT

Cognitive Systems Research 7 (2006) 23–33 www.elsevier.com/locate/cogsys Current advances in SWIFT Action editor: Erik D. Reichle E.M. Richter *, R. ...

Download PDF

523KB Sizes 4 Downloads 142 Views

Report

PDF Reader
Full Text

Cognitive Systems Research 7 (2006) 23–33 www.elsevier.com/locate/cogsys

Current advances in SWIFT Action editor: Erik D. Reichle E.M. Richter *, R. Engbert, R. Kliegl Department of Psychology, University of Potsdam, P.O. Box 601553, 14415 Potsdam, Germany Received 12 November 2004; accepted 1 July 2005 Available online 6 September 2005

Abstract Models of eye movement control are very useful for gaining insights into the intricate connections of diﬀerent cognitive and oculomotor subsystems involved in reading. The SWIFT model (Engbert, Longtin, & Kliegl (2002). Vision Research, 42, 621–636) proposed a uniﬁed mechanism to account for all types of eye movement patterns that might be observed in reading behavior. The model is based on the notion of spatially distributed, or parallel, processing of words in a sentence. We present a reﬁned version of SWIFT introducing a letter-based approach that proposes a processing gradient in the shape of a smooth function. We show that SWIFT extents its capabilities by accounting for distributions of landing positions. 2005 Elsevier B.V. All rights reserved. Keywords: Reading; Eye Movements; Mathematical Modeling

1. Introduction As an everyday task, reading allows us to investigate highly diversiﬁed and complex behavior within a supposedly simple and well deﬁned array of stimuli. Reading is particularly suitable for a paradigmatic enquiry of the interrelationships between basal physiological and cognitive mechanisms. It is indicative of how perceptual, cognitive and motor systems interact to achieve a shared goal. With text comprehension as the ‘‘ﬁnal product,’’ we are interested in eye movements as a mediating and measurable behavioral correlate of the reading process. During reading the eyes move in jumps or saccades. For Latin script these are usually rightward oriented, but can also be leftward or regressive in order to bring certain words into focus. The *

Corresponding author. Tel.: +49 331 977 2127. E-mail addresses: [email protected] (E.M. Richter), engbert @rz.uni-potsdam.de (R. Engbert), [email protected] (R. Kliegl). URLs: http://www.psych.uni-potsdam.de/people/richter/index-e.html (E.M. Richter), http://www.agnld.uni-potsdam.de/~ralf (R. Engbert), http://www.psych.uni-potsdam.de/people/kliegl/index-e.html (R. Kliegl). 1389-0417/$ - see front matter 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.cogsys.2005.07.003

eyes ﬁxate words for varying lengths of time. We consider these movement patterns as the product of visuomotor and cognitive systems controlling the reading process. The measurement, analysis, and modeling of eye movements are among the most fruitful attempts to understand how visual information is processed and utilized to guide behavior (Findlay & Gilchrist, 2003). The measurement of ﬁxation times on words or parts of sentences is meanwhile considered as an essential tool for studying reading (Liversedge & Findlay, 2000; Rayner, 1998) and the amount of respective data has increased enormously in the recent past. Our central goal is to uncover how these data patterns connect with underlying processes. This can hardly be accomplished without quantitative mathematical modeling. Individual aspects of eye movement behavior can certainly be described qualitatively but an approach that tries to integrate all ﬁndings requires a uniﬁed computational model. With the quantity of data available today, such a model needs to perform on many levels. In the family of primary oculomotor models (POC) lowlevel information is utilized to reproduce the respective data patterns (McConkie, Kerr, & Dyre, 1994; Reilly &

24

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33

OÕRegan, 1998; Suppes, 1990, 1994; Yang & McConkie, 2001). Higher-level factors have only a modulating function. The so-called cognitive models, on the other hand, assume that eye movements are driven primarily by lexical processing and can be categorized according to how they conceptualize the allocation of visual attention. Some make the assumption that lexical access, i.e. word recognition, is strongly coupled to sequential shifts of attention (SAS) from word to word, which ensures that the meaning of the words become available in the order of their appearance in the text (Morrison, 1984; Reichle, Pollatsek, Fisher, & Rayner, 1998; Reichle, Rayner, & Pollatsek, 2003). Other models assume guidance by attentional gradients (GAG; Engbert, Longtin, & Kliegl, 2002; Legge, Klitz, & Tjan, 1997; Reilly & Radach, 2003), assigning processing capacities to more than one word in view under physiologically plausible constraints. For a thorough review see Reichle et al. (2003). A model that claims to describe the reading process accurately needs to integrate oculomotor as well as lexical factors. E–Z Reader is certainly the most advanced SASbased model and meets this requirement in many respects by addressing such phenomena as landing site distributions (cf. Reichle, Rayner, & Pollatsek, 1999), reﬁxation behavior, word frequency eﬀects, etc. Note that the SAS modelsÕ high-level assumption of a lexical control loop can be replaced without loss of accuracy with autonomous saccade generation, a low-level oculomotor mechanism, as demonstrated by Engbert and Kliegl (2001). SWIFT1 (Engbert et al., 2002), a member of the GAG model family, proposes parallel processing within a window spanning four words, thereby rejecting the strong assumption of sequential attention shifts. In its former version (Engbert et al., 2002), SWIFT already promoted the notion of a processing gradient by assigning a larger processing rate to the foveal word than the parafoveal words. In spite of its simplicity, this turned out to be a viable alternative to sequential attention shifts. We will proceed with an overview of our current considerations regarding the reﬁnement of the SWIFT model. We aim to enable it to integrate a larger amount of ﬁndings (cf. Radach, Kennedy, & Rayner, 2004) than was possible with the former version. SWIFT is compatible with the general framework of saccade generation by Findlay and Walker (1999) as well as the theory of movement preparation by Erlhagen and Scho¨ner (2002). Following the principle of minimal modeling we attempted to involve as few assumptions as possible. All assumptions are to be physiologically and psychologically plausible. They must not be overly speciﬁed for reading, in order to allow for an extension of the model to other domains such as visual search. The model is designed to provide one general mechanism that is able to explain all types of eye movements in reading such as forward and regressive saccades, reﬁxations, and word skippings.

In the following sections, we discuss modeling goals, current advances in SWIFT with an overview of changes in the formalism, and numerical simulations and evaluations of the modelÕs performance with respect to the modeling goals.

1 SWIFT is the acronym of Saccade generation With Inhibition by Foveal Targets.

2 We computed predictabilities from incremental guesses obtained from 83 subjects for each word in our sentence corpus.

2. Modeling objectives Eye movements have been found to be aﬀected by language processing, particularly when reading multisentence texts (e.g. Frazier, Pacht, & Rayner, 1999; Frazier & Rayner, 1990). Yet it appears that in the situation of reading single sentences, eye-movement control can be explained without a sophisticated model of language processing (Reichle et al., 2003). Instead, straightforward rules for word recognition suﬃce to explain most of the variance of experimental results. We will focus on experimental phenomena, which we tried to reproduce with our model, and methods to evaluate the modelÕs performance. 2.1. Empirical requirements Word length, printed word frequency and predictability, deﬁned as the probability of guessing a word from reading the preceding words of the sentence, are word properties, which have proven useful for predicting ﬁxation durations and other experimental phenomena. These word properties are therefore commonly used as independent variables in the modeling tradition despite their substantial correlations. Unlike frequency, which can be determined from a large text corpus (Baayen, Piepenbrock, & Rijn, 1993), predictability depends on the context of any particular sentence. It has to be determined experimentally, employing, e.g., the incremental cloze task.2 Typical eﬀects are that longer words are likely to be inspected in more and longer ﬁxations, whereas more frequent and more predictable words are skipped more often. Quantitative dependent variables are spatial (probabilities of diﬀerent types of saccades) or temporal (ﬁxation durations) in nature or some combination of these factors. The following dependent measures are typically reported in the literature. 2.1.1. Fixation durations Inspection times have been used as a central tool to examine information processing in reading. We employ three separate, non-overlapping populations of ﬁxations related to individual words, i.e. single ﬁxation duration (SF), and ﬁrst and second ﬁxation duration (F1, F2) for reﬁxated words. These measures are computed only for ﬁrst-pass reading, meaning that we consider only ﬁxations

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33

that are located on the rightmost word of the current (word-based) ﬁxation sequence. In addition, we compute the total reading time (TT), or the sum of all inspection times regardless of the ﬁxation sequence. Since computational models should account for these ﬁxation duration distributions, the deviation of simulated data from empirical data is one source for the computation of our goodness-of-ﬁt measure. 2.1.2. Fixation probabilities Again, based on ﬁrst-pass reading, the probabilities of skipping (P0) a word, ﬁxating it twice (P2), and three or more times (P3+) are calculated. The probability of a single ﬁxation (P1) is redundant, since it can be computed from the other probabilities. Since SWIFT also accounts for regressions, we enrich this set of measures by the interword regression probability (PR), i.e. the probability of a word being the target of a regressive saccade. Note that this is a measure relating to second-pass reading, like TT. 2.1.3. Eﬀects of word length and frequency The dependent measures are commonly summarized as functions of word length as well as classes of logarithmic word frequency. We use these functions as ‘‘benchmark’’ data in the sense that we base our ﬁtting coeﬃcient mainly on these curves. 2.1.4. Landing positions Both random and systematic errors of the oculomotor system inﬂuence reading behavior (McConkie, Kerr, Reddix, & Zola, 1988). As a consequence, we observe a systematic deviation of mean landing positions from an assumed optimal viewing position. We are able to conduct all of the above-mentioned analyses on one large data set (Kliegl, Grabner, Rolfs, & Engbert, 2004).

3. Outline of the model Dynamic ﬁeld theory (Erlhagen & Scho¨ner, 2002) proposes a function of space and time, representing an activation ﬁeld, that is distributed over a number of potential movement targets. To describe the evolution of that ﬁeld, we rely on concepts from the theory of nonlinear dynamic systems (Engbert, Longtin, & Kliegl, 2004). Moreover, different subsystems (e.g. vision, memory, cognition, oculomotion) are allowed to crosstalk continuously. These attributes taken together render the theory highly feasible for research on eye movement control in reading. Thus, the concept of a movement-planning ﬁeld from the dynamic ﬁeld theory of movement preparation motivated the activation ﬁeld for word targeting in SWIFT. However, we did not refer to the explicit mathematical formalism of the dynamic ﬁeld theory, since it proposes a framework for movement planning without explicit reference to lexical processing demands. Hence, the formalism was simpliﬁed in order to

25

allow for a straightforward incorporation of word recognition to account for lexical eﬀects on eye movements in reading.3 As in the former version of SWIFT (Engbert et al., 2002), we use a rather simple, one-dimensional activation ﬁeld. However, this already implies spatially distributed processing, i.e. several words can be processed in parallel. A model of eye movement control in reading should be plausible in light of the accumulated neurophysiological knowledge about saccade generation. SWIFT can be seen as a special case of the very general model proposed by Findlay and Walker (1999). As such, SWIFT is theoretically generalizable, e.g. to a two-dimensional visual-search like task (see Trukenbrod & Engbert, in preparation). 3.1. Theoretical assumptions Only one core assumption (Principle V) has been added to the former version of SWIFT (Engbert et al., 2002). Yet, we concisely repeat the others here as well. Thus the core assumptions of SWIFT are as follows. • Principle I: Spatially distributed processing of an activation ﬁeld. Due to the dynamic-ﬁeld approach (Erlhagen & Scho¨ner, 2002), the model produces all types of saccades in the course of a competition among words with diﬀerent activations. The dynamic ﬁeld of lexical activations evolves as several words are processed in parallel. • Principle II: Separate pathways for saccade timing and target selection. Neurophysiological ﬁndings suggest a distinction between temporal and spatial aspects of saccade generation (Findlay & Walker, 1999). SWIFT integrates these two aspects at diﬀerent stages of the saccade programming scheme (see Principle IV). • Principle III: Random saccade generation with inhibition by foveal targets. Autonomous generation of saccade programs alone would lead to random ﬁxation durations. Therefore, the autonomous timer is modulated by a foveal inhibition process, which is able to decelerate the reading rate for longer inspection times on diﬃcult words. • Principle IV: Two-stage saccade programming. Saccade programming is understood as a two-stage process, motivated by Becker and Ju¨rgens (1979) ﬁndings on the double-step paradigm. A preparatory labile stage is followed by a non-labile stage during which active programs can no longer be cancelled. • Principle V: Systematic and random errors in saccade lengths. Following McConkie et al. (1988) we introduce systematic as well as random oculomotor errors. An illustration of the modelÕs architecture is given in Fig. 1. The next section addresses our translation of the principles into mathematical terms which allows for an implementation of the model on a computer to generate 3

Prospectively, dynamic ﬁeld theoryÕs concept of an interaction between local excitation and global inhibition might also prove useful to contribute to a coherent account of eye movements in reading.

26

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33

the frequency eﬀect, and h as a measure of the eﬀect of predictability on lexical processing time. The ﬁrst factor addresses the strong dependency of visually based lexical retrieval on word frequency. The predictability factor takes into account the eﬀects of context on lexical diﬃculty. The parameter h, whose range was set between 0 and 1, allows for the attenuation of the impact of predictability. Note that Eq. (1) implies that visual processing and word prediction are independent factors, which of course is an idealization.

Fig. 1. Outline of SWIFT. Saccade programming is guided by separate temporal and spatial pathways. An autonomous random timer triggers new saccade programs and is itself susceptible to foveal inhibition. The set of lexical activations of all words in a sentence evolves dynamically and can be conceived of as a saliency map.

artiﬁcial data by simulations that can be compared to actual reading data.

nj ðtÞ ¼ xnj kðtÞ;

3.2. Mathematical description The details from the original version of SWIFT (Engbert et al., 2002) are summarized brieﬂy whereas the new additions are explained in more detail. 3.2.1. Dynamic ﬁeld of activation At any point in time activation values an(t) are assigned to each of Nw words in a sentence. The dynamic ﬁeld {an(t)} of activations is a one-dimensional array that establishes a saliency map for target selection. The actual length of the sentence needs not to be known in advance. The activation value of a word expresses the current extent of processing in the course of its identiﬁcation. It increases in a preprocessing stage until a given maximum is reached and decreases in the subsequent lexical completion stage. Upcoming saccade targets are determined by a simple transformation of the activation ﬁeld into a discrete probability distribution. 3.2.2. Word diﬃculty As in the former version of the model (Engbert et al., 2002; see also Engbert & Kliegl, 2001; Reichle et al., 1998) we assume that word diﬃculty Ln limits the maximum lexical activation for a given wordn. We assume that word diﬃculty is determined by printed word frequency fn and predictability pn of wordn as follows: Ln ¼ ða b log fn Þ ð1 hpn Þ ; |ﬄﬄﬄﬄﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄﬄﬄﬄﬄ} |ﬄﬄﬄﬄﬄ{zﬄﬄﬄﬄﬄ} frequency factor

3.2.3. Lexical processing rate This part constitutes a major reﬁnement of SWIFT. The notion of a processing gradient has been broken down to the letter level. The lexical processing rate, denoted by k() > 0, is a function of physical distance of a letter from the current ﬁxation position, the eccentricity . The ﬁxation position at time t is denoted by k(t) and can attain real values between 0 and the number z of all characters, spaces and punctuation marks of a given sentence. For the purpose of word-based analyses, a ﬁxation on a space is counted as a ﬁxation on the adjacent word to the right. The eccentricity of letter j of wordn with respect to the current ﬁxation position is given by

ð1Þ

predictability factor

with the free parameters a as the maximum of the frequency term for low-frequency words, b as a measure of

ð2Þ

where xnj is the position of letter j of wordn in sentence coordinates. We postulate that processing speed is mainly limited by visual acuity (see Legge, Hooven, Klitz, Mansﬁeld, & Tjan, 2002) which is a function of eccentricity. Furthermore, we hypothesize a global attentional allocation to the default reading direction (e.g. Rayner, 1998) which leads us to a ﬁrst approximation of the gradient of processing rate as an asymmetric Gaussian function. Thus, the lexical processing rate for a given eccentricity is computed as r ¼ rL ; if < 0; 2 kðÞ ¼ k0 exp 2 with ð3Þ 2r r ¼ rR ; if P 0; where rL and rR characterize the width of the processing span to the left and to the right, respectively (Fig. 2). To obtain a real density function the normalization constant has to attain the value of rﬃﬃﬃ 2 1 k0 ¼ . ð4Þ p ðrR þ rL Þ We assign a processing rate to a word by averaging the processing rates of its letters, i.e. kn ðtÞ ¼

Mn 1 X kðnj ðtÞÞ; M n j¼1

ð5Þ

where Mn is the length of wordn in letters. 3.2.4. Temporal evolution of the ﬁeld of activations As described above, word identiﬁcation is modeled as a two-stage process (see also Engbert & Kliegl, 2001; Reichle

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33

27

This assumption yields a mechanism which lies somewhere between a completely random selection and a winner-takes-all target selection.

Processing rate

σL

σR

0

Eccentricity

Fig. 2. The gradient. Lexical processing rate is modeled as a normalized asymmetric Gaussian function with free parameters rL and rR.

et al., 1998). Concomitant lexical activation rises from 0 to Ln within time tp, i.e. from unawareness of the word to completed preprocessing. In the course of the subsequent lexical completion, activation falls back to 0, indicating that the word has been completely processed. The only new aspect here is that we allow for a global decay of activation, which could reasonably be tied to a memory leakage that aﬀects the entire saliency map. The evolution of the dynamically changing ﬁeld of activations is described by a system of ordinary diﬀerential equations (ODE) þf kðn ðtÞÞ x; if t < tp ðpreprocessingÞ; dan ðtÞ ¼ dt kðn ðtÞÞ x; if t P tp ðlexical completionÞ ð6Þ with free parameters f > 1 as a preprocessing factor, i.e. build-up is faster than decrease, and x as the strength of a global decay process. Preprocessing is conceived of as a preliminary stage of identiﬁcation with the main purpose of entering a word into the set of possible saccade targets as soon as it ﬁrst appears in the attentional window. Its function within the model is that of a mechanism and therefore it is not of equal rank to lexical processing. As will be demonstrated later, it is about 90 times faster than the lexical-completion branch of the ODE. 3.2.5. Saccade target selection As mentioned earlier, the rationale behind saccade target selection is quite straightforward. We understand the selection process as being probabilistic and competitive. The model assigns a selection probability to each word by a straightforward transformation of their respective lexical activations. Thus, the probability p(n, t) to select wordn as a saccade target at time t is given by its relative lexical activation an ðtÞ pðn; tÞ ¼ PN w . j¼1 aj ðtÞ

ð7Þ

3.2.6. Random timing and foveal inhibition Saccade timing is deﬁned as a stochastic process that can be modulated by foveal processing demands. The interval between two trigger signals that initiate new saccade programs is a c-distributed random variable with mean tsac and a standard deviation of 13 tsac . The free parameter tsac reﬂects a readerÕs individual mean reading rate as well as the diﬃculty of the text at hand. The duration of a ﬁxation on a word is modulated by the extent of foveal activation. Let ti be the time of initiation of the saccade program for saccade i. The command to initiate the program for saccade i + 1 will occur after an interval Dti+1, which is drawn from the abovementioned c-distribution. We assume that this interval can be prolonged by a foveal inhibition process. Thus, the next saccade program i + 1 will be triggered if t > ti þ Dtiþ1 þ hak ðtÞ;

ð8Þ

where the free parameter h denotes the strength of the inhibitory process and t is the time elapsed since the start of the current ﬁxation. By way of analytical approximation (similar to the method employed by Kliegl & Engbert, 2003) it can be shown that the maximum inhibition time T is limited even for arbitrarily high values of h and is given by !1 M n 1 X a j2 h!1 a T ! ¼ Mn 1 þ exp 2 ; ð9Þ kmin k0 2rL j¼1 where kmin denotes the processing rate assigned to the foveal word with the length Mn in the worst case of a ﬁxation to the very right of the word, Eq. (5). The maximum inhibition time increases almost linearly with word length and it will turn out that it may actually attain a value of at most 65 ms for the longest word in our sentence corpus (described in Kliegl et al., 2004). 3.2.7. Saccade programming Our thinking here stems from ﬁndings obtained with the double-step paradigm in saccade generation by Becker and Ju¨rgens (1979). In the context of reading similar ideas were proposed ﬁrst by Reichle et al. (1998) see also Engbert and Kliegl (2001). After the triggering of a saccade program, an assumed two-stage process is initiated which consists of a labile and a non-labile stage. The labile stage takes an average of slab throughout which it is susceptible to cancelation. It is followed by the non-labile stage with a selected saccade target, Eq. ( 7). The mean duration of the non-labile stage is referred to as snl. The execution itself takes an average of sex throughout which preprocessing pauses due to saccadic blindness but lexical completion continues. At last, the gaze position is updated and a new saccade program might be

28

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33 Table 1 Estimates for oculomotor error parameters Parameter

Forward Saccade

Reﬁxation

Reﬁxation

Saccade

dSRE L0

0.41 5.4

0.49 5.7

0.5 4.3

0.15 10.0

d0 d1

Regressive

0.870 0.084

4. Simulation results

Fig. 3. Saccade programming in SWIFT. The triggering of a saccade program is governed by the temporal stream, whereas target selection at the beginning of the non-labile stage is guided by the saliency map.

In this section, we will clarify the rationale of our parameter ﬁtting, present parameter estimates, and ﬁnally discuss the modelÕs performance in relation to experimental data. 4.1. Simulations and model parameters

triggered. For all model runs the three s-values were ﬁxed at 150, 50, and 25 ms, respectively. A temporal scheme of saccade programming is illustrated in Fig. 3. 3.2.8. Oculomotor errors Our assumptions about oculomotor errors are derived from work by McConkie et al. (1988). Accordingly, the saccadic system aims at the optimal viewing positions within words – deﬁned as word centers in SWIFT. However, the actual landing positions are shifted due to systematic errors and scattered due to random errors. Let L denote the intended saccade length to the center of the current target word. The actual saccade length ‘ can be computed as ‘ ¼ L þ ‘SRE þ ‘G

ð10Þ

with ‘SRE as the so called systematic saccade range error and ‘G as a Gaussian-distributed random error with zero as its mean. The systematic error is assumed to arise from the systemÕs hypothesized preference of saccades with length L0 . This property has the advantage of eﬀective automation with the drawback of a limited adaptivity. Thus, the system undershoots for L > L0 and overshoots for L < L0 . A linear approximation of the systematic error is ‘SRE ¼ dSRE ðL0 LÞ;

ð11Þ

where dSRE speciﬁes the strength of the saccade range error (McConkie et al., 1988). Since motor error tends to increase with motor amplitude, we assume – again as a linear approximation – that the spread of the random error component can be described by r‘G ¼ d0 þ d1 L.

All experimental phenomena that form the basis of our analyses are derived from the Potsdam corpus (Kliegl et al., 2004), a reference data set obtained from 230 participants. For the simulation input, we use the known independent measures of the corpus (1138 words from 144 sentences). The ﬁrst and last word of each sentence is excluded from the statistical analysis. Each simulation is carried out for 500 virtual subjects. The temporal evolution of the ODEs (Eq. 6) is discretized in steps of 2 ms using the Euler method. SWIFT produces gaze trajectories that can be treated like actual data. We compute summary statistics for the dependent measures (i.e. mean ﬁxation durations and probabilities) as functions of classes of word lengths and word frequencies and the distributions of the durations, as described in Section 2. We are then able to calculate v2-type statistics reﬂecting the over-all deviation of the simulated data from our experimental data. Optimization4 was carried out employing a genetic algorithm similar to the procedure described in Engbert et al., 2002. Table 2 lists the set of parameters our optimization procedure converged at, together with respective estimates of error based on the 50 best performing parameter sets. Since the best performing 50 parameter sets ﬁt the experimental data similarly well, we assume that the variances of the parameters in this collection are plausible measures of inter-individual parameter variance. Thus, the parameter variances (i.e. parameter sensitivities) were fed back into the model in order to simulate inter-individual diﬀerences, avoiding the general problem of insuﬃcient variances of model outcomes.

ð12Þ

All parameters of Eqs. (11) and (12) were estimated from our data and held constant for all model runs. The parameters for the systematic error component were estimated separately for the diﬀerent saccade types (Table 1).

4

Note that in contrast to linear models, where a diﬀerent choice of one parameter value may often be compensated by a change in another parameter, we can hope for a unique solution. We also conducted an additional independent run of the genetic algorithm to verify the yielded estimates.

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33

29

Table 2 Estimates of free model parameters Parameter

Symbol

Lexical parameters

Frequency, intercept Frequency, slope Predictability weight

a b h

48.1 0.177 0.62

Visual processing

Visual span, right Visual span, left Preprocessing factor Global decay

rR rL f x

4.55 0.05 90.7 0.026

Saccade timing

Random timing [ms] Inhibition factor

tsac h

4.2. Model performance We open this section with a simulation example to illustrate some of the intrinsic behavior of SWIFT. Then, we present the above-mentioned summary statistics for 500 virtual subjects in comparison to our experimental data. Finally, we discuss data concerning initial landing positions and reﬁxation probabilities. 4.2.1. Simulation example Fig. 4 depicts a single simulated reading trajectory; the word-based ﬁxation sequence is {1, 3, 4, 5, 5, 7, 8, 7}, meaning that the modelÕs eye ﬁrst landed on word1, then on word3, and so forth. The time course of the set of activa-

Value

193.4 2.23

Error

Min

Max

0.88 0.08 0.04

10 0 0

100 5 1

0.07 0.01 9.82 0.00

1 0 1 0

1.89 0.24

100 0

5 6 120 0.1 300 100

tions {an(t)} is plotted in gray with ﬁxation positions k(t) in black. Some phenomena will now be described brieﬂy. As can be seen in Fig. 4, Word2 and word6 are skipped, indicated by the fact that the ﬁxation bars are never located within the respective word boundaries. The ﬁrst skipping occurs due to oculomotor error since the skipped word was actually chosen as saccade target and missed, whereas word6 was completely processed parafoveally. We get a reﬁxation on word5, although word6 had higher activation at the time of target selection. This may happen due to the probabilistic nature of target selection. As stated earlier, SWIFT is able to produce regressions. Word7 is regressed to since it was not completely processed during the two previous ﬁxations. According to SWIFT, regressions might

Fig. 4. Simulation example. The set of lexical activations is plotted as gray polygons and the trajectory of ﬁxation positions is represented by the black line. The lengthy vertical bars represent the ﬁxation durations: uppermost black areas denote compounds of the random terms involved in saccade timing; gray and white stretches denote labile and non-labile stages of saccade programming, respectively; the lowest black areas are times of saccade executions. Small black rings denote intended saccade targets.

30

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33 1

occurs in less than 5% of the time. As can be seen in Fig. 5, most of the time the amount of words processed in parallel is 3–4. Processing of just one word at a time, i.e. serial processing, usually occurs towards the end of sentences, since the rest of the sentence has been processed thoroughly by then. Neither in our data nor in our simulations did we ﬁnd an acceleration, or shorter ﬁxations, towards the end of sentences.

cum probs for parallel processing

1 2

3

4

Fig. 5. Amount of parallel processing in SWIFT. Depicted are the cumulative probabilities for the numbers of words processed in parallel (1, 2, 3, 4, 5, and more than 5) at any given time in the course of sentence reading. We simulated 200 virtual subjects reading all sentences from our corpus. We then rescaled the inspection time for each simulation in order to compute the probabilities. The expanse of each areas reﬂect the overall probability of processing the respective number of words.

4.2.2. Summary statistics The eﬀects of word length and word frequency on the dependent variables for experimental data and SWIFT simulations can be compared in Fig. 6. We summarized data for ﬁve logarithmic frequency classes (1–10, 11–100, 101–1000, 1001–10,000, > 10,000 [per million]) and 11 word lengths (2–11) and >11 [letters]. Distributions of ﬁxation durations of simulated data are compared to those of experimental data in Fig. 7. Simulated durations are in good agreement with experimental data apart from their more narrow distributions, particularly in case of FF and F2.

also be mislocated reﬁxations arising from oculomotor errors. In the depicted example there are never more than four words activated simultaneously. Larger simulations show that simultaneous activation of more than 5 words

4.2.3. Initial landing positions SWIFT, in its former version (Engbert et al., 2002) was able to reproduce the summary statistics equally well. From the results depicted in Fig. 8 it is obvious that SWIFT with the additional assumption of the saccade

5 >5

relative sentence inspection time

b

300 first second single total experiment simulation

250

200

skipping two

three+ regression experiment simulation

0.5 0.4 0.3 0.2 0.1

150

1

350 300

200

1

2

3

4

5

word frequency class d

first second single total experiment simulation

250

150

0

2 3 4 5 word frequency class

c 400 mean fixation duration [ms]

0.7 0.6

relative frequency

mean fixation duration [ms]

a

1

0.7 0.6

relative frequency

0

0.5

skipping two three+ regression experiment simulation

0.4 0.3 0.2 0.1

2 3 4 5 6 7 8 9 10 11 12

word length class

0

2 3 4 5 6 7 8 9 10 11 12

word length class

Fig. 6. Summary statistics for measured (dotted lines) as well as simulated (solid lines) data. Depicted are eﬀects of word frequency on (a) mean ﬁxation durations and (b) relative frequencies of saccadic events. Also depicted are eﬀects of word lengths on (c) ﬁxation durations and (d) relative frequencies of saccadic events.

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33 second fixation

first fixation experiment

0.3

simulation

relative frequency

relative frequency

0.3

0.2

0.1

0

100

200

300

0.2

0.1

0

400

100

fixation duration [ms]

200

300

400

fixation duration [ms]

single fixation

total reading time 0.3

relative frequency

0.3

relative frequency

31

0.2

0.1

0

100

200

300

0.2

0.1

0

400

100

fixation duration [ms]

200

300

400

fixation duration [ms]

Fig. 7. Distributions of simulated and experimental ﬁxation durations (F1, F2, SF and TT).

4-Letter Words

6-Letter Words

8-Letter Words

launch

% of Fix.

0.5

site

0.4 0.3 –1

0.2 experiment simulation

0.1

% of Fix.

0 0.5 0.4 0.3

–3

0.2 0.1

% of Fix.

0 0.5 0.4 0.3 –5

0.2 0.1

% of Fix.

0 0.5 0.4 0.3 –7

0.2 0.1 0 0

1

2

3

4

5

6

7

8

0

1

2

3

4

5

6

7

8

0

1

2

3

4

5

6

7

8

Initial landing position (letters) Fig. 8. Distributions of initial landing positions as a function of word length and launch site. The launch site of a saccade is deﬁned as the distance from the launch position to the space before the word it is directed towards. For clarity, only word lengths of 4, 6, and 8 letters and launch sites of 1, 3, 5, and 7 are depicted.

32

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33 Simulation

relative frequency of refixations

0.4

0.3

0.2

0.1

0

–4

–2

0 2 center based initial landing position

Experiment 0.4 relative frequency of refixations

word length 4 word length 5 word length 6 word length 7 word length 8

word length 4 word length 5 word length 6 word length 7 word length 8

0.3

0.2

0.1

0

–4 –2 0 2 center based initial landing position

Fig. 9. Relative frequency of reﬁxations as functions of initial landing position for diﬀerent word lengths for experimental and simulated data.

range error, Eq. (11), yields a rather close ﬁt to the experimentally observed eﬀects concerning the distributions of initial landing positions as a function of word length and launch site. In both the model and the experimental data, landing positions are (a) more scattered for longer words and longer preceding saccades and (b) shifted towards the end of short words and towards the beginning of words after longer preceding saccades.

(b) SWIFT is capable of quantitatively reproducing the inﬂuence of word characteristics on inspection times and probabilities of the above-mentioned ﬁxational events equally well. (c) Moreover, SWIFT increased its explanatory power, since it accounts for landing position eﬀects due to the introduction of the additional principle of the saccade range error.

4.2.4. Reﬁxation probability The probability of a reﬁxation as a function of initial landing position (cf. McConkie, Kerr, Reddix, Zola, & Jacobs, 1989) provides us with information on the optimal ﬁxation position, i.e. the position which allows the most eﬀective lexical processing. The more eﬀective lexical processing is, the smaller the need for reﬁxations. Experimental data yielded clear U-shaped curves with minima slightly shifted to the left of word centers. A comparison to simulated curves of relative frequencies of reﬁxations is depicted in Fig. 9. The curves for the shorter words are at least qualitatively in accordance with real data, i.e. we ﬁnd a dip to the left of word centers. For longer words the dip is less pronounced or simply not existent. In addition, the simulated data show a much stronger inﬂuence of word length on reﬁxation probabilities to the left of word centers. Thus SWIFT fails to accurately reproduce these data, a shortcoming that must be addressed in its future development.

Yet, we have to acknowledge one shortcoming of SWIFT in its current state. Even though we were able to reproduce some qualitative aspects of the reﬁxation behavior, the experimental data by and large disagree with our simulations on a quantitative level. To account for the reﬁxation probabilities in a future version of SWIFT, it might be feasible to further assume, that large saccade errors induce immediate corrective saccades, thus interrupting the autonomous saccade generation. An immediately following saccade would increase the probability of a reﬁxation, because it is more probable that the ﬁxated word wins the target selection again, as there is less time to change the saliency map. Since ﬁxations on word margins are more likely to arise from saccades with larger errors, immediate corrective saccades after such large errors would yield the desired eﬀect of an increased reﬁxation probability for initial ﬁxations on word margins relative to initial ﬁxations on word centers. The reﬁnement of the processing gradient does not increase the number of parameters. Additional parameters for the saccade range error were ﬁxed and hence add no further degrees of freedom to the model. One psychologically plausible parameter, namely memory decay, has been introduced. Together with the parameter h, which attenuates the degree of the predictabilitiesÕ inﬂuence to word diﬃculty (cf. Reichle et al., 2003, Eq. (2)), SWIFTÕs degrees of freedom have increased by two. On the other hand, we saved one s-parameter related to the programming of saccades and ﬁxed two of the other previously free s-parameters. Finally, no additional states had to be added to the model, thus preserving its straightforward architecture.

5. Discussion The translation of SWIFT to the level of letters was successful in many respects: (a) SWIFT still accounts for the same range of phenomena of gaze sequences employing one general mechanism. In addition, more physiological plausibility is gained through replacing the discrete (word-based) gradient by a continuous gradient that takes into account diﬀerent word lengths.

E.M. Richter et al. / Cognitive Systems Research 7 (2006) 23–33

References Baayen, R. H., Piepenbrock, R., & Rijn, H. (1993). The CELEX lexical database (Release 1) [CD-ROM]. University of Pennsylvania, Philadelphia, PA: Linguistic Data Consortium. Becker, W., & Ju¨rgens, R. (1979). An analysis of the saccadic system by means of double step stimuli. Vision Research, 19, 1967–1983. Engbert, R., & Kliegl, R. (2001). Mathematical models of eye movements in reading: a possible role for autonomous saccades. Biological Cybernetics, 85, 77–87. Engbert, R., Longtin, A., & Kliegl, R. (2002). A dynamical model of saccade generation in reading based on spatially distributed lexical processing. Vision Research, 42, 621–636. Engbert, R., Longtin, A., & Kliegl, R. (2004). Complexity of eye movements in reading. International Journal of Bifurcation and Chaos, 14, 493–503. Erlhagen, W., & Scho¨ner, G. (2002). Dynamic ﬁeld theory of movements preparation. Psychological Review, 109, 545–572. Findlay, J. M., & Gilchrist, I. D. (2003). Active vision. The psychology of looking and seeing. Oxford: Oxford University Press. Findlay, J. M., & Walker, R. (1999). A model of saccade generationbased on parallel processing and competitive inhibition. Behavioral and Brain Sciences, 22, 661–721. Frazier, L., Pacht, J. M., & Rayner, K. (1999). Taking on semantic commitments II: collective versus distributive readings. Journal of Memory and Language, 29, 181–200. Frazier, L., & Rayner, K. (1990). Taking on semantic commitments: Processing multiple meanings vs. multiple senses. Journal of Memory and Language, 29, 181–200. Kliegl, R., & Engbert, R. (2003). SWIFT explorations. In J. Hyo¨na¨, R. Radach, & H. Deubel (Eds.), The mindÕs eye: Cognitive and applied aspects of eye movements (pp. 103–117). Oxford: Elsevier. Kliegl, R., Grabner, E., Rolfs, M., & Engbert, R. (2004). Length, frequency, and predictability eﬀects of words on eye movements in reading. European Journal of Cognitive Psychology, 16, 262–284. Legge, G. E., Hooven, T. A., Klitz, T. S., Mansﬁeld, J. S., & Tjan, B. S. (2002). Mr. Chips 2002: new insights from an ideal-observer model of reading. Vision Research, 42, 2219–2234. Legge, G. E., Klitz, T. S., & Tjan, B. S. (1997). Mr. Chips: an idealobserver model of reading. Psychological Review, 104, 524–553. Liversedge, S. P., & Findlay, J. M. (2000). Saccadic eye movements and cognition. Trends in Cognitive Science, 4, 6–14. McConkie, G. W., Kerr, P. W., & Dyre, B. P. (1994). What are ÔnormalÕ eye movements during reading: toward a mathematical description. In

33

J. Ygge & X. Lennestrand (Eds.), Eye movements in reading (pp. 315–327). Elsevier: Oxford. McConkie, G. W., Kerr, P. W., Reddix, M. D., & Zola, D. (1988). Eye movement control during reading: I. The location of initial ﬁxations on words. Vision Research, 28, 1107–1118. McConkie, G. W., Kerr, P. W., Reddix, M. D., Zola, D., & Jacobs, A. M. (1989). Eye movement control during reading: II. Frequency of reﬁxating a word. Perception and Psychophysics, 46, 245– 253. Morrison, R. E. (1984). Manipulation of stimulus onset delay in reading: evidence for parallel programming of saccades. Journal of Experimental Psychology: Human Perception and Performance, 10, 667–682. Radach, R., Kennedy, A., & Rayner, K. (2004). Eye movements and information processing during reading [Special issue]. European Journal of Cognitive Psychology, 16, 1–352. Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372–422. Reichle, E. D., Pollatsek, A., Fisher, D. L., & Rayner, K. (1998). Toward a model of eye movement control in reading. Psychological Review, 105, 125–157. Reichle, E. D., Rayner, K., & Pollatsek, A. (1999). Eye movements control in reading: accounting for initial ﬁxation locations and reﬁxations within the E–Z Reader model. Vision Research, 39, 4403–4411. Reichle, E. D., Rayner, K., & Pollatsek, A. (2003). The E–Z Reader model of eye movement control in reading: comparisons to other models. Behavioral and Brain Sciences, 26, 446–526. Reilly, R., & OÕRegan, J. K. (1998). Eye movement control in reading: A simulation of some word-targeting strategies. Vision Research, 38, 303317. Reilly, R. G., & Radach, R. (2003). Foundations of an interactive activation model of eye movement control in reading. In J. Hyo¨na¨, R. Radach, & H. Deubel (Eds.), The mindÕs eye: Cognitive and applied aspects of eye movements (pp. 429–455). Amsterdam: Elsevier. Suppes, P. (1990). Eye movement models for arithmetic and reading performance. In E. Kowler (Ed.), Eye movements and their role in visual and cognitive processes. Amsterdam: Elsevier. Suppes, P. (1994). Stochastic models of reading. In J. Ygge & G. Lennerstrand (Eds.), Eye movements in reading. Oxford: Pergamon Press. Trukenbrod, H.A., Engbert, R. (in preparation). Eye movements in visual search: experiment and computational modeling. Yang, S.-N., & McConkie, G. W. (2001). Eye movements during reading: A theory of saccade initiation time. Vision Research, 41, 35673585.

Current advances in SWIFT

Current advances in SWIFT

Recommend Documents