Computers in Human Behavior 26 (2010) 1318–1326
Contents lists available at ScienceDirect
Computers in Human Behavior journal homepage: www.elsevier.com/locate/comphumbeh
Solving problems: How can guidance concerning task-relevancy be provided? Martin Groen *, Jan Noyes Department of Experimental Psychology, University of Bristol, UK
a r t i c l e
i n f o
Article history: Available online 27 April 2010 Keywords: Problem solving support Eye tracking Signal detection analysis Task relevance
a b s t r a c t The analysis of eye movements of people working on problem solving tasks has enabled a more thorough understanding than would have been possible with a traditional analysis of cognitive behavior. Recent studies report that influencing ‘where we look’ can affect task performance. However, some of the studies that reported these results have shortcomings, namely, it is unclear whether the reported effects are the result of ‘attention guidance’ or an effect of highlighting display elements alone; second, the selection of the highlighted display elements was based on subjective methods which could have introduced bias. In the study reported here, two experiments are described that attempt to address these shortcomings. Experiment 1 investigates the relative contribution of each display element to successful task realization and does so with an objective analysis method, namely signal detection analysis. Experiment 2 examines whether any performance effects of highlighting are due to foregrounding intrinsic task-relevant aspects or whether they are a result of the act of highlighting in itself. Results show that the chosen objective method is effective and that highlighting the display element thus identified improves task performance significantly. These findings are not an effect of the highlighting per se and thus indicate that the highlighted element is conveying task-relevant information. These findings improve on previous results as the objective selection and analysis methods reduce potential bias and provide a more reliable input to the design and provision of computer-based problem solving support. Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction Analysis of eye movements of people working on problem solving tasks has enabled a more thorough understanding of the specific task steps than would have been possible with a traditional analysis of cognitive behavior, such as solution time and accuracy. For example, a significant number of studies have provided evidence that top-down task demands affect eye movements (e.g., Castelhano, Mack, & Henderson, 2009; Freksa & Bertel, 2007; Holm & Mantyla, 2007; Just & Carpenter, 1976, 1985; Knoblich, Ohlsson, & Raney, 2001; Land & Hayhoe, 2001; Ratwani, Trafton, & BoehmDavis, 2008; Rayner, 1978). This influence has been found to be applicable to a wide range of tasks, such as mental rotation problems (Just & Carpenter, 1985), geometric reasoning (Epelboim & Suppes, 1997), mechanical devices (Hegarty, 1992) and making cups of tea (Land, Mennie, & Rusted, 1999). Recent studies indicate that not only can task demands influence eye movements, but eye movements can affect task performance too (Grant & Spivey, 2003; Thomas & Lleras, 2007;
* Corresponding author. Address: Department of Experimental Psychology, University of Bristol, 12a Priory Road, Bristol BS8 1TU, UK. Tel.: +44 (0)117 331 7808; fax: +44 (0)117 928 8588. E-mail addresses:
[email protected],
[email protected] (M. Groen),
[email protected] (J. Noyes). 0747-5632/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.chb.2010.04.004
Velichkovsky, 1995). This bidirectional relationship implies that careful manipulation of the stimulus could influence task performance, which provides opportunities to support humans in realizing task objectives by guiding their eye movements in the desired direction. A number of studies have provided evidence for this bottom-up influence of eye movements on performance in a range of tasks. Grant and Spivey (2003) reported a study in which participants were requested to solve an insight problem (Duncker’s, 1945, radiation problem) by using a diagrammatic representation of the problem. In Grant and Spivey’s study, the eye movements of participants were analyzed with regard to which elements of the diagram the successful participants looked. The element that the participants looked at longest (i.e., the skin region) was assumed to be the crucial element to solve the task. This suggestion was examined by Grant and Spivey in a follow-up experiment where they found that highlighting this crucial area led to a significant improvement in task performance. Thomas and Lleras (2007) focused on a particular aspect of Grant and Spivey’s (2003) study, namely, the nature of the eye movement patterns on and around the crucial area. They examined whether it would be possible to induce participants to ‘‘move their eyes in a way that embodied the problem’s solution” (Thomas & Lleras, 2007, p. 664). Invoking the desired eye movements was attempted by giving participants a tracking task whilst they were
M. Groen, J. Noyes / Computers in Human Behavior 26 (2010) 1318–1326
working on Duncker’s radiation problem. The tracking task consisted of the identification of a letter among digits presented at different locations in the problem diagram. In one condition, these locations embodied the solution to the task (i.e., multiple crossings over the skin area in different locations representing the multiple lasers converging on the tumor) whereas in other conditions this solution was not embodied: no skin crossings, skin crossing in one location, and free viewing. Their results showed that the participants in the embodied solution condition performed significantly better than the participants in the other conditions. Note that participants were not explicitly shown the solution to the task, the manipulation was merely directed at inducing eye movements that embodied the solution; this was enough to facilitate improved task performance. In a different task domain, similar results to Grant and Spivey’s were reported: in Hagemann, Strauss, and Canal-Bruland’s (2006) study, participants were asked to watch films of people hitting a badminton shuttle and then guess where the shuttle would land. Cueing the processing of information to the body parts involved in the action with a transparent red patch led to a significant improvement of participants guessing correctly on a post test, compared to controls. However, on a retention test administered about a week later this improvement seemed to have disappeared. An issue with Hagemann, Strauss, and Canal-Bruland’s study is that these results are based on unbalanced sample sizes, giving doubts about the interpretation of the results. Three studies showed that individuals’ task performance improved when their eye movements were directed toward empirically determined desired display elements. In cooperative tasks, the eye movements of the ‘expert’ partner could directly be superimposed on the display of the ‘novice’ partner to improve task performance. This would make it possible to provide problem solving support without the need to determine beforehand what the desired display elements are. Velichkovsky (1995) reported a study in which two participants worked in pairs to solve three jigsaw puzzles presented on separate computer displays, one for each participant. One of the participants in each pair was an expert who assisted the novice partner to complete the puzzles. In one of the conditions, the gaze position of the expert was superimposed on the display of the novice in an effort to guide the eye movements of the novice to the same location as the expert. The effect of the display of expert gaze position improved the task performance of the novices significantly. In brief, there is increasing support that influencing the eye movements of problem solvers whilst they are conducting the task is helpful in the achievement of task objectives across a range of tasks. However, Van Gog, Jarodzka, Scheiter, Gerjets, and Paas (2009) found contradictory results. They reported a study in which participants were first presented an example of a correct solution to Frog Leap, a Tower of Hanoi-type problem, after which they attempted to solve the problem. The influence on eye movements was implemented by superimposing the pattern of eye movements of one expert solving the problem on the example video. So, in this study the participants were required to remember the eye movements of the expert in the example in order to be of help when solving the problem. The results showed that this form of asynchronous guidance was not effective. A major issue with the studies of Velichkovsky (1995) and Van Gog et al. (2009) is that it is unclear whether the observed effects are a result of an effective, or ineffective, influence on the eye movements or whether they could be a mere effect of highlighting. More importantly, because this alternative explanation cannot be excluded, it might have resulted in wrong or extraneous elements being conveyed as task-relevant, leading to detrimental effects on task performance. That is, it could be that wrong or irrelevant display elements were looked at by the ‘‘expert”, leading to
1319
inappropriate guidance and an unnecessary burden on the problem solver. So, it is important to establish the potential task-relevant role of a display element before it is highlighted. There are some additional issues with the studies of Van Gog et al. (2009), Grant and Spivey (2003) and Velichkovsky (1995). In the studies of Van Gog et al. and Velichkovsky, the eye movements that were used as the basis for task performance manipulation were based on the successful performance of, respectively, one and four person(s). This could be problematic as is typical with eye movements, not all movements made are relevant to the task in hand (Findlay & Brown, 2006; Kapoula, Robinson, & Hain, 1986) which could have burdened participants’ memorization or task performance unnecessarily. There could also be an idiosyncratic pattern in the eye movements of the selected expert that resulted in confusion. In addition, in the study of Van Gog et al. the guidance was not supplied whilst the participants were conducting the task, which could have confounded the results too as not only the effect of attention guidance is measured, but also the memorization of eye movements from the example video. Moreover, the particular task (Frog Leap) in Van Gog et al. can only be accomplished in one particular way, which may be too restrictive compared with the majority of tasks encountered in daily life. In Grant and Spivey’s (2003) study, analysts judged whether eye fixations were on the regions of interest, limiting the analysis to the first and the last 30 s of each session. Analyzing the data in this way could lead to two problems. One, potentially important eye movements could be excluded inadvertently when the analysis is limited to the first and last 30 s. Two, even though in Grant and Spivey’s study judges were blind to the solution times and success of participants, measurement bias could occur. In summary, it is unclear from the studies of Van Gog et al. (2009) and Velichkovsky (1995) whether the established effects are the results of highlighting display elements that are intrinsically task-relevant or if they are the result of the highlighting itself. In this study, therefore, the effect of guiding eye movements to empirically determined task-relevant elements of the display is contrasted with guiding eye movements to elements that are low in task-relevance, in an effort to avoid confounding the effects of mere highlighting with flagging task-relevant information. Second, the established effects in Grant and Spivey (2003), Van Gog et al. and Velichkovsky could have been more pronounced when a more objective selection method of task-relevant stimulus elements would have been employed. The current study intends to improve on the previous studies as the adopted selection method, signal detection analysis (see Green & Swets, 1966) is less prone to measurement bias and avoids the inclusion of potential superfluous or detrimental eye movement guidance. In the first experiment the relative contribution of each display element to the successful resolution of the tasks is analyzed using an objective method. In the second experiment, the effect of highlighting the display element with the highest contribution to a successful resolution on task performance is investigated.
2. Experiment 1 In this experiment, people were asked to look at four diagrams of physics problems and answer the questions posed on the displays. The objectives were to collect and analyze the eye movements of participants while solving the task and to identify the relative contribution of each stimulus element to the successful accomplishment of the task. All eye fixations on the display were considered, in an effort to prevent excluding potentially important eye movements. In addition, the relative contribution of display elements to task establishment is analyzed by conducting a signal detection analysis of fixations on each region of interest. That is, in
1320
M. Groen, J. Noyes / Computers in Human Behavior 26 (2010) 1318–1326
Grant and Spivey’s (2003) work it was assumed that the crucial diagram element was that element that was looked at the longest, based on human judgment. In the studies of Van Gog et al. (2009) and Velichkovsky (1995) it was assumed that all eye movements were relevant. So, in contrast to these three studies in the present experiment, the crucial diagram region is calculated using signal detection analysis which comprises a more objective measure. 2.1. Method 2.1.1. Participants Eighteen undergraduate psychology students (14 females, 4 males) participated in the study for course credit. None had taken college-level physics or engineering programs and all had normal or corrected-to-normal vision. Their ages ranged between 18 and 24 years (M = 19, SD = 1.4). 2.1.2. Materials and apparatus The diagrams were based on those used by Yoon and Narayanan (2004). They were chosen as they have shown to be useful to infer cognitive processing in previous research (Hegarty, 1992; Narayanan, Suwa, & Motoda, 1994; Yoon & Narayanan, 2004). The four diagrams show cross-sectional diagrams of simple mechanical devices in black and white, with an accompanying question and possible answers. The area of the diagrams subtended approximately 34° 27° of visual angle. Diagrams were presented on a 21-in. Cathode Ray Tube (CRT) monitor with a resolution of 1280 1024 pixels. An EyeLink II video-based head-mounted eye tracker (SR Research Ltd., Mississauga, Ontario, Canada) with a temporal resolution of 500 Hz and a spatial resolution of 0.3° tracked the movements of both eyes. Eye fixations and saccades were recorded. An eye movement was classified as a fixation when the distance travelled was less than 0.2° and its velocity lower than 30°/s, or when the distance travelled was less than 0.2° and the acceleration was less than 8000°/s2. 2.1.3. Procedure Participants were tested individually by the same experimenter in a laboratory with controlled lighting. First, they read an information sheet and completed consent forms. Participants were seated approximately 57 cm away from the display monitor with their heads stabilized on a chinrest. The eye tracker headband was placed on the participant’s head and was calibrated before the tasks began by having the participant follow a fixation point displayed on a nine point grid. After the calibration was successfully validated, the participants were asked to fixate on a point that appeared in the middle of the computer screen. Then, the first of a sequence of four counterbalanced tasks was presented on the display computer. Each task display consisted of one image with on the right hand side text and the left hand side a diagram. The right hand side consisted of one sentence question, the possible answers ‘1. Yes’ or ‘2. No’ (they were always the same across all four diagrams) and an instruction. The left hand side contained one of the four diagrams (see Appendix A). The participants were given as much time as needed to come with a solution; when ready to answer, he or she gave a knock on the table. The experimenter then stopped the eye tracker and the participant gave the answer verbally. Before each new task, a drift correction was performed to realign the eye tracker to any changed parameters relating to eye movement measurement. Finally, participants were thanked and debriefed. 2.1.4. Data analysis Eye movements were analyzed as to whether the fixations were within the boundaries of predefined regions of interest (ROIs) and
the fixation duration within these ROIs. The ROIs covered all display elements containing either drawing or text. As the ROIs differ in size in all four diagrams, this can bias the number of fixations each ROI receives: large ROIs could receive more fixations on account of the larger sections of the display they cover, whereas smaller ROIs could receive fewer fixations. To address this confound, the ROIs were described by an elliptic Gaussian distribution, where the mean of the Gaussian is the centre of the ROI and the standard deviation the length and width of the ROI divided by two. All the Gaussian ROIs were normalized to sum up to one. Then, the position of each fixation with respect to the mean of each ROI was calculated. The further a fixation is removed from the mean of a given ROI, the further down the asymptote of that ROI’s Gaussian distribution the fixation will be, and, thus, the smaller its value. The nearer a fixation to the mean of a given ROI, the larger this value will be. Fixations were attributed to that ROI where this distance value is largest, that is the nearest ROI mean. In other words, the fixation was attributed to the ROI to which its position was nearest. In this way, an objective decision about whether fixations are on or off a ROI could be made whilst taking into account size differences at the same time. See Fig. 1 for an example of Gaussian distributions superimposed as grey patches over each ROI on one of the diagrams used in this study. Dark grey patches represent an approximate leptokurtic distribution, light patches an approximate platykurtic and variations of grey in between represent an approximate mesokurtic distribution. The patches give an indication of the fixation frequency on the particular elements of the display the grey patch is superimposed on (i.e., comparable to a heat map). To analyze the extent into which each ROI has contributed to the successful performance in each task, a signal detection analysis was performed. In this way, a candidate hierarchy of task-relevance can be assembled. The group level analysis indicated that both the ROI fixation frequency and the ROI fixation time were skewed; hence, the non-parametric variant, a0 , was used (Donaldson, 1992) to assess the discriminability of each ROI in each diagram. Using this discriminability score, the contribution of each ROI to the successful performance on each task can be established. Due to a possible confound in ROI fixation time, that is, participants could just be staring at the display and not actively processing anything, it was decided to base the signal detection analysis on the ROI fixation frequency data. The discriminability score was established as follows. First, all fixations were divided into those fixations that were involved in a successful resolution of the task and those that were made during an unsuccessful outcome. In a second step, the ‘‘hit” rate and the ‘‘false alarm” rates were determined. The ‘‘hit” rate (H) is defined as the proportion of fixations that contributed to a successful task resolution relative to the total number of fixations on the display. Similarly, the ‘‘false alarm” rate (FA) is determined as the proportion of fixations that contributed to an unsuccessful task resolution relative to the total number of fixations on the display. In a third step, the discriminability metric, a0 , was calculated. Formula (1) is adopted from Donaldson (1992):
a0 ¼ 1=2 þ ½ðH FAÞð1 þ H FAÞ=½4Hð1 FAÞ:
ð1Þ
2.2. Results Seventy-eight percent of the participants solved Diagram 1 successfully, 72% Diagram 2, 22% Diagram 3 and 89% Diagram 4. 2.2.1. Differences in eye movement pattern between successful and unsuccessful problem solvers As the data appeared to be skewed, a log transform was applied to the raw data before further analysis. The successful participants
M. Groen, J. Noyes / Computers in Human Behavior 26 (2010) 1318–1326
1321
Fig. 1. Display of a task with the Gaussian distributions superimposed over the ROIs.
fixated more often (M = 1.88, SD = .27) on the ROIs than the unsuccessful ones, (M = 1.54, SD = .25), t(32) = 3.83, p < .001. In addition, the successful participants also fixated longer (M = 4.62, SD = .22) on the ROIs than those in the unsuccessful group (M = 4.38, SD = .24), t(32) = 2.98, p < .05. Fixation durations were calculated by adding the total gaze duration on the ROIs. 2.2.2. Analysis of task-relevance of each ROI The ROI with the highest discriminability was HingeB for Diagram 1, for Diagram 2 it was the Spring, for Diagram 3 it was HoleC and for Diagram 4 it was the Ceiling. See Table 1 for a candidate hierarchy of relevance for each task and the Appendix A for the location of the display elements. These discriminability scores enabled an objective decision with respect to the selection of ROIs to manipulate in order to examine the effects of highlighting them on eye movements and task performance in Experiment 2. Note that the text element of the diagrams is ignored in the data analysis as the focus was here on the graphic elements of the diagrams, not the text. 2.3. Discussion Participants who were successful in solving the tasks showed different eye movement characteristics, as measured by the fixation duration and number of fixations on the ROIs. The successful
Table 1 Discriminability of ROIs in Diagram 1–4 (descending). Diagram 1 HingeB HingeC HingeA Piston Ceiling Arrow HandleA
Diagram 2 1.14 0.92 0.78 0.76 0.74 0.70 0.33
Spring HoleB HoleA Piston Arrow
Diagram 3 0.87 0.77 0.56 0.52 0.48
HoleC Spring HoleA HoleB Piston Arrow
Diagram 4 1.40 1.30 1.13 0.87 0.70 0.69
Ceiling PulleyB String3 String2 Person PulleyC String4 PulleyA String1 ObjectA
1.19 1.15 0.99 0.96 0.89 0.79 0.77 0.77 0.74 0.72
participants not only look longer at the ROI, they also returned more often to the ROIs than the unsuccessful participants. It seems, therefore, that different eye movement behavior is exhibited by the successful participants. In other words, successful participants attended either to other display elements or in a different order than unsuccessful participants. The difference in the looking pattern between the two performance groups is exemplified by the signal detection analysis that showed that some ROIs were more discriminable than others in each diagram. By employing signal detection analysis, it was possible to select the potentially task-relevant display elements objectively and ignore extraneous display elements. This could indicate that the best discriminable ROIs can be considered an essential task-relevant element. This suggestion was tested in Experiment 2. 3. Experiment 2 The objective of Experiment 2 was to investigate the role of the best discriminable ROIs in the performance on the task. The influence of highlighting this candidate task-relevant element was contrasted with non-relevant display regions by highlighting elements that were calculated as low discriminable. This is to examine whether the eye movement guidance effect is a result of the highlighting per se or that the element is likely to be task-relevant because of some intrinsic reason, for example, a conceptual relationship between the highlighted element and the task objective. It should be noted that this was unclear from the results of Van Gog et al. (2009) and Velichkovsky (1995). This also permits, in contrast to Van Gog et al. and Velichkovsky, that irrelevant eye movements are ignored which may avoid placing an unnecessary cognitive load placed on the problem solver. In addition, the eye movements are induced to go to display elements of which it has been established, in Experiment 1, that they contributed to successful task performance. The expectation is that highlighting the crucial ROI will improve performance on the task. This condition was called the Best Discriminability Cue. Two additional conditions were included in the experiment: one condition was included to study the possibility
1322
M. Groen, J. Noyes / Computers in Human Behavior 26 (2010) 1318–1326
that it is not the highlighting alone which led to a performance improvement but a task-relevant intrinsic characteristic of the stimulus element. This was called the Low Discriminability Cue (which refers to the display element that scored lowest in the signal detection analysis). The other condition, called the Best Four Discriminability Cues, was included to examine whether only the ROI with the best discriminability score has specific importance with regard to successfully carrying out the task or that the other ROIs with high discriminability scores contributed to this result too. By employing this design, it will be possible to discern effects of highlighting task-relevant elements from the effect of highlighting display elements alone. 3.1. Method 3.1.1. Participants Ninety-eight undergraduate psychology students (86 females, 12 males) participated in the study for course credit. None had taken college-level physics or engineering programs and all had normal or corrected-to-normal vision. Their age ranged between 18 and 40 years (M = 20, SD = 3.1). No participant had taken part in Experiment 1. Assignment to one of four conditions was counterbalanced. 3.1.2. Materials and apparatus The stimuli were created from the static stimuli that were used in Experiment 1. Three additional versions of the four diagrams
were created in which one or more of the ROIs were highlighted by using a contour effect. Applying this effect to the contours of a ROI led to it pulsing slightly. The static diagrams were jpeg files, while the diagrams with one or more highlighted ROIs were animated gif files. All diagrams were displayed in a Media Player control embedded in a MATLAB program that also recorded the responses of the participants and parameters of the assigned condition. The stimuli were presented full screen on a 17-in. Liquid Crystal Display (LCD) monitor at 1280 1024 resolution. The animated gifs played at a rate of 3 frames per second and the animation was looped until the participant pressed the ‘‘Click to answer question and continue” button, after which a window appeared where the participant had the opportunity to answer by clicking a ‘‘Yes” or a ‘‘No” button. In Fig. 2, the highlighted ROIs are illustrated for one of the displays that was used in each condition of the experiment. The presentation sequence of the diagrams in all conditions was counterbalanced. In the Original condition, the static diagram of Experiment 1 was used, see panel 2A in Fig. 2. Participants in the Best Discriminability Cue condition viewed animated gifs of the four diagrams comprising two frames. The first frame was the static diagram and in the second frame the edges of the ROI that had the best discriminability score in Experiment 1 were pulsing slightly by using a contour effect, see Fig. 2, panel 2B. Participants in the Low Discriminability Cue condition viewed animated gifs of the four diagrams which were similar to those displayed in the Best Discriminability Cue condition, except that now the display element
Fig. 2. Diagram with examples of the stimuli used in Experiment 2. (A) A stimulus with no highlighting that was used in the Original condition, (B) shows one with the best discriminability score highlighted, (C) shows one with the ROI with the lowest discriminability score cued, and (C) shows a stimulus with the best four ROIs cued.
M. Groen, J. Noyes / Computers in Human Behavior 26 (2010) 1318–1326
with the lowest discriminability score was highlighted in the same way as in the Best Discriminability Cue condition, see Fig. 2, panel 2C. Participants in the Best Four Discriminability Cues condition viewed animated gifs of the four diagrams which again were similar to the ones in the Best Discriminability Cue condition, except that in this condition two additional frames were displayed. The second frame was the same as in the Best Discriminability Cue condition; the third and fourth frame highlighted the ROIs with the second and third highest discriminability score. 3.1.3. Procedure Participants in all four conditions were seated in front of a desktop computer display. First, they read an information sheet about the experiment and what would happen with their data. They then completed a consent form. After this, the first diagram was presented on the display. When the participant was ready to respond, an answer button could be clicked and then a window opened in which an answer could be provided by clicking on ‘‘Yes” or ‘‘No”. This was repeated until the participant solved all four diagrams. At the end of the experiment, participants were thanked and debriefed. 3.2. Results Correct results were scored in the same way as in Experiment 1, and the successful and unsuccessful groups were created based on the results of each of the four tasks. Sixty-three percent of the participants in the Original condition successfully solved the tasks, 75% in the Best Discriminability Cue condition, 66% in the Low Discriminability Cue condition and 69% in the Best Four Discriminability Cue condition, see Table 2. The success rate of participants in the Original condition in Experiment 2 (i.e., 37%) matched the success rate in Experiment 1 (i.e., 35%), which also used a static diagram. Highlighting the ROI with the highest discriminability had a significant effect on the performance in the Best Discriminability Cue condition, v2(1, N = 49) = 4.06, p < .05. This performance improvement was not a result of the highlighting alone, as can be seen in the performance in the Low Discriminability Cue condition, v2(1, N = 50) = 0.20, p > .50. In addition, when this essential ROI was highlighted in combination with the lower scoring ROIs, the performance improvement dropped below significance levels, v2(1, N = 49) = 1.12, p > .10, as shown in the performance in the Best Four Discriminability Cue condition. Prior knowledge could have had an influence on the performance on the tasks. The participants were asked at what level they studied physics at secondary school. Eighty-seven participants (89%) had studied physics until aged 16 years and 11 (11%) until 18 years. This difference in prior knowledge had no effect on task performance with respect to the number of errors, t(96) = 0.71, p = .48. The mean number of errors for participants in the former group was 1.3 (SD = .94) and 1.1 (SD = .70) for the participants with 2 years more physics study. 3.3. Discussion The results show that influencing the eye movement allocation of problem solvers by highlighting the best discriminable ROI led Table 2 Error and success frequencies and number of participants by condition across experiments.
Original (Experiment 1) Original (Experiment 2) Best Discriminability Cue Low Discriminability Cue Best Four Discriminability Cue
Unsuccessful
Successful
n
35% 37% 25% 34% 31%
65% 63% 75% 66% 69%
18 25 24 25 24
(25) (37) (24) (34) (30)
(47) (63) (72) (66) (66)
1323
to a marked improvement in task performance. In addition, this performance improvement was due to the essential role of this particular ROI in solving each diagram successfully; otherwise the performance in the Best Four Discriminability Cue condition would be significantly better than in the control group. The significant improvement was also not an effect of the highlighting alone; otherwise the performance in the Low Discriminability Cue condition would be significantly better than that in the control condition, Original. It appears to be possible to improve task performance in this type of physics problem by highlighting the ROIs with the highest discriminability (a0 ) score in the display that are relevant to realizing the objectives of each task.
4. General discussion The results of Experiment 1 show that the looking pattern of successful participants differs significantly from unsuccessful participants. Apparently, the allocation of eye movements over the different display elements was instrumental in arriving at the task objectives. This suggestion is supported by the positive effects on task performance of highlighting the ROI with the highest discriminability in each task as was found in Experiment 2. Moreover, employing a signal detection analysis on the relative contribution of each display element enabled an objective and transparent selection of crucial display elements. The important role of these selected display elements was underlined by the results of Experiment 2: highlighting this particular element improved task performance, highlighting other elements did not have this effect. Using this experimental design, we have been able to improve on the selection process of task-relevant display elements; and, it has been confirmed that the highlighted display element is correctly identified as this was the sole element that was instrumental in improving task performance. These findings suggest that influencing the allocation of eye movements toward task-relevant elements improves task performance as is apparent from the significant differences between the different conditions in Experiment 2. Moreover, influencing the task performance would not have been possible, if the eye fixation pattern over the display was not related to task performance. This in turn suggests a relationship between eye movements and task performance (see Toet, 2006). These findings also corroborate the earlier findings: it appears to be possible to influence eye movements and improve task performance by cueing a task-relevant part of the diagram (Grant & Spivey, 2003; Hagemann et al., 2006; Thomas & Lleras, 2007; Velichkovsky, 1995). These results contribute to our understanding of the nature of the relationship between the crucial task-relevant information and successful task performance. From Velichkovsky’s (1995) study and the work of Van Gog et al. (2009) it was not possible to determine whether their results were the effect of the provision of guidance to the task-relevant information or the mere effect of highlighting itself with no special task-relevant relationship implied. In the present study, we have controlled for this and have thus been able to improve on defining the nature of the relationship between the task-relevant information markers, the information marked and successful task performance. In addition, the selection of task-relevant information has been achieved with an objective method: signal detection analysis. This has been effective and is therefore an improvement on previous methods (i.e., relying on human judgment as adopted by Grant and Spivey (2003)), as by adopting this method the potential influence of measurement and sample bias is reduced. Moreover, only the most discriminable elements were used as the basis for relevancy guidance (as recommended by Wood, Cox, and Cheng (2006)); thus, preventing overloading the problem solver with
1324
M. Groen, J. Noyes / Computers in Human Behavior 26 (2010) 1318–1326
potential extraneous or superfluous support, as was potentially the case in Van Gog et al. (2009) and Velichkovsky (1995). A third contribution is that in this study, all eye fixations were considered as candidate task-relevant information markers in order to prevent excluding potentially important fixations. In addition, due to the ballistic nature of eye movements it happens quite often (13% of the time according to Kapoula et al. (1986)) that the eye overshoots the intended target. To avoid the challenge of having to filter out an unknown 13% of the data, all eye fixations were thus considered. Moreover, taking into account the relative size of the regions of interest, as was approached with superimposing elliptic Gaussians on the ROIs, when analyzing eye fixations to assess their contribution to successful task performance enables a more accurate approach to analysis. This approach addresses the effects of ballistic errors of over- and undershoot, and differences in area coverage of regions of interest and potential measurement biases. Although the effect of prior knowledge was insignificant in this study, caution must be applied when making the conclusion that
prior knowledge does not affect task performance. To assess the effect of prior knowledge in these tasks, it is most likely better to compare the performance of physics or engineering students with students from other disciplines. In addition, the number of participants in each cell was unbalanced which makes drawing conclusions questionable. Research into the relationship between eye movements and successful task achievement yields opportunities with respect to the design of materials for learning and instruction, and computer supported information search and for computer-based tasks in general. As these results show, it is possible to identify the crucial task-relevant information based on objective methods and then evaluate their effectiveness in guiding problems solvers to those display elements that were useful for others to accomplish task goals. It is foreseeable that new instructional material could first be checked by a relatively small sample of people to investigate which elements of the material are considered task-relevant, similar to the method that was adopted in this study. Then in a next phase visual cues, derived from the previous step, could be added to the material highlighting
Fig. 3. Task 1 with predefined ROIs superimposed.
Fig. 4. Task 2 with predefined ROIs superimposed.
M. Groen, J. Noyes / Computers in Human Behavior 26 (2010) 1318–1326
1325
Fig. 5. Task 3 with predefined ROIs superimposed.
Fig. 6. Task 4 with predefined ROIs superimposed.
these task-relevant elements, enabling more people to do the required tasks accurately and swiftly. This could prove to be a very useful development in the various computer-supported teaching environments that are currently available. Acknowledgments The authors acknowledge funding and support for this work from GCHQ in Cheltenham, UK. We thank Dr. Casimir Ludwig and Dr. Filipe Cristino for their thoughtful and helpful comments on an earlier version of this paper, and Filipe for his help setting up the research. Thanks to Dr. N. Hari Narayanan for making the stimuli available. Appendix A See Figs. 3–6. References Castelhano, M. S., Mack, M. L., & Henderson, J. M. (2009). Viewing task influences eye movement control during active scene perception. Journal of Vision, 9(3), 1–15.
Donaldson, W. (1992). Measuring recognition memory. Journal of Experimental Psychology: General, 121(3), 275–277. Duncker, K. (1945). On problem-solving. Psychological Monographs, 58(5). Whole No. 270. Epelboim, J., & Suppes, P. (1997). Eye movements during geometrical problem solving. Paper presented at the proceedings of the 19th annual conference of the Cognitive Science Society, Stanford University. Findlay, J. M., & Brown, V. (2006). Eye scanning of multi-element displays: II. Saccade planning. Vision Research, 46(1–2), 216–227. Freksa, C., & Bertel, S. (2007). Eye movements and smart technology. Computers in Biology and Medicine, 37(7), 983–988. Grant, E. R., & Spivey, M. J. (2003). Eye movements and problem solving: Guiding attention guides thought. Psychological Science, 14(5), 462–466. doi:10.1111/ 1467-9280.02454. Green, D. M., & Swets, J. A. (1966). Signal detection theory and psychophysics. New York: John Wiley & Sons. Hagemann, N., Strauss, B., & Canal-Bruland, R. (2006). Training perceptual skill by orienting visual attention. Journal of Sport & Exercise Psychology, 28(2), 143–158. Hegarty, M. (1992). Mental animation: Inferring motion from static displays of mechanical systems. Journal of Experimental Psychology: Learning, Memory, & Cognition, 18(5), 1084–1102. Holm, L., & Mantyla, T. (2007). Memory for scenes: Refixations reflect retrieval. Memory & Cognition, 35(7), 1664–1674. Just, M. A., & Carpenter, P. A. (1976). Eye fixations and cognitive processes. Cognitive Psychology, 8(4), 441–480. Just, M. A., & Carpenter, P. A. (1985). Cognitive coordinate systems: Accounts of mental rotation and individual differences in spatial ability. Psychological Review, 92(2), 137–172.
1326
M. Groen, J. Noyes / Computers in Human Behavior 26 (2010) 1318–1326
Kapoula, Z. A., Robinson, D. A., & Hain, T. C. (1986). Motion of the eye immediately after a saccade. Experimental Brain Research, 61(2), 386–394. Knoblich, G., Ohlsson, S., & Raney, G. E. (2001). An eye movement study of insight problem solving. Memory & Cognition, 29(7), 1000–1009. Land, M., & Hayhoe, M. (2001). In what ways do eye movements contribute to everyday activities? Vision Research, 41(25–26), 3559–3565. Land, M., Mennie, N., & Rusted, J. (1999). The roles of vision and eye movements in the control of activities of daily living. Perception, 28(11), 1311–1328. Narayanan, N. H., Suwa, M., & Motoda, H. (1994). A study of diagrammatic reasoning from verbal and gestural data. In A. Ram & K. Eiselt (Eds.), Proceedings of the 16th annual conference of the Cognitive Science Society (pp. 652–657). Hillsdale, NJ: Lawrence Erlbaum. Ratwani, R. M., Trafton, J. G., & Boehm-Davis, D. A. (2008). Thinking graphically: Connecting vision and cognition during graph comprehension. Journal of Experimental Psychology: Applied, 14(1), 36–49. Rayner, K. (1978). Eye-movements in reading and information-processing. Psychological Bulletin, 85(3), 618–660.
Thomas, L. E., & Lleras, A. (2007). Moving eyes and moving thought: On the spatial compatibility between eye movements and cognition. Psychonomic Bulletin & Review, 14(4), 663–668. Toet, A. (2006). Gaze directed displays as an enabling technology for attention aware systems. Computers in Human Behavior, 22(4), 615–647. Van Gog, T., Jarodzka, H., Scheiter, K., Gerjets, P., & Paas, F. (2009). Attention guidance during example study via the model’s eye movements. Computers in Human Behavior, 25(3), 785–791. Velichkovsky, B. M. (1995). Communicating attention: Gaze position transfer in cooperative problem solving. Pragmatics and Cognition, 3(2), 199–224. Wood, S., Cox, R., & Cheng, P. (2006). Attention design: Eight issues to consider. Computers in Human Behavior, 22(4), 588–602. Yoon, D., & Narayanan, N. H. (2004). Predictors of success in diagrammatic problem solving. In A. Blackwell, K. Marriott, & A. Shimojima (Eds.), Proceedings of diagrammatic representation and inference: Third international conference, diagrams 2004, Cambridge, UK, March 22–24, 2004, Vol. 2980 (pp. 301–315). Berlin: Springer.