Eye tracking for skills assessment and training: a systematic review

Eye tracking for skills assessment and training: a systematic review

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Q10 19 20 21 22 23 Q1 24 Q2 2...

712KB Sizes 2 Downloads 194 Views

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Q10 19 20 21 22 23 Q1 24 Q2 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65

Available online at www.sciencedirect.com

ScienceDirect journal homepage: www.JournalofSurgicalResearch.com

Research review

Eye tracking for skills assessment and training: a systematic review Tony Tien, BSc,a Philip H. Pucher, MRCS,a Mikael H. Sodergren, PhD MRCS,a,b,* Kumuthan Sriskandarajah, MRCS,a,b Guang-Zhong Yang, PhD,b and Ara Darzi, FACS, FRCSa,b a b

Department of Surgery and Cancer, St Mary’s Hospital, Imperial College London, UK Hamyln Centre for Robotic Surgery, Imperial College London, UK

article info

abstract

Article history:

Background: The development of quantitative objective tools is critical to the assessment of

Received 26 January 2014

surgeon skill. Eye tracking is a novel tool, which has been proposed may provide suitable

Received in revised form

metrics for this task. The aim of this study was to review current evidence for the use of

26 January 2014

eye tracking in training and assessment.

Accepted 16 April 2014

Methods: A systematic literature review was conducted in line with PRISMA guidelines. A

Available online xxx

search of EMBASE, OVID MEDLINE, Maternity and Infant Care, PsycINFO, and Transport databases was conducted, till March 2013. Studies describing the use of eye tracking in the

Keywords:

execution, training or assessment of a task, or for skill acquisition were included in the review.

Eye tracking

Results: Initial search results returned 12,051 results. Twenty-four studies were included in

Education

the final qualitative synthesis. Sixteen studies were based on eye tracking in assessment

Training

and eight studies were on eye tacking in training. These demonstrated feasibility and

Learning

validity in the use of eye tracking metrics and gaze tracking to differentiate between subjects of varying skill levels. Several training methods using gaze training and pattern recognition were also described. Conclusions: Current literature demonstrates the ability of eye tracking to provide reliable quantitative data as an objective assessment tool, with potential applications to surgical training to improve performance. Eye tracking remains a promising area of research with the possibility of future implementation into surgical skill assessment. ª 2014 Elsevier Inc. All rights reserved.

1.

Introduction

The development of valid, reliable, and objective methods of skills assessment is central to modern surgical training. The increased awareness of iatrogenic injury and error has heightened the need for surgeons to demonstrate proficiency

[1] and achieve competency despite the shortening of training time available to trainees with the advent of working time directives [2,3]. Where in the past, surgical assessment was reliant on an apprenticeship model of informal skills acquisition and progression; numerous tools are now available to the surgical

* Corresponding author. Department of Biosurgery and Surgical Technology, 10th Floor QEQM Building, St. Mary’s Hospital, South Wharf Road, London, W2 1NY, United Kingdom. Tel.: þ44 (0) 203 312 6666; fax: þ44 (0) 203 312 6309. E-mail address: [email protected] (M.H. Sodergren). 0022-4804/$ e see front matter ª 2014 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jss.2014.04.032

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 Q3 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130

2

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

trainer. Numerous rating scales have been developed and 131 validated with the aim of quantifying surgical performance, 132 such as Operative Performance Rating System, which return a 133 summative performance score from combined Likert scale 134 ratings across a number of behavioral or procedural domains 135 136 [4]. Objective Structured Assessment of Technical Skill in137 corporates a global rating scale with a checklist to assess 138 surgical performance [5]. However, the design of many of 139 these scoring systems has potential drawbacks in their reli140 ability, with effective assessment dependent on the avail141 ability and presence of a reviewer trained in the assessment 142 methodology [6]. 143 The development of objective and independent systems 144 remains the ultimate goal for surgical assessment. Through 145 the use of objective metrics such as path length or number of 146 movements to define surgical skill, this has, in part, been 147 148 achieved in laparoscopic surgery [7]. However this remains 149 largely limited to the training setting, recording computer150 based metrics from virtual reality simulators [8]. 151 Eye tracking has been proposed as a potential assessment 152 tool not limited by some of the restrictions of laparoscopic 153 metric measurement. The use of camera technology to 154 Q4 analyze eye motion is a well-established concept, dating back 155 to 1950, during which the use of picture cameras to study the 156 gaze behavior of pilots was first described [9]. Since then, new 157 techniques have been developed, documenting eye move158 159 ment using stationary cameras or cameras integrated into 160 otherwise standard eyeglasses. These record the corneal 161 reflection of infrared lighting to track pupil position, mapping 162 the subject’s focus of attention on video recordings of the 163 subject’s field of view (gaze) [10]. In addition to tracking gaze, 164 this has enabled the measurement of various eye metrics 165 including fixation frequency and dwell time (used as a surro166 gate measure of perceived stimulus importance [9,11], as well 167 as pupil dilation, a marker of subject effort and concentration 168 [12,13]. Differences in these metrics between subjects of 169 varying skill levels, it has been proposed, may allow use of 170 171 these measurements as markers of ability [11,14e23]. 172 Beyond a method of assessment, eye tracking has been 173 proposed for other training uses, such as a visually guided 174 control interface, particularly within the operating room 175 where sterility (and therefore contact-free interfaces) must be 176 maintained [24]. It may also be used to address some of the 177 unique challenges presented by the continuing advances in 178 surgical technology. Visual orientation can present a major 179 problem in laparoscopic surgerydthe analysis and identifi180 cation of efficient orientation strategies through eye tracking 181 182 have been demonstrated as one potential way to address this 183 [25]. 184 Despite such broad potential application, research in this 185 area has been limited and disparate to date. Therefore, the 186 aim of this article was to review and consolidate the current 187 literature describing the evidence basis for the use of eye 188 tracking in training and assessment. 189 190 191 2. Methods 192 193 194 A systematic review was conducted in line with PRISMA 195 guidelines [26]. A search of EMBASE, OVID MEDLINE, Maternity

and Infant Care, PsycINFO, and Transport databases was conducted, till March 2013. The following search terms were used: (eye tracking OR gaze) AND (education OR training OR learning OR skill acquisition). After deduplication, results were first searched for relevant titles and abstracts. Full text versions of candidate studies were then retrieved and considered for final inclusion according to agreed selection criteria. In addition, reference lists were hand searched for other relevant articles, which may have been missed. Both literature search and data extraction were undertaken by two independent reviewers (T.T. and P.P.). Any disagreement was resolved by consensus.

2.1.

Selection criteria

Studies were included, which used an eye-tracking device in the execution, training or assessment of a task, or skill acquisition in task completion.

2.2.

Quality analysis

The quality of included studies was assessed using the Jadad score [27] for randomized trials and NewcastleeOttawa Scale (NOS) [28] for cohort studies. The Jadad scale assigns or deducts points over several categories based on the quality of randomization, blinding, and outcomes reporting, for a total score of 1e5. The NOS assigns a score of 0e9 based on the methodological quality of a study’s cohort selection, comparability, appropriate exposure, and analysis of outcome. To allow comparison of study quality across different study types, a summary score of “poor” (Jadad 1e2 and NOS 0e5), “moderate” (Jadad 3 and NOS 6e7), or “good” (Jadad 4e5 and NOS 8e9) was assigned.

3.

Results

Initial search results returned 12,051 results, which were reduced to 7360 results after elimination of duplicates. Thirtysix full-text publications were retrieved for analysis, with final inclusion of 25 studies in the final qualitative synthesis (Figure). Of the 24 studies of this review, 17 fit the requirements for quality analysis. The articles were mostly moderate in quality, with 16 of 17 articles being classed as moderate and the remaining article was classed as poor. Studies were divided into two domains of evidence and considered separately: (1) those describing use of eye tracking for assessment, including validation of assessment metrics and (2) use of eye tracking for training.

3.1.

Eye tracking as an assessment tool

Sixteen studies reported the use of eye tracking as an assessment tool across multiple disciplines, including surgery (4), medical specialties (6), nursing (2), and nonmedical applications (4).

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325

3

Figure e PRISMA chart of search strategy.(For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)

3.2.

Surgical assessment

The assessment of gaze and attention and their relationship with expertise were the most common use of eye tracking. In surgery, Khan et al. [14] recorded the eye gaze of two expert surgeons carrying out a laparoscopic cholecystectomy, replaying it to 16 experts and 20 junior residents while recording their eye gaze. It was found that experts watching the video demonstrated significantly greater overlap with the reference (expert) subjects, overlapping 55% of the time, compared with junior residents 43.8%. A similar principle was assessed by Wilson et al. [15,17] in two studies, in which significantly greater fixation of relevant anatomic targets in laparoscopic procedures was demonstrated by expert surgeons. Similar findings were reported for studies in other nonsurgical specialties. Schulz et al. [29] studied 15 anesthetists in a simulator to assess eye behavior during a critical incident, noting differences in gaze, attention, and task completion between experts and novices. Eye tracking has also been used in radiology where Manning et al. [20] studied radiologists, radiographers, and radiography students assessing pulmonary nodules in a series of radiographs. With increased expertise and training, radiologists required fewer fixations to identify pathology, compared with combined radiographers and novices (85 versus 91 versus 105, P ¼ 0.017). Similar results were reported by Krupinski et al. [19] in the assessment of pathologists, pathology residents, and novices (medical students)

reviewing specimens, with pathologists spending less time fixating on the target (4.471 versus 7.148 versus 11.861 s, respectively). Additional studies with similar methodologies in ophthalmological, endoscopic, and nursing contexts were in agreement with the previously mentioned findings [11,18,30]. Tracking patterns of gaze and attention have also been used to identify different gaze patterns, which might be related to or acquired through experience and expertise. Marquard et al. [31] studied the differences in gaze behavior of 20 nurses giving drugs to patients in a simulated setting. Nurses who noticed an identification error scanned information across artifacts, process more steps in similar time, fixated primarily on patients’ charts, and had more predictable eye fixation sequences. In nurses who failed to identify errors, there was an increased duration in off-topic conversation and had random eye fixation sequences. Tiersma et al. [32] showed two cervical intraepithelial neoplasia slides to five pathologists and asked them for a diagnosis 45 s later. Two different scanning patterns were distinguished, termed by the authors as a “scanning style” and “selective style.” However, the interpretation and diagnosis varied between the observers. Outside of medical applications, similarly different scanning patterns have been noted in the context of aviation [33,34], driving [35], and other tasks [36]. Beyond gaze and gaze patterns, Richstone et al. [16] applied advanced algorithmic methods to the multitude of pupilometric data available from eye tracking. Through the application of

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390

391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455

4

Table 1 e Summary of included studies. Author

Year

Assessment surgery Khan [14] 2012

Origin

Type

Canada

Cohort

2011

UK

Cohort

Richstone [16]

2010

USA

Cohort

Wilson [17]

2010

UK

Cohort

Germany

Exploratory

Medical specialties Schulz [29] 2011

Device

Subjects (n)

Study quality

Tobii X50 eye tracker

36 doctors

Moderate

Applied Science Laboratories Mobile Eye EyeLink II eye tracking system

25 surgeons

Moderate

21 surgeons

Moderate

Applied Science Laboratories Mobile Eye gaze

14 surgeons

Moderate

Induce GA in full-scale simulator, increased workload by randomized critical incident in second/third session. Shown eight glaucomatous optic disc images and asked to diagnose.

EyeSeeCam

15 anesthetists

T120 Tobii technology

Moderate

Watch three videos of colonoscopy withdrawals and detect adenomas. Subjects read 20 virtual slide images of breast core biopsy Detect significant pulmonary nodules in 120 PA CXR.

Applied Science Laboratories Mobile Eye Eye tracker SU4000

30 glaucoma subspecialists and ophthalmology trainees 11 endoscopists Nine pathologists and medical students 21 radiologists, radiographers, and radiography students Five pathologists

Moderate

Expert surgeon carried out lap. Cholecystectomy with eye gaze recorded. Video shown to experts and postgrads with eye gaze recorded. Locate balls within a jelly mass and place them in an endobag on a laparoscopic surgical simulator. Eye gaze recorded expert and novice surgeons during live transperitoneal lap. Renal surgery and simulated lap. Task. Subjects trained to laparoscopically touch colored flashing balls using the tip of the same colored instrument within set time.

O’Neill [18]

2011

Australia

Cohort

Almansa [30]

2011

USA

Cohort

Krupinski [19]

2006

USA

Cohort

Manning [20]

2005

UK

Cohort

Tiersma [32]

2003

Netherlands

Cohort

Two CIN slides shown to pathologists for answer within 45 s, questionnaire þ interview for info on steps to diagnosis.

EyeCatcher

Nursing Marquard [31]

2011

USA

Cohort

Applied Science Laboratories Mobile Eye

20 nurses

Koh [11]

2011

Singapore

Cohort

Nurses administer meds to three patients in simulated clinical setting. One patient has an ID error. Eye gaze from nurses during caesarean section analyzed.

Applied Science Laboratories Mobile Eye Tetherless

20 scrub nurses

Nonmedical Merwe [34]

2012

Netherlands

Cohort

Applied Science Laboratories 6000

12 air pilots

Borowski [23]

2010

Israel

Cohort

Applied Science Laboratories remote model 504

61 drivers

Moderate

Bowling [36]

2008

USA

Cohort

ISCAN eye tracker mounted within virtual reality V8 head mounted display

Six students from Clemson Uni

Moderate

Simulations of system malfunction from a fuel leak run and teams of captain þfirst officer investigate. Experienced drivers had hazard perception test, inexperienced had training (1) AAHPT active, (2) AAHPT instructional, (3) AAHPT hybrid, (4) control Aircraft inspection task of simulated aircraft cargo bay to search for damage. Variables: DI or GI, un/paced (time constrained)

Applied Science Laboratories 504 remote optics system

Moderate

Moderate

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

Wilson [15]

Task

456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520

521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 Sartar [33]

USA

Cohort

Fly 1 h scenario with events on a simulator (three scenarios).

Applied Science Laboratories 4000

22 experienced airline pilots

2012

UK

RCT

Poor

2011

UK

RCT

Applied Science Laboratories Mobile eye gaze Applied Science Laboratories Mobile Eye gaze

27 novices

Wilson [38]

30 medical trainees

Moderate

Sodergren [25]

2011

UK

RCT

2011

UK

Cohort

30 final-year med students 21 surgeons

Moderate

Sodergren [39]

Chetwood [24]

2009

UK

Cohort

50 learning trials to laparoscopically move six balls placed on stems into a cup. Subjects trained to laparoscopically touch colored flashing balls using the tip of the same colored instrument within set time. Subjects shown 12 pictures of lap. Cholecystectomy and asked for orientation. Surgeons were shown eight images during NOTES and had to answer three questions based on orientation. Supervisor eye gaze projected onto screen during lap. Task. Subjects had to identify 10 different objects in an environment of various shape, size, and color.

Tobii Technologies AB

28 with various experience/English proficiency

Moderate

Nonmedical Moore [40]

2012

UK

Cohort

Take part in blocks of 40 golf putts

2011

UK

Cohort

Vine [42]

2011

UK

Cohort

Each subject took 10 penalty shots with the goalkeeper present during their training. Performed 360 basketball free throws and then 120 under conditions to manipulate level of anxiety experienced.

40 undergraduate students 20 soccer Penalty takers 20 undergraduate Students

Moderate

Wood [41]

Applied Science Laboratories Mobile Eye tracker Applied Science Laboratories Mobile Eye tracker Applied Sciences Laboratories Mobile Eye tracker

Training surgery Vine [37]

ET 1750, Tobii Tobii ET 1750

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

Moderate Moderate

AAHPT ¼ Act and Anticipate Hazard Perception Training; DI ¼ detailed inspection; GI ¼ general inspection; NOTES ¼ natural orifice transluminal endoscopic surgery; RCT, ---; CIN, ---; ID, ---.

Q7

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

2007

5

586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650

651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715

6

Table 2 e Summary of study results. Author

Outcomes

Assessment surgery Khan [14] Eye gaze overlap when watching the video Wilson [15] Percentage of time fixating on important locations. Richstone [16] Accuracy of LDA þ NNA in distinguishing expert/novice surgeons Wilson [17]

O’Neill [18]

Almansa [30]

Time taken to interpret disc image, gaze behavior and fixation points and correlation with diagnostic accuracy. Total gaze time in each segment

Experts (16) and juniors (20) Experienced (10) and novice (15) surgeons. Simulator: expert (1), nonexperts (9) Live surgery: experts (2), nonexperts (9) (Experts: >300 laparoscopic cases) Expert surgeons (8) and novice surgeons (6)

Glaucoma subspecialists (7) and ophthalmology trainees (23)

Krupinski [19] Fixation time on self selected locations, saccades, x þ y coordinates of areas zoomed in for higher ROI magnifications

Med students (3), pathology residents (3), pathologists (3)

Manning [20]

Saccadic amplitude per image, coverage of image area, fixations per image, duration of film scrutiny

Experienced radiologists (8), radiographers (5), novice radiography students (8)

Tiersma [32]

Scanning pattern

Nursing Marquard [31] Average time to complete one process step, engagement in off-topic discussions, ratio eye fixations across artifacts Koh [11] Nonmedical Merwe [34]

Dwell time

Novice nurses (10), expert nurses (10)

Fixation rates þ dwell time on displays and scanning entropy

Borowski [35] Fixation þ cumulative fixation duration

Experienced drivers (21), inexperienced drivers (40)

Results Experts watching the video overlapped 55% and juniors 43.8% with the eye gaze of the expert carrying out the surgery. Experts spent more time fixating on the target (jelly, ball, or endobag) than novice surgeons (P < 0.005). Simulated surgery: differentiated expert and novice with 91.9% and 92.9% accuracy. Live surgery: expert 81.0% and novice 90.7% accuracy. Expert surgeons spent more time fixating on the target location than novices (P < 0.001). Experienced anesthetists increased time to manual tasks (21%e25%) during critical incidents, whereas less experienced decreased time (20%e14%). Trainees spent more time looking at the images than subspecialists (21.3 versus 16.6s, P < 0.01). Experts spent a larger proportion of total time examining AOIs than trainees (28.9% versus 13.5%, P < 0.05). Adenoma detection rate was associated with increased percentage of central gaze time (P ¼ 0.024). Endoscopists with less years of practice had lower percentage of central gaze time. Pathologists spent less time scanning compared with pathology residents and med students (4.471 versus 7.148 s versus 11.861). Unlike the other groups, pathologists dwelled on three locations subsequently chosen for zooming, which were frequently outside foveal vision. Radiologists and trained radiographers better than the two lower experience groups (P ¼ 0.0492). Untrained radiographers and novices had more fixations per film than trained experts. Radiographers reduced fixation number per film after training. Distance between fixations (saccades) larger for radiologists than other groups. Radiologists and trained radiographers less time per film. Scanning patterns distinguished: scanning style and selective style. Interpretation and diagnosis varied between the observers. Error identifying nurses: scan information across artifacts, more process steps in similar time, mainly fixate on patients chart and have predictable eye fixation sequences. Nonerror identifying: increase duration in off-topic conversation and had random eye fixation sequences. Experienced nurses take less time on final count and had fewer interruptions. Novice nurses switched attention between AOIs more. Preperiod: fixation þ dwell time mainly on NAV display þ PFD, Postperiod: ECAM display was primary focus. Higher fixation rate on ECAM resulted in finding the malfunction quicker. AAHPT hybrid þ instructional reported more pedestrian-related events. AAHPT hybrid fixated pedestrians more than experienced. AAHPT hybrid þ active had longer fixation duration than experienced.

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

Accuracy, completion time, economy of movement, fixation frequency Medical specialties Schulz [29] Spatial distribution of gaze

Cohort exposure (n)

716 717 718 719 720 721 722 723 724 725 726 727 728 729 730 731 732 733 734 735 736 737 738 739 740 741 742 743 744 745 746 747 748 749 750 751 752 753 754 755 756 757 758 759 760 761 762 763 764 765 766 767 768 769 770 771 772 773 774 775 776 777 778 779 780

781 782 783 784 785 786 787 788 789 790 791 792 793 794 795 796 797 798 799 800 801 802 803 804 805 806 807 808 809 810 811 812 813 814 815 816 817 818 819 820 821 822 823 824 825 826 827 828 829 830 831 832 833 834 835 836 837 838 839 840 841 842 843 844 845 Bowling [36]

Percentage of defects detected, mean search time for minor/major/critical defects/total defects, fixations

Sartar [33]

Fixation frequency

Order of scenarios and order of detailed inspection or general inspection were random

Higher inspection accuracy in DI compared with GI þ unpaced compared with paced. Mean search time decreased in paced versus unpaced. More fixations in DI versus GI þ unpaced versus paced. DI had lower mean fixation duration and covered more area than GI. Pilots monitor basic flight parameters more than visual indications of the automation configuration. Fixation frequency to PFD was 31%, map display 25%, and outside world 3%.

Training surgery Vine [37] Completion time, accuracy (number of balls knocked off)

AAHPT ¼ act and anticipate hazard perception training; DI ¼ detailed inspection; ECAM ¼ electronic centralized aircraft monitoring; GI ¼ general inspection; LDA ¼ linear discriminate analysis; NNA ¼ nonlinear neural network analyses; PFD ¼ primary flight display; QE ¼ quality eye; ROI, ---; AOI, ---; NAV, ---.

Q8

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

Gaze training (14), discovery learning group Completion time of gaze trained was faster than discovery-trained groups (13) (60.3 versus 67.2). Gaze trained also higher accuracy and target locking than discovery trained. Wilson [38] Completion time, total path length, tone Gaze training (10), movement training (10) Gaze trained group had faster completion (P ¼ 0.005) and more locking fixation counting performance (multitasking) and discovery learning (10) (P ¼ 0.015) than movement trained group. Gaze trained also had better multitasking than both movement trained and discovery learning groups. Sodergren [25] Number of correct answers, time for answer, Orientation training (15), control (15) Orientation-trained group gave more correct answers (75.6 gaze dwell time on each ROI and subject fixation versus 56.1, P ¼ 0.019) and took longer to give an answer (24.0 versus 19.8 s, sequences on ROI in each image P ¼ 0.010) than control group. Sodergren [39] Time taken to establish orientation and fixation High performance subjects had less fixations (P < 0.006) and lower sequences on organs and structures. normalized dwell time (P < 0.005) per area of interest. Chetwood [24] Completion time, gaze latency, gaze convergence, English as first language (14) Completion time reduced in eye þ VerbalEye versus verbal-trained number of errors English not as first language (14) (did not groups (2.74 þ 3.42 versus 6.05). Gaze latency sig. reduced in eye versus verbal. Errors only in verbal group. Gaze convergence reduced with state how many in verbal, eye and both) eye and VerbalEye. Nonmedical Moore [40] Mean radial error and percentage of putts QE-trained (20) and technical trained (20) QE-trained group had lower mean radial error than technical trained group (P < 0.001). Wood [41] Shooting accuracy and success rate QE-trained (10) and control (10) QE-trained group shot further away from the goalkeeper after week 2 (P < 0.05). Placebo group had more shots saved (14%) than QE group (7%) (P < 0.005). Vine [42] Free throw percentage QE-trained (10) and control (10) QE-trained group performed better at heightened levels of cognitive anxiety (66% accuracy versus 46%, P < 0.001).

7

846 847 848 849 850 851 852 853 854 855 856 857 858 859 860 861 862 863 864 865 866 867 868 869 870 871 872 873 874 875 876 877 878 879 880 881 882 883 884 885 886 887 888 889 890 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910

8

911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

linear discriminate and nonlinear neural network analyses to both simulated and live laparoscopic procedures, the authors reported that their system was able to accurately distinguish novice and expert surgeons with 91.9% and 92.9% accuracy, respectively, in simulated surgery and 81.0% and 90.7%, respectively, in live surgery.

3.3.

Eye tracking as a training tool

Eight articles studied eye tracking as a training tool in the fields of surgery (5) and nonmedical applications (3). By training subjects to focus their gaze on critical fixation points or in certain expert eye-tracked benchmark patterns, a number of studies have been able to demonstrate task performance improvement. Vine et al. [37] had 27 novice participants carry out 50 learning trials of laparoscopically moving six balls placed on stems of different heights into a cup. Subjects were either left to train independently (discovery learning) or were trained to focus gaze and attention on key points to aid in the task (gaze trained). The gaze-trained group had a significantly shorter completion time (34 versus 40 s, P < 0.05) and higher accuracy (24 versus 36 mistakes, P < 0.05) of the task compared with the discovery learning group. Gaze-trained group also outperformed other groups in a similar study by Wilson et al. [38]. A randomized trial by Sodergren et al. [25] taught orientation strategies to 15 medical students and had another 15 as control. They were shown 12 images of a laparoscopic cholecystectomy and asked for the orientation. The orientationtrained group gave more correct answers (75.6 versus 56.1, P ¼ 0.019) and took longer to give an answer (24.0 versus 19.8 s, P ¼ 0.010) than control group. In a further study by the same group [39], surgeons were given eight images of natural orifice transluminal endoscopic surgery and were asked to answer three questions on orientation. It was found that subjects who excelled had less fixations (P < 0.006) and lower normalized dwell time (P < 0.005) per area of interest. The potential of eye tracking as a control device in teaching was reported by Chetwood et al. [24], who projected the eye gaze of a supervisor onto a screen during a simulated laparoscopic task where subjects had to identify 10 different objects. Subjects were divided into verbal cues, eye gaze, or both groups to complete the task. Completion time was significantly reduced in the eye gaze and combined verbal and eye gaze groups compared with verbal cues alone (2.74 and 3.42 versus 6.05 s, P < 0.05). Errors also only occurred in the verbal cue group. Eye tracking has been explored with similar success in sport, particularly with the use of “quiet eye” training, referring to defining a final fixation point with brief pause (the “quiet eye”), before carrying out the relevant action [40e42]. Moore [40], Wood [41], and Vine et al. [42] assessed this in golf putting, penalty shooting, and basketball free throws, respectively, reporting that quiet eyeetrained groups performed better than their control counterparts in each case.

4.

Discussion

This systematic review summarizes current evidence for the use of eye tracking in assessment and training. These

demonstrate that while current eye tracking research have adopted varying methods, there is increasing evidence supporting the use of eye tracking as an educational tool. The visual nature and necessary identification of anatomical landmarks in surgery make it an ideally suited field for such technology. The studies presented here identify several assessment methods, which might be considered for future research, development, and validation. The overlap of gaze and fixation locations, as surrogate markers for the subject’s perceived areas of importance, was assessed by several studies [11,14,15,17,30,33,34,39]. The knowledge and identification of key landmarks is critical for anatomical orientation and the avoidance of vital structures in surgery. Therefore, it may be assumed that with a greater degree of expertise, these vital areas will occupy a greater proportion of a surgeon’s focus and be reflected in corresponding eye tracking metrics. The implications of this on training and performance improvement were also explored, with uniformly positive results [25,37,38]. The identification of areas of gaze and increased fixations in experts allows the demonstration of these to novices who might otherwise lack such understanding and facilitate accelerated learning and skill. This builds on much earlier gray literature research conducted in the 1970s by Vickers et al. [43] who demonstrated the significance of gaze training in sports, improving a basketball team’s percentage of successful free throws purely by emphasizing a focal point for gaze before throwing. Since then, other studies have been carried out with equally positive results showing “quiet eye”etrained sports players produce better results, which can be incorporated into the training of athletes [40e42]. Fixation patterns are important in rapid analysis and recognition of information and represent a second point for assessment. This was described in two studies [29,32] outside of surgery, however here, too, clear lessons can be learned. The assessment of focus and susceptibility to distractions is a potential use with clear applicability to the operating room, where Marquard et al. [31] have already demonstrated their relationship to increased risk of error. Based on such research, a systematic structured approach in the execution of tasks can be developed. Being trained in a standardized visual search pattern may lead to a greater accuracy and efficiency, with less susceptibility to distraction. This was seen to be useful in surgery where the gaze behavior of experts during laparoscopic tasks was analyzed, and the target locking strategy taught to novices gave better performance results [37,38]. The same applies to orientation strategies during laparoscopic cholecystectomy [25]. It must be considered that this review considered a broad range of professions, applications, and uses of eye tracking, with studies of moderate quality. This reflects the relatively nascent state of eye tracking research. Although the concept itself is decades old, only recently has technology allowed the miniaturization and portability of devices, and consideration of their application to complex areas such as surgery. Despite these limitations, evidence unanimously suggests eye tracking is useful for assessment, with promising early validation data. Compared with global rating scales and motion analysis, which are both well validated, eye tracking is still in its very

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

976 977 978 979 980 981 982 983 984 985 986 987 988 989 990 991 992 993 994 995 996 997 998 999 1000 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 1011 1012 1013 1014 1015 1016 1017 1018 1019 1020 1021 1022 1023 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

1041 1042 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 Q5 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 1089 1090 1091 Q9 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1104 1105

early stages of research for use in assessment of surgical skills. Its advantages, however, include the portability of current eye tracking technology, that is, it is able to be used in all surgical contexts, from laboratory to the operating room, and its ability to produce quantitative objective data with minimal setup or involvement from experts or technicians. On the basis of currently available evidence, eye tracking remains a promising area of research, with clear applications to assessment and training in surgery. It represents an unobtrusive, objective measurement tool, providing reliable, quantitative data. Although research to date has focused on laparoscopic surgical assessment, the application of eye tracking is not limited by procedure type or surgical approach. Further research is required, including the implementation of eye tracking in robustly designed surgical trials to assess surgeon ability, and the ability of eye tracking as a training tool to improve performance. Nevertheless, it remains a promising new addition to the surgeon assessor’s arsenal.

Uncited tables Tables 1 and 2.

Acknowledgment Authors’ contributions: P.P., M.S., G.Y., and A.D. contributed to conception; T.T. collected the data. T.T. and M.S. interpreted the article. P.P. and M.S. designed the article. T.T. analyzed the article. T.T. and P.P. wrote the article. M.S., K.S., G.Y., and A.D. made the critical appraisal of the article. All authors have read and approved the final manuscript.

Disclosure The authors have no conflicts of interest or financial ties to disclose.

references

[1] Kohn LT, Corrigan JM, Donaldson MS. To err is human: building a safer health system. National Academies Press; 2000. [2] Pickersgill T. The European Working Time Directive for doctors in training: we will need more doctors and better organisation to comply with the law. BMJ 2001;323:1266. [3] Sen S, Kranzler HR, Didwania AK, et al. Effects of the 2011 duty hour reforms on interns and their patients: a prospective longitudinal cohort study. JAMA Intern Med 2013;173:657. [4] Larson JL, Williams RG, Ketchum J, Boehler ML, Dunnington GL. Feasibility, reliability and validity of an operative performance rating system for evaluating surgery residents. Surgery 2005;138:640. [5] Martin J, Regehr G, Reznick R, MacRae H, Murnaghan J, et al. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg 1997;84:273.

9

[6] Sharma B, Mishra A, Aggarwal R, Grantcharov TP. Nontechnical skills assessment in surgery. Surg Oncol 2011;20: 169. [7] Eriksen J, Grantcharov T. Objective assessment of laparoscopic skills using a virtual reality stimulator. Surg Endosc 2005;19:1216. [8] Moorthy K, Munz Y, Sarker SK, Darzi A. Objective assessment of technical skills in surgery. BMJ 2003;327:1032. [9] Fitts PM, Jones RE, L MJ. Eye movements of aircraft pilots during instrument-landing approaches. Aeronaut Eng Rev 1950;9:24. [10] Duchowski AT. Eye tracking methodology: theory and practice. Springer; 2007. [11] Koh RY, Park T, Wickens CD, Ong LT, Chia SN. Differences in attentional strategies by novice and experienced operating theatre scrub nurses. J Exp Psychol Appl 2011;17:233. [12] Thomas LE, Lleras A. Moving eyes and moving thought: on the spatial compatibility between eye movements and cognition. Psychon Bull Rev 2007;14:663. [13] Thomas LE, Lleras A. Covert shifts of attention function as an implicit aid to insight. Cognition 2009;111:168. [14] Khan RS, Tien G, Atkins MS, Zheng B, Panton ON, et al. Analysis of eye gaze: do novice surgeons look at the same location as expert surgeons during a laparoscopic operation? Surg Endosc 2012;26:3536. [15] Wilson MR, McGrath JS, Vine SJ, Brewer J, Defriend D, et al. Perceptual impairment and psychomotor control in virtual laparoscopic surgery. Surg Endosc 2011;25:2268. [16] Richstone L, Schwartz MJ, Seideman C, Cadeddu J, Marshall S, et al. Eye metrics as an objective assessment of surgical skill. Ann Surg 2010;252:177. [17] Wilson M, McGrath J, Vine S, Brewer J, Defriend D, et al. Psychomotor control in a virtual laparoscopic surgery training environment: gaze control parameters differentiate novices from experts. Surg Endosc 2010;24:2458. [18] O’Neill EC, Kong YXG, Connell PP, Haymes SA, Coote MA, et al. Gaze behavior among experts and trainees during optic disc examination: does how we look affect what we see? Invest Ophthalmol Vis Sci 2011;52:3976. [19] Krupinski EA, Tillack AA, Richter L, Henderson JT, Bhattacharyya AK, et al. Eye-movement study and human performance using telepathology virtual slides: implications for medical education and differences with experience. Hum Pathol 2006;37:1543. [20] Manning D, Ethell S, Donovan T, Crawford T. How do radiologists do it? The influence of experience and training on searching for chest nodules. Radiography 2006; 12:134. [21] Miyata H, Minagawa-Kawai Y, Watanabe S, Sasaki T, Ueda K. Reading speed, comprehension and eye movements while reading Japanese novels: evidence from untrained readers and cases of speed-reading trainees. PLoS One 2012;7:e36091. [22] Di Stasi LL, Contreras D, Ca´ndido A, Can˜as JJ, Catena A. Behavioral and eye-movement measures to track improvements in driving skills of vulnerable road users: firsttime motorcycle riders. Transp Res F 2011;14:26. [23] Borowsky A, Oron-Gilad T, Meir A, Parmet Y. Drivers’ perception of vulnerable road users: a hazard perception approach. Accid Anal Prev 2012;44:160. [24] Chetwood AS, Kwok KW, Sun LW, Mylonas GP, Clark J, et al. Collaborative eye tracking: a potential training tool in laparoscopic surgery. Surg Endosc 2012;26:2003. [25] Sodergren MH, Orihuela-Espina F, Froghi F, Clark J, Teare J, et al. Value of orientation training in laparoscopic cholecystectomy. Br J Surg 2011;98:1437. [26] Moher D, Liberati A, Tetzlaff J, Altman DG. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Ann Intern Med 2009;151:264.

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

1106 1107 1108 1109 1110 1111 1112 1113 1114 1115 1116 1117 1118 1119 1120 1121 1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132 1133 1134 1135 1136 1137 1138 1139 1140 1141 1142 1143 1144 1145 1146 1147 1148 1149 1150 1151 1152 1153 1154 1155 1156 1157 1158 1159 1160 1161 1162 1163 1164 1165 1166 1167 1168 1169 1170

10

1171 1172 1173 1174 1175 1176 1177 1178 1179 1180 1181 1182 1183 1184 1185 1186 1187 1188 1189 1190 1191 1192 1193 1194 1195 1196 1197 1198 1199

j o u r n a l o f s u r g i c a l r e s e a r c h x x x ( 2 0 1 4 ) 1 e1 0

[27] Jadad AR, Moore RA, Carroll D, Jenkinson C, Reynolds DJ, et al. Assessing the quality of reports of randomized clinical trials: is blinding necessary? Control Clin Trials 1996;17:1. [28] Wells G, Shea B, O’connell D, Peterson J, Welch V, et al. The Newcastle-Ottawa Scale (NOS) for assessing the quality of nonrandomised studies in meta-analyses. 3rd Symposium on Systematic Reviews: Beyond the Basics; 2000. p. 3e5. [29] Schulz CM, Schneider E, Fritz L, Vockeroth J, Hapfelmeier A, et al. Visual attention of anaesthetists during simulated critical incidents. Br J Anaesth 2011;106:807. [30] Almansa C, Shahid MW, Heckman MG, Preissler S, Wallace MB. Association between visual gaze patterns and adenoma detection rate during colonoscopy: a preliminary investigation. Am J Gastroenterol 2011;106:1070. [31] Marquard JL, Henneman PL, He Z, Jo J, Fisher DL, et al. Nurses’ behaviors and visual scanning patterns may reduce patient identification errors. J Exp Psychol Appl 2011;17:247. [32] Tiersma ESM, Peters AA, Mooij HA, Fleuren GJ. Visualising scanning patterns of pathologists in the grading of cervical intraepithelial neoplasia. J Clin Pathol 2003;56:677. [33] Sarter NB, Mumaw RJ, Wickens CD. Pilots’ monitoring strategies and performance on automated flight decks: an empirical study combining behavioral and eye-tracking data. Hum Fact 2007;49:347. [34] van de Merwe K, van Dijk H, Zon R. Eye movements as an indicator of situation awareness in a flight simulator experiment. Int J Av Psych 2012;22:78. [35] Borowsky A, Shinar D, Oron-Gilad T. Age, skill, and hazard perception in driving. Accid Anal Prev 2010;42:1240.

[36] Bowling SR, Khasawneh MT, Kaewkuekool S, Jiang X. Gramopadhye AK evaluating the effects of virtual training in an aircraft maintenance task. Int J Aviation Psych 2008;18: 104. [37] Vine SJ, Masters RS, McGrath JS, Bright E, Wilson MR. Cheating experience: guiding novices to adopt the gaze strategies of experts expedites the learning of technical laparoscopic skills. Surgery 2012;152:32. [38] Wilson MR, Vine SJ, Bright E, Masters RS, Defriend D, et al. Gaze training enhances laparoscopic technical skill acquisition and multi-tasking performance: a randomized, controlled study. Surg Endosc 2011;25:3731. [39] Sodergren MH, Orihuela-Espina F, Mountney P, Clark J, Teare J, et al. Orientation strategies in natural orifice translumenal endoscopic surgery. Ann Surg 2011;254: 257. Q6 [40] Moore LJ, Vine SJ, Cooke A, Ring C, Wilson MR. Quiet eye training expedites motor learning and aids performance under heightened anxiety: the roles of response programming and external attention. Psychophysiol 2012;49: 1005. [41] Wood G, Wilson MR. Quiet-eye training for soccer penalty kicks. Cogn Process 2011;12:257. [42] Vine SJ, Wilson MR. The influence of quiet eye training and pressure on attention and visuo-motor control. Acta Psychol (Amst) 2011;136:340. [43] Vickers JN. Advances in coupling perception and action: the quiet eye as a bidirectional link between gaze, attention, and action. Prog Brain Res 2009;174:279.

5.2.0 DTD  YJSRE12700_proof  7 May 2014  7:32 pm  ce

1200 1201 1202 1203 1204 1205 1206 1207 1208 1209 1210 1211 1212 1213 1214 1215 1216 1217 1218 1219 1220 1221 1222 1223 1224 1225 1226 1227 1228