Objective assessment of technical skill

Objective assessment of technical skill

t h e s u r g e o n 9 ( 2 0 1 1 ) S 2 3 eS 2 5 available at www.sciencedirect.com The Surgeon, Journal of the Royal Colleges of Surgeons of Edinburg...

101KB Sizes 1 Downloads 87 Views

t h e s u r g e o n 9 ( 2 0 1 1 ) S 2 3 eS 2 5

available at www.sciencedirect.com

The Surgeon, Journal of the Royal Colleges of Surgeons of Edinburgh and Ireland www.thesurgeon.net

Review

Objective assessment of technical skill Helen M. MacRae University of Toronto Surgical Skills Centre at Mount Sinai Hospital, Department of Surgery, University of Toronto, Toronto, Canada

article info

abstract

Article history:

Objective assessment of technical skill is an important component of skills training:

Received 22 October 2010

trainees require that deficiencies are clearly and objectively identified if a model of delib-

Accepted 3 November 2010

erate practice with feedback on skill acquisition is to be employed. There are several types of reliable and valid assessments for technical skill currently available. ª 2010 Royal College of Surgeons of Edinburgh (Scottish charity number SC005317) and

Keywords:

Royal College of Surgeons in Ireland. Published by Elsevier Ltd. All rights reserved.

Objective assessment Technical skill Technical Trainees

Types of assessment These include expert based systems, proficiency based models of progression and computer based systems.

Expert based systems The Objective Structured Assessment of Technical Skills (OSATS) was first reported by Martin et al. at University of Toronto. OSATS is a multistation, bell ringer examination, with subjects performing a portion of an operative procedure at each station. OSATS uses bench models, although some investigators have used it as part of a work place based assessment, in the operating room. Expert surgeons serve as examiners and complete task specific checklists as well as a global rating scale for each station, while directly observing performance. The checklists are completed on a binary scale (done correctly, or incorrectly performed/not done), and vary in number of items depending on the complexity of the task. The global rating scales consist of 7 components of operative skill generic

to most operations, such as economy of motion, or knowledge of procedure. Each component is marked on a 5 point Likerttype scale, with anchoring descriptors at 1, 3 and 5. OSATS has been validated as a measure of technical skill by comparing the performance of residents at different stages of training, demonstrating evidence of construct validity. Concurrent validity has been shown by comparing OSATS ratings with ratings of the final product of the technical performance, as well as by comparing faculty rankings of residents’ technical abilities with their OSATS scores. In all of these analyses, global ratings have performed better than checklists, showing better reliability and greater validity. Thus, the global ratings are used primarily for assessment, with the checklist being used to help anchor the examiner, and for feedback to trainees. Although OSATS was developed primarily as a bench model examination, it has also been used in the operating room. Datta et al. compared performance on a bench model OSATS examination of saphenofemoral junction dissection with live operative performance, and demonstrated similar ratings for the two settings, with evidence of construct validity.

E-mail address: [email protected]. 1479-666X/$ e see front matter ª 2010 Royal College of Surgeons of Edinburgh (Scottish charity number SC005317) and Royal College of Surgeons in Ireland. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.surge.2010.11.009

S24

t h e s u r g e o n 9 ( 2 0 1 1 ) S 2 3 eS 2 5

Tools have been developed for specific types of technical skills with properties similar to OSATS. These include the Global Assessment of Gastrointestinal Skills (GAGES) and the Global Operative Assessment of Laparoscopic Skills (GOALS). GAGES is a 5 item assessment tool, with slightly different items for upper and lower gastrointestinal endoscopy. This is also an expert based system, with experts observing performance and completing a 5 item scale. GAGES has been shown to be reliable, with inter-rater reliability of over .9 for most iterations. Evidence of validity was also demonstrated. GOALS consists of a 5-item global rating scale, with items included that are specific to laparoscopic skills. GOALS has also been shown to be reliable with evidence of validity. The major disadvantage of these tools is that they rely on expert examiners, a valuable and not always easily available resource. As well, for formative feedback, they are not necessarily structured in such a way to give trainees the type of specific, detailed information they might need to systematically improve their technical skills.

Bench model proficiency based tools In general, valid metrics must be established for proficiency based tools. This is true for either computer based assessments, which will be discussed in the next section, or bench models. The methodology used is to first have experts deconstruct the important aspects of the task, and models to replicate these essential aspects are developed. A group of experts are then asked to perform the task, and their performance is documented, usually as a combination of a time to completion, as well as precision or error scores. The expert scores are then used to develop performance metrics, setting levels achievable by trainees. MISTELS is one of the best validated systems that relies on a specific set of metrics to score performance. In this system, the manual skills felt to be essential to the performance of laparoscopic surgery were developed by a group of experienced surgeons. Exercises were then developed, initially in an endotrainer box. The five tasks developed were peg transfer, cutting/dissecting, placement of a ligating loop, and suturing with intra or extracorporeal knot tying. Each task is scored using a combination of a time score, and precision, with penalties assigned for errors. Fried et al. developed and have extensively validated the system, showing excellent reliability, with inter-rater reliability of 0.998, test-retest reliability of 0.892 and internal consistency of 0.86. MISTELS was the basis for the technical skill component of the Fundamentals of Laparoscopic Surgery Program (FLS). In this program, proficiency standards for technical skill performance must be met. These types of proficiency standards set very specific targets for learners. This can act as a formative feedback system, with learners being able to assess their progress towards the identified target. The disadvantages are that each technical skill requires a somewhat laborious process to develop a proficiency level. True experts must be identified, and metrics assessed for reliability and validity. As well, there is a risk of setting standards at a level that are too difficult for learners to achieve, leading to discouragement. Finally, these measures tend to be applicable to

discreet tasks, such as laparoscopic knot tying, or peg transfer, rather than to an entire procedure, such as cholecystectomy.

Computer based systems These systems rely on computer programs to track aspects of movement felt to be important. The ICSAD17 and ADEPT (advanced Dundee endoscopic psychomotor trainer) are dexterity analysis systems. Virtual reality systems, such as the LapSim (Surgical Science, Goteborg, Sweden) system, or the MIST-VR system (Mentice, San Diego, CA) incorporate performance metrics, for which expert-derived proficiency levels can be developed. ICSAD includes an electronic tracking system (Isotrak II, Polhemus, United States), which has an electromagnetic field generator and tracking devices, and software developed at Imperial College which converts the data generated by hand movements to dexterity measures. These include the speed of hand movements, the distance travelled, and the time taken to complete a task. Studies have demonstrated the construct validity of the ICSAD, with differences shown between expert and less expert surgeons. The advantage of the ICSAD system is that it is widely applicable, and thus can be used for most tasks in the skills lab. The major disadvantage of the ICSAD system is that it can be difficult to interpret the data. Thus, it is not as useful as some other systems for real time feedback, and information on skill based errors. As well, the ICSAD is difficult to use in the operating room or around devices that may cause interference with the electromagnetic field.

Virtual reality In general, V-R systems generate output data, which may include path length, time to completion, number of movements and instrument errors. Usually, criterion levels for performance assessment are set much like those for the bench model proficiency based tools, using the means of expert performance to develop a benchmark standard. The main advantage of V-R systems is that the metrics are available in real time, for immediate feedback of skills based errors. They do not require the presence of expert examiners, although experts are generally required to develop performance metrics. The disadvantage of these systems is that the metrics, such as path length, are not always easy for learners to interpret and remediate. Also, some of the metrics may be developed based on what is easy to record, as opposed to what is clinically relevant. Virtual reality systems are currently only developed and validated for a limited number of laparoscopic and endoscopic tasks, thus are not widely applicable to evaluate a broad range of procedures.

Summary As each of the above types of tools have advantages and disadvantages, training programs should likely use a combination of tools, tailoring their use to the educational need. With restrictions on work hours, the increasing focus on error prevention, and research demonstrating the efficacy of ex vivo skills training, much of the early learning curve for technical

t h e s u r g e o n 9 ( 2 0 1 1 ) S 2 3 eS 2 5

skills training is moving outside of the operating room and to the skills centre. Objective, reliable and valid evaluations are a vital component of a skills curriculum that will transfer to the clinical setting.

Conflict of interest None declared.

S25