ORIGINAL REPORTS
Validation of a Novel Venous Duplex Ultrasound Objective Structured Assessment of Technical Skills for the Assessment of Venous Reflux Usman Jaffer, PhD, Pasha Normahani, MBBS, Kimberly Lackenby, Mohammed Aslam, PhD, and Nigel J. Standfield, MD Department of Vascular Surgery, Imperial College School of Medicine, Hammersmith Hospital, Du Cane Road, London, United Kingdom OBJECTIVES: Duplex ultrasound measurement of reflux time is central to the diagnosis of venous incompetence. We have developed an assessment tool for Duplex measurement of venous reflux for both simulator and patient-based training.
CONCLUSIONS: We have developed and validated VC DUOSATS for simulator training. ( J Surg 72:754-760. J 2015 Association of Program Directors in Surgery. Published by Elsevier Inc. All rights reserved.)
METHODS: A novel assessment tool, Venous Duplex
lar, venous
Ultrasound Assessment of Technical Skills (V-DUOSATS), was developed. A modified DUOSATS was used for simulator training. Participants of varying skill level were invited to viewed an instructional video and were allowed ample time to familiarize with the Duplex equipment. Attempts made by the participants were recorded and independently assessed by 3 expert assessors and 5 novice assessors using the modified V-DUOSATS. “Global” assessment was also done by expert assessors on a 4-point Likert scale. Content, construct, and concurrent validities as well as reliability were evaluated. RESULTS: Content and construct validity as well as reliability were demonstrated. Receiver operator characteristic analysis–established cut points of 19/22 and 21/30 were most appropriate for simulator and patient-based assessment, respectively. DISCUSSION: We have validated a novel assessment tool
for Duplex venous reflux measurement. Further work is required to establish transference validity of simulator training to improve skill in scanning patients.
U.J. is a director for Axiom Medical Ltd. Correspondence: Inquiries to Usman Jaffer, BSc (Hons), MSc (Surgery), MSc (Ultrasound), PhD, PGCE, FRCS, Department of Vascular Surgery, Imperial College School of Medicine, Hammersmith Hospital, Du Cane Road, London W12 0HS, UK; fax: (208) 383-2083; e-mail:
[email protected]
754
KEY WORDS: ultrasound, duplex, reflux, training, vascuCOMPETENCIES: Practice-Based Learning and Improve-
ment, Medical Knowledge, Systems-Based Practice
INTRODUCTION Measurement of venous reflux time is central to the diagnosis of venous insufficiency. The presence of reflux as determined by venous flow toward the feet is suggestive of reflux. The duration of reflux is known as the reflux time. A reflux time of 40.5 (or 1.0) second has been used to diagnose the presence of reflux, although a variable “cutoff” based on location has been suggested.1 In vascular surgery, Duplex ultrasound (DUS) imaging has been primarily performed by vascular scientists who have had lengthy apprenticeship-style training in vascular ultrasonography. However, surgeons are becoming increasingly involved, as reflected in the new UK vascular surgery training curriculum2 as well as from the USA.3 However, caution should be taken in clinical change, and assessment and accreditation of skills should be undertaken to ensure quality. Objective structured assessment of technical skills (OSATS) for surgical tasks is a generic tool using the combination of checklist and global rating scales to assess trainees performing simulated technical tasks. We have previously validated a procedure-specific OSATS for Duplex detection of arterial stenosis4 and demonstrated skill enhancement with a simulator training package.5 The Venous Duplex Ultrasound Objective
Journal of Surgical Education & 2015 Association of Program Directors in Surgery. Published by 1931-7204/$30.00 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jsurg.2015.02.004
Structures Assessment of Technical Skills (V-DUOSATS) tool was developed for this purpose by 2 practitioners who routinely practice and train in these skills (M.A. and U.J.). This study aimed to establish construct validity and concurrent validity as well as reliability of V-DUOSATS.
METHODS In total, 24 participants were recruited, and all consented to participate in the study. Medical students, vascular scientists, and vascular surgical trainees from Imperial College London NHS Healthcare Trust were included; ethical approval was not required. Demographic data including previous ultrasound (US), DUS, and reflux time assessment experience were recorded. All participants were shown a standard instructional video prepared at our institution that incorporated training in equipment settings and technique of venous reflux assessment.6 Sections of the video corresponded to the domains of V-DUOSATS. Following viewing of the video, each participant was given 10 minutes to familiarize himself/herself with the equipment provided. Questions regarding the functioning of the Duplex machine (Mindray M7. Shenzhen, China) and simulator were answered. A high-fidelity pulsatile-flow simulator (Axiom Medical Ltd, London) was used to create pathological venous reflux waveforms beginning from a simulated calf compression (Fig. 1). The venous reflux time was varied by adjusting graphical control points on a linked tablet computer. Participants were then asked, in the presence of an assessor, to measure the reflux time. The reflux time was randomly selected and changed between subjects; participants were blinded. No restriction to the number of measurements was made; however, a single value was taken from participants for the purpose of analysis. Video recordings of each attempt were made for later analysis.
Following the completion of the study, video recordings were evaluated by 3 blinded specialists in vascular US (44 years practical experience) who were not involved in data collection. For these simulator-based assessments, a modified V-DUOSATS was used; this did not include the patient positioning and reporting fields (scoring guidance detailed in Fig. 2). Additionally, “global assessment” rating was also made on a 4-point Likert scale (level 1 representing “unable to perform the procedure,” level 2 “able to perform the procedure with prompting,” level 3 “able to perform the procedure with minimum prompting,” and level 4 “competent to perform the procedure unsupervised”). Overall, 5 novice assessors who had received basic instruction in V-DUOSATS marking also examined independently assessed video recordings of participant attempts. All 5 participants had no previous US experience and were not involved in the earlier phase of the study. They were shown the instructional video and given a brief didactic tutorial on how to use V-DUOSATS. Analysis Participants were classified into 4 groups on the basis of experience with diagnostic US (1, none or only theoretical knowledge of diagnostic US; 2, performed 1 to 20 cases; 3, performed 21 to 50 cases; and 4, had performed more than 50 cases), DUS (1, none or up to 10 cases performed; 2, performed 11 to 20 cases; 3, performed 21 to 50 cases; and 4, had performed over 50 cases), and Duplex assessment of venous reflux (1, none or up to 10 cases performed; 2, performed 11 to 20 cases; 3, performed 21 to 50 cases; and 4, had performed more than 50 cases). Interobserver reliability between the expert assessors as well as between expert and novice assessors was determined using Cronbach’s alpha (α). Nonparametric tests were used in all the analyses. Spearman’s rank (R) was used to correlate continuous variables. The Kruskal-Wallis test was used to
FIGURE 1. Photograph of Duplex display of refluxing venous waveform being set using tablet and generated. Journal of Surgical Education Volume 72/Number 4 July/August 2015
755
FIGURE 2. V-DUOSATS marking scheme indicating domains for the assessment on patients as well as simulation. CFV, common femoral vein; LSV, long saphenous vein; TGC, time gain compensation.
identify differences between the subgroups tested. Construct validity was assessed by comparing V-DUOSATS scores of expert, intermediate, and novice participants. Concurrent validity was assessed by comparing percentage error in reflux time assessment and global assessment to V-DUOSATS. SPSS 20 (IBM corporation) was used in the statistical analysis. A p o 0.05 was considered statistically significant.
RESULTS Demographic Data There were 24 participants in total, of which 15 were men and 9 women. Demographic data are given in Table 1. Construct Validity Participants were categorized according to previous US experience. The average scores for both V-DUOSATS and “global” assessment were used for all analyses. There were statistically significant differences in both V-DUOSATS 756
scores (p r 0.001) and “global” assessment (p r 0.001) across the 4 experience groups. Further analysis according to previous DUS and reflux time experience also identified significant differences in both V-DUOSATS score (p r 0.001 and o0.001 respectively; Fig. 3A) and “global” assessment (p r 0.001 and o0.001 respectively; Fig. 3B). Individual V-DUOSATS domains were also analyzed using US, DUS, and reflux time experience to establish the importance of individual domains (Table 2). Statistically significant fields were image optimization in B mode, evaluation of reflux in color, and reflux time assessment using spectral Doppler. Concurrent validity Statistically significant correlation was found between V-DUOSATS scores and “global” assessment scores (Spearman’s rank correlation coefficient: R ¼ 0.737, p r 0.001; Fig. 4A). Also, the V-DUOSATS score correlated with percentage error in reflux time estimation (R ¼ 0.467, p ¼ 0.019; Fig. 4B). The “global” assessment score did not
Journal of Surgical Education Volume 72/Number 4 July/August 2015
TABLE 1. Demographics and Previous Experience Level of Participants Total Number Age Sex
US experience
DUS experience
Reflux time experience
24 28 (23-32) 9 Female and 15 male Experience
Number
0 1 2 3 4 0 1 2 3 4 0 1 2 3 4
8 2 2 2 10 10 3 2 3 6 12 1 2 3 6
USS experience—(0: none, 1: Theoretical, 2: 1-20 cases, 3: 420 cases, and 4: 450 cases). DUS experience—(0: none, 1: 1-9 cases, 2: 1-20 cases, 3: 420 cases, and 4: 450 cases). Reflux time experience—(0: none, 1: 1-9 cases, 2: 10-20 cases, 3: 4 20 cases, 4: 4 50 cases).
correlate with percentage error in reflux time estimation (R ¼ 0.367, p ¼ 0.08). On multivariate analysis, previous Duplex experience (adjusted R2 ¼ 0.595, p ¼ 0.049) and reflux time experience (adjusted R2 ¼ 0.62, p ¼ 0.04) were found to have statistically significant effect on V-DUOSATS scores. US experience, age, and sex had no significant effect.
Reliability Interrater reliability for V-DUOSATS and “global” assessment scores was high for the 3 expert assessors (Cronbach’s α ¼ 0.805 and 0.816, respectively). There was also good interrater reliability between the novice assessors (Cronbachs α ¼ 0.853). There was good correlation between the assessments performed by expert and novice of V-DUOSATS scores (R ¼ 0.509, p ¼ 0.011; Fig. 5A). Bland-Altman analysis did not demonstrated any systematic bias in assessments undertaken by novice and expert assessors (Fig. 5B). Sensitivity and Specificity A receiver operating characteristics curve (Fig. 6) was plotted for the sensitivity and specificity of the test using the scores of those experienced in Duplex assessment of venous reflux (participants who have performed 420 cases) to determine the cut point in defining competence for this procedure. The area under the curve was 0.88, p ¼ 0.071. For various cut points, the specificity and sensitivity is shown in Table 3. It was decided the cut point should have a high specificity (i.e., less experienced operators were unlikely to be considered competent) while maintaining sensitivity (more experienced operators are likely to achieve competence in this test). The cut point for competence was set at 19 (of a possible 22).
DISCUSSION Currently methods of assessing venous DUS skills include examination of logbooks and non–criteria-based observations that lack content validity. Also, these methods do not readily lend themselves to highlighting areas for improvement in formative assessment.
FIGURE 3. Box and whisker plot of experience (Reflux: reflux assessment experience) vs (A) V-DUOSATS score and (B) “global” rating score. Journal of Surgical Education Volume 72/Number 4 July/August 2015
757
TABLE 2. Construct Validity for Individual Domains of V-DUOSATS and Also for “Global” Rating and Percentage Error in Reflux Estimation Domain 1 2 3 4 5 6 Total “Global” rating % Error in reflux estimation
US Experience (p Value)
DUS Experience (p Value)
Reflux Time Experience (p Value)
0.49 0.08 o0.001* 0.008* o0.001* o0.001* o0.001* o0.001* 0.009*
0.68 0.21 o0.001* 0.008* o0.001* o0.001* o0.001* o0.001* 0.006*
0.98 0.38 o0.001* 0.022* o0.001* o0.001* o0.001* o0.001* 0.084
Kruskal-Wallis test was used to identify differences between previous US, DUS, and reflux time experience subgroups. *p o 0.05.
The design of V-DUOSATS was informed following content analysis along the principles formulated by Gagne.7 Domains are patient positioning, transducer selection, US coupling gel usage, image optimization in B mode, acquisition of saphenofemoral junction (SFJ) image, evaluation of reflux in color, evaluation of reflux time assessment using spectral Doppler, and reporting. The scoring for tasks was set to reflect progression from lower order concepts, through to rule learning, and finally synthesis of rule learning principles into higher order problem solving (Fig. 4). More complex educational objectives were given proportionally greater weight in the overall score to reflect this. With the introduction of Duplex ultrasonography into the vascular surgery curriculum, it is imperative that formative assessment is performed and is “high yield,” thus ensuring learning is rapid. Concomitantly, it is important that a formal credentialing process is in place using a
validated assessment tool. The V-DUOSATS has been developed with this in mind to provide objective formative feedback of technical skills in training. Both V-DUOSATS and the “global” assessment were able to differentiate between participants according to their previous experience of US, DUS, and reflux assessment. This suggests good construct validity. Importantly, multivariate analysis demonstrated that DUS and reflux assessment were the only significant variables contributing to the V-DUOSATS score, suggesting that V-DUOSATS is a sensitive procedure-specific assessment tool, thus satisfying content validity. The V-DUOSATS score correlated with percentage error in reflux time estimation (R ¼ 0.467, p ¼ 0.019), indicating concurrent validity with an objective “end product” style assessment. “Global” assessment did not correlate with percentage error in reflux time estimation
FIGURE 4. Scatter graphs demonstrating (A) correlation of “global” score with V-DUOSATS score and (B) negative correlation of V-DUOSATS score with percentage error in reflux time assessment.
758
Journal of Surgical Education Volume 72/Number 4 July/August 2015
FIGURE 5. (A) Scatter graph showing correlation and (B) Bland-Altman graph showing agreement between novice and expert assessor scores for V-DUOSATS.
and is also limited by its unstructured and unsystematic nature. The V-DUOSAT domains cover a wide range of skills related to the task. The domains that seem to differentiate best between different experience groups were found to be the following: image optimization in B mode, evaluation of reflux in color, and reflux time assessment using spectral Doppler. Close attention should be paid to these domains when using the V-DUOSATS in formative assessment.
Using V-DUOSATS interrater reliability between both experienced assessors was high, comparing favorably with other objective structured technical skills assessments.8-10 Interestingly, interrater reliability among novice assessors, with minimal training and no previous US experience, was also high and comparable to the expert group of assessors. This indicated that V-DUOSATS assessments need not necessarily be performed by highly trained assessors, with relative little availability. ROC data suggested that a score of 19 (of a possible 22) has high sensitivity and specificity for consideration of competence for simulated assessment. We believe that it is reasonable to increase the pass mark to 24 (out of a possible 30) for the “full” V-DUOSATS scoring to be used in patient assessment. Further work is required for validation of V-DUOSATS on real patients. Investigators in a number of technical skills simulation fields have reported skills transference from bench top simulators to a human cadaver model11 as well as operating theater environments.12,13 Further work is required to assess skills transference to scanning on patients.
TABLE 3. Specificity and Sensitivity for Various Cut Points in V-DUOSATS Score Cut point
FIGURE 6. Receiver operating characteristics curve for the sensitivity and specificity of the test using the scores of those who had preformed more than 20 cases of Duplex assessment of venous reflux.
12.0000 14.5000 16.3333 17.8333 19.1667 20.3333 21.5000 22.6667
Journal of Surgical Education Volume 72/Number 4 July/August 2015
True Positive Rate (Sensitivity)
True Negative Rate (1-Specificity)
1.000 1.000 1.000 1.000 1.000 0.333 0.333 0.000
1.000 0.833 0.667 0.500 0.167 0.167 0.000 0.000 759
7. Gagne R. The Conditions of Learning. New York:
CONCLUSION
Holt Rinehart and Winston; 1965.
We have developed a novel OSATS for duplex assessment of venous reflux. The modified V-DUOSATS assessment tool for simulation was validated.
8. Martin JA, Regehr G, Reznick R, et al. Objective
structured assessment of technical skill (OSATS) for surgical residents. Br J Surg. 1997;84:273-278. 9. Moorthy K, Munz Y, Adams S, Pandey V, Darzi A.
REFERENCES 1. Labropoulos N, Tiongson J, Pryor L, et al. Definition
of venous reflux in lower-extremity veins. J Vasc Surg. 2003;38:793-798. 2. General Medical Council. Vascular surgery curriculum.
London: General Medical Council; 2013. 3. Zimmet SE, Min RJ, Comerota AJ, et al. Core content
for training in venous and lymphatic medicine. Phlebology. 2014;29:587-593. 4. Jaffer U, Singh P, Pandey VA, Aslam M, Standfield NJ.
Validation of a novel duplex ultrasound objective structured assessment of technical skills (DUOSATS) for arterial stenosis detection. Heart Lung Vessels. 2014;6:92-104. 5. Jaffer U, Normahani P, Singh P, Aslam M, Standfield NJ.
The effect of a simulation training package on skill acquisition for duplex arterial stenosis detection. J Surg Educ. 2015;72(2):310-315. 6. Jaffer U, Aslam M. Teaching venous duplex. Axiom
Medical Ltd; 2014.
760
A human factors analysis of technical and team skills among surgical trainees during procedural simulations in a simulated operating theatre. Ann Surg. 2005; 242:631-639. 10. Pandey VA, Wolfe JH, Black SA, Cairols M, Liapis
CD, Bergqvist D. Self-assessment of technical skill in surgery: the need for expert feedback. Ann R Coll Surg Engl. 2008;90:286-290. 11. Anastakis DJ, Regehr G, Reznick RK, et al. Assess-
ment of technical skills transfer from the bench training model to the human model. Am J Surg. 1999;177:167-170. 12. Beard JD, Jolly BC, Newble DI, Thomas WE,
Donnelly J, Southgate LJ. Assessing the technical skills of surgical trainees. Br J Surg. 2005;92: 778-782. 13. Datta V, Bann S, Beard J, Mandalia M, Darzi A.
Comparison of bench test evaluations of surgical skill with live operating performance assessments. J Am Coll Surg. 2004;199:603-606.
Journal of Surgical Education Volume 72/Number 4 July/August 2015