Development and Pilot Testing of an Assessment Tool for Performance of Invasive Mediastinal Staging

Development and Pilot Testing of an Assessment Tool for Performance of Invasive Mediastinal Staging

ORIGINAL ARTICLE: EDUCATION Development and Pilot Testing of an Assessment Tool for Performance of Invasive Mediastinal Staging Simon R. Turner, MD, ...

529KB Sizes 0 Downloads 15 Views

ORIGINAL ARTICLE: EDUCATION

Development and Pilot Testing of an Assessment Tool for Performance of Invasive Mediastinal Staging Simon R. Turner, MD, MEd, Basil S. Nasir, MD, Hollis Lai, PhD, Kazuhiro Yasufuku, MD, Colin Schieman, MD, Brian E. Louie, MD, and Eric L. R. B edard, MD, MSc

EDUCATION

Department of Thoracic Surgery, University of Alberta, Edmonton, Alberta, Canada; Department of Thoracic Surgery, Universite de Montr eal, Montr eal, Qu ebec, Canada; Faculty of Medicine and Dentistry, University of Alberta, Edmonton, Alberta, Canada; Department of Thoracic Surgery, University of Toronto, Toronto, Ontario, Canada; Department of Thoracic Surgery, University of Calgary, Calgary, Alberta, Canada; and Department of Thoracic Surgery, Swedish Cancer Institute and Medical Center, Seattle, Washington

Background. To develop and evaluate a surgical trainee competency assessment instrument for invasive mediastinal staging, including cervical mediastinoscopy and endobronchial ultrasound (EBUS), a comprehensive instrument was developed, the Thoracic Competency Assessment Tool-Invasive Staging (TCAT-IS), using expert review and simulated and clinical pilot-testing. Methods. Validity and reliability evidence were collected, and item analysis was performed. Initially, a 27-item instrument was developed, which underwent expert review with members of the Canadian Association of Thoracic Surgeons (n [ 86) in 2014 to 2015 (response rate, 57%). TCAT-IS was refined to 29 items in 4 competency areas: preoperative, general operative, mediastinoscopy, and EBUS. Further refinements were made based on simulated use. The final version was then used to assess competency of 5 thoracic trainees performing invasive mediastinal staging in live patients. Results. Participants were assessed during 20 mediastinoscopy and 8 EBUS procedures, with 47 total

assessments completed. Reliability (Cronbach’s alpha [ 0.94), interrater reliability (k [ 0.80), and correlation with an established global competency scale (k [ 0.75) were high. The most difficult items were “set up and adjust EBUS equipment” and “identify vascular anatomy (EBUS).” Feedback questionnaires from trainees (response rate, 80%) and surgeons (response rate, 100%) were consistently positive regarding user friendliness, utility as an assessment tool, and educational benefit. Participants believed the tool “facilitated communicating feedback to the trainee with specific areas to work on.” Conclusions. TCAT-IS is an effective tool for assessing competence in invasive staging and may enhance instruction. This initial test establishes early validity and reliability evidence, supporting the use of TCAT-IS in providing structured, specific, formative assessments of competency.

C

Invasive mediastinal staging for lung cancer is a fundamental ability of thoracic surgeons. Ensuring competence to perform a safe, effective, and efficient staging procedure is of utmost importance. Our group is developing a suite of Thoracic Competency Assessment Tools (TCATs) for surgical trainees performing the core operations in thoracic surgery. The development and validation of a TCAT for anatomic lung resection for cancer (TCAT-ARC) has previously been presented.6 Invasive staging (IS) was selected as the second procedure for TCAT development. Although endobronchial ultrasound (EBUS) and cervical mediastinoscopy are technically very different procedures, there are multiple common aspects such as case selection, understanding of

ompetency-based medical education has emerged as a major paradigm in the assessment of medical trainees.1,2 In surgery particularly the need to determine and document the ability of trainees to independently perform both common and complex operations is paramount. In Canada the Competence by Design initiative has mandated the creation of specialty-specific milestones to guide assessment of key competencies, aligning training with practice needs.3 Similar initiatives exist in the United States, via the Accreditation Council for Graduate Medical Education, and Europe, via the Bologna Declaration.4,5 Accepted for publication Mar 18, 2019. Presented at the American Association for Thoracic Surgery International Thoracic Surgical Oncology Summit, New York, NY, Oct 2018. Address correspondence to Dr Turner, University of Alberta, 416 Community Services Centre, 10240 Kingsway Ave, Edmonton, AB, Canada T5H 3V9; email: [email protected].

Ó 2019 by The Society of Thoracic Surgeons Published by Elsevier Inc.

(Ann Thorac Surg 2019;108:590-6) Ó 2019 by The Society of Thoracic Surgeons

The Supplemental Material can be viewed in the online version of this article [https://doi.org/10.1016/j.athoracsur. 2019.03.050] on http://www.annalsthoracicsurgery.org.

0003-4975/$36.00 https://doi.org/10.1016/j.athoracsur.2019.03.050

Ann Thorac Surg 2019;108:590-6

lung cancer staging principles, and knowledge of mediastinal anatomy that are fundamental to both and allow the potential for 1 instrument to assess competence in either procedure. The objective of this study was to establish preliminary validity and reliability evidence for TCAT-IS and to assess user experiences in a clinical setting.

TURNER ET AL INVASIVE MEDIASTINAL STAGING ASSESSMENT

591

observed by 1 study author to observe how the instrument functioned in a simulated clinical environment and to note any unexpected deficiencies so that items could be refined. A postencounter feedback questionnaire was distributed to the trainees and assessors at the boot camp regarding their experiences.

Pilot Testing: Clinical Patients and Methods A comprehensive competency assessment instrument was developed in a multistep process involving item development, expert review for item refinement, and pilot testing in simulated and clinical environments. The tool was designed to apply to the 2 most common methods of mediastinal staging: cervical mediastinoscopy and EBUS. The scale is named the Thoracic Competency Assessment Tool-Invasive Staging (TCAT-IS). Ethical approval was granted by the University of British Columbia Behavioural Research Ethics Board.

An initial version of TCAT-IS was developed by 2 thoracic surgeons and a thoracic surgery trainee using a process of logical analysis. The psychometric domain of the instrument was defined as a complete set of the steps that must be completed in the conduct of any safe, oncologically sound, invasive mediastinal staging procedure. The goal was to create a list fully representative and relevant to this domain. A set of 27 steps (or items) was generated and grouped into 3 areas of competencies: general, mediastinoscopy, and EBUS.

Expert Review The initial version of TCAT-IS was distributed to all members of the Canadian Association of Thoracic Surgeons (n ¼ 86) as an online questionnaire in 2014 to 2015. Respondents were asked whether each item should be included in a competency assessment tool for invasive staging procedures, indicating agreement on a 5-point Likert scale (1 ¼ strongly disagree, 5 ¼ strongly agree). Respondents were invited to comment on any items and were asked to provide any items they believed were missing. Mean responses for each item were calculated, and each respondent’s total deviation from the mean was determined.7-9 A predetermined threshold for consensus was set at a mean more than 4.5 of 5 and a median of 5 of 5.

Statistical Analysis Descriptive analysis was performed on the results of the expert review, postencounter questionnaires, and pilot test data. Results of the clinical pilot test were further analyzed using item analysis and interrater reliability. For item analysis data were pooled and dichotomized (1-4 vs 5/5). Item difficulty was calculated as 1 minus the proportion of scores of 5 of 5, and discrimination was calculated using the point biserial correlation. Pearson’s statistic was used for correlations. Internal consistency, measuring how well a test assesses a single psychometric construct, also known as reliability, was calculated using Cronbach’s alpha. Because most procedures were mediastinoscopies, Cronbach’s alpha was calculated by excluding EBUS procedures, thereby eliminating “missing” data from that section of the instrument. Any remaining missing cells (20 cells over 33 assessments) were replaced with the overall test mean for that assessment. Microsoft Excel (Microsoft, Redmond, WA) and Statistical Package for Social Sciences (SPSS version 24) (IBM, Armonk, NY) were used.

Pilot Testing: EBUS Simulation The refined instrument was then pilot tested in 2 phases. The first pilot test occurred at the July 2015 University of Toronto Interventional Thoracic Surgery training course, an introductory “boot camp” for thoracic surgery trainees from across Canada.10 As a part of this course, trainees are given instruction in EBUS technique and perform the procedure using a mannequin simulator. Each trainee performed several simulated EBUS procedures under the supervision of a thoracic surgeon, who assessed performance using TCAT-IS. The assessment process was

Results Expert Review There were 49 respondents (or judges) for the initial expert review questionnaire (response rate, 56.9%). Respondents were from every province in Canada except Newfoundland and Prince Edward Island but most commonly from the 2 largest provinces, Ontario (18, 37.5%) and Quebec (7, 14.6%). Additionally, 2 were from the United States and 1 from Australia. Most had been in

EDUCATION

Item Development

The second pilot test was conducted at 3 North American institutions (University of British Columbia, University of Western Ontario, and Swedish Cancer Institute and Medical Center). Trainees were voluntarily recruited to participate in a 6-month study, during which time they were asked to have their performance of mediastinal staging procedures assessed and self-assessed using the TCAT-IS instrument and the Objective Structured Assessment of Technical Skills (OSATS) global competency score at least twice a month. Procedures were performed with real patients as a routine part of the trainee’s clinical experience and under direct supervision by the attending surgeon. Both video and direct vision cervical mediastinoscopies were permitted. Participants were encouraged to have the assessments and selfassessments completed on paper as soon as possible after the case. Participating trainees and surgeons were administered a postencounter questionnaire.

EDUCATION

592

TURNER ET AL INVASIVE MEDIASTINAL STAGING ASSESSMENT

practice more than 10 years (<5 years, n ¼ 8 [16.3%]; 5-10 years, n ¼ 11 [22.5%]; 11-20 years, n ¼ 14 [28.6%], 21-30 years, n ¼ 11 [22.5%]; >30 years, n ¼ 5 [10.2%]). Respondents performed an average of 82 invasive staging procedures per year. There was a variety of staging procedure preferences (only mediastinoscopy, n ¼ 19 [38.8%]; mostly mediastinoscopy, n ¼ 8 [16.3%]; equal mediastinoscopy/EBUS, n ¼ 10 [20.4%], mostly EBUS, n ¼ 12 [24.5%]). Most reported having trainees with them for at least half of their invasive staging procedures (all/ almost all, n ¼ 17 [35.4%]; most, n ¼ 6 [12.5%]; about half, n ¼ 10 [20.8%]; less than half, n ¼ 7 [14.6%]; none/almost none, n ¼ 8 [16.7%]). The mean Likert ratings of the initial 27 items are shown in Table 1. Only 2 items had overall mean ratings less than 4 of 5. Each judge’s deviation from the mean (JDM; equal to the sum of the difference between a judge’s rating and the mean rating from all judges for each of the 27 items) was calculated. The judges with the highest JDM were removed 1 by 1, and JDM scores were recalculated for the remaining judges in an iterative fashion until all judges had a JDM < 2. This left 22 judges with the fewest outlier answers who therefore were the most representative of the average consensus of the group.7-9 The mean Likert ratings for each item, using only the retained judges, is also shown in Table 1. All items had final mean ratings greater than 4 of 5, and all but 7 items had final mean ratings of more than 4.9 of 5. Only 2 items did not meet the predetermined threshold for consensus of mean greater than 4.5 of 5 and median of 5 of 5 (“Demonstrates efficiency of motion” and “Makes appropriate use of assistants”). Both were drawn from previous work on TCAT-ARC. In addition, 1 other item was drawn from the expert review of TCAT-ARC, “Provides assistance to another operator.” It was believed that it was important to include these 3 items to maintain consistency between all TCATs, regardless of procedure. One item was suggested by several experts and was judged to be important for inclusion: “Identifies and samples all appropriate nodal stations.” Other items suggested by judges for inclusion were judged to be either redundant (eg, suggested item: “Assesses whether or not an adequate sample has been collected”; existing item: “Adequately assesses gross tissue sample quality”) or not relevant to the objectives of the instrument (eg, “Describes the processing of specimens”). The final TCAT-IS instrument therefore has 29 items. Additional rounds of review, as would be performed in a modified Delphi procedure, were not conducted after the initial round resulted in agreement at the established threshold.

Pilot Testing: EBUS Simulation Ten boot camp trainees completed 2 simulated EBUS procedures each and had their performance assessed using TCAT-IS by 1 of 3 surgeons. The feedback questionnaire was completed by 9 trainees (response rate, 90%) and 3 surgeons (response rate, 100%). The median score for each questionnaire item was 5 of 5 from trainees and 4 of 5 from surgeons (Figure 1). Minor adjustments to item wording and organization were made based on

Ann Thorac Surg 2019;108:590-6

Table 1.

Expert Review Individual Item Ratings

Section Preoperative

General

Item

Brief Descriptor

1 2 3 4 5 6 7

Patient selection Risks and benefits Interpret imaging Positioning Time out Node sampling Efficiency of motion Use of assistants Communication Sample handling Sample assessment Ensure hemostasis Prep and drape Incision Tissue handling Pretracheal plane Handle mediastinoscope Vascular anatomy Nodal anatomy Control major bleeding Set up equipment Bronchoscopy Insert EBUS scope Vascular anatomy Nodal anatomy Needle insertion Control major bleeding

8 9 10 11 12 Mediastinoscopy 13 14 15 16 17 18 19 20 EBUS

21 22 23 24 25 26 27

Mean Final Item Rating After Mean Removal of Initial Judges Item With High Rating Deviation (median) (median) 4.85 4.87 4.83 4.70 4.15 4.96 3.79

(5) (5) (5) (5) (5) (5) (4)

4.91 4.95 5 5 4.68 5 4.18

(5) (5) (5) (5) (5) (5) (4)

3.98 4.21 4.53 4.09

(4) (4) (5) (4)

4.41 4.73 4.86 4.57

(4.5) (5) (5) (5)

4.79 4.36 4.66 4.34 4.79 4.52

(5) (5) (5) (5) (5) (5)

5 4.68 4.95 4.90 5 4.95

(5) (5) (5) (5) (5) (5)

4.96 (5) 4.96 (5) 4.83 (5) 4.70 4.62 4.68 4.87 4.96 4.87 4.74

(5) (5) (5) (5) (5) (5) (5)

5 (5) 5 (5) 5 (5) 4.89 4.95 5 5 5 5 5

(5) (5) (5) (5) (5) (5) (5)

EBUS, endobronchial ultrasound.

observations of the instrument in the simulated environment. The final TCAT-IS items as used in the clinical pilot test are provided in the Supplemental Material.

Pilot Testing: Clinical Five trainees participated in the clinical pilot test, 2 in their first year of thoracic surgery training (after completing general surgery), 2 in their second year, and 1 in his or her third year. Forty-seven assessments were performed (27 by surgeons and 20 self-assessments by trainees) during 28 invasive staging procedures (20 mediastinoscopies and 8 EBUS) (Figure 2). Cronbach’s alpha was 0.94, indicating high internal consistency of the instrument. Interrater reliability between surgeon assessments and trainee self-assessments

Ann Thorac Surg 2019;108:590-6

TURNER ET AL INVASIVE MEDIASTINAL STAGING ASSESSMENT

593

Figure 1. Simulated pilot test feedback questionnaire results by trainees and surgeons.

Figure 2. Relationship of trainee performance with months in training as a measure of external validity.

monthly (2/5) or every few months (3/5), whereas surgeons stated they would want to assess their trainees with the instrument with each procedure (1/10), weekly (2/10), monthly (6/10), or every few months (1/10). Surgeons commented that the instrument “facilitated communicating feedback to the trainee with specific areas to work on” and that the impact of the instrument on the trainee was “positive, (and) forces discussion about important components of preparation and performance.”

Comment The utility of an educational assessment can be judged using the framework proposed by van der Vleuten,11 composed of reliability, validity, educational impact, acceptability, and cost-effectiveness. This study provides evidence for TCATIS in each of these domains, with the exception of cost, supporting a role in helping in the assessment of thoracic surgery trainees’ performance invasive staging procedures, both EBUS and mediastinoscopy. Expert review was provided by surgeons from a range of locations, invasive staging practices, years of experience, and exposure to trainees, although most had significant experience and involvement with trainees. Respondents reflected previously published data on invasive staging practices of Canadian surgeons.12 Validity was shown by scores on TCAT-IS that correlated well with an established global competency assessment scale (OSATS) and by acceptable interrater reliability between trainees and assessors. Reliability, measured as Cronbach’s alpha, was high, indicating that individual items correlated well with each other, demonstrating that the instrument as a whole is assessing an internally consistent, unitary concept of overall competence to perform invasive staging. By including items that address patient and procedure selection, imaging interpretation, and communication in addition to procedural skills, TCAT-IS addresses a broad range of skills required to perform these procedures expertly.

EDUCATION

was 0.80, indicating validity and reliability. Correlation of TCAT-IS with OSATS was 0.75, indicating external validity. Results of the item analysis are shown in Table 2. The most difficult items were “Able to properly set up and adjust EBUS equipment.,” “Correctly identifies vascular anatomy and other structures at risk for injury using ultrasound,” and “Makes appropriate use of assistants.” The most discriminatory items (items that were most useful at differentiating between a trainee with high vs low overall performance) were “Provides appropriate assistance to another operator,” “Properly develops the pre-tracheal plane,” and “Makes appropriate use of assistants.” The feedback questionnaire (Figure 3) was completed by 4 trainees (response rate, 80%) and 10 surgeons (response rate, 100%). Responses were favorable, with median agreement to all positive statements at least 4 of 5 and median agreement to all negative statements no more than 2 of 5, with the exception of the trainees’ responses to “This tool is too time consuming” (median, 3/5). Trainees stated they would want to be assessed using the instrument

594

TURNER ET AL INVASIVE MEDIASTINAL STAGING ASSESSMENT

Table 2.

Item Analysis for Each Item in the Thoracic Competency Assessment Tool-Invasive Staging Instrument

Section Preoperative

General

EDUCATION

Mediastinoscopy

Item

Brief Descriptor

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Patient selection Risks and benefits Interpret imaging Positioning Time out Node sampling Node stations Efficiency of motion Use of assistants Provide assistance Communication Sample handling Sample assessment Ensure hemostasis Prep and drape Incision Tissue handling Pre-tracheal plane Handle mediastinoscope Vascular anatomy Nodal anatomy Control major bleeding Set up equipment Bronchoscopy Insert EBUS scope Vascular anatomy Nodal anatomy Needle insertion Control major bleeding

20 21 22 EBUS

Ann Thorac Surg 2019;108:590-6

23 24 25 26 27 28 29

Discrimination (point biserial)

Difficulty (1 – [proportion of scores of 5/5])

0.11 0.14 0.45 0.10 0.12 0.42 0.29 0.29 0.65 0.74 0.35 0.40 0.56 0.53 0.57 0.57 0.57 0.74 0.68

0.04 0.06 0.04 0.04 0.02 0.07 0.12 0.14 0.15 0.14 0.04 0.05 0.05 0.03 0.02 0.02 0.02 0.03 0.08

0.56 0.44 0.42

0.05 0.07 0.09

0.17 0.63 NA –0.37 0.08 0.14 NA

0.2 0.05 NA 0.18 0.13 0.1 NA

EBUS, endotracheal ultrasound; NA, not available (insufficient data).

Item analysis provides further evidence of validity, because the performance characteristics of individual items fit well with what would be expected given our general understanding of invasive staging procedures. For example, the most difficult items addressed the ability to set up and adjust EBUS equipment and to properly identify vascular anatomy. Trainees are often not the ones responsible for setting up the EBUS equipment, so it makes sense that they struggle when asked to do so during an assessment. Similarly, trainees may initially learn to identify nodal stations by bronchoscopic landmarks without learning the correct vascular landmarks on ultrasound and may therefore find it difficult to identify vascular anatomy when asked. Interestingly identification of vascular anatomy during mediastinoscopy was 1 of the easier items, as this may be more intuitive to trainees because it is directly visualized rather than via ultrasound or bronchoscopic landmarks. The use of assistants and providing assistance to another operator were 2 of the most discriminatory items, and these skills are often

believed to differentiate between high and low performing trainees. It is difficult however to draw conclusions from the item analysis given the small sample size. Because of the limited number of participants, it was also impossible to conduct meaningful analyses of individual trainees’ performance over time or correlation of performance with experience or quality outcomes such as adequacy of lymph node specimen, procedure duration, or complications, all of which would add to the available validity evidence. However there was initial evidence of an upward learning curve over time. In the future we hope to investigate the use of TCAT-IS with a larger number of trainees over a significant training period to address some of these deficiencies with the current study and establish more definitive evidence of the ability of this tool to measure competence. Compared with our strategy of developing unique instruments for specific procedures in thoracic surgery, numerous generic global competency rating scales have

Ann Thorac Surg 2019;108:590-6

TURNER ET AL INVASIVE MEDIASTINAL STAGING ASSESSMENT

595

Figure 3. Clinical pilot test feedback questionnaire results by trainees and surgeons.

4 of 5 for nearly all assessments. This may be partially accounted for trainees only attempting or being allowed to attempt those steps of the procedure with they were already comfortable while other steps received a score of not applicable/not performed and did not factor into the mean overall score. Mean scores may also have been high if trainees selected particular procedures to be assessed where they believed they had performed particularly well. This could be addressed by making surgeons or program directors responsible for selecting which procedure is to be assessed or by setting a predetermined frequency of assessments. It is important therefore when using instruments like TCAT-IS to focus on individual item scores rather than the overall score, assessing progress over time on each step and ensuring that each step is attempted and mastered. For this reason we did not attempt to create an overall threshold score or a “passing grade.” Although this study provides some early validity evidence for TCAT-IS as a competency assessment, further study is needed to confirm whether scores on TCAT-IS truly measure competence or if the instrument’s utility lies more in facilitating feedback. However, given the crucial role of direct, specific, and actionable feedback in attaining competence, we would argue that this role on its own is important enough to support the use of TCAT-IS in an educational program. Identification of procedure-specific competencies is an emerging field in thoracic surgery. Ferguson and Bennett14 recently published the results of their Delphi process to identify the essential components of a thoracoscopic right upper lobectomy. Davoudi and colleagues15 have developed the EBUS Skills and Tasks Assessment Tool (EBUS-STAT), which consists of 10 items testing both EBUS technical skills and an image interpretation test for pulmonologists. The 7 technical skills in EBUS-STAT are each addressed to a degree in TCAT-IS, as is interpretation of relevant imaging. However no previous tool exists that assesses competence in

EDUCATION

been validated. One example is the OSATS.13 This 7-item instrument is well established for measuring overall surgical competence, independent of the procedure performed. A key limitation of such global scales is their inability to provide procedure-specific, fine-grained feedback to trainees and educators about which aspects of an operation have been mastered and which need improvement. Thus although useful as a holistic assessment, OSATS and tools like it provide little specific and relevant feedback for learners to focus their skill improvement efforts. Our larger study on TCAT-ARC demonstrated that it was most useful for providing specific formative feedback,6 and this is likely also true of TCAT-IS. Of note, there were no instructions given to participants that TCAT-IS should be used to foster formative feedback to the trainee. That this appeared to occur spontaneously and organically is a further indicator of the instrument’s utility in this regard. The Accreditation Council for Graduate Medical Education lists “performs uncomplicated EBUS or mediastinoscopy” as a level 3 milestone in thoracic surgery.4 Because of the limited sample size in this pilot study and the relatively lower number of EBUS procedures captured, it would not be appropriate based on this study to use TCAT-IS alone to make high-stakes summative decisions about invasive staging competency. TCAT-IS is intended to provide a more accurate understanding of the degree to which a trainee has attained competence in these procedures and to help break down the areas in which a trainee excels and which areas she or he still needs to develop. However, the overall score on an instrument like TCAT-IS is a poor measure of overall competence because it is insensitive to a single, potentially disastrous error. Also, an average item score is not a useful measure of overall trainee competence, because some items clearly deserve more weight than others and also because a mean score does not reflect when a trainee was unable to complete a case and the surgeon took over. Mean overall scores were high in this study, at more than

EDUCATION

596

TURNER ET AL INVASIVE MEDIASTINAL STAGING ASSESSMENT

mediastinoscopy or invasive staging in general (EBUS and/or mediastinoscopy). Importantly feedback from both trainees and surgeons, during both the simulated and clinical pilot tests of TCAT-IS, was consistently positive in terms of both educational impact and user friendliness (acceptability). A web-based version of the instrument is currently under development, which should help with uptake and further improve user friendliness. Both groups believed TCAT-IS provided useful educational information to both the learner and the surgeon who was teaching and assessing them. In this way tools like TCAT-IS may be most useful for formative assessment (specific, actionable feedback given throughout training about which steps have been mastered and which need further refinement and why), which has an important role in shaping training and improving competence over time. Comments from surgeons support the utility of TCAT-IS in providing a framework on which to structure detailed, procedurespecific feedback to trainees, which can be crucial in their skill development. Together with other tools, such as global competency scales and traditional gestalt assessments, TCAT-IS has the potential to form an important part of a multifaceted approach to assessing the performance of invasive staging. The authors wish to acknowledge each of the other participating surgeons at the study sites for making this study possible (Swedish Cancer Institute and Medical Center: Drs Aye, Costas, Farivar, Gilbert, and Valli eres; University of British Columbia: Dr McGuire; University of Western Ontario: Drs Fr echette, Fortin, and Malthaner) as well as the mentorship of the Thoracic Education Cooperative Group.

References 1. ten Cate O. Competency-based postgraduate medical education: past, present and future. GMS J Med Educ. 2017;34: Doc69.

Ann Thorac Surg 2019;108:590-6

2. Frank JR, Mungroo R, Ahmad Y, Wang M, De Rossi S, Horsley T. Toward a definition of competency-based education in medicine: a systematic review of published definitions. Med Teach. 2010;32:631-637. 3. Harris K, Frank J, eds. Competence by Design: Reshaping Canadian Medical Education. Ottawa, Canada: The Royal College of Physicians and Surgeons of Canada; 2014. 4. Nasca TJ, Philibert I, Brigham T, Flynn TC. The next GME accreditation system—rationale and benefits. N Engl J Med. 2012;366:1051-1056. 5. Cumming A, Ross M. The Tuning Project for Medicine— learning outcomes for undergraduate medical education in Europe. Med Teach. 2007;29:636-641. 6. Turner SR, Lai H, Nasir BS, et al. Clinical validation of a competency assessment scale for anatomic lung resection. Paper presented at: the American Association for Thoracic Surgery. May 2018; San Diego, CA. 7. Keeny S, Hasson F, McKenna H. The Delphi Technique in Nursing and Health Research. 1st ed. Chichester, UK: WileyBlackwell; 2011. 8. Scott EA, Black N. When does consensus exist in expert panels? J Pub Health Med. 1991;13:35-39. 9. Shapley L, Grofman B. Optimizing group judgemental accuracy in the presence of interdependencies. Pub Choice. 1984;43:329-343. 10. Schieman C, Ujie H, Donahoe L, et al. Developing a national, simulation-based, surgical skills bootcamp in general thoracic surgery. J Surg Educ. 2018;75:1106-1112. 11. van der Vleuten CP. The assessment of professional competence: developments, research and practical implications. Adv Health Sci Ed. 1996;1:41-67. 12. Turner SR, Seyednejad N, Nasir BS. Patterns of practice in mediastinal lymph node staging for non-small cell lung cancer in Canada. Ann Thorac Surg. 2018;106: 428-434. 13. Martin JA, Regehr G, Reznik R, et al. Objective structured assessment of technical skill (OSATS) for surgical residents. Br J Surg. 1997;84:273-278. 14. Ferguson MK, Bennett C. Identification of essential components of thoracoscopic lobectomy and targets for simulation. Ann Thorac Surg. 2017;103:1322-1329. 15. Davoudi M, Colt HG, Osann KE, Lamb CR, Mullon JJ. Endobronchial ultrasound skills and tasks assessment tool. Assessing the validity evidence for a test of endobronchial ultrasound-guided transbronchial needle aspiration operator skill. Am J Respir Crit Care Med. 2012;186: 773-779.