Does the NBME Surgery Shelf exam constitute a “double jeopardy” of USMLE Step 1 performance?

Does the NBME Surgery Shelf exam constitute a “double jeopardy” of USMLE Step 1 performance?

Accepted Manuscript Does the NBME Surgery Shelf exam constitute a “double jeopardy” of USMLE Step 1 performance? Michael S. Ryan, Jorie M. Colbert-Get...

362KB Sizes 0 Downloads 22 Views

Accepted Manuscript Does the NBME Surgery Shelf exam constitute a “double jeopardy” of USMLE Step 1 performance? Michael S. Ryan, Jorie M. Colbert-Getz, Salem N. Glenn, Joel D. Browning, Rahul J. Anand PII:

S0002-9610(16)30978-3

DOI:

10.1016/j.amjsurg.2016.11.045

Reference:

AJS 12205

To appear in:

The American Journal of Surgery

Received Date: 7 March 2016 Revised Date:

21 October 2016

Accepted Date: 29 November 2016

Please cite this article as: Ryan MS, Colbert-Getz JM, Glenn SN, Browning JD, Anand RJ, Does the NBME Surgery Shelf exam constitute a “double jeopardy” of USMLE Step 1 performance?, The American Journal of Surgery (2017), doi: 10.1016/j.amjsurg.2016.11.045. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

RI PT

ACCEPTED MANUSCRIPT

M AN U

SC

DOES THE NBME SURGERY SHELF EXAM CONSTITUTE A “DOUBLE JEOPARDY” OF USMLE STEP 1 PERFORMANCE?

TE D

Michael S Ryan MD, MEHPa, Jorie M Colbert-Getz MS, PhDb, Salem N Glenn BSa, Joel D Browning BSa, Rahul J Anand MDa

a. Virginia Commonwealth University School of Medicine 1201 East Marshall Street, Richmond, Virginia 23298

AC C

EP

b. University of Utah School Of Medicine 30 N 1900 E, Salt Lake City, Utah 84132

Corresponding Author Rahul J Anand MD FACS [email protected]

ACCEPTED MANUSCRIPT

Background: Scores from the NBME Subject Examination in Surgery (Surgery Shelf) positively correlate with United States Medical Licensing Examination Step 1 (Step 1). Based on this relationship, the

RI PT

authors evaluated the predictive value of Step 1 on the Surgery Shelf.

Methods:

SC

Surgery Shelf standard scores were substituted for Step 1 standard scores for 395 students in 2012-2014 at one medical school. Linear regression was used to determine how well Step 1

M AN U

scores predicted Surgery Shelf scores. Percent match between original (with Shelf) and modified (with Step 1) clerkship grades were computed.

Results:

TE D

Step 1 scores significantly predicted Surgery Shelf scores, R2= 0.42, P<0.001. For every point increase in Step 1, a Surgery Shelf score increased by 0.30 points. Seventy-seven percent of

Conclusion:

EP

original grades matched the modified grades.

AC C

Replacing Surgery Shelf scores with Step 1 scores did not have an effect on the majority of final clerkship grades. This observation raises concern over use of Surgery Shelf scores as a measure of knowledge obtained during the Surgery clerkship.

ACCEPTED MANUSCRIPT

Keywords: Surgery Clerkship, Surgery Shelf Exam, USMLE Step 1 exam

RI PT

Funding:

This research did not receive any specific grant from funding agencies in the public, commercial,

AC C

EP

TE D

M AN U

SC

or not-for-profit sectors.

ACCEPTED MANUSCRIPT

Introduction

Most Surgery Clerkships across the United States use the National Board of Medical

RI PT

Examiners (NBME) Subject Examination in Surgery (henceforth referred to as Surgery Shelf) as a component of final grade determination. Institutions vary in the weight afforded to the Surgery Shelf for final grade calculations, but it commonly accounts for at least a quarter of the final

SC

grade.1 The Surgery Shelf appeals to clerkship directors because of both its ease in

administration and its value as a nationally developed measure of knowledge acquisition.2 In

M AN U

contrast to other assessment methods such as global rating forms for clinical performance, the Surgery Shelf provides a more objective measure of clerkship-based performance.3

The United States Medical Licensing Examination (USMLE) Step 1 has been shown to correlate positively with Shelf exams in Surgery,4,5 Internal Medicine5, Pediatrics,5 Psychiatry,5

TE D

Obstetrics and Gynecology,5,6 and Family Medicine.5,7 This correlation has several implications for the Surgery clerkship. First, it suggests that scores on USMLE Step 1 may be helpful in identifying students at risk for failing the Surgery Shelf and/or receiving a poor grade in the

EP

clerkship. Second, it has raised concern that the Surgery Shelf may signify a clerk’s general testtaking capabilities rather than his/her knowledge related to surgery8 thus serving as “double

AC C

jeopardy” for students who perform poorly in other standardized tests.

The purpose of this study was to assess the value of USMLE Step 1 in predicting

performance on the Surgery Shelf and consequentially, final clerkship grades. We compared performance on the Surgery Shelf to USMLE Step 1 scores and then substituted Surgery Shelf scores with scores from USMLE Step 1 to determine the impact the substitution would have on Surgery clerkship grades. We expected that USMLE Step 1 scores would be predictive of scores

ACCEPTED MANUSCRIPT

on the Surgery Shelf and that final clerkship grades would be unaffected if we substituted USMLE Step 1 scores for Surgery Shelf scores.

RI PT

Materials and Methods:

This was a retrospective study. We collected USMLE Step 1 and USMLE Step 2

Clinical Knowledge (CK) scores, Surgery Shelf scores, and Surgery Clerkship grades for all

SC

medical students who completed the Surgery Clerkship from 2012 to 2014 at the Virginia

M AN U

Commonwealth University School of Medicine (VCU-SOM).

The Surgery Clerkship at Virginia Commonwealth University School of Medicine (VCUSOM) is an 8 week experience with students spending 4 weeks on a General Surgery Service (Gastrointestinal and Bariatric Surgery, Trauma and Emergency Surgery, Surgical Oncology, Transplantation Surgery, Pediatric Surgery, or Veterans Affairs Hospital), and two rotations of 2

TE D

weeks each on surgical subspecialty service (Cardiothoracic Surgery, Neurosurgery, Ophthalmology, Orthopedic Surgery, Otolaryngology, Plastic Surgery, Urology, Community Rotation and Vascular Surgery). Final grades on the clerkship are based on preceptors’ ratings of

EP

clinical performance (30% from General Surgery and 15% from each subspecialty experience),

AC C

Surgery Shelf performance (30%), completion of online assignments (10%), and completion of other non-graded components such as written History and Physicals (Pass/Fail). To compute a final grade, VCU-SOM converts each assessment score into a T-score. T-

scores are standard scores that have a mean of 50 and a standard deviation of 10 and allow for comparison between assessment types that use different scales (e.g. 0-100%, 1-4). At the VCUSOM, thresholds for Honors, High Pass, and Pass are suggested using norm-referenced T-scores from the previous academic cycle. The threshold for pass is identified by subtracting 2 standard

ACCEPTED MANUSCRIPT

deviations from the mean T-score from the previous academic year. The threshold for High Pass includes all T-score values above the mean but below the top 15%. All scores above the top 15% meet the threshold for Honors. Final grades are assigned by a grading committee which

RI PT

takes into account the suggested clerkship grade as well as comments from preceptors. For the purpose of this study, the suggested T-score grade was termed the “original” grade.

Each student’s Surgery Shelf T-score was theoretically substituted with their USMLE

SC

Step 1 T-score in their final grading computation to determine a “modified grade” using the same

M AN U

thresholds for honors, high pass, pass. The frequency of each grade type was computed. Mean scores and standard deviations for USMLE Step 1 and the Surgery Shelf were also computed. Linear regression was employed to determine how well raw USMLE Step 1 scores predicted raw Surgery Shelf scores. A squared correlation coefficients (R2) between raw USMLE Step 1 scores and raw Surgery Shelf scores was computed to determine the effect size of the regression

TE D

equation. Additionally, we also used linear regression to determine how well raw Surgery Shelf scores predicted USMLE Step 2 CK scores.

EP

Results:

Three hundred and ninety five medical students completed the Surgery Clerkship in 2012

AC C

to 2014. Descriptive demographic information for these students is provided in Table 1. Figure 1 illustrates the median Surgery Shelf scores by rotation over the two academic years studied. During the 2012-2013 academic year, median scores improved with each rotation, however, this did not occur in the 2013-2014 academic year. Figure 2 provides a comparison of original and modified final Surgery clerkship grades. There were no failures for both original and modified grades. There was a match between

ACCEPTED MANUSCRIPT

original and modified grades for 77% (304) of the students. For the 91 un-matched cases, 46% (42) were modified grades that were lower than original grades (e.g. HP instead of H) and 54% (49) were modified grades that were higher than original grades (e.g. H instead of HP). Ninety

RI PT

six percent (86) of the modified grade changes were usually going up or down one grade (e.g. H to HP) with only 6% (5) of the changes going up or down two grades (e.g. H to P).

Students averaged a 227 (SD = 18.67) on USMLE Step 1 and 75 (scaled score) (SD = 9)

SC

on the Surgery Shelf. Linear regression analysis showed that USMLE Step 1 scores significantly

M AN U

predicted Surgery Shelf scores, R2 = 0.42, P<0.001. The R2 value indicated USMLE Step 1 scores accounted for 42% of the variance in Shelf scores. Figure 3 provides USMLE Step 1 scores by shelf scores with the regression equation line. The regression equation (Y = 62.22 + 0.30x) was interpreted with 185 as a baseline score on Step 1. Thus, a Step 1 score of 185 would

by 0.30 points.

TE D

predict a shelf score of 62 and for every 1 point increase in Step 1 a Shelf score would increase

Step 2 CK scores were unavailable for 5 students. Linear regression analysis showed that

EP

shelf scores significantly predicted Step 2 CK scores, R2 = 0.44, P<0.001. The regression equation (Y = 206.69+ 1.32x) was interpreted with 50 as a baseline score on shelf. Thus, a shelf

AC C

score of 50 would predict a Step 2 CK score of 207 and for every 1 point increase in shelf a Step 2 CK score would increase by 1.32 points. Discussion:

The current study demonstrated that performance on the Surgery Shelf examination could be predicted from USMLE Step 1 performance and shelf scores shared a large percentage of variance with USMLE Step 1 scores. Additionally, theoretically replacing Surgery Shelf scores

ACCEPTED MANUSCRIPT

with USMLE Step 1 scores did not have an effect on a majority of final Surgery clerkship grades. These results further concerns expressed by others regarding the use of the Surgery Shelf

RI PT

as a measure of knowledge acquired during the Surgery clerkship.

Surgery clerkship grades are commonly determined by a constellation of faculty and/or resident evaluations of a student’s clinical performance, the Surgery Shelf, and in some settings,

SC

oral examinations, objective structured clinical examinations, encounter notes, or other

assignments. Numerous studies have illustrated poor concordance between clinical evaluations

M AN U

and performance on the Surgery Shelf leading to the conclusion that clinical evaluations do not effectively measure medical knowledge.3,9-11 However, the assumption is that the Surgery Shelf does, in fact, measure knowledge acquisition during the clerkship.

In the recent text from the Alliance on Clinical Education (ACE), a section is devoted to

TE D

the utility of the NBME subject examinations in assessing knowledge acquisition.2 As the authors describe, Shelf exams “reflect cumulative knowledge, including knowledge acquired from basic science and prior clinical experiences (p.187).” This statement is supported by

EP

substantial literature, which reflects the impact of clerkship timing and performance on the subject exams. Specific to the Surgery Shelf, several authors have made the same

AC C

observations.12-14 For example, Gerhardt and colleagues compared surgical clerkship students on “slow” and “busy” clinical services and found that characteristics of clinical rotations were less influential on the Surgery Shelf scores than time of year.14 The NBME also acknowledges the notion that Shelf examinations may reflect cumulative knowledge. While they highlight that these examinations are intended to reflect learning in the

ACCEPTED MANUSCRIPT

context of the respective course or clerkship, they go on to share that students’ scores also relate to their overall progression throughout the course of medical school. 15

RI PT

The acknowledgements from ACE, the NBME and our own observations are somewhat problematic for both clerkship directors and medical students. The NBME establishes suggested thresholds for “Honors” and passing using common standard-setting methods from a panel of experts. However the use of these guidelines and the weight afforded to the Surgery Shelf is at

SC

the discretion of the local clerkship and/or School of Medicine. At our institution the Surgery Shelf counts for 30% of the final grade, but the weighting is variable across the country with

M AN U

some institutions reportedly using the Shelf for up to 70% of the final grade.1 At this time there is no consistency in Surgery Clerkship grading systems across medical schools.16 For medical students, there are additional concerns. Outside of professionalism, performance on USMLE Step 1 and “Honors” in the Surgical clerkship are considered the two

TE D

most important factors in selecting applicants to interview for surgical residency.17 While these are considered in the larger view of the applicant, one may pose that a student can combat poor

EP

performance on one with an outstanding performance on the other. However, if “Honors” on the Surgery clerkship is heavily influenced by the Surgery Shelf exam, and performance on this

AC C

exam is influenced by previous performance, such as USMLE Step 1 performance, this raises concern for inherent “double jeopardy.” Though our data does raise concern over the use of the Surgery Shelf scores in

determining surgical knowledge obtained during the clerkship, one must also consider the consequences of removing the Surgery Shelf as a component of the final grade. Rockney and colleagues assessed the impact of dropping the Pediatric Shelf examination from the clerkship at a single institution and found that students performed worse on USMLE Step 2 CK as a result.18

ACCEPTED MANUSCRIPT

Similarly, the Shelf has also been used effectively to identify students who are at risk for failure of USMLE Step 2.19 The results of the current study suggests that performance on the USMLE Step 2 CK could be predicted from Surgery Shelf scores and USMLE Step 2 CK scores shared a

RI PT

large percentage of variance with Shelf scores. Therefore, the Surgery Shelf can be viewed, at a minimum, as a method of preparing for USMLE Step 2 CK in a low(er) stakes setting, depending on how much weight it is worth in final grade computations of the Surgery clerkship.

SC

In addition to concerns expressed over potential for worse outcomes on USMLE Step 2

M AN U

CK, one must consider the alternative methods of assessment available. Measures such as global rating forms for clinical performance are fraught with tendencies for evaluators to confuse attitudes such as motivation with fund of knowledge, skills in the operating room, or potential as a future surgeon.3 Newer assessment tools such as virtual patient cases may be worth considering in the future.20 However, regardless of the assessment method chosen, there are always trade-

TE D

offs, which is why multiple assessment tools and multiple observations of performance are needed in clerkships to determine final grades. As the NBME itself states, “the results of the

EP

subject exams should not be viewed as the beginning and end of evaluation.” 15 There are various limitations to this study. First, we used data from a single institution in

AC C

which the Surgery Shelf counts almost for a third of the grade. The impact of replacing Surgery Shelf scores with USMLE Step 1 scores on final grades would presumably vary depending on how much weight shelf contributes to the grade across institutions. Further investigation with multiple medical schools is needed to determine if “double jeopardy” is a national issue or just a local issue. Second, while approximately three quarters of modified grades “matched” the original grades, we did not find a match for one quarter of final grades. Our study was not designed to evaluate data such as career interest, study habits, or demographic variables which

ACCEPTED MANUSCRIPT

may have influenced a better or worse performance on the Surgery Shelf exam than expected based on USMLE Step 1 scores.

RI PT

Conclusions Performance on the Surgery Shelf examination can be predicted using USMLE Step 1 scores. This result questions the use of the Surgery Shelf as a specific examination for

SC

knowledge acquired during the Surgery Clerkship and may have significant implications for

AC C

EP

TE D

M AN U

Surgery clerkship directors.

ACCEPTED MANUSCRIPT

References: 1. National Board of Medical Examiners. Characteristics of clinical clerkships. http://www.nbme.org/PDF/SubjectExams/Clerkship Survey Summary.pdf. Accessed

RI PT

January 1 2016.

2. Sisson T, Grum C. Clerkship examinations. In: Pangaro LN and McGaghie WC, eds. Alliance for Clinical Education: Handbook on Medical Student Evaluation and

SC

Assessment. North Syracuse: Gegensatz Press, 2015: 177-190.

3. Awad SS, Liscum KR, Aoki N et al. Does the subjective evaluation of Medical Student

Research 2002;104:36-39.

M AN U

Surgical Knowledge Correlate with Written and Oral Exam Performance? J Surg

4. Kozar RA, Kao LS, Miller CC et al. Preclinical Predictors of Surgery NBME Exam Performance. J Surg Research 2007;140:204-7.

TE D

5. Zahn CM, Saguil A, Artino AR Jr, et al. Correlation of National Board of Medical Examiners scores with United States Medical Licensing Examination Step 1 and Step 2 scores. Acad Med. 2012;87:1348–1354

EP

6. Ogunyemi D, De Taylor-Harris S. NBME obstetrics and gynecology clerkship final examination scores: Predictive value of standardized tests and demographic factors. J

AC C

Reprod Med. 2004;49:978-998

7. Myles T, Galvez-Myles R. USMLE Step 1 and 2 scores correlate with family medicine clinical and examination scores. Fam Med. 2003;35:510–513

8. Hermanson B, Firpo M, Cochran A et al. Does the National Board of Medical Examiners’ Surgery Subtest level the playing field? Am J Surg. 2004:188:520-1.

ACCEPTED MANUSCRIPT

9. Goldstein SD, Lindeman B, Colbert-Getz J et al. Faculty and resident evaluations of medical students on a surgery clerkship correlate poorly with standardized exam scores. Am J Surg. 2014; 207: 231-5.

RI PT

10. Lawrence PF, Nelson EW, Cockayne TW. Assessment of medical student fund of knowledge in surgery. Surgery. 1985; 97: 745-9.

11. Farrell TM, Kohn GP, Owen SM, et al. Low correlation between subjective and

SC

objective measures of knowledge on surgery clerkships. J Am Coll Surg. 2010; 210: 680-5.

M AN U

12. Ripkey DR, Case SM, Swanson DB. Predicting performances on the NBME surgery subject test and USMLE Step 2: The effects of surgery clerkship timing and length. Acad Med. 1997; 72: S31-33

13. Baciewicz FA Jr, Arent L, Weaver M, et al. Influence of clerkship structure and timing

TE D

on individual student performance. Am J Surg. 1990; 159: 265-8. 14. Gerhardt JD, Filipi CJ, Watson P, et al. Are long hours and hard work detrimental to end-clerkship examination scores? Am J Surg. 1999; 177: 132-135.

EP

15. National Board of Medical Examiners. Subject Exams. http://www.nbme.org/schools/Subject-Exams/index.html. Accessed January 1 2016.

AC C

16. Ravelli C, Wolfson P. What is the “Ideal” Grading System for the Junior Surgery Clerkship? Am J Surg 1999;177:140-44.

17. National Resident Matching Program. http://www.nrmp.org/wpcontent/uploads/2014/09/PD-Survey-Report-2014.pdf. Accessed

January 1 2016.

ACCEPTED MANUSCRIPT

18. Rockney RM, Allister RG. Dropping the Shelf examination: does it affect student performance on the United States Medical Licensure Examination Step 2? Ambul Pediatr 2005;5:240-43

USMLE Step 2. Acad Med 1999;74:45-48.

RI PT

19. Ripkey DR, Case SM, Swanson DB. Identifying students at risk for poor performance on

20. Yang RL, Hashimoto DA, Predina JD et al. The Virtual-Patient Pilot: Testing a New

SC

Tool for Undergraduate Surgical Education and Assessment. J Surg Education; 70:394-

AC C

EP

TE D

M AN U

400.

ACCEPTED MANUSCRIPT

# Females

181 (46 %)

Age Range

25 - 42 years

Average Age

28 years

# Caucasian

194 (49 %)

# Asian

100 (25%)

# Underrepresented Minorities

24 (6%)

SC

214 (54%)

TE D

M AN U

# Males

RI PT

Table 1: Demographic Information for 395 Surgery Clerkship Students

Figure 1: Surgery Shelf Examination Scores by Rotation Number

EP

See separate file

Figure 2: Percentage of students receiving modified grades of “Honors,” “High Pass,” and

AC C

“Pass” compared to their original grades in the Surgery Clerkship See separate file

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

Figure 3: Step 1 Scores Minus 185 by Surgery Shelf Examination Scores for 395 Students

RI PT

ACCEPTED MANUSCRIPT

82

SC

80

M AN U

76

Legend 2012-2013

TE D

74

72

2013-2014

EP

70

68

66

1

2

AC C

Shelf Exam Raw Score

78

3

4

Rotation Number

5

6

SC

RI PT

ACCEPTED MANUSCRIPT

M AN U

200

160 140

TE D

120 100 80

40 20 0

Honors

High Pass

Original Grade

Honors High Pass Pass

EP

60

AC C

Modifed Grade (n receiving)

180

Pass

SC

RI PT

ACCEPTED MANUSCRIPT

128

High Pass

23

Pass

3

AC C

EP

TE D

Honors

High Pass

M AN U

Honors

Pass

17

5

34

20

22

167

SC

RI PT

ACCEPTED MANUSCRIPT

M AN U

Pass 3%

High Pass 12%

Pass 26%

High Pass 11%

TE D

Honors 30%

High Pass 44%

Pass 87%

Honors

AC C

EP

Honors 85%

Honors 2%

High Pass Original Grades

Pass