The American Journal of Surgery (2016) 211, 1153-1157
Surgical Education
Accuracy and content of medical student midclerkship self-evaluations Madeline B. Torres, M.D.1, Amalia Cochran, M.D.* Department of Surgery, University of Utah School of Medicine, 30 North 1900 East, SOM 3B110, Salt Lake City, UT, 84132, USA
KEYWORDS: Medical student education; Self-evaluation; Knowledge acquisition; Cognitive skills
Abstract BACKGROUND: Midclerkship self-evaluations (MCSEs) require students to reflect on their knowledge, skills, and behaviors. We hypothesized that MCSEs would be consistent with supervisor midpoint evaluations during a surgical clerkship. METHODS: MCSEs of 153 students who completed our surgery clerkship in 2 academic years were compared with supervisor midclerkship evaluations. The quantitative domains of the MCSE and supervisor evaluation were compared for accuracy. Identified areas of strengths and weakness were evaluated for thematic consistency. RESULTS: Student MCSE scoring was accurate across evaluated domains most of the time; when students were inaccurate, they tended to underrate themselves. Students and supervisors most often identified cognitive skills as areas for improvement and noncognitive skills predominated as student strengths. CONCLUSIONS: Medical students can accurately identify their strengths and weaknesses in the context of an MCSE. Based on these findings, knowledge acquisition and application by medical students in the clinical setting should be emphasized in undergraduate medical education. Ó 2016 Elsevier Inc. All rights reserved.
Medical education is constantly evolving, with growing weight on self-directed learning and ‘‘flipping the classroom.’’ Because of this emphasis, it has become more important that medical students develop accurate selfevaluation skills. Previous studies have demonstrated the There were no relevant financial relationships or any sources of support in the form of grants, equipment, or drugs. Work presented at the 2015 Association for Surgical Education Meeting, Seattle, WA, April, 22-25. * Corresponding author. Tel.: 11-801-581-7508; fax: 11-801-5856005. E-mail address:
[email protected] Manuscript received July 7, 2015; revised manuscript September 20, 2015 1 Present address: Penn State Milton S. Hershey Medical Center, Department of Surgery, General Surgery Residency Program, 500 University Drive, MC H159, Hershey, PA 17033. 0002-9610/$ - see front matter Ó 2016 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.amjsurg.2015.11.030
inaccuracy of medical student and health professional selfevaluations, resulting in a belief that health professions students are unable to accurately assess their own strengths and weaknesses to improve on them.1,2 Midclerkship self-evaluations (MCSEs) require students to reflect on their knowledge, skills, and behaviors. Selfevaluation activities provide students with the opportunity to practice their self-evaluation skills, which are vital for life-long learning in medicine, in a context that emphasizes formative feedback from their supervisors. In addition, midpoint formative feedback is required by the Liaison Committee on Medical Education, and the American Board of Medical Specialties stresses self-evaluation as a key component of medical education and maintenance of certification.3,4 Our study had 2 specific areas of inquiry. First, we hypothesized that medical student self-assessments would
1154
The American Journal of Surgery, Vol 211, No 6, June 2016
Non-CogniƟve
CogniƟve
Reliability/
Knowledge base/
Responsibility
Recall
Maturity
InterpretaƟon
CriƟque
Problem Solving
CommunicaƟon Skills
Data Gathering
Surgical SKills Surgical skills
ApplicaƟon/
Honesty/ Integrity Respect for PaƟents Chemical/Mood Disorder
Figure 1 Taxonomy of skills used to describe student performance in surgical clerkship.
align with those of their supervisors during a surgical clerkship. Secondly, we wanted to identify which clinical skills medical students and their supervisors most commonly identified as strong or weak.
Methods Participants Participants included 153 3rd-year medical students at the University of Utah School of Medicine who completed the junior surgery clerkship in the 2012 and 2013 academic years. The University of Utah Institutional Review Board declared the study exempt.
Study design and analysis We asked students to complete the self-assessment portion of their MCSE before meeting with a supervisor of their choice, who could be a senior resident or a faculty member. The MCSE form included space for a student to self-assess and an additional space for evaluation by their supervisor, allowing direct comparison of student and supervisor data from deidentified forms. The midclerkship evaluation form included a 4-point rating scale (1 5 unacceptable, 4 5 competent/advanced). The following 5 domains were present on the form: medical knowledge,
Table 1
progress notes, timeliness, initiative, and professionalism. Students submitted 2 MCSE forms during their 6-week surgery clerkship. We compared the domains of the MCSE and supervisor evaluations for accuracy of self-assessment. Chi square was calculated using Stata 14 (Stata Corp., College Station, TX). Each MCSE also requires free-text statement of one item the student does well and one item for improvement from both student and supervisor. If 2 or more strengths or areas for improvement were documented, only the 1st two comments were analyzed. Identified areas of strengths and weakness were examined for thematic consistency between students and their supervisors. Themes were separated into cognitive, noncognitive and technical skills taxonomies that were further subcategorized based on previously described schema5,6 (Fig. 1). Thematic consistencies were identified and evaluated independently by 2 of the authors with a 99% agreement. A 3rd-independent rater was used to eliminate discrepancies.
Results A total of 153 MCSEs were analyzed for this 2-year period. Accuracy of student self-evaluation vs supervisor evaluation is shown in Table 1. Students were most likely to both underrate and overrate themselves in medical knowledge (43% underrate, 12% overrate). Student self-rating was most accurate for professional demeanor (66%) and timeliness (66%). Concordance between student self-rating and supervisor rating exceeded 50% for all domains except medical knowledge (45%). In the evaluation of the free-text statements, student perception of clinical strengths based on major categories corresponded 85.6% of the time with their evaluators; areas for improvement similarly aligned 72.5% of the time between students and evaluators (Fig. 2). Both students and supervisors were most likely to identify noncognitive skills as areas of strength (93.1%) followed by cognitive skills (5.4%) and technical skills (1.5%). Cognitive skills were the most commonly identified areas for improvement (58%), followed by noncognitive skills (38.4%) and technical skills (3.6%, Fig. 3). Fig. 4 summarizes the distribution of major domains and subcategories in which student and supervisor free-text comments matched in describing student strengths. Reliability/responsibility was the most commonly matched student strength, with no other achieving double digits. Fig. 5
Accuracy of student self-evaluation vs supervisor
Evaluation accuracy
Medical knowledge (n 5 152)
Progress notes (n 5 145)
Timeliness (n 5 152)
Initiative (n 5 152)
Professional demeanor (n 5 152)
Underrate Match Overrate
66 (43%) 68 (45%) 18 (12%)
54 (37%) 78 (54%) 13 (9%)
40 (26%) 101 (66%) 11 (7%)
50 (33%) 88 (58%) 14 (9%)
33 (22%) 101 (66%) 18 (12%)
M.B. Torres and A. Cochran
Figure 2
Medical student self-evaluations
1155
(A) Matching of free-text comments, strengths (B) Matching of free-text comments, areas for improvement.
provides the same summary for areas for improvement, showing that the matched areas for improvement were more broadly distributed than were strengths. Although knowledge base and recall was the most common area for improvement (cognitive), application/interpretation (cognitive) and communication skills (noncognitive) were also commonly cited. Of note, the N for major domains in Figs. 4 and 5 is total comments in that category, with the n for the subcategories only includes those in which student and supervisor matched.
Comments Quantitative data Our results indicate that 3rd-year medical students are reasonably skilled at self-assessing their knowledge, skills, and behavior when anchored against their supervisors during their 3rd-year surgical clerkship. The MCSE match rates are above 50%, with the exception of medical knowledge at 45%, for the quantitative domains. Students were most likely to match their supervisors’ assessments of timeliness and professional demeanor, both of which are core noncognitive behaviors that should be present in all adult learners. The lower percentage of matching ratings for medical knowledge
comes as no surprise given a similar number reported for underrating of medical knowledge (43%). Our findings differ from those previously described by Gordon’s review of the validity and accuracy of self-assessment in health professions training, which describes consistent underrating of health professional students on self-evaluations.1 We observed that very few students overrated their performance, with medical knowledge and professional demeanor being the most commonly overrated categories. Most of the students who overrated their abilities tended to match in all other domains, meaning that their misperception of their knowledge and behavior was usually isolated to that single domain. As would be expected, students who overrated their professional demeanor also failed to identify this as one of their areas for improvement in the free-text section of the form. Students underrated themselves less than 50% of the time in the quantitative domains. Our students were more likely to underrate their skills in medical knowledge (43%), which comes as no surprise given previous reports of students underrating both skills and performance.7–11
Qualitative data In analyzing the free-text areas of the form, we found that identified major domains of strength and areas for improvement had a match rate above 50%, again
Figure 3 (A) Matching domains in free-text comments, strengths (n 5 131) (B) Matching domains in free-text comments, areas for improvement (n 5 111).
1156
The American Journal of Surgery, Vol 211, No 6, June 2016
Non -CogniƟve Strengths 122
CogniƟve Strengths 7
Reliability/Responsibility 80
Knowledge base/Recall 3
Maturity 0
Application/Interpretation 0
Critique 0
Technical Skills Strengths 2 Technical skills 2
Problem Solving 0
Communication Skills 5
Data Gathering 2
Honesty/Integrity 0 Respect for Patients 2 Chemical/Mood Disorder 0
Figure 4
Matching free-text categories and domains for identified areas of strengths. All values are (n).
suggesting that students are generally accurate assessors of their skills and knowledge. Looking in more detail at subcategories within the matching free-text areas, we found that students most commonly listed noncognitive skills as areas of strength. Kaiser and Bauer JJ12 presented similar findings when she described very strong agreement on student self-evaluations of interactional and physical
Non -CogniƟve Areas for Improvement 43
Reliability/Responsibility 6 Maturity 2 Critique 0
Communication Skills 20
examination skills, which would fit under our umbrella of noncognitive skills. This finding is also supported by our quantitative data where students underrate themselves the most in medical knowledge, a cognitive skill, and match the most in timeliness and professional demeanor, which are noncognitive skills. We believe this finding is likely a result of how easy it is to comment on noncognitive skills
CogniƟve Areas for Improvement 65 Knowledge base/Recall 32
Technical Skills Areas for Improvement 4 Technical skills 4
Application/Interpretation 18
Problem Solving 0 Data Gathering 2
Honesty/Integrity 0 Respect for Patients 0 Chemical/Mood Disorder 0
Figure 5
Matching free-text categories and subcategories for identified areas for improvement; all values are (n).
M.B. Torres and A. Cochran
Medical student self-evaluations
(eg, reliability and responsibility, communication skills, and patient care) compared with cognitive skills; cognitive skills demand a more extensive and regular interaction with students to adequately evaluate their knowledge base and their ability to apply and interpret data. Cognitive skills were the most commonly identified major area for improvement. The most commonly listed cognitive skill for improvement was knowledge base and recall, similar to the underrating of knowledge in our quantitative data. Communication skills were commonly listed as a noncognitive area for improvement, similar to findings describing pharmacy students who consistently underrated their own communication skills compared with their evaluators.9 The identification of cognitive skills, particularly knowledge base, as a consistent deficit is not surprising, particularly because the basic surgical clerkship is a time for acquisition of core surgical knowledge by learners. Our study does have some limitations. First and foremost, the single-center nature may introduce bias as the admissions committee at our institution may select for a particular ‘‘type’’ of student that is more likely to excel in one major domain than another or student performance may be impacted by curricular emphases during the 1st two years of medical school. Because of this potential for selection bias, our results may not be generalizable to other medical schools and other clerkships. Another limitation is that gender information was lost during the data blinding process, so certain gender-based differences in selfassessment may not have been adequately captured. A possible limitation could be identified based on our use of supervisor ‘‘expert’’ assessment of medical student knowledge as a reference rather than National Board of Medical Examiners subject examination scores. However, prior work has shown that clerkship evaluations of medical knowledge correlate well with National Board of Medical Examiners subject examination scores, so we do not believe this is a major shortcoming.13 Finally, we did not compare the midpoint evaluations with end of clerkship evaluations, primarily because the feedback from supervisors is considered entirely formative. In spite of these limitations, we believe that our examination of the domains in which medical students perceive and are perceived as having both strengths and weaknesses provides important information for improving foundational undergraduate medical education and for targeting learning activities during the surgical clerkship.
Conclusions Data suggest that self-assessments do not correlate with competence,14 but we strongly believe MCSEs have a place in medical education. There is value in conducting MCSEs in conjunction with midclerkship formative feedback for all clerkships as they provide students an opportunity to evaluate their performance and an opportunity to receive and
1157 manage feedback. External feedback has also been demonstrated to improve the validity and accuracy of self-assessments,1,8 so pairing these activities provides the most meaningful growth opportunities for medical students. In addition, with the incorporation of practice-based learning and improvement as both an ACGME core competency and as part of maintenance of certification in all specialties, acquisition of self-assessment skills is imperative. MCSEs provide students with the ability to hone their selfassessment skills when combined with a program to explicitly encourage reflection and development in this area, and therefore providing them with a skill set necessary for success as a medical professional.
Acknowledgments The authors thank Ms. Dellene Stonehocker for administrative assistance with data acquisition.
References 1. Gordon MJ. A review of the validity and accuracy of self-assessments in health professions training. Acad Med 1991;66:762. 2. Davis DA, Mazmanian PE, Fordis M, et al. Accuracy of physician selfassessment compared with observed measures of competence: a systematic review. JAMA 2006;296:1094. 3. American Board of Medical Specialties. Maintenance of Certification Competencies and Criteria. Available at: http://www.abms.org/ maintenance_of_certification/MOCcompetencies.aspx. Accessed May 18, 2015. 4. Liaison Committee on Medical Education. Data Collection Instrument. Available at: http://www.lcme.org/survey-connect-dci-download.htm. Accessed May 31, 2015. 5. Buckwalter JA, Schumacher R, Albright JP, et al. Use of an educational taxonomy for evaluation of cognitive performance. Med Educ 1981;56:115. 6. Phelan S, Obenshain SS, Galey WR. Evaluation of the noncognitive professional traits of medical students. Acad Med 1993;68:799. 7. Papinczak T, Young L, Groves M, et al. An analysis of peer, self, and tutor assessment in problem-based learning tutorials. Med Teach 2007; 29:e122–32. 8. Edwards RK, Kellner KR, Sistrom M, et al. Medical student selfassessment of performance on an obstretics and gynecology clerkship. Am J Obstet Gynecol 2003;188:1078–82. 9. Lundquist LM, Shogbon AO, Momary KM, et al. A comparison of students’ self-assessments with faculty evaluation of their communication skills. Am J Pharm Educ 2013;77:72. 10. Isenberg GA, Roy V, Veloski J, et al. Evaluation of the validity of medical students’ self-assessments of proficiency in clinical simulations. J Surg Res 2015;193:554–9. 11. Lind D, Rekkas S, Bui V, et al. Competency-based student selfassessment on a surgery otation. J Surg Res 2002;105:31–4. 12. Kaiser S, Bauer JJ. Checklist self-evaluation in a standardized patient exercise. Am J Surg 1995;169:418. 13. Reid C, Kim D, Mandel J, et al. Correlating surgical clerkship evaluations with performance on the National Board of Medical Examiners examination. J Surg Res 2014;190:29–35. 14. Barnsley L, Lyon PM, Ralston SJ, et al. Clinical skills in junior medical officers: a comparison of self-reported confidence and observed competence. Med Educ 2004;38:358.