ORIGINAL REPORTS
Establishing Learning Curves for Surgical Residents Using Cumulative Summation (CUSUM) Analysis CPT Amy Young, MC, USA,* LTC Joseph P. Miller, MC, USA,† and COL Kenneth Azarow, MC, USA* Departments of *Surgery and †Anesthesia/Operative Services, Madigan Army Medical Center, Tacoma, Washington BACKGROUND: The assessment of technical proficiency is
of paramount importance in the training of surgical residents. The fact that technical proficiency is underrepresented in the context of the ACGME outcomes project is evidenced in that proficiency skills comprise less than 5% of all assessments that evaluate residents. In this study, we use Cumulative Summation Analysis (CUSUM) as a visual objective analytic tool to determine performance accuracy and establish learning curves for PGY-1s in surgery. METHODS: From April 2001 to May 2002, 11 surgical
residents completed a 1-month anesthesia rotation. Each resident was asked to complete a preoperative airway assessment followed by endotracheal intubation with induction of anesthesia. Airway assessment was performed independently by a resident and a licensed anesthesiologist or certified anesthetist with the modified Mallampati Score. Data were sequentially collected and plotted for summated successes and failures. RESULTS: The average intern required approximately 19
intubation attempts to complete the learning curve experience. There was no learning curve for airway assessment. CONCLUSIONS: The CUSUM analysis is an effective objec-
tive tool to define learning curves for technical skills. Vital information is provided for surgical programs that place residents in positions to manage airways, and limitless potential for defining the learning curves for technical skills is provided. (Curr Surg 62:330-334. Published by Elsevier Inc. on behalf of the Association of Program Directors in Surgery.) KEY WORDS: cumulative summation analysis, CUSUM, in-
tubation, learning curve, GME, surgical education
Correspondence: Inquiries to Kenneth S. Azarow, MD, Department of Surgery, Madigan Army Medical Center, MCHJ-SGY, Tacoma WA 98431; fax: (253) 968-0232; e-mail:
[email protected] The opinions and assertions contained herein are the private views of the authors and are not to be construed as the official policy or position of the United States Government, the Department of Defense, or the Department of the Army.
330
BACKGROUND The evaluation of technical proficiency is a difficult and complex task. Historically, evaluating surgical residents has been primarily based on subjective criteria. These criteria have been challenged by the ACGME, a challenge that resulted in the ACGME’s outcomes project. Tests, such as the American Board of Surgery In Training and Basic Science Examination along with oral examinations commonly assess factual knowledge and reasoning ability, even within the structure of the outcomes-based project. However, for technical proficiency, no objective measuring stick exists. Control charts are statistical tools originally developed by engineers to test the efficiency of machinery and mechanical systems. Cumulative Summation Analysis (CUSUM) is a type of control chart that has recently gained acceptance in the medical field.1 The basic premise of the analysis is to plot the sequential difference of a set of measured values and to define a target level for those values.2 Thus, the analysis can determine the overall proficiency at achieving success of a given task. By doing so, it is believed that tighter control over deviation from a standard can be achieved. In addition, if we postulate that all subjects have the same baseline experience, we can use the changing success rates as a learning curve until a steady success rate is achieved. To achieve an assumption of equal baseline experience, interns or first-year residents (PGY-1s) become a natural source of a standardized pool of participants. Finally, as all results are summated, the data from all participants can develop a learning curve for the average of all participants. Interpretations can be made for a population rather than for each person alone, which provides the potential application for curricula development, rotational experiences, mentor evaluations, and credentialing experiences. The aim of this study is to identify whether CUSUM can establish a defined number of procedural attempts necessary for the average resident to achieve proficiency of a particular task. The skills of (1) airway assessment and (2) endotracheal intubation were chosen as the model for this analysis.
CURRENT SURGERY • Published by Elsevier Inc. on behalf of the Association of Program Directors in Surgery
0149-7944/05/$30.00 doi:10.1016/j.cursur.2004.09.016
METHODS From April 2001 to May 2002, 11 surgical PGY-1s at Madigan Army Medical Center (Ft Lewis, Washington) completed a 1-month anesthesia rotation. Each resident received the same prestudy training, including ATLS and a didactic session by anesthesia on the first day of the rotation. Thus, this study makes the assumption that all participating interns start with the same baseline intubation experience. All PGY-1s had completed ACLS before beginning the academic year, had completed ATLS during the month before this anesthesia rotation, and underwent a half-day of didactic instruction on intubation before the initiation of their measured experience. To standardize the patient population, eligibility criteria of patients were those aged 18 years and older and ASA class I or II, undergoing elective surgery under general anesthesia. In addition, any patient whose airway assessment received a modified Mallampati score of 3 or 4 was excluded from the intubation part of the study (see below). For both tasks (assessment and intubation), target success rates were set at 95%. These success rates were determined via consensus of the anesthesia department at our institution. Airway assessment was determined with the modified Mallampati score (score corresponds to grade for this study).3 In a supine position, with neck flexed and not extended or neutral, the patient protrudes his/her tongue while phonating. By simply looking in the pharynx, the PGY-1 determines the visibility of pharyngeal structures and assigns a score based on exposure of the glottis. Grade I ⫽ glottis could be fully exposed. Grade II ⫽ glottis could be partially exposed (anterior commissure not visualized). Grade III ⫽ glottis could not be exposed (corniculate cartilages only visible). Grade IV ⫽ glottis including corniculate cartilages could not be exposed. The score assigned by the PGY-1 was then recorded and compared with a staff (anesthesiologist or chief nurse anesthetist) independent assessment. A “successful” attempt was one in which the PGY-1’s score was concordant with that of the anesthesia staff. As long as the score was graded as 1 or 2, the patient was then moved into the operating suite and placed under general anesthesia. The intern then made an attempt at endotracheal intubation under direct supervision of a credentialed anesthesia provider. A “successful” endotracheal intubation was defined as one in which the endotracheal tube is in place with cuff inflated within 30 seconds of direct laryngoscopy and confirmed by end tidal CO2 of 30 mm Hg for 3 breaths. The CUSUM equation is defined as ⫽ Xi–Xo, where is the cumulative sum, Xi is an individual attempt, and Xo is the predetermined failure rate inherent for the procedure. Xi is assigned a score of 0 for a success and 1 for a failure. As a premise, every procedure has an inherent failure rate. The target success rate for intubation was set at 95% by our anesthesia department. Thus, the inherent failure rate was defined at 5% or 0.05. The score, after each attempt, was sequentially added to the cumulative score and plotted on a graph. Graphs were then analyzed on the basis of their slope. A positive, up-going slope CURRENT SURGERY • Volume 62/Number 3 • May/June 2005
indicates a series of failures, whereas a negative, down-going slope indicates a series of successes.1,2 The portion of the curve where the maximal change in slope begins to decrease signifies the end of the initial learning process (ie, “getting off the learning curve”). This study was approved by the Institutional Review Board and Human Use Committees at Madigan army Medical Center.
RESULTS Figure 1 demonstrates individual intern successes plotted versus attempts. Thus, a CUSUM analysis is done for each resident. Note the wide variability in the shapes and slopes of these curves. When this data are summated (ie, all first attempts are summed and then added to all second attempts, and subsequent attempts), Table 1 is formed. This table demonstrates the cumulative summation data for all residents at each of the tested skills. The cumulative summated data are demonstrated in Figs. 2 and 3. These curves represent the success or failure of the total group. Thus, the point at which the change of slope decreases indicates the number of procedures necessary for the average PGY-1 to get off of the learning curve. For endotracheal intubation, this point was estimated at 19 attempts. For airway assessment, the slope never changed; thus we deduce that no learning curve exists for this skill. Thus, despite a positive slope (ie, a success rate that falls below the defined acceptable rate), residents were as good as they were going to get at the outset for this standardized patient.
DISCUSSION Over the last 15 years, the CUSUM method has been used by physicians in various medical disciplines to assess trends and proficiencies. It has assessed antimicrobial treatment in neutropenic patients by plotting temperature curves.4 Schlup et al analyzed the technical proficiency of a single endoscopist in performing ERCP. A 90% target success rate was achieved for selective cannulation after 100 procedures and 120 interventions.5 McCarter et al examined the learning curves for FAST examinations performed by 5 trauma surgeons, as both individual persons and as an institution. Graphs were plotted for 3 different success rates, namely, 85%, 90%, and 95%. These subjects achieved accuracy from the outset, that is, without a learning curve.6 The following 2 studies are the only studies applying CUSUM to assess surgical residents. Van Rij et al looked at 17 surgical residents. They documented that 25 operations were needed before acceptable speed was reached in performing appendectomies, open cholecystectomies, and inguinal hernia repairs.7 Molloy et al studied proficiency at intraoperative cholangiography performed during laparoscopic cholecystectomies. A 95% success rate was achieved after 46 cases. Twenty-four and 16 cases were required to attain success rates of 90% and 85%, respectively.8 Learning curves were constructed by de Oliveira and for 331
3 1 1
3
5
7
5
4 3 2 1 0
4
1
9 11 13 15 17 19 21 23 25 27
3
5
7
Cusum
Cusum
CUSUM
5
-1
9
5
1
9 11 13 15 17 19 21
2 1 0
atte m pts
# attempts
3
-1
1
3
5
7
9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 Attempts
2
6
1 1
3
5
7
9 11 13 15 17 19 21 23 25 27
4
8
2 0 -2
attempts
10
1
3
5
7
9
11
13
15
17
19
21
23
25
C U SU M
3
-1
10
6
CUSUM
CUSUM
5
6 4 2
atte m pts
0 1
CUSUM
CUSUM
9 4
1
4
7
5
7
9
11 13 15
10 13 16 19 22 25 28 31 34 37 attempts
2 1.5 1 0.5 0 -0.5 -1
17 19 21
23 25 27
29
Atte m pts
7
3
-1
3
11 3.5
3
1
3
5
7
9
11 13
15 17 19 21 23 25
atte m pts
2.5
2
1.5
1
4
0.5
8
0 1
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
10
5 3 1 -1
2
A t t e mpt s
CUSUM
cusum
7
1
3
5
7
9
11 13 15 17 19 21 23 25 27 atte m pts
5 0 1
4
7
10
13 16
19 22
atte m pts
FIGURE 1. Cumulative summation of successes and failures versus attempts for each intern enrolled.
anesthesia residents in basic anesthesia procedures, including endotracheal intubation.9 This study sets a lower success rate of 80% but holds to a more stringent set of criteria by not allowing repeat attempts at intubation. In our study, we not only allowed for repeated attempts at intubation but also used a standardized pool of patients selected for less complex airways. Establishing a learning curve for residents to accomplish the skill of endotracheal intubation has accomplished 3 important endpoints. First, it has allowed us to define the minimum training experience necessary to succeed at this skill. Armed with the knowledge that at least 19 attempts are necessary before nearing proficiency, we have increased the caseload volume to accomplish this on a consistent basis. We feel this information is also useful when arranging the order of clinical rotations. As both trauma and critical care rotations require intubation skills, we have scheduled them to follow the anesthesia month. Second, we now have an objective method for measuring technical ability. With the establishment of an average learning curve, future residents can be objectively evaluated based on their ability to follow this curve. Finally, this method will allow for potential early intervention when skills are lower than average. If a given resident shows a persistently positively steep slope, and can be 332
identified early in the process, extra mentorship could be provided. This point is demonstrated by intern #8 in Fig. 1. This first-year resident was the only person not able to establish a trend at which the learning process could be identified. Retraining or a change in the training process for this task would be indicated for this person. The dynamic nature of this analysis can allow for early intervention and success, which avoids a declaration of failure at the end of the training period. Regarding the skill of airway assessment, residents displayed great variability in individual proficiency and as a group failed to achieve the target success rate of 95%. Some residents mastered the skill at the first attempt, whereas others displayed a continuous pattern of successes and failures. The conclusion that we must draw is that this skill requires no learning to accomplish and that degree of success or failure is fixed rather than alterable based on experience. This conclusion is based on the assumption that airway assessment is a measurable skill with defined parameters. An alternative conclusion for the observed pattern is that the criteria necessary to perform a CUSUM analysis were not met and thus the conclusion is not correct. Factors that would make the CUSUM invalid would be inadequate prestudy training such that the residents did not have CURRENT SURGERY • Volume 62/Number 3 • May/June 2005
TABLE I. Cumulative Summation Scores for All Residents at Each Attempt at Intubation and Airway Assessment. Assessment
SUM
CUSUM
6.45 6.45 2.45 2.45 1.45 4.45 1.45 2.45 1.45 4.45 ⫺0.55 1.45 ⫺0.55 1.45 0.45 0.45 1.45 3.45 1.45 0.45 2.50 0.55 ⫺0.45 ⫺0.45 0.60 ⫺0.40 ⫺0.30
6.45 12.90 15.35 17.80 19.25 23.70 25.15 27.60 29.05 33.50 32.95 34.4 33.85 35.30 35.75 36.20 37.65 41.10 42.55 43.00 45.50 46.05 45.60 45.15 45.75 45.35 45.05
Attempt 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
SUM
CUSUM
0.45 0.45 0.45 0.45 0.55 0.45 ⫺0.55 0.45 0.45 2.55 ⫺0.45 ⫺0.55 ⫺0.55 1.45 0.55 1.45 0.45 0.45 1.45 1.65 0.50 0.55 ⫺0.45 1.55 ⫺0.30 0.60 ⫺0.30
0.45 0.90 1.35 1.80 2.35 2.80 2.25 2.70 3.10 5.70 5.25 4.70 4.15 5.60 6.15 7.60 8.05 8.50 9.95 11.60 12.10 12.65 12.20 13.75 13.45 14.05 13.75
CUSUM
Intubation
Assessment 23 19 15 11 7 3 -1
1
5
9
13
17
21
25
attempts
FIGURE 3. Cumulative summation (y-axis) for all residents plotted against the number of attempts (x-axis) and airway assessment. As the slope was relatively constant, no learning curve was demonstrated.
derstanding of this skill. Another explanation is that this evaluation is a subjective judgment rather than a measured technical skill. Thus, it would not be subject to the same assumptions and objective determinations of success and failure. This study has established CUSUM analysis as an objective method of evaluating the technical proficiency of first-year surgical residents at our institution. We anticipate using it in multiple other procedures and the overall evaluation of our residents at all levels. We have already begun to alter schedule rotations, completely change rotational curricula, and initiate mentoring programs based on this evaluation tool.
REFERENCES equal skill levels at the onset. Perhaps a more extensive initial lecture on the modified Mallampati Score would be helpful. In addition, a standard quiz at completion of prestudy training could be administered to establish each resident’s cognitive un-
1. Ravin L. The CUSUM score. A tool for evaluation of clinical
competence. Ugeskrift For Laeger. 2001;163:3644-3648. 2. Goldsmith ODaP. Statistical Methods in Research and Pro-
duction. London: Longman; 1976. 3. Constantikes J. Predicting difficult tracheal intubation us-
Intubation
ing a modified Mallampati sign: a pilot study report. CRNA. 1993;4:16-20.
49
4. Kinsey SE Giles FJ, Holton J. Cusum plotting of tempera-
CUSUM
39
ture curves for assessing antimicrobial treatment in neutropenic patients. BMJ. 1989;299:775-776.
29
5. Schlup MM, Williams SM, Barbezat GO. ERCP: a review
19
of technical competency and workload in a small unit. Gastrointest Endosc. 1997;46:48-52.
9 -1
6. McCarter F, Luchette FA, Molloy M, et al. Institutional 1
6
11
16
21
26
attempts FIGURE 2. Cumulative summation (y-axis) for all residents plotted against the number of attempts (x-axis) at endotracheal intubation. The point at which the average intern comes off of the learning curve is approximately 19. CURRENT SURGERY • Volume 62/Number 3 • May/June 2005
learning curves for focused abdominal untrasound for trauma, cumulative summation analysis. Ann Surg. 2000; 231:689-700. 7. Van Rij AM, McDonald JR, Pettigrew R, Petterill M,
Reddy C, Wright J. Cusum as an aid to early assessment of the surgical trainee. Brit J Surg. 1995;82:1500-1503. 333
8. Molloy M, Bower RH, Hasselgren P, Dalton B. Cholan-
9. de Oliveira Filho GR. The construction of learning curves
giography during laparoscopic cholecystectomy. Cumulative summation analysis of an instituional learning curve. J Gastrointest Surg. 1999;3:185-188.
for basic skills in anesthetic procedures: an application for the cumulative summation analysis. Anesth Analg. 2002;95: 411-416.
334
CURRENT SURGERY • Volume 62/Number 3 • May/June 2005