Systematic review of coaching to enhance surgeons' operative performance

Systematic review of coaching to enhance surgeons' operative performance

ARTICLE IN PRESS Systematic review of coaching to enhance surgeons’ operative performance Hyeyoun Min, MD,a Dianali Rivera Morales, MS,b Dennis Orgil...

715KB Sizes 0 Downloads 33 Views

ARTICLE IN PRESS

Systematic review of coaching to enhance surgeons’ operative performance Hyeyoun Min, MD,a Dianali Rivera Morales, MS,b Dennis Orgill, MD, PhD,b,c Douglas S. Smink, MD,b,c,d and Steven Yule, PhD,b,c,d Seattle, WA; and Boston, MA

Background. There is increasing attention on the coaching of surgeons and trainees to improve performance but no comprehensive review on this topic. The purpose of this review is to summarize the quantity and the quality of studies involving surgical coaching methods and their effectiveness. Methods. We performed a systematic literature search through PubMed and PsychINFO by using predefined inclusion criteria. Evidence for main outcome categories was evaluated with the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) system and the Medical Education Research Study Quality Instrument (MERSQI). Results. Of a total 3,063 articles, 23 met our inclusion criteria; 4 randomized controlled trials and 19 observational studies. We categorized the articles into 4 groups on the basis of the outcome studied: perception, attitude and opinion; technical skills; nontechnical skills; and performance measures. Overall strength of evidence for each outcome groups was as follows: Perception, attitude, and opinion (Grading of Recommendations Assessment, Development, and Evaluation: Very Low, Medical Education Research Study Quality Instrument [MERSQI]: 10); technical skills (randomized controlled trials: High, 13.1; Observation studies: Very Low, 11.5); nontechnical skills (Very Low, 12.4) and performance measures (Very Low, 13.6). Simulation was the most used setting for coaching; more than half of the studies deployed an experienced surgeon as a coach and showed that coaching was effective. Conclusion. Surgical coaching interventions have a positive impact on learners’ perception and attitudes, their technical and nontechnical skills, and performance measures. Evidence of impact on patient outcomes was limited, and the quality of research studies was variable. Despite this, our systematic review of different coaching interventions will benefit future coaching strategies and implementation to enhance operative performance. (Surgery 2015;j:j-j.) From the Department of Surgery,a University of Washington Medicine, Seattle, WA; Center for Surgery and Public Health,b Brigham & Women’s Hospital, Boston, MA; Harvard Medical School,c Boston, MA; and STRATUS Center for Medical Simulation,d Brigham & Women’s Hospital, Boston, MA

COACHING IS A WELL-ACCEPTED EDUCATIONAL METHOD in fields such as sports, music, and business. Although there recently has been a great interest in coaching applied to surgery, the need for surgical coaching, in our experience, has been far from universally accepted. The purpose of this paper is to critically review the best papers in surgical coaching, to identify the level of evidence supporting surgical coaching and to describe areas of Funded by the Harvard Milton Fund. Accepted for publication March 14, 2015. Reprint requests: Steven Yule, PhD, STRATUS Center for Medical Simulation, Brigham & Women’s Hospital, 75 Francis Street, Neville House, Boston, MA 02115. E-mail: syule@ partners.org. 0039-6060/$ - see front matter Ó 2015 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.surg.2015.03.007

investigation in which the level of evidence could be improved. With increased focus on improving quality of care in surgery, surgical coaching may provide a viable mechanism for both technical and nontechnical skill (NTS) improvements in surgery. For decades, surgical training has followed the apprenticeship model. Considered a type of coach, a surgeon–teacher taught and assessed an individual trainee’s surgical skills. However, this method has been criticized because the traditional assessment of surgical skills is commonly associated with competency determination based on inadequate metrics.1,2 Moreover, the traditional apprenticeship model focuses on technical skills at an individual level without much emphasis on NTS. This is now changing as a number of groups, including the Accreditation Council for Graduate Medical Education, the American College of Surgeons, SURGERY 1

ARTICLE IN PRESS 2 Min et al

and the Surgical Council on Resident Education, have included NTS as part of the required core competencies3 and training curricula for surgeons. This inclusion is based on the increasing evidence that team training interventions in NTS suggest a reduction in communication failures4 and measurable decreases in surgical morbidity and mortality.5,6 Apart from teaching trainees, there is a current lack of coaching for practicing surgeons and thus potential benefit in assessing performance improvement.7,8 Hu et al9 showed that videobased coaching was valuable for surgeons at all stages of their career. Coaching as a method of enhancing performance is not a new phenomenon and is in fact commonly encountered in many other professional fields such as sports, music, education, and business.8,10 Regardless of the level of expertise of the person being coached, some experts argue that coaching in surgery is necessary because surgeons require deliberate practice to master tasks.11 A critical component of achieving this mastery is constructive feedback provided by an expert coach to mediate self-directed development.12 Coaches may behave differently depending on whom they are coaching. For example, a coach may act more as a partner and a collaborator for practicing surgeons and more as a teacher and an instructor for trainees. Tailoring the style of coaching would allow trainees a smoother transition into independent practice and practicing surgeons to reach and/or maintain expertise.7 As professional surgical societies begin to recognize the need for surgical coaching at all levels, we will need to expand beyond the traditional apprenticeship paradigm to fit today’s surgical culture and needs. We found an increasing number of studies seeking to present different coaching methods in surgery, but found no published systematic review of the coaching methods employed. The purpose of this review is to summarize the quantity and the quality of studies that implement coaching methods to enhance surgical performance in both surgeons and surgical trainees. We sought to determine the main outcomes and strength of evidence for each intervention in order to provide a reference for the development of impactful coaching strategies to improve valued skills and enhance safe practices in the operating room. METHODS Search strategy. A systematic literature search was conducted using the databases PubMed (1809

Surgery j 2015

to 11/18/2013---note that records are selective from 1809 to 1965; from 1966 to present, records are comprehensive), and PsychINFO (1597 to 11/ 18/2013---note that comprehensive coverage starts from the 1880s). Search terms ‘‘coaching,’’ ‘‘mentoring,’’ ‘‘debriefing,’’ ‘‘non-technical skills,’’ ‘‘leadership,’’ ‘‘decision making,’’ ‘‘situation awareness,’’ ‘‘learning,’’ ‘‘communication,’’ ‘‘teamwork,’’ ‘‘technique,’’ ‘‘technical skills,’’ ‘‘performance,’’ ‘‘review,’’ and ‘‘improvement’’ were linked with the medical subject heading ‘‘surgery’’ using the Boolean operator AND. At the initial search stage, no restrictions were applied to retrieve a comprehensive set of articles. In addition to these database searches, a search by hand for articles on coaching was conducted based on the references from recent articles, and contents pages of specific journals. Two authors (H.M. and D.R.M.) identified the relevant articles for full-text review by reviewing the titles and abstracts and reaching a mutual consensus. Definition of coaching. For the purpose of the literature search and data extraction, we defined coaching as ‘‘a form of inquiry-based learning characterized by collaboration between individuals or groups and more accomplished peers.’’13 Inclusion/exclusion criteria. Studies were included in the review if they involved coaching of surgeons and/or surgical trainees in the operating room or simulated operating room. Only original articles, published in English language in peer-reviewed journals were included. We included original research, review, or commentary articles. Studies were excluded if they deviated overtly from the study topic or if the study group included only nonsurgical health professionals, had no surgical intervention, had no measured outcome, or had the sole outcome of knowledge or participant satisfaction. Data extraction and synthesis. Three authors (H.M., D.R.M., S.Y.) independently reviewed the full texts of the relevant articles in a systematic fashion by using a predetermined data extraction form developed for this review. The fields on the form included author, year, country, target group, study design, study format, study content, learner assessment, coach assessment, outcome, main findings, coaching target, and intervention timing (data available upon request). Data analysis and grading of evidence. Three of the authors (H.M., D.R.M., S.Y.) independently assessed the quality of the extracted studies by using 2 different modes of evaluation: the Grading of Recommendations Assessment, Development and Evaluation (GRADE) system14,15 and the Medical Education Research Study Quality Instrument

ARTICLE IN PRESS Surgery Volume j, Number j (MERSQI).16 GRADE is an approach to evaluating quality of evidence and strength of recommendation from research studies based on critiquing the likelihood of bias; inconsistency of results; and indirectness of evidence. Using GRADE, we are able to categorize the evidence provided by thematic groups of studies as high, moderate, low or very low. The MERSQI is a 10-item scale tool specifically developed to evaluate educational studies that assess the study design type, sampling, type of data, validity of the evaluation instrument, data analysis, and outcomes. The MERSQI score is graded out of a total of 18 points. For studies that had Not Applicable items, we adjusted total score based on the maximum points available. These are not perfect metric systems but are the best available to systematically compare different articles with differing study designs using some standard criteria. The degree to which the original authors of the manuscripts reviewed here described the reliability and validity of their metrics were taken into account when calculating MERSQI and GRADE scores. Greater scores generally do mean that the study is more reliable and valid, but these scores cannot be taken as absolute indicators of validity or reliability of research results as the grading focuses on the method rather than the outcome of the studies. For example, novel clinical, educational, or translational research may make substantial scientific contributions and add to knowledge, but will almost always score lower than replication studies that employ randomized control designs. Any discrepancies in grading were resolved with group consensus. For example, all discrepancies were reviewed by the 3 assessors, and each point discussed against GRADE and MERSQI criteria. Any particular score in dispute was adjusted after the authors reviewed the article together and consulted the definition of the particular scoring rubric. Common disagreements resolved with this method were on response rates (it was possible to calculate these from data presented in tables--changing classification from ‘‘not reported’’ to ‘‘reported’’) and complexity of analysis (agreeing that any paper with inferential statistics, P value, or confidence interval be classified as ‘‘beyond descriptive analysis’’). This led us to define stricter definitions for MERSQI categories to guide our systematic analysis. RESULTS A flow diagram of the search results is illustrated in Fig. The initial search yielded a total of 3,063 articles. After duplicates and non-English articles

Min et al 3

were removed, 2,306 articles remained. Once the aforementioned inclusion/exclusion criteria were applied, 58 articles were selected for the full text review. After this, a total of 23 articles were selected for analysis.6,9,17-37 Design of studies and main outcome themes. Table I presents an overview of the 23 articles reviewed (ie, target group, total n, study design, coaching target, risk of bias, comments). Four studies22,26,27,32 were randomized controlled trials (RCTs) that compared the intervention group with a control group. Eighteen studies6,17-21,23-25,28-31,33-37 were observational studies using a single group prepost design, and one study was a descriptive study analyzing audio-recorded video coaching.9 The different types of reported outcomes were categorized largely based on Kirkpatrick’s model of training evaluation.38 This model is used widely but relatively novel in surgery. It is a flexible framework to apply to almost any training intervention and its strength is in differentiating between 4 distinct hierarchical levels of outcomes that often are complementary: reactions (affective responses, attitudes and perceptions of training), learning (whether the training resulted in an increase in knowledge or skills), behavior (whether participants change their behavior in the workplace as a result of training), and results (whether the training has affected process or outcomes). Many articles in this review concurrently addressed technical skills and behavioral outcomes, commonly termed nontechnical skills (NTS). In response, we adapted the Kirkpatrick model to include this terminology. The final outcome categories were therefore (1) perception, attitudes, and opinions; (2) technical skills; (3) nontechncial skills; and (4) performance measures. These groups were analyzed subsequently using the GRADE system in Table II. Six studies9,20,25,28,29,31 reported on more than one outcome and were included in more than one outcome summary. Because many of the observational studies yielded a Very Low GRADE score in each outcome cohort shown in Table II, we used the MERSQI score to better differentiate the quality of evidence of these studies (see Table I). The mean MERSQI score for the articles we reviewed was 12.16, greater than available benchmark data from 12 journals of 9.83.39 We also extracted information on setting, coach, duration, follow-up, and effectiveness of coaching intervention (Appendix). Eleven studies used simulation either on its own or in combination with another setting to conduct the coaching intervention.19,20,22,27,29-32,34,35,37 Thirteen of the studies

ARTICLE IN PRESS 4 Min et al

Surgery j 2015

Fig. Search and exclusion algorithm in the preferred reporting items for systemic reviews and meta-analyses (PRISM) format.

deployed an experienced surgeon as coach either on his or her own or in combination with a nonsurgeon.9,17-19,24-26,29,32-34,36,37 Only 4 studies conducted a follow-up after coaching intervention.6,25,27,32 Overall, coaching was considered to be effective in 15 of the 23 studies.6,17-19,22-25,27-29,32,35-37 Perception, attitudes, and opinions. The first level of the Kirkpatrick model of training evaluation consists of the learner’s reaction to the intervention. Seven studies included participants’ perception and attitude toward coaching interventions and participants’ opinions on their confidence and stress level after being coached.9,17,25,29,31,34,37 On the basis of the GRADE evaluation, this outcome category received a Very Low strength of evidence and a mean MERSQI score of 10, the lowest mean compared with the other three outcome categories. The main factors driving this relatively low score were selection bias (voluntary enrollment), nonblinded observers and self-assessed outcome measures. Most of the studies had additional measured outcomes described in subsequent categories.9,17,25,29,31 Two studies included learner’s confidence37 or trainee’s perception of the mentoring quality34 as their sole study outcome.

The setting of the coaching interventions varied between simulation29,31,34,37 and real-time or videotaped procedure in the operating room.9,17,28 Six studies specified the coach credentials as an expert surgeon.9,17,25,29,34,37 In technical skill coaching sessions, the reaction was positive as trainees reported increased confidence and comfort with laparoscopic procedures,37 and trainers also found intraoperative coaching and debriefing valuable.9 In settings involving NTS, participants found debriefing to be useful,17 and reported increased confidence and decreased stress after the coaching sessions.29 Technical skills. Ten studies reported outcomes related to technical skills: 4 RCTs22,26,27,32 and 6 observational studies.9,19,24,28,29,35 Outcomes measured included technical errors, suture/knot tying time and skills, number of operative movements, path length, field exposure, progression of the operation, and inadequate motion,19,22,24,26-29,32,35 with 2 studies using a validated assessment tool.19,28 The 4 RCTs resulted in a High strength of evidence according to GRADE with a mean MERSQI score of 13.1 compared with the 6 observational studies resulting in a Very Low grading and a lower mean

Author 17

Target group

Total n 52 (22 trainers, 30 trainees)

Study design

Coaching Target

Risk of bias

MERSQI

Pre-post, no control group

Uncertain

High

Observers not blinded; singleinstitution study

12

7

Pre-post, no control group

Individual

High

13.8

11

Pre-post, no control group

Individual

Uncertain

Possible bias from the differences in the baseline characteristics of the patients and other factors that were not accounted for Observers not blinded; Observation conducted by trainers; Use of a validated assessment tool Observers not blinded; convenience sampling of the resuscitation events; variation in the operating room team members before and after the study; Hawthorne effect possible Different length of training at each site; Different specialty cases per each site; observers not blinded; variation in the operating room team members before and after the study; possible selection bias Small sample size; Use of validated assessment tool Possible bias from unaccounted confounders; no statistical analysis done for the baseline demographics to demonstrate lack of significant difference

Briet et al19

Gynecologists

Capella et al20

Trauma team personnel

6 surgeons, 28 surgery residents, 80 nurses

Pre-post, no control group

Team

High

Catchpole et al21

Surgical teams

Three different sites

Pre-post, no control group

Team

High

Cole et al22

Medical students

Randomized controlled trial

Individual

Low

Culig et al23

Management team

Pre-post, has control group

Team

High

17 Unspecified

14

ARTICLE IN PRESS

Birch et al18

Surgeons and surgical residents Surgeons

Ahmed et al

Comments

Surgery Volume j, Number j

Table I. Overview of studies in the systematic review and MERSQI score

13.5

14

14.5 11.4

Min et al 5

(continued)

Author Goldman et al

Target group 24

Halverson et al25 Hamad et al26

Surgical residents Surgical personnel Surgical residents

Total n

Study design

9 (3 in 3 teams)

Pre-post, self-selected control group Pre-post, no control group

1,150

Coaching Target

Risk of bias

Team

High

Team

High

24

Nonrandomized 2-group study*

Individual

Uncertain

4

Observational/descriptive

Individual

High

Surgeons

Kirkpatrick27

Surgical residents

17

Randomized controlled trial

Team

Low

McCulloch et al28

Surgical personnel

54

Pre-post, no control group

Team

High

Mueller et al29

Medical students

22

Pre-post, no control group

Team

High

Neily et al6

VHA Facilities

Pre-post, has control group

Team

Uncertain

Nurok et al30

Surgical personnel

Pre-post, no control group

Team

High

74 (facilities)

50

MERSQI

Observers not blinded, groups were self-selected Observers not blinded; Selfassessment of participants Lack of randomization; small sample size; Observers not blinded, observation conducted by trainers Small sample size, no numerical outcome Small sample size, coaching effect captured in the outcome was intraprocedural feedback but not debriefing Observers not blinded; variation in the operating room team members before and after the study. Selection bias (voluntary enrollment). Observers not blinded. Self-assessment of participants for stress and confidence. Researchers blind to the study hypothesis; control group nonrandomly selected; Possible bias from unaccounted factors that could reduce operative mortality despite propensity score matching Observers not blinded; few baseline operating room information significantly different for pre-post interventions

9.5 11.5 12

10 11.5

ARTICLE IN PRESS

Hu et al9

Comments

6 Min et al

Table I. (continued)

15

10

15.6

14

Surgery j 2015

(continued)

Coaching Target

Total n

Peckler et al

Surgical interns

41

Pre-post, no control group

Team

High

Porte et al32

Medical students

45

Randomized controlled trial

Individual

Low

Schlachta et al33

Surgeons

2

Pre-post, no control group

Individual

High

Sereno et al34

Surgeons

40

Pre-post, no control group

Individual

High

Subramonian et al35

Medical students

13

Pre-post, no control group

Uncertain

High

Weaver et al36

Surgeons

3

Pre-post, no control group

Individual

High

Zimmerman et al37

Surgical residents

36

Pre-post, no control group

Team

High

31

Study design

Risk of bias

Comments

MERSQI

Observers not blinded; not an outcome-based study High interrater correlation; Use of a validated assessment tool Possible bias from unaccounted confounders Self-assessment of participants. Observers not blinded; Uncertain who the coach was, what the coaching involved. Set up as a noninferiority study where the main question is the learning curve for 2 different procedures that both have the same style o Possible bias from unaccounted confounders; set up similar to a noninferiority design Self assessment of participants

9.5 14.5

13.2 7 10.5

ARTICLE IN PRESS

Target group

Author

Surgery Volume j, Number j

Table I. (continued)

12.6

10

*To facilitate analysis, the nonrandomized 2-group study was categorized along with the other randomized controlled studies. MERSQI, Medical Education Research Study Quality Instrument.

Min et al 7

ARTICLE IN PRESS 8 Min et al

GRADE, Grading of Recommendations Assessment, Development and Evaluation; MERSQI, Medical Education Research Study Quality Instrument; RCTs, randomized controlled trials.

13.6 Very low Not serious Not detected Not serious Serious

Not serious

12.4 Very low Not serious Not detected Inconclusive Serious

Not serious

13.1 11.5 High Very low Not serious Serious Not detected Not detected Not serious Not serious Not serious Serious

Not serious Not serious

10 Very low Not serious Not detected Not serious Not serious Serious

Perception, attitudes and opinions 7 Observational9,17,25,29,31,34,37 Technical skills 4 RCTs22,26,27,32 6 Observational9,19,24,28,29,35 Nontechnical skills 8 Observational17,20,21,25,28-31 Performance measures 7 Observational6,18,20,23,28,33,36

Publication bias Indirectness Inconsistency Risk of bias Number of studies

Table II. Strength of body of evidence using GRADE and MERSQI, organized by main outcomes

Imprecision

Strength of body (GRADE)

MERSQI score

Surgery j 2015

MERSQI score of 11.5. The setting of these studies varied between simulations,19,22,27,29,32,35 real-time operating rooms,19,28 and video-review sessions.9,24,26 Five studies9,19,22,26,32 focused on individual coaching, 4 studies focused on team coaching.24,27-29 Interventions varied in duration ranging from one-time sessions to repeating sessions over weeks. Six of the 10 studies directly stated the credentials of the coach.9,19,24,26,29,32 One particular study designed for NTS training resulted in a decrease in technical error as well.28 Three of the RCT studies22,27,32 demonstrated an overall positive performance improvement in the coached individuals whereas the fourth, with a focus on the impact of debriefing after intraoperative cases, failed to demonstrate any significant difference in knot-tying time, minor errors, and anastomotic time.26 Five observational studies also showed a general improvement in technical skills after the intervention.19,24,28,29,35 Two studies attempted to assess sustainability of the improved skills by conducting follow-up evaluation with varying results.27,32 The study with a 1-month followup32 showed sustained improvement of skills on delayed performance testing whereas as the study with a 4-month follow-up showed no significant difference between the coached versus noncoached group.27 Nontechnical skills. NTS relevant for surgeons refer to a range of behavioral items, such as situation awareness, decision making, communication, teamwork, and leadership.40 Eight studies focused on evaluating the impact of coaching on NTS improvement,17,20,21,25,28-31 and were appraised to have a Very Low strength of evidence according to GRADE but a relatively greater mean MERSQI score of 12.4. All studies were team-level interventions, and 50% of these studies took place in a simulation setting.20,29-31 The other half took place in operating rooms during real operations.17,21,25,28 This outcome category included several studies that employed nonsurgeons as coaches.20,21,25,28,30 Only 2 studies collected follow-up data and it was collected 3–6 months after the implementation period.25,30 Five of the eight studies demonstrated that the coaching interventions were successful as they were associated with greater nontechnical skills assessment scores and surgical situation awareness,28 improved debriefing quality amidst the team,17 improved NTS and performance under stress,29 and significant improvement of team performance measured by the Trauma Team Performance Observation Tool.20 In contrast, 3 other studies showed inconclusive results.21,30,31 Catchpole

ARTICLE IN PRESS Surgery Volume j, Number j et al21 attempted to measure the effect of aviationstyle team training on 3 different surgical subspecialties focusing on NTS and found that 1 site demonstrated an improvement after the coaching intervention whereas the other 2 sites showed either no change or a worsening effect. Nurok et al30 showed a significant improvement of communication and team skills immediately after coaching intervention that were no longer significant when re-assessed during a 3-month follow up. Performance measures. Seven studies with coaching interventions collected patient outcomes, such as mortality, morbidity, complications, hospital duration of stay, wound complications, and conversion rates, which were collectively grouped as Performance Measures in our review.6,18,20,23,28,33,36 These studies demonstrated a Very Low strength of evidence according to GRADE with a mean MERSQI score of 13.6, greater than that of the RCTs included under Technical Skills. Most of the studies involved coaching sessions in real-time operating rooms with some classroom sessions.6,18,23,28,33,36 The duration of the coaching intervention varied from 1-day sessions6,20 up to repeating sessions over several months to 2 years.18,23,28 Coaches included expert surgeons coaching real-time operations,18,33,36 a team of a nonsurgeon and surgeon observing trauma team performance,20 and a non-health care professional coach during an NTS training session.28 Five of the studies demonstrated improvement in patient outcomes after coaching interventions6,18,23,28,36 whereas the remaining two had inconclusive results.20,33 DISCUSSION Surgical coaching supports deliberate practice and skill acquisition at all levels of training and practice. As this concept continues to receive attention,7,8 it is necessary to reflect on the new paradigm of surgical coaching as it pertains to the current surgical culture and needs. Therefore, we systematically reviewed coaching interventions in surgery to help evaluate and summarize key learning concepts that are currently available. Although variable in their quality based on GRADE and MERSQI scores, the studies reviewed provide promise that coaching can improve surgeon performance in a number of realms. The majority of these studies were field trials aimed at real-world applications; as a result some facets of research design such as strict experimental control and blinding of observers were not possible. Given the variable study designs and outcomes, the studies were appraised after being grouped into four themes based on the Kirkpatrick model.

Min et al 9

Perception, attitudes, and opinions. Despite being the weakest evidence of an effective educational intervention, learners’ perception, attitudes, and opinions are valuable as they allow greater insight into feasibility of different coaching methods. Our review demonstrated that the papers in this category had a Very Low evidence according to GRADE, which was largely because of the increased risk of bias resulting from a lack of control groups and the nature of the outcomes requiring self-assessments. One study focusing on technical skills received industry funding; therefore, we were not able to rule out publication bias.37 This outcome group scored low on evidence quality according to GRADE due to the lack of RCTs but they had other redeeming features in terms of innovative study design and potential impact. This exemplifies why we decided to further analyze our selected articles using the MERSQI approach. The MERSQI score allowed us to more finely differentiate the quality of the papers. Of the 7 studies, those assessing additional outcomes such as skills and behavior in addition to perception and attitude undoubtedly received greater MERSQI scores.4,9,17,29,31 Ahmed et al17 received a MERSQI score of 12, the greatest MERSQI score in this category and the 8th greatest MERSQI score of all 23 articles. The relatively high MERSQI score can be attributed to its additional outcome measurement of NTS implicit in debriefing, as well as its use of a validated assessment tool and strong interrater reliability. This study demonstrated that evidence-based interventions shown to improve debriefing translate to the real operating room setting and that objective scores correlated with learners’ perception of debriefing. Furthermore, Halverson et al’s study25 with a MERSQI score of 11.5 presented both trainees’ perception of teamwork along with behavior outcomes. In their study, surgeons had the greatest perception of teamwork rating at baseline compared to nurses and anesthesiologists, which supports previous studies demonstrating the differences in perceptions of team-based sessions.41,42 Such findings have high implications for future interventions as many of the coaching interventions designate surgeons as ‘‘the leader’’ in operating rooms. On the lower spectrum of the MERSQI scores was the study by Sereno et al,34 with a MERSQI score of 7, that studied trainee perception of remote robotic telementoring versus onsite mentoring. The study included a serious bias because of a lack of a control group and self-assessment by participants with an inconclusive finding that there was no

ARTICLE IN PRESS 10 Min et al

significant difference in participants’ perceptions between groups. While being cognizant of the risk of serious bias in these studies, we can extract that learner perception, attitudes, and/or opinions generally are positive towards coaching interventions and may be particularly useful in obtaining insight into individual interactions involving team-based coaching or skills, particularly relevant in coaching interventions associated with NTS. Attitudes and perceptions may also be beneficial in understanding the context of inconsistent team-based findings. We conclude that these subjective outcomes are recommended as a supplement to study designs measuring concrete skills or behaviors but not as the sole outcome in a given study. Technical skills. Technical skills continue to be an essential skill for surgeons in the fast-evolving surgical field. Implementation of new technology for practicing surgeons and the long-term changes in legal and ethical factors in health care leave residents potentially less prepared for practice than their predecessors.43 Therefore, effective learning of technical skills is necessary. Table I indicates whether studies focused on surgical residents or practicing surgeons. The 4 randomized controlled studies used control groups and presented a strong quality of evidence reflected in their GRADE and MERSQI scores. Despite small sample sizes, some studies were exemplary in that they were able to randomize the participants.22,27,32 Some even took into consideration the effect of possible biases, such as interest in the specific field of the performed operation, could have on the measured outcome in the pre-intervention analysis.27 This finding is in contrast the observational studies in this category that used a pre-post study design.19,24,28,29,35 Risk of bias was high in these studies, as there was no way to decipher possible testing and/or instrumentation threat without a control group. In other words, a pre-post design does not adequately answer whether improved technical skills result from the intervention or other possible factors. Other possible biases include observations being conducted by the trainers/coaches,19 nonblinded observers,24,29 as well as selection bias due voluntary enrollment of the participants.29 The use of a simulation setting was common22,27,32 in the reviewed studies because it provides a safe practice environment. In this setting, measurable outcomes in technical skills that seemed particularly useful and sensitive to coaching interventions were the quality of performance

Surgery j 2015 and error rates,22 which other studies have suggested to be the most valuable metric to evaluate skills acquisition during simulation.44 For an intraoperative setting, videotaping cases for later discussion during debriefing was used.26 Previous evidence has shown videos to be valuable in allowing self-reflection45 and delivery of customized feedback9,46,47 in a convenient, timely fashion. The current evidence allows us to conclude that coaching interventions comprising lecture, concurrent feedback, and debriefing lead to significant performance improvement in technical skills,27 whereas debriefing on its own may not be sufficient.26 With respect to the credentials of a coach, an experienced surgeon cannot be replaced by computerized external feedback and still contribute to greater rates of skill retention.32 The most effective timing of a coaching session is noted to be either concurrent with22,27 or immediately following a surgical procedure.26,27,32 Notably, coaching interventions seem to impact error reduction rates22 rather than generic skills like economy of movement. Nontechnical skills. Approximately 10% of hospital patients are unintentionally harmed by modern health care48,49 where gaps in effective communications and teamwork,50-52 as well as organizational complexity keep patients particularly vulnerable. Of these, surgical patients are at a particularly high risk49 where the work environment is perceived to be stressful53 and odds of communication and awareness failures leading to patient harm are high.54 These NTS are critical to the success of operations and the safety of patients. To develop these skills, coaching and simulation-based training have been shown to be useful in other performance-based industries.55 We reviewed 8 articles of which 6 demonstrated a positive impact due to the coaching interventions on NTS. The reviewed studies had a low GRADE rating because of the serious risk of bias given the lack of control groups, nonblinded observers in most of the cases, as well as variation in the makeup of operating room team members in the pre-post observational design; however, this outcome category had a relatively greater mean MERSQI score in comparison with the other observational studies so the overall quality of evidence was good. Greater scores are in part attributable to the use of various validated assessment tools, such as the Oxford NonTechnical Skills (NOTECHS) scoring system and the 360-degree evaluation of 5 nontechnical criteria. Both surgeon and nonsurgeon coaches were used to coach NTS with positive, albeit shortterm, results.

ARTICLE IN PRESS Surgery Volume j, Number j

We were unable to determine the long-term impact of coaching in several of the studies in this category. In Catchpole et al21 the disparate results seem partly related to the enthusiasm of the senior professional leadership at each of the three sites, as well as the limitation in the study design where prepost intervention groups differed and included members who had not undergone the coaching intervention. Furthermore, data on sustainable behavior changes after intervention were lacking. Most studies did not assess long term-retention and the one that did, showed a non-significant difference in NTS 3 months after intervention.30 It is also unclear whether assessing for behavioral intention instead of the actual change is particularly helpful in assessing coaching interventions.31 We note that longitudinal follow-up studies of clinicians’ behavior are incredibly difficult to orchestrate in real-world settings so these initial studies provide a powerful base on which to build. Even though the strength of evidence for NTS improvement after coaching interventions was low, we observed that for the evaluation of NTS, nonsurgeons seem to be effective coaches that can be used as either an alternative to or in addition to a surgeon coach; and in the majority of cases coaching had a positive impact on NTS. However, the long-term effect of NTS coaching remains unclear. Performance measures. For all interventions, we strive to achieve improved patient or health care outcomes, which the Kirkpatrick model names as the gold standard. We grouped these outcomes under Performance Measures in 7 of the studies included in this review.6,18,20,23,28,33,36 It is interesting to note that these papers received drastically different ratings from the 2 grading criteria used: Very Low strength of evidence per GRADE and the highest mean MERSQI score even when comparing them to the RCTs previously described in the Technical Skills section. The low rating from GRADE is largely due to its serious risk of bias as the seven papers simply reflect associations between final outcomes and the coaching intervention and they do not take into consideration other serious confounders, such as patient characteristics and numerous risk factors at the preoperative, intraoperative and postoperative settings. This particular flaw, however, is not adequately detected in the MERSQI grading criteria where studies with patient/health care outcomes receive up to 3 points, significantly inflating the total MERSQI score for these studies despite the serious risk of bias. That said, a broad conclusion can be drawn from these studies. Coaching

Min et al 11

is a safe and effective way to teach and mentor other surgeons or surgical trainees to improve learning, and intraoperative performance. Direct impact on patient care and outcomes has not been robustly demonstrated yet, but we observed that in most studies outcomes did improve after the coaching intervention. Future implications. This systematic review has revealed various qualities of evidence that suggest the usefulness of coaching interventions for surgeons and surgical trainees. Most studies have shown a positive impact of the coaching interventions on learners’ perception and attitudes, technical skills, NTS, and performance measures; a few, however, revealed inconsistent findings. These findings justify expansion of surgical coaching programs by surgical educators, albeit with a research focus to advance our knowledge of the impact of coaching on surgical performance. These interventions will likely be directed at surgical trainees as well as practicing surgeons. Surgeons in practice have the greatest potential for benefit as they currently receive little instruction on technical and NTS once in surgical practice. Ensuring that feedback is constructive and objective is important, as some may feel threatened by being subject to more scrutiny even after completing a long surgical training program. Future research should focus on addressing the gaps that were identified in our current evidence by striving to conduct more RCTs as most of the current studies consisted of a pre-post design. If resources are limited, providing a nonrandomized control group would help address the possible time-related effects on the outcomes, an inherent flaw in a pre-post study design. Furthermore, our review has revealed a paucity of studies that assess for skill retention at a delayed time point after receiving the coaching intervention. With an improved study design that offers stronger evidence, we will obtain more concrete answers to target the most impactful coaching interventions for surgeons. Specific questions pertaining to the ideal content of coaching, coach credential, individual versus team coaching, and timing of coaching have been mentioned in this review and will need to be further addressed when more evidence is available. A new framework to support systematic statewide coaching to help surgeons improve technical and NTS recently was developed and provides a potential foundation for future coaching interventions.56 Evaluation of these coaching techniques in a rapid implementation trial to generate data on feasibility and effectiveness is a crucial next step.

ARTICLE IN PRESS 12 Min et al

REFERENCES 1. Sachdeva AK. Acquiring skills in new procedures and technology: the challenge and the opportunity. Arch Surg 2005; 140:387-9. 2. Bass BL, Polk HC, Jones RS, Townsend CM, Whittemore AD, Pellegrini CA, et al. Surgical privileging and credentialing: a report of a discussion and study group of the American Surgical Association. J Am Coll Surg 2009;209:396-404. 3. American College of Surgeons ACS/APDS surgical skills curriculum for residents, phase 3: team-based skills. Available from: http://www.facs.org/education/surgicalskills. html. 4. Halverson AL, Casey JT, Andersson J, Anderson K, Park C, Rademaker AW, et al. Communication failure in the operating room. Surgery 2011;149:305-10. 5. Young-Xu Y, Neily J, Mills PD, Carney BT, West P, Berger DH, et al. Association between implementation of a medical team training program and surgical morbidity. Arch Surg 2011;146:1368-73. 6. Neily J, Mills PD, Young-Xu Y, Carney BT, West P, Berger DH, et al. Association between implementation of a medical team training program and surgical mortality. JAMA 2010; 304:1693-700. 7. Greenberg C. Surgical coaching: An idea whose idea has come. ACS Surgery News [Internet]. Parsippany, NJ: Frontline Medical Communications Inc.; 2012. Available from: http://www.acssurgerynews.com/opinions/editorials/singlearticle/surgical-coaching-an-idea-whose-time-has-come/b2e6e 8e538fd87e5414c4ff9cb4f74a5.html. Accessed May 17, 2014. 8. Gawande A. Personal Best: Top athletes and singers have coaches. Should you? New York, NY: The New Yorker, Conde Nast; October 3, 2011. 9. Hu YY, Peyre SE, Arriaga AF, Osteen RT, Corso KA, Weiser TG, et al. Postgame analysis: using video-based coaching for continuous professional development. J Am Coll Surg 2012;214:115-24. 10. Bush RN. Making Our Schools More Effective: Proceedings of Three State Conferences. San Francisco, CA: Far West Laboratory for Educational Research and Develpment; 1984. 11. Reznick RK, MacRae H. Teaching surgical skills---changes in the wind. N Engl J Med 2006;355:2664-9. 12. Ericsson KA. Deliberate practice and acquisition of expert performance: a general overview. Acad Emerg Med 2008; 15:988-94. 13. Poglinco SM, Bach AJ, Hovde K, Rosenblum S, Saunders M, Supovitz JA. The Heart of the Matter: The Coaching Model in America’s Choice Schools. Philadelphia, PA: Consortium for Policy Research in Education [Internet]; 2003. Available from: http://files.eric.ed.gov/fulltext/ED498335.pdf. 14. Guyatt GH, Oxman AD, Kunz R, Vist GE, Falck-Ytter Y, Schunemann HJ, et al. What is ‘‘quality of evidence’’ and why is it important to clinicians? BMJ 2008;336:995-8. 15. Guyatt GH, Oxman AD, Vist GE, Kunz R, Falck-Ytter Y, Alonso-Coello P, et al. GRADE: an emerging consensus on rating quality of evidence and strength of recommendations. BMJ 2008;336:924-6. 16. Reed DA, Cook DA, Beckman TJ, Levine RB, Kern DE, Wright SM. Association between funding and quality of published medical education research. JAMA 2007;298: 1002-9. 17. Ahmed M, Arora S, Russ S, Darzi A, Vincent C, Sevdalis N. Operation debrief: a SHARP improvement in performance feedback in the operating room. Ann Surg 2013;258:958-63.

Surgery j 2015

18. Birch DW, Asiri AH, de Gara CJ. The impact of a formal mentoring program for minimally invasive surgery on surgeon practice and patient outcomes. Am J Surg 2007;193: 589-91; discussion 91-2. 19. Briet JM, Mourits MJ, Kenkhuis MJ, van der Zee AG, de Bock GH, Arts HJ. Implementing an advanced laparoscopic procedure by monitoring with a visiting surgeon. J Minim Invasive Gynecol 2010;17:771-8. 20. Capella J, Smith S, Philp A, Putnam T, Gilbert C, Fry W, et al. Teamwork training improves the clinical care of trauma patients. J Surg Educ 2010;67:439-43. 21. Catchpole KR, Dale TJ, Hirst DG, Smith JP, Giddings TA. A multicenter trial of aviation-style training for surgical teams. J Patient Saf 2010;6:180-6. 22. Cole SJ, Mackenzie H, Ha J, Hanna GB, Miskovic D. Randomized controlled trial on the effect of coaching in simulated laparoscopic training. Surg Endosc 2014;28:979-86. 23. Culig MH, Kunkle RF, Frndak DC, Grunden N, Maher TD Jr, Magovern GJ Jr. Improving patient care in cardiac surgery using Toyota production system based methodology. Ann Thorac Surg 2011;91:394-9. 24. Goldman LI, Maier WP, Rosemond GP, Saltzman SW, Cramer LM. Teaching surgical technique by the critical review of videotaped performance---the surgical instant replay. Surgery 1969;66:237-41. 25. Halverson AL, Andersson JL, Anderson K, Lombardo J, Park CS, Rademaker AW, et al. Surgical team training: the Northwestern Memorial Hospital experience. Arch Surg 2009;144:107-12. 26. Hamad GG, Brown MT, Clavijo-Alvarez JA. Postoperative video debriefing reduces technical errors in laparoscopic surgery. Am J Surg 2007;194:110-4. 27. Kirkpatrick JS. A comparison C1-C2 transarticular screw placement after self-education and mentored education of orthopaedic residents. J Spinal Disord Tech 2012;25: E155-60. 28. McCulloch P, Mishra A, Handa A, Dale T, Hirst G, Catchpole K. The effects of aviation-style non-technical skills training on technical performance and outcome in the operating theatre. Qual Saf Health Care 2009;18:109-15. 29. Mueller G, Hunt B, Wall V, Rush R Jr, Molof A, Schoeff J, et al. Intensive skills week for military medical students increases technical proficiency, confidence, and skills to minimize negative stress. J Spec Oper Med 2012;12:45-53. 30. Nurok M, Lipsitz S, Satwicz P, Kelly A, Frankel A. A novel method for reproducibly measuring the effects of interventions to improve emotional climate, indices of team skills and communication, and threat to patient outcome in a high-volume thoracic surgery center. Arch Surg 2010;145: 489-95. 31. Peckler B, Prewett MS, Campbell T, Brannick M. Teamwork in the trauma room evaluation of a multimodal team training program. J Emerg Trauma Shock 2012;5:23-7. 32. Porte MC, Xeroulis G, Reznick RK, Dubrowski A. Verbal feedback from an expert is more effective than selfaccessed feedback about motion efficiency in learning new surgical skills. Am J Surg 2007;193:105-10. 33. Schlachta CM, Sorsdahl AK, Lefebvre KL, McCune ML, Jayaraman S. A model for longitudinal mentoring and telementoring of laparoscopic colon surgery. Surg Endosc 2009;23:1634-8. 34. Sereno S, Mutter D, Dallemagne B, Smith CD, Marescaux J. Telementoring for minimally invasive surgical training by wireless robot. Surg Innov 2007;14:184-91.

ARTICLE IN PRESS Surgery Volume j, Number j

35. Subramonian K, DeSylva S, Bishai P, Thompson P, Muir G. Acquiring surgical skills: a comparative study of open versus laparoscopic surgery. Eur Urol 2004;45:346-51; author reply 51. 36. Weaver FA, Hood DB, Shah H, Alexander J, Katz S, Rowe V, et al. Current guidelines produce competent endovascular surgeons. J Vasc Surg 2006;43:992-8; discussion 8. 37. Zimmerman H, Latifi R, Dehdashti B, Ong E, Jie T, Galvani C, et al. Intensive laparoscopic training course for surgical residents: program description, initial results, and requirements. Surg Endosc 2011;25:3636-41. 38. Kirkpatrick DL. Techniques for evaluating training. Train Dev J 1979;33:78-92. 39. Reed DA, Beckman TJ, Wright SM. An assessment of the methodologic quality of medical education research studies published in The American Journal of Surgery. Am J Surg 2009;198:442-4. 40. Yule S, Flin R, Paterson-Brown S, Maran N. Non-technical skills for surgeons in the operating room: a review of the literature. Surgery 2006;139:140-9. 41. Sexton JB, Thomas EJ, Helmreich RL. Error, stress, and teamwork in medicine and aviation: cross sectional surveys. BMJ 2000;320:745-9. 42. Makary MA, Sexton JB, Freischlag JA, Holzmueller CG, Millman EA, Rowen L, et al. Operating room teamwork among physicians and nurses: teamwork in the eye of the beholder. J Am Coll Surg 2006;202:746-52. 43. Mattar SG, Alseidi AA, Jones DB, Jeyarajah DR, Swanstrom LL, Aye RW, et al. General surgery residency inadequately prepares trainees for fellowship: results of a survey of fellowship program directors. Ann Surg 2013; 258:440-9. 44. Gallagher AG, Ritter EM, Champion H, Higgins G, Fried MP, Moses G, et al. Virtual reality simulation for the operating room: proficiency-based training as a paradigm shift in surgical skills training. Ann Surg 2005;241:364-72.

Min et al 13

45. Rex DK, Hewett DG, Raghavendra M, Chalasani N. The impact of videorecording on the quality of colonoscopy performance: a pilot study. Am J Gastroenterol 2010;105:2312-7. 46. Hoyt DB, Shackford SR, Fridland PH, Mackersie RC, Hansbrough JF, Wachtel TL, et al. Video recording trauma resuscitations: an effective teaching technique. J Trauma 1988; 28:435-40. 47. Nakada SY, Hedican SP, Bishoff JT, Shichman SJ, Wolf JS Jr. Expert videotape analysis and critiquing benefit laparoscopic skills training of urologists. JSLS 2004;8:183-6. 48. Brennan TA, Leape LL. Adverse events, negligence in hospitalized patients: results from the Harvard Medical Practice Study. Perspect Healthc Risk Manage 1991;11:2-8. 49. Vincent C, Neale G, Woloshynowych M. Adverse events in British hospitals: preliminary retrospective record review. BMJ 2001;322:517-9. 50. Kohn L. To err is human: an interview with the Institute of Medicine’s Linda Kohn. Jt Comm J Qual Improv 2000;26: 227-34. 51. Mazzocco K, Petitti DB, Fong KT, Bonacum D, Brookey J, Graham S, et al. Surgical team behaviors and patient outcomes. Am J Surg 2009;197:678-85. 52. Wolff AM, Bourke J. Reducing medical errors: a practical guide. Med J Aust 2000;173:247-51. 53. Rosenstein AH, O’Daniel M. Impact and implications of disruptive behavior in the perioperative arena. J Am Coll Surg 2006;203:96-105. 54. Greenberg CC, Regenbogen SE, Studdert DM, Lipsitz SR, Rogers SO, Zinner MJ, et al. Patterns of communication breakdowns resulting in injury to surgical patients. J Am Coll Surg 2007;204:533-40. 55. Hackman JR, Wageman RA. A theory of team coaching. Acad Manage Rev 2005;30:269-87. 56. Greenberg CC, Ghousseini HN, Pavuluri Quamme SR, Beasley HL, Wiegmann DA. Surgical coaching for individual performance improvement. Ann Surg 2015;261:32-4.

Author, yr

Setting

Coach

Time of training

Follow-up

Was coaching effective overall?

Main findings (1) Significant improvement of OSAD scores postintervention compared with preintervention, and number of educational objectives set increased postintervention; (2) Trainees’ assessment of debriefing were favorable; and (3) Users reported high levels of satisfaction (1) Mentorship program increased adoption rate (number of cases completed), decreased the number of conversions to open surgery (nonsignificant), significantly reduced intraoperative complications and number of colorectal resections. Postoperative complications were unchanged; and (2) After mentoring the number of cases completed increased from 35 to 102

Ahmed et al,17 2013

Real-time OR

Moreexperienced surgeon

10 min

No

Yes

(1) Quality of educational debriefing assessed using OSAD tool; (2) Trainees’ assessment of debriefing assessed using a survey; and (3) User satisfaction survey of usefulness, comprehensiveness, feasibility, overall satisfaction

Birch et al,18 2007

Real-time OR

Moreexperienced surgeon

53 mentored cases in 1 year

No

Yes

(1) Outcomes: adoption rate, conversion rate, operative time, intraoperative/ postoperative complications, mortality, length of stay; and (2) number of MIS cases completed

ARTICLE IN PRESS

Outcomes assessed

14 Min et al

Appendix. Main findings

(continued)

Surgery j 2015

Author, yr Briet et al, 2010

19

Coach

Time of training

Follow-up

Was coaching effective overall? Yes

Simulation/ real-time OR

More experienced surgeon

1 workshop (no mention of workshop length)/As many OR coaching sessions as needed for trainee to achieve a passing score 2 times.

No

Classroom/ simulation

Nonsurgeon (PhD and RN)/ surgeon

2 hour didactic session/2-hour simulation session

No

Inconclusive (team performance improved but no impact on patient outcomes)

Outcomes assessed

Main findings

(1) Gynecologist competency at performing laparoscopic hysterectomies measured using OSATS; (2) Complications detected intraoperatively or up to 6 months postoperatively; and (3) Blood loss, operative time, conversion rate (1) Participant performance measured by The Trauma Team Performance Observation Tool; and (2) patient outcomes (mortality, complications, hospital length of stay, intensive care unit LOS, time from arrival to Focused Assessment Sonography in Trauma examination, time from arrival to CT scanner, time from arrival to intubation, time from arrival to operating room, and time from arrival to departure from the emergency department) before and after training

(1) 9 of 11 participants had passing OSATS grade; (2) Complication rates comparable during and after learning curve; and (3) No significant differences in blood loss, operating time or conversion rate

ARTICLE IN PRESS

Capella et al,20 2010

Setting

Surgery Volume j, Number j

Appendix. (continued)

(1) Team performance improved significantly across all domains of teamwork; and (2) Training reduced the times from arrival to CT scanner, endotracheal intubation, and operating room significantly; however, the remaining patient outcomes did not show a significant change between pre- and posttraining

Min et al 15

(continued)

Author, yr

Setting

Coach

Time of training

Follow-up

Classroom/ real time OR

Independent

2 days initial followed by 1 week and extended to 8 sessions per site.

No

Cole et al,22 2013

Simulation

Not described

Coaching provided for first 9 cases.

No

Inconclusive (variation of site NOTECHS scores)

Yes

Outcomes assessed

Main findings

(1) Frequency of pre-list briefing; (2) frequency of preincision time-out; (3) frequency of postlist/case debriefing; (4) intraoperative teamwork measured using NOTECHS; and (5) operative time

(1, 2, 3) There were significantly more briefings, debriefings and time outs after intervention; (4) NOTECHS scores no change in site 1, significantly improved in site 2 and significantly depreciated in site 3 after intervention; and (5) operative duration was longer after intervention (1) Significant improvement in intervention group at operations 1, 5, and 10; (2) Intervention group scored greater on knowledge than control group after 5th and 10th procedure; and (3) Operating time was lesser in control group but number of errors significantly. No difference in path length or number of movements between groups

(1) Surgical quality measured by competency assessment tool; (2) knowledge test of anatomy, procedural steps, instruments, common errors; and (3) performance metrics (path length, operating time, number of movements, number of errors)

ARTICLE IN PRESS

Catchpole et al,21 2010

Was coaching effective overall?

16 Min et al

Appendix. (continued)

(continued)

Surgery j 2015

Author, yr

Setting

Coach

Time of training

Follow-up

Was coaching effective overall?

Classroom/ real-time OR

Not described

Weekly training for 24 mo

No

Yes

Goldman et al,24 1969

Case review session

More-experienced surgeon

1 review session

No

Yes

Halverson et al,25 2009

Classroom/ real-time OR

More-experienced surgeon and nonsurgeon

4-hour class/ Intraoperative coaching for 2 wk

Yes, repeated observation period 6–8 mo after implementation period.

Yes

Main findings

(1) Observed outcomes of isolated CABG surgery; (2) costs of complications of CABG surgery; and (3) patient satisfaction after operational excellence methodology training (1) Exposure; (2) motion; and (3) progression of the operation

Operational excellence training resulted in (1) low complication rates; (2) cost savings; and (3) high patient satisfaction

(continued)

Min et al 17

(1) Hospital metrics: adverse events, antibiotic use, OR efficiency (operation start time, turnover time); (2) Teamwork attitude surveys; and (3) Direct intraoperative observation using a checklist pre- and post intervention

(1) Exposure time for Team A (video review and feedback) and Team B (video review) fell. Team C (no intervention) showed an increase in exposure time; (2) Inappropriate motions for Team A and Team B were reduced after intervention whereas Team C showed an increase; and (3) Intervention did not have an impact on the progression of operation (1) No change in antibiotic use or turnover time, adverse events data not stated; (2) Significantly improved perceptions of teamwork in 14/19 survey items; and (3) Frequency preoperative briefings, compliance with time-out items improved after intervention

ARTICLE IN PRESS

Culig et al,23 2011

Outcomes assessed

Surgery Volume j, Number j

Appendix. (continued)

Author, yr Hamad et al,26 2007

Setting Case review session

Coach More-experienced surgeon

Time of training After surgical procedure

Follow-up No

Was coaching effective overall? No (noncoached group started off poorer but both groups improved over the course of the trial); Yes (adverse events from technical errors)

Outcomes assessed (1) Frequency of minor technical errors; (2) frequency of adverse event; (3) knot-tying time; and (4) anastomosis time

Main findings

ARTICLE IN PRESS

(1) Technical errors significantly higher in nondebriefed group over but no difference between groups over trial; (2) a significant difference between groups in adverse events from technical errors; (3) no significant difference between groups in knot-tying time over 4 wk of trial; and (4) no significant difference between groups in anastomosis time over 4 weeks of trial

18 Min et al

Appendix. (continued)

(continued)

Surgery j 2015

Author, yr

Setting

Coach

Time of training

Case review session

More-experienced surgeon

1-h blocks

Kirkpatrick et al,27 2012

Simulation

Not described

One lecture, concurrent feedback, debriefing

No

Yes, test repeated 4 mo after coaching to evaluate skill retention.

Was coaching effective overall? Not studied

Yes

Outcomes assessed

Main findings

(1) Coaching techniques; (2) frequency of discussion topics; and (3) participant reactions

(1) Techniques observed to be either surgeondriven (asking pointed questions, narrating video) or coach-driven (asking questions to prompt reflection, suggesting alternative approaches, framing in terms of performance); (2) majority of discussion topics were on operative technique (positioning, assistants, retractors, incision, exposure, progression), less discussion on teaching residents; and (3) all surgeons involved agreed that the sessions were valuable (1) Greatly improved knowledge in mentored learning group; and (2) screw placement technique significantly lower in mentored learning group

(1) Knowledge: written test on transarticular screw placement; and (2) errors in technique of screw placement

ARTICLE IN PRESS

Hu et al,9 2012

Follow-up

Surgery Volume j, Number j

Appendix. (continued)

(continued)

Min et al 19

Author, yr

Setting

Coach

Time of training

Follow-up

Was coaching effective overall?

Main findings (1) NOTECHS significantly improved after intervention, surgical situation awareness and overall NOTECHS score associated with technical error rate (higher NOTECHS scores associated with fewer errors); (2) technical errors significantly reduced; (3) procedural errors significantly decreased; (4) safety climate significantly improved for team scale but not others; and (5) no difference in duration of stay or operating time (1) Increased confidence, decreased stress; (2) modest improvement in non-technical skills; (3) significant improvement in technical skills; (4) statistically significant increase in instrument knowledge; and (5) modest but statistically significant increase in pathophysiology

McCulloch et al,28 2009

Classroom/ real-time OR

Independent

9 h followed by twiceweekly visits from trainers for 3 mo

No

Yes

(1) Nontechnical skills measured using NOTECHS; (2) technical errors measured using OCHRA; (3) procedural errors measured using NOPEs; (4) safety climate, measured by safety attitudes questionnaire; and (5) Process measures: length of stay, operating time, return to OR, unplanned admission to intensive care unit, critical incident reports, complications within 12 wk of surgery

Mueller et al,29 2012

Simulation

More experienced surgeon and peer surgeon

1 continuous week

No

Yes

(1) Self-reported stress and confidence; (2) technical skills evaluated by observer; (3) nontechnical skills measured by 360-degree evaluation; (4) surgical instrument identification measured by knowledge test; and (5) pathophysiology and patient management assessed by multiple-choice test

ARTICLE IN PRESS

Outcomes assessed

20 Min et al

Appendix. (continued)

(continued)

Surgery j 2015

Time of training

Setting

Coach

Neily et al,6 2010

Classroom

Not described

1d

Nurok et al,30 2010

Classroom/ Simulation

Nonsurgeon

Two 90minsessions

Follow-up Yes; quarterly follow-up interviews with coach

Yes

Was coaching effective overall?

Outcomes assessed

Main findings

Yes

(1) Change in mortality of patients; and (2) reported improvements after implementation of training program

Inconclusive (communication and teamwork scores reverted to baseline after 3 mo)

Impact of standardized intervention on (1) emotional climate (surgical environment, anesthetic environment, circulating environment); (2) team skills; and (3) threat to patient outcome

(1) Trained facilities experienced a significant decrease in observed mortality after training; and (2) Facilities reported an improvement in communication among their OR staff, OR staff awareness, and OR teamwork (1) Surgical environment trended toward more frequently engaged after intervention; (2) Communication and team skills score significantly improved immediately after intervention but after 3 mo returned to a level not statistically significant from preintervention level; and (3) Threat to outcome score improved significantly from preintervention and remained significantly improved after 3 months

ARTICLE IN PRESS

Author, yr

Surgery Volume j, Number j

Appendix. (continued)

(continued)

Min et al 21

Author, yr

Setting

Coach

Time of training

Simulation

Not described

1 Day

Porte et al,32 2007

Simulation

More-experienced surgeon

18 practice trials

Schlachta et al,33 2009

Real-time OR

More-experienced surgeon

20 cases

No

Yes, administered a delayed post-test

No

Was coaching effective overall? Inconclusive (analysis of pre-post scores show no improvement within teams)

Yes

Inconclusive

Outcomes assessed

Main findings

(1) Behavioral intentions measured using a SJT on trauma teamwork; and (2) Trainee reactions measured after training

(1) SJT scores did not improve from pre- to postintervention in group 1, but significant improvement was found in group 2; and (2) participant reactions were positive, but not related to behavioral intentions (1, 2) All groups showed improvement from pretest to post-test but only the expert feedback group showed sustained improvement of skill on delayed performance testing

Participant performance assessed by (1) expert analysis using Global Rating Scale (evaluates respect for tissue, time and motion, instrument handling, flow of operation, overall performance); and (2) computer analysis (1) Procedure outcomes (conversion rate, duration of surgery, duration of hospital stay, wound complications) recorded by Canadian Advanced Endoscopic Surgery Registry practice audit software

(continued)

Surgery j 2015

(1) No conversions to open surgery in either group, mentored cases took longer but resulted in shorter duration of stay. Wound complication greater in nonmentored group but not statistically significant. Longitudinal mentoring and telemonitoring of laparoscopic colon surgery for cancer is feasible and may serve as a model for safe technology transfer to the community.

ARTICLE IN PRESS

Peckler et al,31 2012

Follow-up

22 Min et al

Appendix. (continued)

Author, yr

Setting

Coach

Time of training

Follow-up

Simulation

More-experienced surgeon

Few (unclear) procedures (per participant)

No

Subramonian et al,35 2004

Simulation

Not described

4 hours per week for 12 wk

No

Inconclusive (feasibility analysis)

Yes

Main findings

(1) Survey of trainees’ perception of mentoring quality; and (2) survey of trainees’ global perception of quality of robot (sound, image, mobility, interference)

(1) Robot mentoring was well received, but no significant difference in the different mentoring sessions (active, passive and remote). Onsite mentoring statistically superior to robotic mentoring in first coaching session; and (2) Global performance of robot not statistically significant between groups (1, 2) No significant difference in overall skills between open and laparoscopic procedures. Also no significant difference in skill between participants who wish to become surgeons vs those who do not and participants who play video games vs those who do not; and (3) Participants perceived that learning laparoscopy was more difficult than open surgery. However, both open surgery and laparoscopy were easier than they thought prior to coaching

(1) Open and laparoscopic skills of participating medical students; (2) Career intention: surgery or not; and (3) Participant perception about learning different surgical techniques

(continued)

Min et al 23

Outcomes assessed

ARTICLE IN PRESS

Sereno et al,34 2007

Was coaching effective overall?

Surgery Volume j, Number j

Appendix. (continued)

Setting

Coach

Weaver et al,36 2006

Classroom/ real-time OR

More-experienced surgeon

Zimmerman et al,37 2011

Simulation

More-experienced surgeon

Time of training

Follow-up

Was coaching effective overall?

100 diagnostic angiograms and 50 peripheral interventions

No

5d

No

Outcomes assessed

Main findings

Yes

(1) Number of PEPs performed; and (2) Major patient complications and deaths that occurred as a result of a PEP

Yes

(1) Reactions: confidence and comfort of residents with nine different laparoscopic surgical procedures

(1) Number of PEPs performed increased between 2000 and 2004; and (2) There were no significant differences in the frequency of major complications and deaths between surgeons (1) Residents felt more comfortable handling laparoscopic instruments and positioning trocars. They had greater confidence when performing laparoscopic surgery

ARTICLE IN PRESS

Author, yr

24 Min et al

Appendix. (continued)

CABG, Coronary artery bypass grafting; CT, computed tomography; OR, operating room; NOPEs, non-operative procedural errors; NOTECHS, Oxford Non-Technical Skills Scale; OCHRA, Observational Clinical Human Reliability Assessment; OSAD, Objective Structured Assessment of Debriefing; OSATS, Objective structured assessment of technical skill; PEP, percutaneous endovascular procedure; SJT, situational judgment test.

Surgery j 2015