Validation of a Novel Cognitive Simulator for Orbital Floor Reconstruction

Validation of a Novel Cognitive Simulator for Orbital Floor Reconstruction

Accepted Manuscript Validation of a Novel Cognitive Simulator for Orbital Floor Reconstruction Renata Khelemsky, DDS MD, Resident, Brianna Hill, B.A.,...

696KB Sizes 0 Downloads 112 Views

Accepted Manuscript Validation of a Novel Cognitive Simulator for Orbital Floor Reconstruction Renata Khelemsky, DDS MD, Resident, Brianna Hill, B.A., Medical Student, Daniel Buchbinder, DMD MD, Chairman PII:

S0278-2391(16)31208-3

DOI:

10.1016/j.joms.2016.11.027

Reference:

YJOMS 57560

To appear in:

Journal of Oral and Maxillofacial Surgery

Received Date: 30 August 2016 Revised Date:

20 November 2016

Accepted Date: 24 November 2016

Please cite this article as: Khelemsky R, Hill B, Buchbinder D, Validation of a Novel Cognitive Simulator for Orbital Floor Reconstruction, Journal of Oral and Maxillofacial Surgery (2017), doi: 10.1016/ j.joms.2016.11.027. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT

Title: Validation of a Novel Cognitive Simulator for Orbital Floor Reconstruction Authors: Renata Khelemsky, DDS MD£ ; Brianna Hill, BA¥ ; Daniel Buchbinder, DMD MD€ Resident, Department of Otolaryngology-Head and Neck Surgery, Division of Oral &

RI PT

£

Maxillofacial Surgery, Mount Sinai Beth Israel and Icahn School of Medicine at Mount Sinai, ¥

B.A., Medical Student, The George Washington University School of

Medicine and Health Sciences.



Chairman, Department of Otolaryngology-Head and Neck

SC

New York, NY.

Surgery, Division of Oral & Maxillofacial Surgery, Mount Sinai Beth Israel and Icahn School of

M AN U

Medicine at Mount Sinai, New York, NY.

Corresponding author:

TE D

Renata Khelemsky 10 Union Square E, Suite 5B New York, NY 10003

EP

Fax: (212) 844-6975

AC C

Email: [email protected] Conflict of interest: None

Financial Disclosures: None Acknowledgements: The authors would like to acknowledge Ali Bahsoun, Jean Nehme, and Andre Chow of Touch Surgery™ for their help with data acquisition and unwavering

ACCEPTED MANUSCRIPT

cooperation in the methodology of this study, as well as Jennifer Reingle Gonzalez, Ph.D. for her

AC C

EP

TE D

M AN U

SC

RI PT

chief statistical support in analyzing the data.

ACCEPTED MANUSCRIPT

Validation of a Novel Cognitive Simulator for Orbital Floor Reconstruction

Brianna Hill, BA

AC C

EP

TE D

M AN U

SC

Daniel Buchbinder, DMD, MD

RI PT

Renata Khelemsky, DDS, MD

ACCEPTED MANUSCRIPT

Abstract Purpose: The increasing focus on patient safety in current medical practice has promoted the development of surgical simulation technology in the form of virtual reality (VR) training

RI PT

designed largely to improve technical skills, and less so, non-technical aspects of surgery such as decision-making and material knowledge. The present study investigates the validity of a novel cognitive VR simulator called Touch Surgery ™ for a core maxillofacial surgical procedure:

SC

orbital floor reconstruction (OFR).

M AN U

Methods: A cross-sectional study was carried out on two groups of participants with differing experience level. Novice graduate dental students and expert surgeons were recruited from a local dental school and academic residency programs, respectively. All participants completed the OFR module on Touch Surgery™. The primary outcome variable was simulator performance

TE D

score. Post-module questionnaires rating specific aspects of the simulation experience were completed by both groups and served as the secondary outcome variables. The age and sex of participants were considered additional predictor variables. From these data, conclusions were

EP

made regarding three types of validity (face, content, and construct) for the Touch Surgery™ simulator. Dependent samples t-tests were used to explore the consistency in simulation

AC C

performance scores across Phase 1 and 2 by experience level. Two multivariate ordinary least squares regression models were fit to estimate the relationship between experience and Phase 1 and 2 scores.

Results: A total of 39 novices and 10 experts naïve to Touch Surgery™ were recruited for the study. Experts outperformed novices on both phases of the OFR module (p<0.001), which

ACCEPTED MANUSCRIPT

provided the measure of construct validation. Responses to the questionnaire items used to assess face validity were favorable from both groups. Positive questionnaire responses were also

age and sex were not significant predictors of performance scores.

RI PT

recorded from experts alone on items assessing the content validity for the module. Participant

Conclusion: Construct, content, and face validity were observed for the OFR module on a novel

SC

cognitive simulator, Touch Surgery™. OFR simulation on the smart device platform could therefore serve as a useful cognitive training and assessment tool in maxillofacial surgery

AC C

EP

TE D

M AN U

residency programs.

ACCEPTED MANUSCRIPT

Introduction Traditionally, a surgical procedure like orbital floor reconstruction (OFR) would be learned by textbook reading and repeated observation, followed by supervised practice in a surgical

RI PT

residency program.1 However changes in medical economics and residency standards over the past decade have focused on reduced resident work hours, shortened hospital stays, outpatient procedures, and minimally invasive techniques.2,3 While oral and maxillofacial surgery (OMS)

SC

training does not fall under the mandates of the Accreditation Council for Graduate Medical Education (ACGME), the initial 80-hour workweek conditions set forth by the 1989 Bell

M AN U

Commission were extended to OMS residents in New York State.4 A significant portion of OMS training also takes place alongside other surgical fields in a hospital setting, which realistically implicates OMS in the pool of training issues. The relatively recent ACGME initiatives to protect resident wellbeing and patient safety led to objective improvements in

TE D

clinical outcomes, leaving OMS residencies at free will to implement regulations for the sake of consistency with other surgical departments or for individual reasons to adopt restricted work hours.4-6 In response to the climate of increasing safety, conventional training methods in

EP

surgical residencies have also been affected.7 For a surgical trainee, the increasing emphasis on patient safety may be directly at odds with the shrinking number of opportunities to observe and

AC C

practice surgery.2,8,9 Moreover, competency-based evaluations and standardized assessments of residents have increased, adding to the demand for more standardized practice opportunities.10

Orbital floor fractures are frequently the result of accidents such as sports injuries, traffic accidents, falls, and interpersonal violence-related facial trauma.11 Given the complex anatomical nature of the bony orbit, repair of this facial injury demands keen surgical ability

ACCEPTED MANUSCRIPT

when considering the orbit’s role in the support and function of the globe.12,13 Fractures of the orbital floor can result in bony defects that require reconstruction rather than reduction and fixation, placing both the function and aesthetic appearance of the eye at risk. It is therefore

RI PT

essential that repair be undertaken by highly skilled and knowledgeable surgeons to decrease the risk for complications such as globe malposition, diplopia, corneal injury, retrobulbar hematoma

SC

and blindness.13,14

Surgical simulation as an educational tool has somewhat abated the concern surrounding the

M AN U

shortage of experience-based opportunities for surgical trainees. That experience is a prerequisite of expertise is well-known; however, the notion of the “right” type of experience, which is aimed at improving performance, can potentially be fostered with virtual reality (VR) simulation.15 VR simulation combines several positive educational components of experience like repetition,

TE D

formative feedback, and immediate review of outcomes in a safe, controlled, and self-directed setting that is ideal for adult learning.16-18 VR simulation has been implemented in other fields like aviation as a tool to expedite mastery achievement without risking safety.1 Surgical

EP

simulation has focused on improving technical skills with the idea that the early learning curve can be fulfilled on a virtual trainer rather than a live patient. Indeed, reduced error rates during

AC C

laparoscopic surgery were demonstrated in a randomized double-blinded study for trainees who learned the procedure with simulation compared to those who used traditional methods.19 Still, cognitive qualities such as automation of problem solving processes and a rich professional memory, are key to surgical excellence, and thus represent an important first phase of skill development.10,20

ACCEPTED MANUSCRIPT

For the trainee learning from an expert surgeon, it can be a challenge to learn each minute step involved in a procedure, partly because experts tend not to dictate that they use unconscious automated knowledge to recall and perform during surgery.21 To accumulate hands-on practice

RI PT

outside the operating room, maxillofacial residents have resorted to cadaver labs and benchtop bone model exercises, which are difficult to access, expensive, and lead to highly variable

experiences.22 The use of more advanced simulation technologies is no longer novel to the

SC

practice of craniomaxillofacial surgery and has already been demonstrated with tools like virtual preoperative planning, temporal bone haptics-enabled drilling, and high-fidelity bimanual

M AN U

neurosurgery simulators.23-25 Moreover, adult learners benefit from self-directed engagement with a task,19 such as the kind that can be provided by VR simulation wherein residents engage in deliberate practice prior to encountering a live patient. Simulation research has consequently focused on technical training with evidence of improved clinical performance following

TE D

simulator exposure.26-29 Simulation-based training focusing on the cognitive decision-making aspects of surgery, however, is relatively new and largely underutilized within head and neck

EP

surgical curricula.20,21

Touch Surgery™ is a novel, readily available and free application (Kinosis Limited, London,

AC C

UK) designed for smart devices (ie, smart phones and tablets) and intended to teach surgical procedures by utilizing cognitive task analysis (CTA) theory, which holds that a performable task consists of a series of basic cognitive operations. This theory is framed on understanding how an experienced operator executes complex tasks during surgery. Such tasks require both controlled (conscious) and automated (unconscious) knowledge and belong to a three-part system wherein skill-based, rule-based, and knowledge-based behaviors create the sum of

ACCEPTED MANUSCRIPT

intraoperative decision-making.21 The Touch Surgery™ platform is a hybrid of CTA and VR simulation whereby trainees interact with 3D animation to rehearse a given surgery in a stepwise fashion, gaining access to the necessary tools, anatomical landmarks, and pitfalls of the

RI PT

chosen surgical procedure.

Surgical education research enforces rigorous validation testing and methodologies for any novel

SC

assessment tool in order to examine its effectiveness in simulation-based training.30 Of these, the most commonly encountered are face and content validity, in which the former describes the

M AN U

appropriateness of the tool and the latter, carried out by a group of experts, describes the integral makeup of such a tool to measure what it is supposed to measure. A third benchmark of validation research is construct validity, which is defined as the extent to which a tool can measure variations in performance between subjects having different levels of a defined

orthopedic surgery.32,33

TE D

construct.30,31 Previous validation studies for Touch Surgery™ have been conducted for

EP

The purpose of this study was to perform a validation study of the OFR module on the Touch Surgery™ platform and to assess the utility of this novel cognitive simulator for residency

AC C

curricula applications such as training and assessment. The specific aims of the study were to measure the objective Touch Surgery™ performance metrics in a group of novice and expert participants with respect to their experience level in performing an OFR procedure. This enabled a measure of construct validity. Additional aims of the study were to collect subjective questionnaire responses regarding the acceptability of the module from both groups, and the accuracy of the module from the experts alone. This enabled a measure of face and content

ACCEPTED MANUSCRIPT

validity, respectively. The investigators hypothesize that this novel cognitive trainer will demonstrate the three forms of validity and thus have a purposeful role in residency curriculum

Surgery™ in the scope of maxillofacial surgery.

Methods

SC

Study Design & Sample

RI PT

development and competency-based training. This is the first validation study for Touch

To address the research purpose, the investigators designed and implemented a cross-sectional

M AN U

study. The study population was composed of novice graduate dental students from Columbia University College of Dental Medicine and local expert surgeons in the greater New York City region. Both groups of participants were recruited by voluntary enrollment through email invitations in June 2016. To be included in the study sample, dental students were required to

TE D

have completed a comprehensive head and neck anatomy course. Students were excluded from the novice group if he or she had previously observed or participated in an OFR, had prior exposure to Touch Surgery™, or did not complete both Phases 1 and 2 of the OFR module. To

EP

be considered an expert in this study, surgeons were required to perform OFR surgery independently. Exclusion criteria for expert participants included having prior exposure to

AC C

Touch Surgery™ and failure to complete both Phases 1 and 2 of the OFR module. None of the study participants from either group were involved in the development or authorship of the module.

Variables

ACCEPTED MANUSCRIPT

The primary predictor variable in this study is experience level as defined in the inclusion criteria for novice and expert participants. The primary outcome variable in this study is simulator performance score calculated out of a maximum score of 100. Secondary outcome variables

RI PT

were made up of answers to post-module questionnaire items dealing with the face validity (for both groups) and content validity (for experts only) of the simulator. Additional variables that may have affected the outcomes of interest and were thus additionally considered in this study

M AN U

Data Collection Methods

SC

included age and sex of participants.

Enrollment and Location – Due to the educational nature of the study and the de-identification of all enrolled subjects, an exemption was obtained from the Institutional Review Board from the Icahn School of Medicine at Mount Sinai. Novices were recruited by voluntary open enrollment

TE D

from email invitations sent to the Columbia University College of Dental Medicine’s second, third, and fourth year classes. Experts were recruited by similar email invitations from local New York City training programs in both OMS and otolaryngology departments. A description

EP

of the study purpose and design was provided, however due to our defined exclusion criteria, the name of the software application was not disclosed. A dedicated testing site consisting of a quiet

AC C

room at the graduate school’s library was reserved for the novice group, and makeup sessions took place in the same library and conditions. Experts who enrolled were asked to arrange a time and quiet location of their choice in order to complete the module. A research investigator observed all novices and experts during the time of their participation for basic compliance and study integrity.

ACCEPTED MANUSCRIPT

Touch Surgery Simulator – On the day of the study, participants downloaded the free Touch Surgery™ app using their own smart devices, which included Apple iPads, iPhones (Apple, Apple, Cupertino, CA, USA), and Android phone devices. Participants randomly selected a

RI PT

paper card with login information from a box of pre-generated email addresses with which to register their software. The investigators provided chargers and a backup Apple iPad for anyone whose device malfunctioned during the study. Free wireless access was provided for

SC

participants. The OFR module was authored by Dr. Khelemsky and Dr. Buchbinder of Mount

M AN U

Sinai Beth Israel, NY and developed by Touch Surgery™ Labs in June 2015.

The app can be used in two modes: Learn and Test. Subjects were not exposed to the Learn mode during this study (Figures 1a and 2a). The Test mode measures the user’s knowledge of a procedure through a series of multiple choice questions (Figure 1b) and intermittent virtual

TE D

prompts to drag a virtual instrument to the correct anatomical space using a swipe motion (Figure 2b), both of which occur during a continuous 3D animation of the surgical procedure. In the Test mode, the user must choose the single appropriate choice or swipe before advancing to

EP

the next procedural step and before obtaining a final performance score.

AC C

Both Phase 1 and Phase 2 of the OFR module were completed back to back in “Test” mode. Phase 1 of the OFR module depicted the sequence of steps required to prep and drape a patient, followed by exposure of the orbital floor fracture via a transconjunctival approach with lateral canthotomy. Phase 2 of the module related to the insertion and fixation of a preformed titanium orbital implant and closure of the soft tissue incisions.

ACCEPTED MANUSCRIPT

Questionnaires – All participants were required to complete pre-test and post-test questionnaires. The pre-test questionnaire was used to screen participants for inclusion and exclusion criteria. Gender, age, and experience level (graduate year for novices, and surgical experience in 5-year

RI PT

ranges for experts) were optional items. Post-test questionnaires collected data about the

acceptability and realism of the simulator from both groups of participants by rating statements on a 5-point Likert scale in the categories of visual appearance, utility for training, interest to

SC

learn more procedures, and interest for use in training programs. Participants in the expert group were given additional questionnaire items that assessed categories of anatomic accuracy, surgical

M AN U

sequence, and instrumentation for the depicted procedure (Figure 3).

Data Collection - Each participant underwent a rigorous de-identification process such that data was securely recorded in real time after logging in with the randomly selected pre-generated

TE D

email addresses. The Touch Surgery™ server was able only to identify which users were novices and which users were experts based on the login email addresses, while any remaining subject

EP

data remained completely de-identified.

Participants were evaluated based on the number of points they earned while completing the

AC C

module. An individual score for both Phase 1 and Phase 2 of the module was calculated by the software based on correct multiple-choice answer and correct swipe choice. Once the correct answer choice or swipe was selected, the module progressed to the next authored step. The module did not support alternative approaches or modification of steps during the virtual simulation. A single point was gained with each correct choice, while an incorrect choice caused

ACCEPTED MANUSCRIPT

a failure to accumulate points, leading to a lower cumulative score. The data was stored to the secure Touch Surgery™ server as two performance scores, one each for Phase I and Phase II.

RI PT

Data Analysis

A Shapiro-Wilk Test was used to assess normality in the score distribution and continuous covariates. Chi-squared and t-tests were used to examine differences between novices and

SC

experts on categorical and continuous measures, respectively. The Pearson’s r was used to assess the magnitude of the association correlation between simulation performance scores by module

M AN U

phase and subject age (the only two continuous covariates measured among all participants). Dependent samples t-tests were used to explore the consistency in simulation performance scores across Phase 1 and 2 by experience level. Bivariate and multivariate ordinary least squares (OLS) regression models were fit to estimate the relationship between experience and Phase 1

TE D

and 2 scores, controlling for age and sex.

To examine whether content and face validation measures were equal across novices and experts,

EP

the non-parametric, two-sample Kolmogorov-Smirnov test was used. Because these measures included multiple Likert scales, the distributions were not normal and medians are therefore

AC C

presented. For all models, item-level listwise deletion was used because data were missing completely at random. Alpha was set a priori at .05.

Results

Table I summarizes the demographics of the study population. The study initially recruited 42 novices and 10 experts. Data for three novices were collected but later excluded because these

ACCEPTED MANUSCRIPT

participants reported observing an OFR. The final sample was composed of 39 novices of which 18 (47%) were males, 20 (53%) were females, and 1 who chose not to identify. The mean age of novice participants was 24.8 ± 2.5 years and 12 (30%) were second-year, 13 (33%) were third-

RI PT

year, and 14 (36%) were fourth-year dental students. Due to a technical server error in data collection, performance scores were not recorded from four participants in the novice group.

SC

The second group consisted of 10 experts of which 9 (90%) were males and 1 (10%) was female. The mean age of experts was 46.6 ± 9.8 years of age. Compared to novices, experts were

M AN U

significantly older (p<0.001) and more frequently male (p=0.02). The majority of experts (80%) had greater than 10 years of independent surgical experience, while 1 expert (10%) had 1-5 years and 1 expert (10%) had 6-10 years of experience.

TE D

Table II displays the bivariate evaluation of the relationship between age, sex and performance scores across study phase. Age was positively associated with performance scores at both time points, indicating that older participants had higher performance scores. Every one unit increase

EP

in age was associated with a 0.72 and 0.63 unit increase in Phase 1 and 2 scores, respectively

AC C

(p<0.001). No sex differences in the mean performance scores were identified in either phase.

Table III depicts the relationship between experience and performance scores per phase of the module. Overall, no significant differences in performance scores across Phases 1 and 2 by experience level emerged. The mean performance score for novices was slightly higher in Phase 1 (mean: 44) than Phase 2 (mean: 42); however, this difference did not approach statistical

ACCEPTED MANUSCRIPT

significance (p=0.397). Among experts, scores for Phase 2 were marginally higher than Phase 1 (mean 78 vs. 76, respectively). This difference was not statistically significant (p=0.709).

RI PT

Table IV includes results of bivariate OLS regression models examining the unadjusted

relationships between experience, sex, age and performance score. In both phases, experts had significantly higher performance scores compared to novices. In Phase 1, experts scored 32.21

SC

units (95% CI 25.40-39.01) higher on average than novices. In Phase 2, experts scored 35.71 units (95% CI 25.46-45.95) higher than novices on average. These differences by experience

M AN U

level were statistically significant (p<0.001). In both Phases 1 and 2, age was positively associated with performance scores. Specifically, each unit increase in age was associated with a 1.13 unit (95% CI 0.79-1.47) increase in Phase 1 performance score, and a 1.23-unit (95% CI 0.76-1.70) increase in Phase 2 performance score. Male participants also scored significantly

TE D

higher on average than female participants across both phases. In Phase 1, males scored 10.87 units (95% CI 1.39-20.45) higher than females (p=0.026); in Phase 2, males scores 16.20 units

EP

(95% CI 4.27-28.13) higher than females (p=0.009).

Table V includes results of two OLS regression models to examine the relationship between

AC C

experience and performance score, controlling for sex and age. In both phases, experts had significantly higher performance scores compared to novices. In Phase 1, experts scored 32.91 units (95% CI 16.87-48.95) higher on average than novices, controlling for age and sex. In Phase 2, experts scored 33.85 units (95% CI 10.48-57.22) higher than novices on average, controlling for age and sex. These differences by experience level were statistically significant. Notably, sex and age were not significantly associated with performance scores in multivariate

ACCEPTED MANUSCRIPT

models.

Table VI summarizes secondary outcomes data from questionnaires that were calculated using

RI PT

median values from a 5-point Likert scale. Questionnaire responses to four items that rated face validity were equally favorable from both groups; no significant differences emerged between novices and experts. For each item rated by experts and novices, median Likert scale values were

SC

at or above 4 from experts and 5 from novices. Experts rated the questionnaire items related to content validity favorably as well, with median values of 4 for surgical instrumentation, 3.5 for

M AN U

surgical sequence, and 4 for anatomic accuracy.

Discussion

The purpose of the present study is to perform a validation study for a common maxillofacial

TE D

surgical procedure on a novel cognitive VR simulator and to assess its utility as a training and assessment tool in the context of a maxillofacial training curriculum. Three types of validity were under investigation – construct, face, content – and in order to provide objective measures

EP

of each one, primary and secondary outcome variables were defined as simulator performance score and numerical questionnaire answers, respectively. The investigators hypothesized that

AC C

experience level of the user would correlate with simulator performance scores such that increased experience leads to higher performance scores. This was indeed the key finding of the study, as experts significantly outperformed novices on both phases of the module. Since construct validity pertains to the ability of a simulator to reliably measure a difference in expertise between two groups,27 the present findings designate construct validation to the Touch Surgery™ simulator.

ACCEPTED MANUSCRIPT

The investigators also found that both the bivariate and multivariate regression models revealed a positive association between experience and performance score across both phases. In the

RI PT

bivariate models, sex and age were significantly associated with performance scores. However, multivariate results suggest that experience was the predictor that explained the variation in performance scores and sex and age were confounding these effects. These findings are to be

SC

expected, as experts were older and more likely to be male than novices. The data analysis

confirms our hypothesis that the Touch Surgery™ simulator maintains construct validity in that

M AN U

it measures what it is intended to measure, which is a level of knowledge that is proportional to experience.

Additional findings of this study show that both study groups also answered favorably (ratings of

TE D

4 or 5) on questions dealing with the simulator’s realism and acceptability. This data upheld the presence of face validity. Moreover, experts alone answered favorably on items pertaining to the objective accuracy and content of the modules. From this data, we corroborated content validity.

EP

Given the observation of construct, face, and content validity, the investigators propose the incorporation of cognitive surgical simulation such as the one offered by the Touch Surgery™

AC C

platform into the field of OMS for both training and assessment purposes.

Touch Surgery™ is a mobile VR simulator that provides trainees with the opportunity to rehearse a series of goal-oriented steps that make up a larger operative task and to train surgical decision-making skills using the principle of cognitive task analysis (CTA). Our results for construct validation are similar to previously published data by Sugand et al. using the same

ACCEPTED MANUSCRIPT

software for an orthopedic procedure where median scoring ranges were 41-60% for novices and 72-77.5% for experts,32 compared to our median ranges of 44-47% and 77-77.5% respectively. This data demonstrates a measurable gap with respect to the construct of our interest (surgical

assessment tool to distinguish between users of varying skill level.

RI PT

proficiency) between two dissimilar groups, advocating that Touch Surgery™ is a useful

SC

Face validity confirms the simulator’s acceptance to the degree that is appears to be realistic and useful in portraying the surgical procedure.27,34 We support the presence of face validity in this

M AN U

study with median questionnaire ratings of 4 or greater from both experts and novices. Subjects in both groups aided in the measurement of face validity since both experts and novices had adequate prior exposure to the anatomic region of interest, by way of surgery and/or anatomy education and dissection. The questionnaire items mainly addressed categories of visual

TE D

appearance, utility for training, and interest to use the software for future education. These categories were chosen in order to assess whether this simulation platform creates a virtual environment that is appropriately realistic for learning surgical procedures. Previous studies

EP

reporting on face validation used similar questionnaire-based methods to address categories on a

AC C

numerical analog or Likert scale.29,32,35

On the other hand, only experts with independent experience in OFR were given questionnaire items to assess content validity, which measures the appropriateness of Touch Surgery™ as a tool that delivers and achieves a particular goal.34 While validity in a general sense refers to an instrument’s ability to measure what it intends to measure, content validity derives exclusively from experts who offer their opinions in order to bring diverse concepts to a working level.36

ACCEPTED MANUSCRIPT

Early literature on the development of content validity argues that in order to meet the standards of content validation, a two-stage process is required in which wide-ranging content is first developed and then quantified into discrete testable items, ideally with an expert panel to identify

RI PT

any areas of omission.34 While the authors of the present study did not adhere to this rigorous methodology, the present study is the first to show a positive rating of the anatomic accuracy, surgical sequence, and depicted instrumentation to perform an OFR by what is essentially a

SC

small group of ten maxillofacial surgery “experts.” A lower rating for surgical instrumentation can be explained by the fact that more than one set of instruments is suitable for common

M AN U

surgical steps, and personal preferences are likely to emerge in a relatively small number of expert participants.

Several limitations within this study are worth mentioning. First, we did not use an intermediate

TE D

group between novices and experts, such as OMS residents with early exposure to surgery. Arguably this would help to further corroborate the construct validity of the OFR module, particularly on Phase 1 where performance was in part related to knowledge of patient

EP

positioning and preparation that is obtained earlier in training. While non-surgeon graduate students share only the anatomical basis for surgery with their expert counterparts, second year

AC C

OMS residents would be proportionally more experienced than our chosen novices and would thus be expected to have scores in the intermediate range. The choice of subjects in this study limits construct validation to a gross measure of the simulator’s ability to assess a difference at two extremes, rather than a more graded assessment of knowledge that would lend support for a more sensitive and widely applicable assessment tool.

ACCEPTED MANUSCRIPT

Second, the assessment of face validity in our study is based on subjective answers that are at risk for response bias by nature of the tested population. Graduate students are typically grateful for the opportunity to sample advanced training and free education, whereas experts have

RI PT

varying exposure to technically innovative tools. However in our study we observed median scores well above 3.0 out of 5.0; therefore, it is unlikely that this issue influenced our

conclusions. Third, we did not evaluate whether cognitive skill acquisition persisted with time,

SC

and future studies in this field should investigate how long and how often simulation modules should be repeated to maintain a satisfactory proficiency level among trainees. The learning

M AN U

curve associated with repeated use of the OFR module should also be explored in the future as was done for an orthopedic surgery procedure,33 which would allow for fine-tuning of the content and MCQ’s to develop a valid as well as an efficient training tool.

TE D

Fourth, the Touch Surgery™ simulator consists of fairly basic functions that are designed to measure a finite amount of knowledge, rather than intricate technical skills to respond to changing surgical conditions (decreased visibility, fracture severity, anatomical variants, etc).

EP

This limitation should be weighed against the overall context of OFR, which can be performed in a variety of ways that are often dictated by complex intraoperative needs. The OFR module as it

AC C

was authored for Touch Surgery™ is committed to a linear progression of steps without accounting for these special circumstances, and this may limit users from practicing more complex cognitive functions. While a high-fidelity interactive simulator experience is beyond the intention of the Touch Surgery ™ platform, the investigators do support increased development of functions such as the addition of decision trees, ongoing revision of specific

ACCEPTED MANUSCRIPT

steps, and residency communication channels, both internal and external, that invite differing professional views about debated steps or tools for a given surgical procedure.

RI PT

The most substantial limitation is the lack of evidence to show predictive validity. Prior studies attempting to show valuable data in this realm mostly rely on small subject numbers for their conclusions.37 A systematic review of 21 qualified papers pertaining to VR simulators for

SC

Otolaryngology procedures elicited 7 predictive validity studies that showed improved cadaveric temporal bone dissection following a virtual temporal bone simulator, significantly increased

M AN U

surgical confidence following practice on an endoscopic sinus surgery simulator, as well reduced overall operating time, reduced number of surgical errors made, and better overall performance with simulation-trained residents.27 A recent study enrolled 17 trainees and 4 expert robotic surgeons and showed a significant positive relationship between simulated VR training tasks and

TE D

intraoperative robotic performance.26 These data underscore that the results of the present study cannot make the assertion that exposure to this simulator will improve clinical performance. In order to become a useful training tool, the OFR module on the Touch Surgery™ platform must

EP

provide the training functions needed to reduce errors, provide rapid feedback with unlimited practice of the correct surgical sequence, and ultimately improve clinical outcomes for patients.

AC C

Although it may be cumbersome to perform a VR-to-OR study when considering the limited evidence to support the use of a new tool and ethical barriers involving the use of human subjects, predictive validity studies still possess the burden of proof to justify implementing this technology to resident training curricula.

37

An area that has raised recent interest is the ability to assess and teach surgical judgment.

ACCEPTED MANUSCRIPT

At a cognitive level, expert surgical judgment is rooted in automated representations of a given surgery with an allowance of free cognitive capacity to anticipate cues and problems, rather than actively responding to them.38 While dexterity and manual sensitivity can be rehearsed without

RI PT

the presence of an expert on technical simulators,26 this may lead to insufficient training. Pugh et al. interviewed 13 experienced surgeons and performed a rigorous CTA-based analysis of their thought processes during critical, error-prone surgical moments. The researchers found several

SC

variables affecting intra-operative decision making, indicating that “knowing” what to do in a crisis is a key player for task completion, leading the authors to conclude that knowledge-based

M AN U

behaviors outweigh skill-based ones in error rescue scenarios.21 These studies argue for the development of a knowledge-based surgical simulator, which may help to equalize decisionmaking processes between surgeons and operative team members, and promote situational awareness.21,38 A simulator platform like Touch Surgery™ that uses cognitive virtual task

intraoperative decisions.

EP

Conclusion

TE D

training has a role that falls consistent with the teaching of knowledge-based and successful

The present study establishes the presence of face, content, and construct validity for the OFR

AC C

module on the Touch Surgery™ cognitive task simulator, asserting Touch Sugery™ as a userfriendly platform that is well adapted for rehearsal of key surgical progression. The investigators also believe that simulation offers an opportunity to create standardized competency goals for basic surgical procedures, one of which is the OFR in maxillofacial surgery. While improvements in training methodologies to protect patient safety have received tremendous focus in health policy, VR simulation has not yet been incorporated in most training programs.39

ACCEPTED MANUSCRIPT

A validated assessment tool would allow residency programs to use a set of metrics to better quantify the amount of experience a trainee has received before operating on patients.

On a

larger scale, we propose a “blended” curriculum strategy that incorporates simulation into the

RI PT

current training schedule, along with conventional intraoperative guidance, which would allow trainees to remediate areas of deficiency, focus on specific portions of challenging procedures, and partake in meaningful feedback with mentors regarding their progression to mature and

SC

experienced surgeons.

M AN U

Presently, no other VR simulators use cognitive task analysis to enhance decision-making skills in maxillofacial surgery training. Future work is required to validate this platform as a useful training tool by testing for predictive validity and describing the learning curve associated with repeated exposure. The ideal time point within post-graduate training to initiate simulation

TE D

exposure for maximum resident benefit (early versus late with respect to operative experience) also requires further investigation. Given the potential positive impact on education, medical costs, and patient safety,3 it is reasonable to defend the interest in VR simulation for expanding

AC C

EP

surgical training on both a national and global level.

ACCEPTED MANUSCRIPT

References Dawson S. Procedural simulation: a primer. Radiology 241:17, 2006.

2.

Philibert I, Friedman P, Williams WT. New requirements for resident duty hours. JAMA 288: 1112, 2002.

3.

Millenson ML. Pushing the profession: how the news media turned patient safety into a priority. Qual Saf Health Care. 11: 57, 2002.

Chahal HS. Work hour regulations and training of residents. J Oral Maxillofac Surg

SC

4.

65:154, 2007.

Fisher EL, Blakey GH. Perspective on work-hour restrictions in oral and

M AN U

5.

RI PT

1.

maxillofacial surgery: the argument against adopting duty hours regulations. J Oral Maxillofac Surg 70:1249, 2012. 6.

Cunningham LL, Salman SO, Savit E. Limiting resident work hours: the case for the

7.

TE D

80-hour work week. J Oral Maxillofac Surg 70:1246, 2012. Eckert M, Cuadrado D, Steele S, Brown T, Beekley A, Martin M. The changing face of the general surgeon: national and local trends in resident operative experience. Am

8.

EP

J Surg 5:652, 2010.

Brennan TA, Leape LL, Laird NM, Hebert L, Localio AR, Lawthers AG, Newhouse

AC C

JP, Weiler PC, Hiatt HH. Incidence of adverse events and negligence in hospitalized patients: results of the Harvard Medical Practice Study. Qual Saf Health Care 13:

145, 2004.

9.

Verma SP, Dialey SH, McMurray JS, Jiang JJ, McCulloch TM. Implementation of a program for surgical education in laryngology. Laryngoscope 120:2241, 2010.

ACCEPTED MANUSCRIPT

10.

Accreditation Council for Graduate Medical Education. ACGME Common Program Requirements section III.B. Accessed August 01, 2016.

11.

Zaleckas L, Pečiulienė V, Gendvilienė I, Pūrienė A, Rimkuvienė J. Prevalence and

Perry M. Maxillofacial trauma - developments, innovations and controversies. Injury 12:1252, 2009.

13.

Converse JM, Smith B, Obear MF, Wood-Smith D. Orbital blowout fractures: a tenyear survey. Plast Reconstr Surg 39:20, 1967.

Losee JE, Gimbel ML, Rubin JP, Wallace CG, Wei F-C. Schwartz's Principles of

M AN U

14.

SC

12.

RI PT

etiology of midfacial fractures: a study of 799 cases. Medicina 51:222, 2015.

Surgery Plastic and Reconstructive Surgery. 10e [online]. New York, NY, McGrawHill Education, 2014, Ch. 45. 15.

Hall JC, Ellis C, Hamdorf J. Surgeons and cognitive processes. Br J Surg 90:10,

16.

TE D

2003.

Ericsson KA. Deliberate practice and the acquisition and maintenance of expert performance in medicine and related domains. Acad Med 79:S70, 2004. Tsuda S, Scott D, Doyle J, Jones DB. Surgical skills training and simulation. Curr

EP

17.

Probl Surg. 46:271, 2009. Goldman S. The educational kanban: promoting effective self-directed adult learning

AC C

18.

in medical education. Acad Med. 84:927, 2009.

19.

Seymour NE, Gallagher AG, Roman SA, O'Brien MK, Bansal VK, Andersen DK, Satava RM. Virtual reality training improves operating room performance: results of a randomized, double-blinded study. Ann Surg 236:458, 2002.

20.

Hamdorf JM, Hall JC. Acquiring surgical skills. Br J Surg 87:28, 2000.

ACCEPTED MANUSCRIPT

21.

Pugh CM, Santacaterina S, DaRosa DA, Clark RE. Intra-operative decision making: more than meets the eye. J Biomed Inform 44:486, 2011.

22.

Salma A, Chow A, Ammirati M. Setting up a microneurosurgical skull base lab:

23.

RI PT

technical and operational considerations. Neurosurg Rev 34:317, 2011.

Morris D, Sewell C, Barbagli F, Salisbury K, Blevins NH, Girod S. Visuohaptic simulation of bone surgery for training and evaluation. IEEE Comput Graph Appl

24.

SC

26:48, 2006.

Pflesser B, Petersik A, Tiede U, Hohne KH, Leuwer R. Volume cutting for virtual

25.

M AN U

petrous bone surgery. Comput Aided Surg 2:74, 2002.

Dubois L, Jansen J, Schreurs R, Habets PE, Reinartz SM, Gooris PJ, Becking AG. How reliable is the visual appraisal of a surgeon for diagnosing orbital fractures? J Craniomaxillofac Surg 44:1015, 2016.

Aghazadeh MA, Mercado MA, Pan MM, Miles BJ, Goh AC. Performance of robotic

TE D

26.

simulated skills tasks is positively associated with clinical robotic surgical performance. BJU Int 118:475, 2016. Arora A, lau LY, Awad Z, Darzi A, Singh A, Tolley N. Virtual reality simulation

EP

27.

training in Otolaryngology. Int J Surg 2:87, 2014. Tanaka A, Graddy C, Simpson K, Perez M, Truong M, Smith R. Robotic surgery

AC C

28.

simulation validity and usability comparative analysis. Surg Endosc 30:3720, 2015.

29.

Kelly DC, Margules AC, Kundavaram CR, Narins H, Gomella LG, Trabulsi EJ, Lallas CD. Face, content, and construct validation of the da vinci skills simulator. Urology 79:1068, 2011.

ACCEPTED MANUSCRIPT

30.

Van Nortwich SS, Lendvay TS, Jensen AR, Wright AS, Horvath KD, Kim S. Methodologies for establishing validity in surgical simulation studies. Surgery 147:622, 2010. Gallagher AG, Ritter EM, Satava RM. Fundamental principles of validation, and

RI PT

31.

reliability: rigorous science for the assessment of surgical education and training. Surg Endosc 17:1525, 2003.

Sugand K, Mawkin M, Gupte C. Validating touch surgery (TM): a cognitive task

SC

32.

simulation and rehearsal app for intramedullary femoral nailing. Injury 46:2212,

33.

M AN U

2015.

Sugand K, Mawkin M, Gupte C. Training effect of using touch surgery (TM) for intramedullary femoral nailing. Injury 47:448, 2016.

Lynn MR. Determination and quantification of content validity. Nurs Res 6:382,1986.

35.

Xu S, Perez M, Perrenot C, Hubert N, Hubert J. Face, content, construct, and

TE D

34.

concurrent validity of a novel robotic surgery patient-side simulator: the Xperience™ team trainer. Surg Endos 30:3334, 2016. Slocumb EM, Cole FL. A practical approach to content validation. Appl Nurs Res 4:

EP

36.

192, 1991.

Hogle NJ, Widmann WD, Ude AO, Hardy MA, Fowler DL. Does training novices to

AC C

37.

criteria and does rapid acquisition of skills on laparoscopic simulators have predictive validity or are we just playing video games? J Surg 65:431, 2008.

38.

Kempton SJ, Bentz ML. Making master surgeons out of trainees: part I. teaching surgical judgment. Plast Reconstr Surg 137:1646, 2016.

ACCEPTED MANUSCRIPT

39.

Yule S, Flin R, Paterson-Brown S, Maran N. Non-technical skills for surgeons in the

AC C

EP

TE D

M AN U

SC

RI PT

operating room: a review of the literature. Surgery 2:140, 2006.

ACCEPTED MANUSCRIPT

Figure Legend

RI PT

Figure 1a. Screen shot of animation for the cantholysis surgical step during simulation in “Learn” mode (the participants were not exposed to this mode in the present study).

SC

Figure 1b: Corresponding multiple choice question for the cantholysis step in “Test” mode. A single correct answer choice must be chosen to proceed to the next step. The cumulative score is

M AN U

displayed on the top right corner of the screen.

Figure 2a. Screen shot of a swipe interaction during simulation in “Learn” mode. The user is instructed to perform a manual swipe from the depicted blue circle to the dotted purple circle,

TE D

representing the correct anatomical area for the proposed instrument.

Figure 2b. Sample swipe interaction for the same step as in Figure 2a, now shown during

EP

simulation module in “Test” mode. The dotted purple circle is no longer present, and the user is asked to “drag” the displayed instrument to the correct anatomical area to proceed to the next

AC C

step.

Figure 3. Likert Scale and items included in the post-module questionnaire for both novice and expert participants.

ACCEPTED MANUSCRIPT

Table I. Descriptive analysis of novice and expert characteristics.

Experts

N(%) or Mean(sd) 39

N(%) or Mean(sd) 10

24.8 ± 2.5

46.6 ± 9.8

0.000

Male

18 (47%)

9 (90%)

0.02

Female

20 (53%)

Age in years¥ Sex*

2 3 4 b

`Expert years of experience 1–5

>15

12 (30%)

-

13 (33%)

-

-

14 (36%)

-

-

-

1 (10%)

-

1 (10%)

-

-

4 (40%)

-

-

4 (40%)

-

TE D

6 – 10 11 – 15

1 (10%)

M AN U

Student graduate yeara

SC

Sample size (n)

EP

¥ Mean ±SD * One participant chose not to identify a Student graduate year data were only collected among novices. b Years of experience data were only collected among experts.

AC C

p-value

RI PT

Novices

Study variable

ACCEPTED MANUSCRIPT

Table II. Bivariate correlations between age and performance scores, and Chi Squared analysis of sex and performance scores. Performance score Phase I r = 0.72

r = 0.63

Sex

Mean(sd)

Mean(sd)

Male

56 ±18.06

57 ±21.14

Female

45 ±11.98

41 ±15.8

AC C

EP

TE D

M AN U

SC

Age

r = correlation coefficient

p-value

Phase II

RI PT

Study variable

<0.001

0.226 0.714

ACCEPTED MANUSCRIPT

Table III. Bivariate relationship between experience and performance scores per phase of the module. Performance score Phase I

Phase II

Novice

44 ±8.9

42 ±15.6

Expert

76 ±4.8

78 ±5.9

p-value

AC C

EP

TE D

M AN U

SC

Experience

Dependent Sample

RI PT

Variable

0.397 0.709

ACCEPTED MANUSCRIPT

Table IV. Bivariate ordinary least squares regression models examining the unadjusted relationships between experience, sex, age, and performance score, n=46 (Phase 1) and n=44 (Phase 2).

p-value

Referent 32.21 (25.40-39.01)

<0.001

Referent 35.71 (25.46-45.95)

<0.001

Age

1.13 (0.79-1.47)

<0.001

1.23 (0.76-1.70)

<0.001

Male

10.87 (1.39-20.45)

0.026

16.20 (4.27-28.13)

AC C

EP

TE D

M AN U

SC

RI PT

p-value

Phase 2 Beta coefficient (95% CI)

Variable Experience Novice Expert

Phase 1 Beta coefficient (95% CI)

0.009

ACCEPTED MANUSCRIPT

Table V. Multivariate ordinary least squares regression models testing the magnitude of the relationship between experience and Phase 1 and 2 scores, controlling for age and sex, n=46 (Phase 1) and n=44 (Phase 2).

Referent 32.91 (16.87-48.95)

<0.001

Referent 33.85 (10.48-57.22)

0.006

-0.05 (-0.68-0.56) 0.97 (-5.49-7.43)

0.846 0.763

-0.05 (-0.95-0.85) 6.14 (-3.51-15.79)

0.910 0.206

RI PT

p-value

M AN U TE D EP AC C

Age Male

p-value

Phase 2 Beta coefficient (95% CI)

SC

Variable Experience Novice Expert

Phase 1 Beta coefficient (95% CI)

ACCEPTED MANUSCRIPT

Table VI. Descriptive and median comparison tests of validity assessments across novices and experts. Novices

Experts

p-value

Face Validity Visual appearance

5 (4-5)

4.5 (3-5)

0.930

Utility for training

5 (3-5)

4.5 (4-5)

0.493

Interest to rehearse more procedures

5 (3-5)

4.5 (3-5)

0.732

Interest for use in training programs

5 (3-5)

4 (3-5)

0.756

-

Surgical sequence

4 (3-5)

3.5(3-5) 4 (3-5)

AC C

EP

TE D

* Reported as Median (Range)

M AN U

Instrumentation

SC

Content Validity Anatomic accuracy

RI PT

Questionnaire Items Responses*

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT