Accepted Manuscript The test-retest reliability of three computerized neurocognitive tests used in the assessment of sport concussion
Jacob E. Resch, Mathew Schneider, C. Munro Cullum PII: DOI: Reference:
S0167-8760(17)30537-8 doi: 10.1016/j.ijpsycho.2017.09.011 INTPSY 11321
To appear in:
International Journal of Psychophysiology
Received date: Revised date: Accepted date:
15 December 2016 6 September 2017 12 September 2017
Please cite this article as: Jacob E. Resch, Mathew Schneider, C. Munro Cullum , The test-retest reliability of three computerized neurocognitive tests used in the assessment of sport concussion, International Journal of Psychophysiology (2017), doi: 10.1016/ j.ijpsycho.2017.09.011
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT The Test-Retest Reliability of Three Computerized Neurocognitive Tests Used in the Assessment of Sport Concussion Corresponding Author:
IP
T
Jacob E. Resch, Ph.D. The University of Virginia 210 Emmet St. S Memorial Gymnasium Charlottesville, VA 22932
[email protected]
US
CR
Mathew Schneider The University of Texas at Arlington 500 S. Nedderman Dr. Arlington, TX 7601
AC
CE
PT
ED
M
AN
C. Munro Cullum, Ph.D. The University of Texas Southwestern Medical Center 6333 Forest Park Rd 1st Floor, BLA100 Dallas, TX 75235
[email protected]
ACCEPTED MANUSCRIPT Abstract Computerized neurocognitive tests (CNTs) are widely used at all competitive levels of sport to assess sport concussion (SC). Whereas there are multiple CNTs available, little is known about how some of the most popular platforms compare. The purpose of this study was to investigate the test-retest reliability of the Automated Neuropsychological Assessment Metrics (ANAM),
IP
T
Concussion Vital Signs (CVS) and the Immediate Postconcussion and Cognitive Testing battery
CR
(ImPACT) using clinically relevant time points in healthy college-age participants. Participants were healthy college-age students (N = 128) randomly assigned into one of three groups which
US
were administered ANAM, CVS, or ImPACT at Days 1, 45 and 50. Intraclass correlation coefficients (ICCs) and Pearson correlations were used to assess reliability of the various CNT
AN
scores and subtest scores between time points. Participants were tested approximately 47.1 + 2.75 days after time point 1 and approximately 7.0 + 2.45 days after time point 2. ICC values
M
ranged from 0.18 (Procedural Reaction Time) to 0.53 (Mathematical Processing and Simple
ED
Reaction Time 1) for ANAM, 0.14 (Continuous Performance Test) to 0.85 (Reaction Time) for CVS, and 0.19 (Verbal Memory) to .89 (Visual Motor Speed) for ImPACT. Significant
PT
improvements (p < .05) across time were observed for (7/10) CNS Vital Signs composite
CE
scores, but no additional significant changes in performance were observed for the remaining CNTs. Overall, weak to strong reliability coefficients for ANAM, CVS, and ImPACT were
AC
observed when using clinically relevant time points of repeated administration.
Key Words: Concussion, Computerized Neurocognitive Test, Reliability, Mild Traumatic Brain Injury
ACCEPTED MANUSCRIPT 1.1 Introduction Computerized neurocognitive tests (CNT)s have become a commonly used clinical measure of sport concussion (SC) at all levels of sport. A survey of approximately 1000 certified athletic trainers (ATs) from a variety of clinical backgrounds identified CNTs as the fourth most common assessment tool for SC behind the clinical examination, the Standardized Assessment
IP
T
of Concussion and the Romberg test (Lynall et al., 2013). The application of CNTs to SC has
CR
increased by 29% compared to a similar survey of ATs in 1999 (Lynall et al., 2013). Since then, the increased use of CNTs to assess SC by health care professionals practicing in a variety of
US
clinical settings has seen the introduction of several new CNT platforms to the marketplace(Ferrara et al., 2001, Resch et al., 2013). The most common tests at the time were
AN
the Immediate Postconcussion Assessment and Cognitive Test (ImPACT) battery (90% to 93.3%), Cogsport or Axon Sport (2.8% to 4%) and Headminder (1.4% to 4%) (Meehan et al.,
M
2012, Lynall et al., 2013). Despite the widespread use of CNTs, each has been demonstrated to
ED
have variable psychometric evidence to support their clinical use in the management of SC.(Randolph et al., 2005, Resch et al., 2013)
PT
In 2005, a review paper addressed the measurement properties of paper-and-pencil and
CE
computerized neurocognitive tests used to assess SC. Based on their findings, the authors warranted caution to health care providers when using CNTs until further empirical research
AC
was established.(Randolph et al., 2005) An update of that review included 29 papers that examined the reliability and validity evidence of four commonly used CNTs published since 2005(Resch et al., 2013). The results of the reviewed articles revealed significantly variability in findings, which the authors attributed to methodological differences (e.g. sample size, test-retest intervals and participant age). Each CNT demonstrated variable reliability and limited, but more narrow, ranges of validity evidence (Resch et al., 2013). In systematic review which addressed studies investigating the test-retest reliability of ImPACT, Alsalaheen and colleagues expanded upon the methodological differences between studies which addressed the variable test-retest
ACCEPTED MANUSCRIPT reliability of the CNT. The authors concluded that methodological differences partially contributed to the variable reported reliability coefficients(Alsalaheen et al., 2015). Several sources of error may contribute to variable reliability coefficients, including the test-retest interval. For example, Broglio and Resch used clinically derived time points over 50
T
days to assess the test-retest reliability of ImPACT. The authors reported strong to weak (.76 to
IP
.23) intraclass correlation coefficient (ICC) values (Broglio et al., 2007a, Resch, 2013a). Authors
CR
of related research administered ImPACT using time frames which ranged from one month to two years and reported strong to moderate ICC values between .80 and .46 (Schatz and Ferris,
US
2013, Schatz, 2009, Elbin et al., 2011a). Littleton et al addressed another CNT, Concussion Vital Signs at three time points using one week test-retest intervals. Similar to ImPACT, the
AN
authors reported test-retest intervals which ranged from strong (0.86) to weak (0.10). (Littleton
M
et al., 2015). In addition variable test-retest intervals, methodological considerations which may influence measurement properties of CNTs include: the inclusion/absence of an external
ED
measure of effort, the use of alternate vs. parallel forms, the administration of multiple CNTs
PT
during the same test period, administration in an individual or group setting (plus group size differences), and the particular reliability coefficients and differences associated with exclusion
CE
criteria(Broglio et al., 2007a, Schatz et al., 2006, Resch, 2013b, Elbin et al., 2011a, Broglio et
AC
al., 2007b, Nakayama et al., 2014, Alsalaheen et al., 2015). To date, two studies have simultaneously addressed the test-retest reliability of multiple CNTs using the same varying methods. Cole compared ImPACT, CNS Vital Signs, and Automated Neuropsychological Assessment Metrics (ANAM) using a 30 day test-retest interval in a military sample. Participants were between 19 and 59 years of age with variable levels of education (Cole et al., 2013). Subjects were randomly assigned into one of four groups who were administered ANAM, CNS Vital Signs or ImPACT on Day 1 and again approximately 30 days later. Cole et al. reported strong to weak ICC values for ANAM (.79 to .40) and CNS Vital
ACCEPTED MANUSCRIPT Signs (.79 to .29) and strong to moderate ICC values for ImPACT (.83 to .50) (Cole et al., 2013). Nelson et al compared the stability reliability of ANAM, Axon Sports/Cogstate Sport, and ImPACT using test-retest intervals which ranged from seven to 198 days in high school and collegiate athletes. Participants were randomly assigned two CNTs which were administered at each time point. The authors reported the majority (~75%) of reliability coefficients were
IP
T
observed to be below the criterion value (.70) deemed acceptable for clinical utility(Nelson et al.,
CR
2016). A head-to-head comparison of CNTs using a narrower age range representative of adult athletes would be beneficial for clinicians who routinely assess SC. Additionally, using the same
US
methodology (e.g. time points, measures of effort, individual or group setting) to assess multiple CNTs would reduce extraneous error which may influence reliability. The purpose of the current
AN
study was to assess the test-retest reliability of three commercially available CNTs used to assess SC in college aged participants using clinically relevant time points (Broglio et al.,
M
2007a, Resch, 2013a). We hypothesized that each test and their respective neurocognitive
ED
indices would achieve acceptable reliability (ICC > .75) for clinical utility. The criterion value (ICC > .75) was chosen as it is not as conservative of the elected value (ICC = .90) by Randolph
PT
et al. and not as liberal at the value (ICC = .60) suggested by Anastasi (Randolph et al., 2005,
1.2.1 Methods
CE
Anastasi, 1998).
AC
Study site and subjects
The current study was conducted at a large public metropolitan university and was approved by the institutional review board. Participants were non-athlete university students who were recruited from the general student population. Non-athletes were selected to participate due to a) the university student-athletes’ previous exposure to ImPACT (as part a university concussion management protocol) which may skew the data due to practice effects (Resch, 2013b, Broglio et al., 2007a) and b ) would potentially limit the clinical utility of ImPACT
ACCEPTED MANUSCRIPT due to repeated exposure to the test (ImPACT) or by presenting multiple neurocognitive test formats to participants which may lead to confusion between test design and instructions (e.g. confusing one test section with another), thereby inherently decreasing the sensitivity of the clinical measure following suspected concussion. Subjects were excluded if English was not their primary language or they had a self-reported learning disability or attention deficit disorder,
IP
T
psychiatric condition, or a concussion within six months before or any time throughout the study. Additionally, participants were excluded if they did not complete all three time points, had an
CR
invalid baseline test as determined be each test manufacturer’s criteria, provided suboptimal
US
effort at any time point on the Green’s Word Memory test (WMT), or if they had prior exposure to any of the administered CNTs. Sample size was calculated to achieve a power of .80 and d =
AN
.75, which is consistent with related literature (Resch, 2013a, Broglio et al., 2007a, Maxwell, 2003).
M
Upon providing consent, all participants (N = 156) completed a health history
ED
questionnaire which was used to screen for any of the aforementioned exclusion criteria. Subjects were then randomly assigned into three groups to determine the administration of
PT
ANAM, CNS Vital Signs, or ImPACT at each time point. Participants were tested in quiet and
CE
controlled laboratory where distractions were minimized. Subjects were tested individually or in groups consisting of < 4 participants to reduce additional distractions which may be detrimental
AC
to test performance (Moser et al., 2011, Echemendia et al., 2013).The majority of participants was assessed in groups of two. Participants completed their randomly assigned CNT at each time point on desktop computers operating on the Microsoft Windows 7™ operating system using the Internet Explorer™ web browser. Each desktop computer was equipped with an external mouse and keyboard for response selection. Throughout the study period both JAVA™ and Adobe Flash™ were updated to ensure optimal performance of each CNT. 1.2.2 Measures Outcome Measures
ACCEPTED MANUSCRIPT Green’s Word Memory Test The WMT (Green, 2005) is a measure of effort and is previously described in the literature (Resch, 2013a). Participants are asked to remember 20 pairs of words. Next, each subject completes four subtests which are based on the immediate and delayed recall of each word in the provided word pair. Following the completion of the initial section, there is a 30
IP
T
minute-delay prior to the beginning of the delayed WMT subtests. The completion of four WMT
CR
subtests results in immediate and delayed recall, consistency of responses, multiple choice, paired associates, and free recall composite scores. For the current study, participants were
US
excluded if they scored below 85% on the immediate recall, delay recall and consistency composite scores.
AN
ANAM
The ANAM4™ Sports Medicine Battery (Vista Life Sciences, Norman, OK) is a CNT
M
commonly used in the military setting as a predeployment cognitive screening tool which
ED
assesses reaction time, information processing and memory. ANAM consists of 10 subtests which include Simple Reaction Time, Code-Substitution Immediate and Delayed, Procedural
PT
Reaction Time, Spatial Processing, Matching to Sample, Memory Search, and Mathematical
CE
Processing. ANAM also provides a 21-item symptom scale. Throughput scores, defined as the number of correct responses per minute of available response time, are calculated for each
AC
task. (Cernich et al., 2007)Test validity was determined by manually reviewing each participant’s composite report for each time point. Anyone who scored 56% correct or lower for any subtest were excluded from further analysis as suggested by the test manufacturer (Cernich et al., 2007). Time to complete ANAM was approximately 30 minutes.
CNS Vital Signs CNS Vital Signs® (Morrisville, NC) is a commercially available CNT used in the assessment of cognitive disorders. (Gualtieri and Johnson, 2006, Cole et al., 2013) CNS Vital
ACCEPTED MANUSCRIPT signs consists of seven subtests which include Verbal and Visual Memory, Finger Tapping, Symbol Digit Coding, and Stroop, Shifting Attention and Continuous Performance Tasks. Two or more of the seven subtests are combined to calculate Verbal and Visual Memory, Psychomotor Speed, Executive Functioning, Cognitive Flexibility, Continue Performance Task Correct Responses, and Reaction Time composite scores. Additionally, a 24-item symptom scale is
IP
T
included which participants rate each item on a Likert scale which ranges from 1(mild) to 6
CR
(severe). CNS Vital Signs has automated invalidity criteria based on the manufacturers’ instructions which were used to determine participant inclusion (CNS Vital Signs, 2014). The
US
first format, which was used for this study, is the CNS Vital Signs test battery which is has been described as a brief clinical evaluation tool used to assess a variety of psychiatric conditions
AN
(Gualtieri and Johnson, 2006). From the CNS Vital Signs test battery, tests were selected which represented the more focused battery of tests used in the clinical measure Concussion Vital
M
Signs. Though the CNS Vital Signs modules are identical to those administered by the
ED
Concussion Vital Signs platform, their output scores are different. The CNS Vital Signs report provides raw scores (which are not age adjusted), standard scores where with a mean for the
PT
age group is 100 and the standard deviation is 15, and percentile rank. The Concussion Vital
CE
Signs report does not provide standardized scores. The choice to omit the standard score on Concussion Vital Signs was based on simplicity and ease of use.(Rogers, 2015) The time
AC
needed to complete CNS Vital Signs was approximately 25 minutes. CNS Vital Signs is available in two formats. ImPACT
ImPACT (ImPACT Applications, Pittsburgh, PA, Version 2.0) is a commonly used CNT for the evaluation of SC (Lynall et al., 2013) which consists of 8 subtests consisting of immediate and delayed word design recall, a symbol-match test, 3-letter recall, the X’s and O’s test, and a color-match test which assesses attention, memory, reaction time, and information processing speed. Two or more of the aforementioned tasks are used to calculate Verbal and
ACCEPTED MANUSCRIPT Visual Memory, Visual Motor Speed, and Reaction Time composite scores. Additionally, the ImPACT administers a 22-item symptom scale in which participants rank their symptoms at the time of testing on a Likert scale which ranges from 1 “mild” to 6 “severe”. The values from each of the 22-items are summed to create a composite Total Symptom Score. ImPACT also accounts for effort using automated invalidity criteria which has been previously described in the
CR
IP
T
literature (Lovell, 2011) . Completion of ImPACT took approximately 25 minutes.
Assessment procedures
US
The current protocol is similar to that employed by Broglio et al., Resch et al., and Nakayama et al.(Broglio et al., 2007a, Resch, 2013a, Nakayama et al., 2014) Participants were
AN
assessed at three time points (Days 1, 45, and 50). These empirically chosen time points represent the typical timing between a pre-season baseline, the acute assessment of a
M
concussive injury, and on average, the average number of days that an athlete would report
ED
symptom free (Broglio et al., 2007a). Unlike these previous investigations, however, we used the WMT rather than Rey 15 item Memory Test as an external measure of effort and
PT
incorporated two additional CNTs (ANAM and CNS Vital Signs).
CE
At time point 1, subjects reviewed and provided consent which was followed by the completion of a health history questionnaire. Subjects then completed the initial portion of the WMT
AC
followed by the administration of their randomly assigned CNT. For each session, subjects completed a sequential alternate form of each CNT. For example, for ImPACT participants completed the baseline assessment (Form 1) at Day 1 and then were administered post-injury assessments 1 and 2 (Forms 2 and 3) at Days 45 and 50, respectively. Following the completion of the assigned CNT the remaining portion of the WMT was administered. Participants were reassessed approximately 45 days (time point 2) later and again five days (time point 3) after their second session.
ACCEPTED MANUSCRIPT 1.2.3 Statistical analyses ICCs were calculated for each CNT’s (ANAM, CNS Vital Signs, and ImPACT) composite or throughput scores. Specifically, ICCs were calculated between time points 1, 2 and 3 using a 1-way analysis of variance (ANOVA) model for which single measure reliability coefficients were obtained. ICC values range from “0” to “1.0” with values closer to “1.0” inferring greater
IP
T
reliability(Baumgartner, 2007). For CNS Vital Signs, ICC values were calculated for both raw and transformed composite scores.
CR
Effort was assessed based on the WMT’s and each CNT’s invalidity criteria as described
US
in the manufacturer’s instructions.(Lovell, 2011, Cernich et al., 2007, 2013) An analysis of variance (ANOVA) was used to assess for between-group differences on the WMT at each time
AN
point. If significant, Tukey’s post-hoc analysis was used to identify the origin of the significant difference. A repeated measures ANOVA was used to depict differences in effort across time.
M
Repeated measures ANOVAs were also used to examine differences in group performance
ED
across time for CNT. Greenhouse-Geisser corrections were implemented when violations of sphericity were observed. A Bonferroni adjustment was made for multiple pairwise comparisons
PT
during post hoc analyses. Effect sizes was calculated with Cohen’s d. All analyses were
with α < .05.
AC
1.3 Results
CE
performed with SPSS (Version 22.0, IBM, Armonk, NY) with statistical significance set a priori
A total of 156 participants participated in the current study of which 128 (n = 128) were included for data analysis. Participants were excluded based on poor performance or missing data on the Green’s WMT (n = 14) or their assigned computerized neurocognitive test (n = 13) as depicted in Figure 1.
ED
M
AN
US
CR
IP
T
ACCEPTED MANUSCRIPT
PT
Overall, groups did not differ in terms of age or time between time points (p > .05). For those who were administered ImPACT, female participants were significantly older than male subjects
CE
(F(1,39) = 5.11, p = .03). On average, participants were reassessed 47.1 + 2.75 days after time point 1 and 7.0 + 2.45 days after time point 2. Descriptive data for each group are presented in 1.
AC
Table
Table 1. Participants’ Descriptive Data (Mean(SD)) † = p < .05) Group Age Days Between Days Between Time Points Time Points 1 and 2 2 and 3 ANAM (n =42) 19.2 (1.61) 46.6 (2.06) 7.0 (3.06) Males (n = 8) 18.9 (0.99) Females (n = 34 ) 19.3 (1.72) CNS Vital Signs (n =45) Males (n = 14) Females (n = 31)
19.4 (1.58) 19.9 (1.94) 19.2 (1.40)
47.7 (3.47)
6.5 (1.37)
ACCEPTED MANUSCRIPT 20.0 (2.80) 21.0 (2.62)† 19.4 (1.62)
47.0 (2.46)
7.2 (2.26)
US
CR
IP
T
ImPACT (n =41) Males (n = 8) Females (n = 33)
1.3.1 Assessment of Effort
AN
In terms of the Green’s WMT, all participants included in our analyses exceed the criterion score of 85% for Immediate and Delayed Recall and Consistency composite scores.
M
Significant differences did exist amongst the three groups on one or more WMT composite
ED
scores at each time point (p < .05), although, all participants (n = 128) exceeded the criterion necessary to be included in our analyses.
PT
1.3.2 Computerized Neurocognitive Test Results
CE
ICC Values
Calculated ICC values for ANAM, CNS Vital Signs, and ImPACT between time points 1
AC
and 2, time points 1 and 3, and time points 2 and 3 may be found in Table 2. Each test was observed to have strong to weak reliability coefficients. Results of our reliability analyses may be found in Table 2. ANAM Descriptive data for ANAM throughput scores may be found in Table 3. ANAM ICC (0.53 to .18) and Pearson (0.55 to 0.22) reliability values ranged from moderate to weak, and all fell below our criterion ICC value of 0.75. The highest reliability values were observed for Simple Reaction Time 1 (ICC = 0.53, r = 0.53) between time points 1 and 3 and Mathematical
ACCEPTED MANUSCRIPT Processing (ICC = .53, r = 0.54). The lowest reliability values were observed for procedural reaction time (ICC = 0.18, r = 0.22) between time points 1 and 2. CNS Vital Signs Descriptive data for CNS Vital Signs may be found in Table 4. CNS Vital Signs, ICC (0.85 to 0.14) and Pearson reliability values ranged from strong to weak (0.91 to 0.18). Minimal
IP
T
differences in test-retest reliability values were observed between raw and standardized scores
CR
therefore on standardized composite scores are presented due to their clinical relevance to sport concussion management. The highest reliability values were observed for Psychomotor
US
(ICC = 0.79, r = 0.82) and Reaction Time (ICC = 0.85, r = 0.91) composite scores between time points 2 and 3. The weakest reliability values were observed for the Continuous Performance
AN
Task (ICC = 0.14, r = 0.18). Practice effects were observed only for CNS Vital Signs which are presented in table 4. Specifically, significant improvements across time were observed for
M
Psychomotor, Reaction Time, Complex Attention, Cognitive Flexibility, Processing Speed
AC
CE
PT
ED
Executive Function, and Visual Memory raw and standardized composite scores.
ACCEPTED MANUSCRIPT Table 2. Intraclass correlation(1), [95% Confidence Intervals] and (Pearson correlation) coefficients and heat map for ANAM, CNS Vital Signs and ImPACT between time points 1, 2 and 3. Test Time Points 1 to 2 Time Points 1 to 3 Time Points 2 to 3 ImPACT
Reaction Time ANAM Code Substitution
0.63 [0.17 – 0.66] (0.63) 0.47 [0.20 – 0.68] (0.50) 0.89 [.80 – 0.94] (0.89) 0.59 [0.35 – 0.76] (0.65)
0.29 [-0.01 – 0.55] (0.29)
0.45 [0.17 – 0.66] (0.44)
0.25 [-0.05 – 0.51] (0.24)
T
Visual Motor Speed
0.32 [0.01 – 0.56] (0.33) 0.54 [0.28 – 0.72] (0.54) 0.74 [0.56 – 0.85] (0.76) 0.53 [0.27 – 0.72] (0.55)
IP
Visual Memory
0.19 [-0.12 – 0.47] (0.20) 0.45 [0.28 – 0.72] (0.45) 0.81 [-0.12 – 0.47] (0.82) 0.57 [0.33 – 0.75] (0.57)
CR
Verbal Memory
0.38 [0.09 – 0.61] (0.37)
0.26 [-0.04 – 0.52] (0.25)
0.31[0.02 – 0.56] (0.31)
0.18 [-0.13 – 0.46] (0.22) 0.28 [-0.03 – 0.51] (0.30)
0.27 [-0.04 – 0.52] (0.31) 0.45 [0.17 – 0.66] (0.48)
0.32 [0.02 – 0.56] (0.32) 0.53 [0.28 – 0.72] (0.54)
Match to Sample
0.43 [0.15 – 0.65] (0.42)
0.34 [0.05 – 0.58] (0.35)
0.45 [-0.01 – 0.54] (0.29)
0.32 [0.03 – 0.57] (0.31) 0.50 [0.24 – 0.70] (0.50)
0.53 [0.28 – 0.72] (0.53) 0.45 [0.18 – 0.66] (0.44)
0.46 [0.19 – 0.67] (0.45) 0.51 [0.25 – 0.70] (0.51)
Visual Memory
0.40 [0.13 – 0.62] (0.45) 0.76 [0.61 – 0.86] (0.78) 0.75 [0.58 – 0.85] (0.77) 0.56 [0.33 – 0.73] (0.64) 0.69 [0.50 – 0.82] (0.72) 0.67 [0.47 – 0.80] (0.72) 0.65 [0.45 – 0.79] (0.70) 0.32 [0.04 – 0.56] (0.33) 0.42 [0.15 – 0.63] (0.51)
0.33 [0.04 – 0.56] (0.39) 0.59 [0.36 – 0.75] (0.67) 0.58 [0.35 – 0.74] (0.70) 0.33 [0.05 – 0.57] (0.37) 0.45 [0.12 – 0.61] (0.61) 0.45 [0.19 – 0.66] (0.64) 0.37 [0.09 – 0.59] (0.61) 0.41 [0.13 – 0.62] (0.44) 0.55 [0.31 – 0.73] (0.61)
0.40 [0.12 – 0.47] (0.40) 0.79 [0.65 – 0.88] (0.82) 0.85 [0.74 – 0.91] (0.91) 0.50 [0.25 – 0.69] (0.65) 0.69 [0.47 – 0.80] (0.76) 0.69 [0.50 – 0.82] (0.73) 0.63 [0.47 – 0.80] (0.77) 0.48 [0.22 – 0.67] (0.48) 0.46 [0.20 – 0.66] (0.46)
CPT
0.14 [-0.15 – 0.42] (0.18)
0.21 [-0.08 – 0.47] (.22)
0.76 [0.60 – 0.86] (.79)
Memory
Reaction Time
AC
Complex Attention
Cognitive Flexibility Processing Speed Executive Function Verbal Memory
AN
M
CE
Psychomotor
ED
Simple Reaction Time 2 CNS Vital Signs
PT
Simple Reaction Time 1
US
Code Substitution Delayed Procedural Reaction Time Mathematical Processing
1.0
0
ACCEPTED MANUSCRIPT
Table 3. Group 3 Means (SD)s for ANAM throughput scores. ANAM Throughput Scores (n = 42) Mean (SD) Time Code Code Procedural Match to Mathematical Simple Simple Point Substitution Substitution Reaction Time Sample Processing Reaction Reaction delayed Time 1 Time 2 1 54.6 (14.89) 61.5 (10.76) 105.3 (12.02) 40.4 (12.02) 23.7 (5.77) 231.7 (32.70) 232.6 (31.91) 2 57.2 (16.35) 62.0 (10.98) 110.3 (16.98) 39.7 (11.02) 23.8 (7.19) 233.2 (29.36) 229.7 (32.99)
T P
3
56.1 (15.29)
61.0 (12.71)
110.4 (12.54)
37.9(11.62)
D E
T P
C A
E C
25.0 (6.94)
U N
A
M
C S
I R
235.5 (31.27) 234.4 (34.10)
ACCEPTED MANUSCRIPT
Table 4. Group 3 Means (SD)s for CNS Vital Signs raw and standardized composite scores. † = a significant difference (p < .05) when compared time point 1. ǂ = a significant difference (p < .05) between time points 2 and 3. Continuous Performance Task (CPT)
Time Point 1
Memory
Psychomotor
102.4 (6.23) 104.2 (12.57)
182.0 (21.00) 101.1 (13.32)
2
100.6 (10.00) 100.6 (20.07)
186.6 (21.30) 104.0 (13.29)
3
101.2 (10.69) 103.4 (24.59)
190.4 (24.64) 107.5 (13.30)
†
†ǂ
CNS Vital Signs Composite Scores (n = 38) Raw: Mean (SD) Standardized: Mean (SD) Attention Cognitive Processing Flexibility Speed 7.3 (4.08) 48.9 (9.95) 66.0 (11.37) 102.1 (13.41) 101.7 (14.00) 104.7 (14.46)
Reaction Time 648.8 (74.08) 94.3 (14.23) 629.6 (85.69) 97.5 (16.08)
†
594.0 (102.78) 102.3(13.66)
†
8.3 (6.71) 98.7 (21.93) †ǂ
†ǂ
7.8 (13.96) ǂ 106.8 (13.16)
70.8 (13.27) † 110.4 (16.80)
U N
†ǂ
56.1 (8.73) †ǂ 112.0 (13.01)
A
D E
T P
C A
E C
M
I R
C S †
51.6 (10.90) † 105.8 (15.99)
T P
Executive Function 50.3 (9.46) 102.2 (13.27)
†ǂ
75.6 (13.50) †ǂ 116.0 (16.94)
†
53.5 (10.44) † 106.9 (15.13) †ǂ
57.7 (8.36) †ǂ 113.0 (12.28)
Verbal Memory 52.3 (4.14) 98.6 (15.3)
Visual Memory 50.1 (3.73) 107.3 (10.76)
52.4 (5.37) 99.0 (19.9)
48.2 (5.89) 101.9 (16.66)
53.1 (10.31) 97.5 (23.74)
49.1 (5.90) 104.6 (16.59)
†
CPT 39.8 (0.53) -39.6 (1.03) -39.7 (0.82) --
ACCEPTED MANUSCRIPT ImPACT Descriptive data for ImPACT may be found in Table 5. A significant improvement for Visual Motor Speed was observed across time (F(1.74, 69.57) = 3.34, p = .05) between time points 1 and 3 (t(40) = -2.19, p = .03). ImPACT ICC (.89 to .19) and Pearson (.89 to .20) reliability coefficients ranged from strong to weak. The highest reliability coefficients were observed for
IP
T
Visual Motor Speed (ICC = 0.89, r = 0.89) between time points 1 and 2 and between time points
CR
2 and 3. The lowest reliability values were observed for Verbal Memory (ICC = 0.19, r = 0.20)
US
between time points 1 and 2.
1.4 Discussion
73.2 (14.10)
41.5 (6.26)†
.60 (.12)
6.1 (4.26)
4.6 (7.47)
PT
89.4 (10.37)
CE
3
ED
M
AN
Table 5. Group 1 Means (SD)s for ImPACT’s composite scores. Group 1 Means (SD)s for ImPACT composite scores. † = a significant difference (p < .05) when compared time point 1. ǂ = a significant difference (p < .05) between time points 2 and 3. ImPACT Composite Scores (n = 41) Mean (SD) Time Verbal Visual Visual Motor Reaction Impulse Total Point Memory Memory Speed Time Control Symptom Score Score 1 86.8 (9.67) 75.3 (15.00) 40.0 (6.30) .60 (.08) 5.3 (4.24) 6.8 (10.31) 2 89.2 (8.47) 76.8 (11.86) 41.0 (6.62) .59 (.07) 6.4 (5.01) 5.3 (6.08)
AC
The current study is the second to compare the test-retest reliability of three commercially available CNTs used to assess SC (Cole et al., 2013) and the first to investigate this topic using healthy college age participants at clinically relevant time points(Resch, 2013b, Broglio et al., 2007a, Nakayama et al., 2014). Unique to our study was the finding that digitized measures of memory across neurocognitive tests were observed to have the lowest reliability coefficients (ICC = 0.63 to 0.19, r = 0.63 to 0.20) and measures of reaction time (ICC = 0.85 to 18, r = .91 to .22) and information processing (ICC = .89 to .28, r = .89 to .30) were observed to have the highest values across platforms. Overall, two of the three tests examined
ACCEPTED MANUSCRIPT demonstrated one or more composite scores which met our criterion value of 0.75. That said, the majority (84%) of reliability coefficients, regardless of CNT, fell below what is considered acceptable for clinical utility when calculated between time points representative to the baseline, post-injury, and symptom free assessments. Though several studies have addressed the test-retest reliability of specific CNTs, only
IP
T
two other investigations employed the same methodology to examine multiple platforms (Nelson
CR
et al., 2016, Cole et al., 2013). Cole et al examined the stability reliability of four commonly used CNTs in a young to middle-aged military sample. Using a 30-day test-retest interval, the authors
US
reported strong to weak (ICC = 0.83 to 0.22, r = 0.86 to 0.25) reliability coefficients. Despite demographic differences between samples (e.g. age, gender, level of education and
AN
compensation) our findings were similar to those reported by the Cole et al. Similar to our findings, Cole et al also reported measures of response speed (e.g. Reaction Time and
M
Information Processing) had superior reliability vs. other CNT indices. For the remaining
ED
neurocognitive outcome scores (e.g. memory scores) we observed predominantly moderate to weak reliability across platforms. Similar to our findings, Nelson and colleagues reported test-
PT
reliability coefficients using serial test-retest intervals ranging from one week to approximately
CE
six months and reported approximately 25% of all CNT indices met a clinically acceptable level of reliability(Nelson et al., 2016). Despite each participant being administered two CNTs at each
AC
time point, Nelson et al reported similar reliability coefficients which ranged from strong (0.79) to weak (0.30) for ANAM and Axon (.81 to .32) and strong (0.78) to moderate (0.49) values for ImPACT (Nelson et al., 2016). Though it is difficult to make direct comparisons amongst CNTs due to variations in composite score calculations, computerized measures of working and shortterm memory have been routinely demonstrated to be the least reliable subtests when compared to the aforementioned timed tests (Cole et al., 2013, Randolph et al., 2005, Resch et al., 2013, Nakayama et al., 2014, Littleton et al., 2015, Nelson et al., 2016). Provided these measures possess suboptimal reliability and hence questionable validity, caution is warranted
ACCEPTED MANUSCRIPT when using these indices for clinical decision making and/or making strong conclusions in research. Furthermore, CNT manufacturers should consider reevaluating and/or replacing subtests which contribute to outcome scores with lower reliability (e.g. word and visual memory) in order to improve test-validity and inherently the validity of each measure. It is also important to note that limited to no literature exists examining the measurement properties of CNT
IP
T
outcome scores when administered in languages other than English which may serve a another
CR
direction for future research.
Though our reliability coefficients are largely consistent with those reported in related
US
research(Cole et al., 2013, Nelson et al., 2016), we observed lower values when compared to studies investigating each platform independently. These discrepancies are potentially due to
AN
variable methodology (e.g. age of participants (Cole et al., 2013, Gualtieri and Johnson, 2006), test setting (Cole et al., 2013), test-retest interval (Schatz, 2009, Elbin et al., 2011b, Littleton et
M
al., 2015), the administration of multiple CNTs during the same test session (Broglio et al.,
ED
2007a, Nelson et al., 2016), and hardware and/or software variations(Rahman-Filipiak and Woodard, 2013)) between the current and related studies. Our results may be more
PT
generalizable due to factors such as recruitment from the general study body and a lack of
CE
incentives to inspire effort. However, though our participants were not collegiate studentathletes that does not preclude the possibility that they participated in sport at the secondary
AC
school level or at a recreational level. In terms of a lack of incentivized participation, though monetary (e.g. cash or gift cards) were not provided for the completion of one or more time points, the rationale for the study was explained to foster effort while completing the randomly assigned CNT. Additionally, each test session was proctored by a member of the research team to further evaluate visible effort (e.g. CNT attentiveness) and/or the understanding of CNT instructions. One of the benefits of CNTs is the ability to administer alternate “equivalent” or parallel forms in order to reduce practice effects(Collie and Maruff, 2003). For the current study,
ACCEPTED MANUSCRIPT participants were administered alternate forms of each test which varied in the stimuli (e.g. words and designs) in order to reflect clinical practice. Similar investigations which have reported higher reliability coefficients either administered the same test form at each time point or did not report which forms of the test were used to determine reliability (Resch, In press, Schatz, 2009, Elbin et al., 2011b, Nakayama et al., 2014). The chosen statistical model(s) may
IP
T
have influenced the reliability values reported in each study. Specific to ICC values, varying
CR
models have been used to determine the test-retest reliability of CNTs (Alsalaheen et al., 2015). We elected to use a one-way ANOVA model to calculate ICCs and Pearson correlation
US
coefficients to determine test-retest reliability. An additional source of the reported variability may be the selection of “single” versus “average” measure ICC values. Single measure ICC
AN
values were chosen as they reflect a single versus multiple raters to calculate reliability(McGraw and Wong, 1996). As CNT summary scores are independent of a rater for calculation, using the
M
single measure ICC value appears to be more appropriate than electing to use average
ED
measure values which generally provide higher reliability values. Another methodological consideration which may have influenced discrepancies
PT
between reported reliability values is the iteration or version of the CNT employed. During the
CE
past 15 years, CNTs have evolved in terms of being primarily desktop(Schatz, 2009, Broglio et al., 2007a, Resch, 2013b, Gualtieri and Johnson, 2006, Cernich et al., 2007) to now web-based
AC
platforms (Elbin et al., 2011b, Nakayama et al., 2014, Littleton et al., 2015) in addition to the refinement of instructions and composite score calculation(s)(Roebuck-Spencer et al., 2007, Iverson et al., 2003). The current study employed the most recent versions of ANAM, CNS Vital Signs, and ImPACT at the time of data collection. Refinements of each CNT compared to earlier iterations may have resulted in higher stability reliability coefficients than previously reported for some of the observed outcome values. An example of algorithm modifications to tabulate outcome scores is CNS Vital Signs. Our study of the standardized composite scores calculated by CNS Vital Signs were comparable to the raw scores provided by Concussion Vital Signs. Our
ACCEPTED MANUSCRIPT results suggest that minimal differences exist between standardized and raw values in terms of test-retest reliability and hence the two test platforms are parallel to each other. As previously mentioned, all CNTs were observed to have reliability coefficients which ranged from strong to weak with similarities illustrated in Table 2. Our results suggest that regardless of the CNT used, clinicians must be cognizant of the limitations of this form of
IP
T
neurocognitive testing and factors which may further reduce their reliability and validity to assess SC. Lower reliability values inherently lead to larger confidence intervals. Larger
CR
confidence intervals allow a student-athletes variable performance to be considered “within
US
normal limits” when re-administered a CNT. For example, a student-athlete is administered a CNT and achieves a Verbal Memory score of “80” on their pre-injury test. The CNT’s Verbal
AN
Memory score, based on normative data, has a known standard deviation of “6” and an ICC = 0.40. Calculating the standard error of measurement (SEM) with a 95% confidence interval
M
reveals that the next time the same student is administered the same CNT and is healthy, they
ED
may achieve a Verbal Memory score between 70.8 and 89.1. This hypothetical example is important as SEM serves as the denominator when calculating the reliable change index, a
PT
commonly used metric to determine clinical change.(Jacobson and Truax, 1991) Clinically, if a
CE
CNT outcome score has weak to moderate reliability (0 to 0.60), a concussed student-athlete’s performance might need to exceed a set threshold in order to be considered clinically
AC
meaningful. A large confidence interval due to low reliability values may give rise to falsepositive or false-negative errors, which may be detrimental to a student-athlete’s care. In terms of factors which may decrease the variability associated with CNTs is the environment in which the test is administered. For the current study, participants were administered their assigned CNT in groups of four or less. Our methodology is in agreement with previous recommendations to test groups of less than 10 when administering CNTs (Moser et al., 2011, Echemendia et al., 2013). Additionally, participants were administered each CNT in quiet, controlled environment with minimal distractions with up-to-date computers and software
ACCEPTED MANUSCRIPT (e.g. Adobe Flash and Java). A study by Schatz investigated participant feedback related to environmental distractions, computer difficulty, and test instructions following the completion of ImPACT. The authors reported environmental distractors and difficulty with instructions lead to increased symptom reporting. In turn, increased symptom reporting was found to significantly influence performance on one of the four neurocognitive domains of ImPACT (Schatz et al.,
IP
T
2010).
CR
Despite our attempts to control for group size and sources of systematic and random error suboptimal reliability was still observed for all three computerized neurocognitive tests.
US
Though a variety of settings are conducive to small (x < 10) group CNT administration (e.g. clinics, private high schools, some colleges), some venues find this recommendation logistically
AN
impossible hence limiting the ability to provide the optimal CNT environment. That said, when employing CNTs into a concussion management protocol clinicians must strive to facilitate test
M
administration to the fewest number of athletes possible (i.e. < 5 (Echemendia et al., 2013) at
ED
one time, reduce distraction in the environment, consider supplemental methods of test instruction (e.g. verbal explanation) to reduce the likelihood of difficulties associated with
PT
understanding a given task and remaining vigilant about ensuring all hardware and software are
CE
functional and up-to-date, and careful monitoring of all subjects’ during testing. Intrinsic factors may have also influenced the values observed during the current and
AC
related studies. Factors such as ADHD and/or other learning disabilities, limited motivation, and providing suboptimal effort at one or more time points may result in variable and questionable performances. Nelson et al assessed high school and college student-athletes with ANAM, Axon Sports, and ImPACT and factors associated with invalid baseline assessments. The authors reported a history of ADHD and/or learning disabilities, a lower GPA, and lower performance on the Wechsler Test of Adult Reading, a brief measure of estimated intelligence, contributed to invalid CNTs as determined by manufacturers’ guidelines(Nelson et al., 2015). Factors such as ADHD and learning disabilities have been well documented to result in overall
ACCEPTED MANUSCRIPT decreased CNT performance (Collins et al., 1999, Echemendia et al., 2013). Intrinsic motivation is another factor which may result in varying CNT performance. Schatz and Glatts investigated purposeful suboptimal performance or sandbagging while taking the ImPACT. The authors administered ImPACT to two groups of student-volunteers who were instructed to either provide their best effort while completing the test or to perform poorly, but not to perform below the test’s
IP
T
validity measures which were made known to the subjects. The authors reported that 30% of
CR
participants were able to successfully provide a suboptimal effort while completing the test and without being identified by the built-in validity indicators as invalid (Schatz and Glatts, 2013).
US
Though inconsistent effort at each time point may have contributed to the observed suboptimal reliability coefficients, all participants in our analyses scored well above the criterion value for
AN
the WMT, suggesting generally adequate effort. Clinicians must be aware of factors predisposing student-athletes’ to poor or potentially invalid CNT performance and consider
M
special arrangements to foster optimal test performance.
ED
The current study is not without limitations. First, we used college age students who were not athletes, which may limit generalizability. Additionally, given the age range of our
PT
sample, our findings may not translate to athletes below 18 years of age or to older individuals.
CE
Also, a majority of our sample (76.5%) was female, which may have influenced our findings, although each group had a proportionate number of male and female participants. As previously
AC
mentioned, the use of multiple measures within the same test period may have influenced our results. Last, although less likely in a healthy population, the use of Green’s WMT as a measure of effort in addition to each CNT may have influenced performance on verbal memory tasks. 1.5 Conclusions The current study was the first to assess the test –retest reliability of three commonly used computerized neurocognitive tests in healthy college aged subjects. Our results suggest that regardless of the CNT used in a concussion management protocol, clinicians must be aware of each test’s measurement properties as well as factors which may further influence
ACCEPTED MANUSCRIPT variable test performance. Controlling for such factors, to the best of the clinicians’ ability, will ultimately increase the clinical utility of the CNT in order to assess SC. That said, a stand-alone measure of SC does not currently exist and CNTs should not be used as such. Our results should assist clinicians in making evidence-based decisions regarding the use of CNTs in an evidence-based concussion management protocol.
IP
T
1.6 Acknowledgements
CR
The authors would like to acknowledge Vista Life Sciences and CNS Vital Signs for access to the computerized neurocognitive test to complete the current study.
US
References
AC
CE
PT
ED
M
AN
2013. Concussion Vital Signs User Guide. ALSALAHEEN, B., STOCKDALE, K., PECHUMER, D. & BROGLIO, S. P. 2015. Measurement Error in the Immediate Postconcussion Assessment and Cognitive Testing (ImPACT): Systematic Review. J Head Trauma Rehabil. ANASTASI, A. 1998. Pscyhological Testing, New York, NY, Macmillan. BAUMGARTNER, T. A., JACKSON, A.S., MAHAR, M.T., ROWE, D.A. 2007. Measurement for Evaluation in Physical Education and Exercise Science New York, New York, McGraw Hill. BROGLIO, S. P., FERRARA, M. S., MACCIOCCHI, S. N., BAUMGARTNER, T. A. & ELLIOTT, R. 2007a. Testretest reliability of computerized concussion assessment programs. J Athl Train, 42, 509-14. BROGLIO, S. P., MACCIOCCHI, S. N. & FERRARA, M. S. 2007b. Sensitivity of the concussion assessment battery. Neurosurgery, 60, 1050-7; discussion 1057-8. CERNICH, A., REEVES, D., SUN, W. & BLEIBERG, J. 2007. Automated Neuropsychological Assessment Metrics sports medicine battery. Archives of clinical neuropsychology : the official journal of the National Academy of Neuropsychologists, 22 Suppl 1, S101-14. CNS VITAL SIGNS, L. 2014. FAQ's - Frequently Asked Questions [Online]. CNS Vital Signs, LLC. [Accessed 2014]. COLE, W. R., ARRIEUX, J. P., SCHWAB, K., IVINS, B. J., QASHU, F. M. & LEWIS, S. C. 2013. Test-retest reliability of four computerized neurocognitive assessment tools in an active duty military population. Arch Clin Neuropsychol, 28, 732-42. COLLIE, A. & MARUFF, P. 2003. Computerised neuropsychological testing. Br J Sports Med, 37, 2-3. COLLINS, M. W., GRINDEL, S. H., LOVELL, M. R., DEDE, D. E., MOSER, D. J., PHALIN, B. R., NOGLE, S., WASIK, M., CORDRY, D., DAUGHERTY, K. M., SEARS, S. F., NICOLETTE, G., INDELICATO, P. & MCKEAG, D. B. 1999. Relationship between concussion and neuropsychological performance in college football players. JAMA, 282, 964-70. ECHEMENDIA, R. J., IVERSON, G. L., MCCREA, M., MACCIOCCHI, S. N., GIOIA, G. A., PUTUKIAN, M. & COMPER, P. 2013. Advances in neuropsychological assessment of sport-related concussion. Br J Sports Med, 47, 294-8. ELBIN, R., SCHATZ, P. & COVASSIN, T. 2011a. One-Year Test-Retest Reliability of the Online Version of ImPACT in High School Athletes. The American journal of sports medicine. ELBIN, R. J., SCHATZ, P. & COVASSIN, T. 2011b. One-year test-retest reliability of the online version of ImPACT in high school athletes. The American journal of sports medicine, 39, 2319-24.
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN
US
CR
IP
T
FERRARA, M. S., MCCREA, M., PETERSON, C. L. & GUSKIEWICZ, K. M. 2001. A Survey of Practice Patterns in Concussion Assessment and Management. J Athl Train, 36, 145-149. GREEN, P. 2005. Green's Word Memory Test for Windows: User's Manual and Program Edmonton, Canada, Green's Publishing Inc. . GUALTIERI, C. T. & JOHNSON, L. G. 2006. Reliability and validity of a computerized neurocognitive test battery, CNS Vital Signs. Archives of clinical neuropsychology : the official journal of the National Academy of Neuropsychologists, 21, 623-43. IVERSON, G. L., LOVELL, M. R. & COLLINS, M. W. 2003. Interpreting change on ImPACT following sport concussion. Clin Neuropsychol, 17, 460-7. JACOBSON, N. S. & TRUAX, P. 1991. Clinical significance: a statistical approach to defining meaningful change in psychotherapy research. J Consult Clin Psychol, 59, 12-9. LITTLETON, A. C., REGISTER-MIHALIK, J. K. & GUSKIEWICZ, K. M. 2015. Test-Retest Reliability of a Computerized Concussion Test: CNS Vital Signs. Sports Health, 7, 443-7. LOVELL, M. 2011. Immediate Post-Concussion Assessment Testing (ImPACT) Test Clinical Interpretive Manual: Online ImPACT 2007-2012. LYNALL, R. C., LAUDNER, K. G., MIHALIK, J. P. & STANEK, J. M. 2013. Concussion-assessment and management techniques used by athletic trainers. J Athl Train, 48, 844-50. MAXWELL, S. E., DELANEY, H.D. 2003. Designing Experiments and Analyzing Data, Mahweh, New Jersey, Lawrence Elrbaum Associates MCGRAW, K. O. & WONG, S. P. 1996. Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1, 30-46. MEEHAN, W. P., 3RD, D'HEMECOURT, P., COLLINS, C. L., TAYLOR, A. M. & COMSTOCK, R. D. 2012. Computerized neurocognitive testing for the management of sport-related concussions. Pediatrics, 129, 38-44. MOSER, R. S., SCHATZ, P., NEIDZWSKI, K. & OTT, S. D. 2011. Group versus individual administration affects baseline neurocognitive test performance. The American journal of sports medicine, 39, 2325-30. NAKAYAMA, Y., COVASSIN, T., SCHATZ, P., NOGLE, S. & KOVAN, J. 2014. Examination of the Test-Retest Reliability of a Computerized Neurocognitive Test Battery. Am J Sports Med, 42, 2000-2005. NELSON, L., LA ROCHE, A., PFALLER, A., LERNER, E., HAMMEKE, T., RANDOLPH, C., BARR, W., GUSKIEWICZ, K. & MCCREA, M. 2016. Prospective, Head-to-Head Study of Three Computerized Neurocognitive Tools (CNTs): Reliability and Validity for the Assessment of Sport Concussion. Journal of International Neuropsychological Society, 22, 24-37. NELSON, L., PFALLER, A., REIN, L. & MCCREA, M. 2015. Rates and predictors of invalid baseline test performance in high school and collegiate athletes for 3 computerized neurocognitive tests: ANAM, Axon Sports, and ImPACT. American Journal of Sports Medicine. RAHMAN-FILIPIAK, A. A. & WOODARD, J. L. 2013. Administration and environment considerations in computer-based sports-concussion assessment. Neuropsychol Rev, 23, 314-34. RANDOLPH, C., MCCREA, M. & BARR, W. B. 2005. Is neuropsychological testing useful in the management of sport-related concussion? J Athl Train, 40, 139-52. RESCH, J., DRISCOLL, A., MCCAFFREY, N., BROWN, C.N., MACCHIOCCHI, S., BAUMGARTNER, T.A., WALPERT, K., FERRARA, M.S. 2013a. ImPACT Test Retest Reliability: Reliably Unreliable? J Athl Train, 48, 506-511. RESCH, J., DRISCOLL, A., MCCAFFREY, N., BROWN, C.N., MACCHIOCCHI, S., BAUMGARTNER, T.A., WALPERT, K., FERRARA, M.S. 2013b. ImPACT Test Retest Reliability: Reliably Unreliable? J Athl Train, 48, 506-511.
ACCEPTED MANUSCRIPT
AC
CE
PT
ED
M
AN
US
CR
IP
T
RESCH, J., DRISCOLL, A., MCCAFFREY, N., BROWN, C.N., MACCHIOCCHI, S., BAUMGARTNER, T.A., WALPERT, K., FERRARA, M.S. In press. ImPACT Test Retest Reliability: Reliably Unreliable? J Athl Train, In Press RESCH, J. E., MCCREA, M. A. & CULLUM, C. M. 2013. Computerized neurocognitive testing in the management of sport-related concussion: an update. Neuropsychol Rev, 23, 335-49. ROEBUCK-SPENCER, T., SUN, W., CERNICH, A. N., FARMER, K. & BLEIBERG, J. 2007. Assessing change with the Automated Neuropsychological Assessment Metrics (ANAM): issues and challenges. Archives of clinical neuropsychology : the official journal of the National Academy of Neuropsychologists, 22 Suppl 1, S79-87. ROGERS, C. 2015. RE: Concussion and CNS Vital Signs. Type to RESCH, J. E. SCHATZ, P. 2009. Long-Term Test-Retest Reliability of Baseline Cognitive Assessments Using ImPACT. Am J Sports Med. SCHATZ, P. & FERRIS, C. S. 2013. One-Month Test-Retest Reliability of the ImPACT Test Battery. Archives of clinical neuropsychology : the official journal of the National Academy of Neuropsychologists. SCHATZ, P. & GLATTS, C. 2013. "Sandbagging" Baseline Test Performance on ImPACT, Without Detection, Is More Difficult than It Appears. Archives of clinical neuropsychology : the official journal of the National Academy of Neuropsychologists, 28, 236-44. SCHATZ, P., NEIDZWSKI, K., MOSER, R. S. & KARPF, R. 2010. Relationship between subjective test feedback provided by high-school athletes during computer-based assessment of baseline cognitive functioning and self-reported symptoms. Archives of clinical neuropsychology : the official journal of the National Academy of Neuropsychologists, 25, 285-92. SCHATZ, P., PARDINI, J. E., LOVELL, M. R., COLLINS, M. W. & PODELL, K. 2006. Sensitivity and specificity of the ImPACT Test Battery for concussion in athletes. Arch Clin Neuropsychol, 21, 91-9.