Archives of Physical Medicine and Rehabilitation journal homepage: www.archives-pmr.org Archives of Physical Medicine and Rehabilitation 2013;94:589-96
SPECIAL COMMUNICATION
An Introduction to Applying Individual Growth Curve Models to Evaluate Change in Rehabilitation: A National Institute on Disability and Rehabilitation Research Traumatic Brain Injury Model Systems Report Allan J. Kozlowski, PhD,a,b Christopher R. Pretz, PhD,c,d Kristen Dams-O’Connor, PhD,e Scott Kreider, MS,c,d Gale Whiteneck, PhDc,d From the aCenter for Rehabilitation Outcomes Research, Rehabilitation Institute of Chicago, and bCenter for Healthcare Studies, Feinberg Medical School, Northwestern University, Chicago, IL; cCraig Hospital and the dTraumatic Brain Injury National Statistical and Data Center, Englewood, CO; and eMount Sinai School of Medicine, New York, NY.
Abstract The abundance of time-dependent information contained in the Spinal Cord Injury and the Traumatic Brain Injury Model Systems National Databases, and the increased prevalence of repeated-measures designs in clinical trials highlight the need for more powerful longitudinal analytic methodologies in rehabilitation research. This article describes the particularly versatile analytic technique of individual growth curve (IGC) analysis. A defining characteristic of IGC analysis is that change in outcome such as functional recovery can be described at both the patient and group levels, such that it is possible to contrast 1 patient with other patients, subgroups of patients, or a group as a whole. Other appealing characteristics of IGC analysis include its flexibility in describing how outcomes progress over time (whether in linear, curvilinear, cyclical, or other fashion), its ability to accommodate covariates at multiple levels of analyses to better describe change, and its ability to accommodate cases with partially missing outcome data. These features make IGC analysis an ideal tool for investigating longitudinal outcome data and to better equip researchers and clinicians to explore a multitude of hypotheses. The goal of this special communication is to familiarize the rehabilitation community with IGC analysis and encourage the use of this sophisticated research tool to better understand temporal change in outcomes. Archives of Physical Medicine and Rehabilitation 2013;94:589-96 ª 2013 by the American Congress of Rehabilitation Medicine
Evaluating change in patient functioning is a primary concern in rehabilitation clinical practice and research. Before the development of more sophisticated approaches, analysis of change relied on linear regression methods that do not account for relationships that exist across time or hierarchical levels. Investigation of longitudinal outcomes has been conducted with linked crosssectional analyses or pre-post treatment models in which only the baseline status and a single endpoint are considered. However,
Supported by the National Institute on Disability and Rehabilitation Research through the Rehabilitation Research and Training Center on Improving Measurement of Medical Rehabilitation Outcomes (grant no. H133B090024), and the Traumatic Brain Injury Model Systems National Data and Statistical Center (grant no. H133A110006). No commercial party having a direct financial interest in the results of the research supporting this article has or will confer a benefit on the authors or on any organization with which the authors are associated.
cross-sectional analyses fail to model time explicitly; that is, time is not directly related to outcome, but instead, data collected for individuals are defined by “snapshots” of group means at milestones, such as admission, discharge, or 1-month follow-up. Consequently, cross-sectional analyses cannot model outcomes or covariates as they relate to time. Additionally, in studies where individuals are assessed at several time points, cross-sectional analyses ignore correlations of the individuals’ repeated measures.1 If correlated measures are treated as if they were independent, the variance of the parameter estimatesdand consequently the inferences based on these estimatesdwill be inaccurate and potentially misleading.2 Likewise, in pre-post treatment designs, information on the nature of change is lost when interim measures are excluded from the analysis and change from baseline to endpoint is collapsed into a single indicator such as a difference score.1,3 Repeated-
0003-9993/13/$36 - see front matter ª 2013 by the American Congress of Rehabilitation Medicine http://dx.doi.org/10.1016/j.apmr.2012.08.199
590
A.J. Kozlowski et al
measures analysis of variance (ANOVA) provides a marked improvement in comparing groups over time by accounting for the correlations between measures taken on the same individual. However, the focus of such a design is limited to comparison of group means, and it cannot accommodate analysis at the individual level.4-6 Accurate documentation and modeling of change is important to researchers because it directly affects the veracity of the findings and is of critical importance to clinicians who hope to extract knowledge from the literature. Fortunately, rehabilitation researchers have access to a longitudinal analytic methodology that can incorporate time, address correlations between data points resulting from repeated measures taken from the same individual, retain information about individual as well as group change, and accommodate missing data without excluding individuals. Although a number of modern longitudinal analytic techniques exist, and were applied in rehabilitation more than a decade ago,7 researchers in rehabilitation have only recently begun to adopt these techniques.8-14 One of these powerful yet underutilized methodologies is individual growth curve (IGC) analysis. IGC analysis is known by many names; some of the more common ones are latent growth curve analysis, hierarchical linear modeling, mixed-effect modeling, random effects modeling, and multilevel modeling. These labels represent variations of multilevel modeling and mean essentially the same thing. Even though IGC analysis has been available for more than 30 years, implementation in rehabilitation research is limited. IGC analysis has not been routinely applied for a variety of reasons including the absence of longitudinal data, necessary statistical software, computational power, and the knowledge and skills necessary to implement the analysis itself. However, with the maturation of longitudinal datasets in rehabilitation such as the Spinal Cord Injury (SCI) and the Traumatic Brain Injury (TBI) Model Systems National Databases, accessibility to fast and affordable computers, enhancements in software packages such as SAS,15 SPSS,16 Mplus,17 Stata,18 HLM-7,19 and the open-source ‘R’ Project,20 and the growing prevalence of advanced training in longitudinal data analyses, rehabilitation researchers are now well positioned to include IGC analysis as a key component of their methodologic repertoire.
IGC analysis: the basic idea
databases like the SCI and TBI Model Systems National Databases, they are also useful for longitudinal randomized controlled trials. In the following paragraphs we expound on the subtleties and describe the advantages of using IGC analysis as an analytic tool, and provide a detailed example of IGC analysis to reinforce the ideas discussed. In the simplest case, suppose each individual’s response is best defined by a straight line. Consider the response patterns provided in figure 1, which are based on a dataset created for didactic purposes. Using a straight line to describe the response pattern of every individual may not be prudent because some individuals may be fit better by a different function, such as a curve. However, since a straight line describes the response patterns of most individuals fairly well, a line is used to describe all response patterns. As a result, each individual is represented by a linear trajectory, or the line of best fit that passes through the data points of the individual. The use of linear trajectories to describe change over time is shown in figure 2, where each individual’s trajectory, based on their unique response patterns across each time point presented in figure 1, is represented by a dashed straight line defined by 2 parameters: an intercept (starting level) and a slope (rate of change). Since time is an integral part of an IGC model, the intercept (initial status) and rate of change (slope) parameters should be defined to facilitate interpretation of the models. For example, the intercept might be set to the date of injury or of admission to rehabilitation. All subsequent time points would be defined in days, weeks, months, years, or other meaningful time unit from the intercept. Change is reported as an increase or decrease in the outcome for a 1-unit increase in the selected time unit. Thus, change per day or week might not be optimal for a study that covers years of data. In figure 2, each trajectory is distinct and consequently retains the intercept and slope growth parameters for each individual represented in the data. In IGC terminology, the intercept is often defined as a person’s initial status, while the slope is his/her rate of change. The trajectory that is defined by an intercept and slope describes how the outcome for the individual progresses over time. The utility of these individual growth parameters arises from the ability to make direct comparisons between persons, subgroups, and the group average. This utility accounts for the
IGC analysis is an extension of traditional regression models; it is unique, however, in that it allows the researcher to simultaneously model outcomes at the individual and group levels. In terms of modeling at the individual level, measurements of a particular outcome can be directly related to time by use of a variety of mathematical functions that range from the simple to the complex. This feature is crucial because even though outcome measures often change over time in a consistent fashion, they may exhibit curvature (quadratic change), rising and falling patterns (cubic change), floor or ceiling effects (nonlinear change), or a number of other patterns. Although IGC analyses are suited to observational
List of abbreviations: AIC ANOVA COWA IGC SCI TBI
Akaike information criterion analysis of variance Controlled Oral Word Association individual growth curve spinal cord injury traumatic brain injury
Fig 1
Response patterns for a sample of individuals.
www.archives-pmr.org
Introduction to individual growth modeling
591
Outcome 40 35 30 25 20 15 10 5 0 0
1
2
3
4
5
Year Group 353
Fig 2
101 360
302 501
311 603
348 610
351
IGC trajectories for a sample of individuals.
strength of IGC analysis in modeling rehabilitation outcomes for research and clinical applications. In figure 2, individuals can be compared visually in terms of their initial status and rate of change estimates, but can also be compared statistically as described below. The outcome trajectories for the group are described by averaging the values for the individual growth parameters to create the growth parameters for the group, also called fixed effects. That is, the group initial status is estimated by averaging the initial status across individuals, while the group rate of change is determined by calculating the mean individual rate of change, as seen in the bold red line in figure 2. It appears that the group starts with an average value of approximately 23.5, from which point the group average value steadily decreases over time at a rate of e2.4 outcome units per year. Not only does the group trajectory relate change in outcome to time, it also serves as a basis for subsequent comparisons. That is, an individual’s initial status and rate of change can be directly compared with those of the group or subgroups. Likewise, subgroups can be compared with other subgroups, such as women with men or strata of condition severity (mild, moderate, severe disability). Another important component of IGC analysis is investigation of the variance of the individual growth parameters (called random effects) about the group means (fixed effects). Figure 2 demonstrates that individuals vary both in initial status and ratedthat is, in terms of where they start and how they change over time on the outcome measure. The presence of significant variance in the individual growth parameters implies that related factors represented by 1 or more covariates might explain this variability. For example, as a covariate such as age at injury increases, rates of change may decrease. Thus, if the outcome measure is level of depression, where a higher score indicates a greater degree of depression, such a relationship would indicate that older individuals tend to display less rapid decreases in depression compared with those who are younger. Similar statements could also be made about the relationships between a covariate and the initial status. Many of the aforementioned characteristics of IGC analysis make it highly suited for rehabilitation research applications. This www.archives-pmr.org
is particularly so where a patient’s impairments, activity limitations, participation restrictions, quality-of-life ratings, or other outcome measures are tracked over time in response to treatment or as resulting from natural recovery. Perhaps greater value lies in how clinical practice can be informed by the detail inherent in IGC models. IGC models provide group change estimates based on individual change, and retain the individual differences that contribute to that change. Thus, evidence from research in forms of comparative norms and prognostic models could be applied clinically to plan interventions and evaluate patient outcomes at both the group and individual levels. Consequently, IGC analysis is a useful technique for evaluating how rehabilitation patients progress while identifying the patient, condition, and treatment factors that are associated with change, and permitting comparisons among individuals, subgroups, and the groups as a whole. To augment the discussion above, we include an example to facilitate a deeper understanding of the basic features of IGC analysis. We encourage rehabilitation clinicians and researchers to become familiar with the equations that define the example model because they are generally reported in IGC research publications and other IGC-related literature. Additionally, understanding how the equations relate to the concepts of simultaneously modeling the individual and group levels will facilitate understanding the more complex manifestations of IGC analysis. However, every attempt was made to construct an example that conveys the principles of IGC analysis without equations so that those with a limited background in mathematics will be able to attain a conceptual understanding. Note that this example is for didactical purposes only and is not intended to be interpreted as an actual study.
Example TBI can result in various cognitive deficits. Reduced verbal fluency, for instance, can be measured with the Controlled Oral Word Association (COWA) Test,21,22 as this test is a sensitive indicator of brain dysfunction.23 Patients name in 1 minute as many words as possible that begin with a given letter of the alphabet. The total score of all acceptable words (ie, no proper nouns or repetitions) produced across 3 trials is adjusted using nationally representative normative data for age, sex, and education to create the adjusted COWA score, which is our outcome variable (labeled as COWAadj in the formulas). In this example, the word association test was administered during rehabilitation and each consecutive postinjury year for 3 years in a substantial number of patients enrolled in the TBI Model Systems National Database. Participants whose data are represented in the TBI Model Systems National Database consented to the collection of their data for research purposes at the facility in which they received rehabilitation. The current analyses were approved by the institutional review board of the TBI Model Systems National Data and Statistical Center. All analyses were conducted at the TBI Model Systems National Data and Statistical Center using a deidentified version of the database and SAS version 9.3a statistical software.
Investigating response patterns and trajectories It is common in IGC analysis to examine data graphically as an initial step. Figure 3 depicts individual response patterns (dashed lines) for a selection of participants. Each individual response
592
A.J. Kozlowski et al
Fig 3 Response patterns of adjusted COWA test scores over time for a sample of individuals from the Traumatic Brain Injury Model Systems database.
Fig 4 IGC trajectories of adjusted COWA test scores over time for a sample of individuals from the Traumatic Brain Injury Model Systems database.
pattern is scrutinized to identify the mathematical function (ie, linear, quadratic, cubic, etc) that most appropriately describes the relationship between the outcome and time, and the type of change that conforms to most response patterns is chosen. Although response patterns may vary, common practice in IGC analysis is to select the mathematical expression that best fits most of the data. We placed the intercept (year 0) at the time of the first COWA assessment, which is at approximately 1 year postinjury, so the response patterns represent postrehabilitation time frames. Although some patterns curve upward and others downward, a straight line adequately describes the change for most individuals. Since most response patterns appear to follow a linear trend, the following equation (function) was selected to describe the adjusted COWA score:
modeled mathematically as trajectories that are defined by each individual’s initial status and rate of change. Thus, the initial status and rate of 1 individual can be compared with another, with a subgroup, or with the group as a whole. Trajectories that correspond to the response patterns in figure 3 are provided in figure 4. The trajectories displayed in figure 4 indicate both variability in initial status and rate of change. An essential component of IGC analysis is to understand the extent to which initial statuses and rates differ from the group averages.
COWAadjti Zp0i þ p1i ðYearti Þ þ εti This equation has components that are similar to those used in ordinary least-squares regression, except that this equation focuses exclusively on the individual, where equations in ordinary leastsquares regression define the association between predictor and outcome variables for a group. For each individual (denoted by a subscript i), the equation linearly relates outcome ðCOWAadjti Þ to time ðYearti Þ, where the time point in which the data for the outcome were collected is denoted by t. Consequently, the above formula is often called an individual or level-1 equation. IGC parameters are interpreted similar to ordinary least-squares regression coefficients, except p0i represents the true initial status of an individual (the intercept), and p1i represents the individual’s true rate of change (the slope). In practice, these parameters are estimated. The final term in the above equation, εti , is the error term, and for a given time point, is the portion of the individual’s outcome that remains unaccounted for. Correct specification of this error term can be a complicated process, and discussion of this process is beyond the introductory nature of this article. However, many longitudinal analysis texts cover this topic extensively.24 The purpose of the equation is to provide a general representation of each response pattern and, in doing so, create a common basis of comparison for individuals and set the stage for group analysis. In other words, the individuals’ response patterns are
Random effects Determining the extent to which initial statuses and rates vary is important because significant variability implies the existence of a meaningful spread in initial statuses, rates of change, or both. Note that the variations in initial statuses, rates, or both could be explained by including patient, injury, or intervention characteristics in the model as covariates. Conversely, if initial statuses and rates do not significantly vary, trajectories appear almost identical across individuals, and the group average provides a reasonable description of everyone. The variability in initial statuses and rates are defined by the parameters u00 and u11 , respectively, where the degree to which initial statuses and rates are related, and their covariance is denoted by the parameter u01. The initial status and rate estimates are often related because both measures are derived from the same individual. Estimates of u00 , u11 , and u01 are provided in table 1. In IGC terminology, u00 , u11 , and u01 capture what are called random effects, and define the individual variation from the group average. Based on their respective P values (P<.05), both the variability in initial statuses and rates (62.1 and 3.7, respectively) are statistically significant (see table 1). Therefore, we conclude that individuals differ in word fluency at initial status and in the rates of word fluency change over time. This variability in initial statuses and rates may be explained in part by 1 or more covariates. The covariance between initial statuses and rates is not significant in this case. If the covariance was positive and found to be significant, this would imply that those with higher initial adjusted COWA scores would, in general, exhibit faster rates of improvement in word fluency. www.archives-pmr.org
Introduction to individual growth modeling
593
Table 1 Estimates of variability in initial status and rate, and the covariance of initial status and rate Label
Parameter Estimate SE
Variability in initial u00 status Variability in rate u11 Covariance of initial u01 status and rate
z Score P
62.1
6.4 9.7
<.0001
3.7 3.1
1.7 2.2 2.5 1.3
.02 .21
Fixed effects Also included in figure 4 is the average or group trajectory indicated by the red line. It is common in IGC analysis to use b00 to represent the average initial status and b10 to represent the average rate. The individual’s initial status is equal to the average initial status plus a random error term ða0i Þ such that p0i Zb00 þ a0i . Consequently, each individual’s initial status can be defined as varying randomly about the average initial status of the group. Likewise, the individual’s rate of change is equal to the average rate plus a random error term ða1i Þ such that p1i Zb10 þ a1i , meaning that each individual’s rate varies randomly around the group average. Together, equations p0i Zb00 þ a0i and p1i Zb10 þ a1i are known as the “group” or level-2 equations because they contain information at the group level. Additionally, both the average initial status and rate are known as fixed effects because they do not vary, unlike the initial statuses and rates of individuals. Estimates of the fixed effects for our example are given in table 2. Based on their respective P values (see table 2), the average initial status and the average rate both differ significantly from zero (P<.05). Specifically, the average adjusted COWA initial status is 25.4, while the average rate suggests a gain of 4.8 adjusted COWA points per year. Upon obtaining values for the fixed effects these values can be compared, if desired, with those of an individual or subgroup. Figure 5 depicts a comparison between trajectories of 2 individuals indicated by the dashed lines and the group indicated by the bold red line. The values of each individual’s initial status and rate are given in table 3 along with the corresponding P values that compare the individual’s initial status and rate with those of the group. The first individual’s initial status is similar to the group average with an adjusted COWA value of 25.9. However, the rate to which the person improved (9.8 adjusted COWA points per year) is significantly different from that of the group. The second individual differs from the group on initial status (13.1), though this person’s rate does not differ from the rate of the group (P Z .77). Although not demonstrated here, the principles discussed above can easily be used to compare 1 individual with another or 1 subgroup with another. The unconditional model describes the best-fit average trajectory for the available data, and the addition of 1 or more covariates to explain variance of the growth parameters will produce models that are conditional on the specific associations between the covariate(s) and the growth parameters that are included. Table 2
Estimates of group intercept and rate
Covariate introduction If the random effects are statistically significant, variability in initial statuses, rates, or both may be explained by introducing 1 or more covariates into the group level equations. Covariates can be continuous, dichotomous, or categorical. It is common practice to center continuous covariates by calculating the difference between the mean of the covariate and each individual’s value of the covariate. This process transforms the average of the covariate to be zero and allows for accurate interpretation of growth parameters, which we illustrate below. The parameter estimates for continuous covariates are interpreted as the amount of change in the outcome variable for a 1-unit change in the covariate. Singer25 provides an in-depth discussion on the process and importance of centering covariates. Dichotomous covariates can be used by assigning a reference category as is commonly done in regression analysis. Similarly, categorical covariates can be included in the analysis where the selected reference category serves as the basis of comparison between itself and the other levels of the covariate. Note that the choice of the reference category for a categorical covariate is arbitrary, though it should represent the level of the covariate against which one wishes to draw contrast. For this example, we use estimated years of education at injury as the covariate. To maintain interpretability of the fixed effects, we centered the covariate about the mean, creating a new variable, mean-centered education (labeled in formulas as EducationMC ). With the covariate, the group level equations become the following: p0i Zb00 þ b01 ðEducationMC Þ þ a0i and p1i Zb10 þ b11 ðEducationMC Þ þ a1i Notice that when EducationMC Z0, which is now the average for EducationMC , the above equations reduce to p0i Zb00 þ a0i Table 3
Parameter
Estimate
SE
t
P
Group initial status Group rate
25.4 4.8
0.4 0.2
57.2 22.6
<.0001 <.0001
www.archives-pmr.org
Fig 5 Comparison of individual and group average fitted trajectories.
Individual 1 2
Individual initial statuses and rates Initial Status
P
Rate
P
25.9 13.1
.89 <.0001
9.8 4.6
<.0001 .77
594
A.J. Kozlowski et al
and p1i Zb10 þ a1i , respectively. Thus, the average trajectory with the centered covariate set at a value of zero reflects the trajectory for the unconditional model. Introducing a covariate requires the addition of 2 fixed effects, b01 and b11. The first of these fixed effects describes the linear relationship between centered education and the individual’s initial status, and is labeled b01 . The second describes the effect of the covariate on the individual’s rate, and is labeled b11 . Thus, the initial status of an individual is a function of the average initial status plus the effect of the covariate on the initial status, and an error term. Likewise, the rate of change of an individual is a function of the average rate plus the effect of the covariate on the rate, and an error term. To assess the variability explained by the covariate, new estimates of variability in initial status and rate estimates (table 4) are compared with those from the unconditional model. The addition of the covariate reduces the variance in both initial statuses and rates. To compute the variability accounted for by grand mean-centered education, we subtract the estimate of the variance based on the inclusion of the covariate from the estimate provided by the initial model and divide this quantity by the latter. 3:7 3:3 Z0:11 3:7
Broader applications
Thus, years of education explained roughly 11% of the variability in the rates of change, and (62.1 e 60.9)/62.1 Z .02 or 2.0% of the variability in initial statuses. In both instances, the amount of variability explained is relatively small. The small contribution of this covariate in explaining overall variance is also evident in the finding that variability in both initial statuses and rates remains statistically significant (P<.05). Based on these results, additional covariates may explain additional variance. To demonstrate that the addition of the covariate improves overall model fit, model fit statistics such as the Akaike information criterion (AIC) can be compared between the unconditional model and models containing a covariate(s). In this instance, the AIC for the unconditional model is 10,485, while the AIC for the model containing education is 10,387. Since a drop in the AIC greater than 10 indicates model separation,26 we conclude that the addition of education as a covariate improves model fit. The estimates of the fixed effects are displayed in table 5. Because of the inclusion of the covariate, the model now contains a total of 4 fixed effects: the group intercept, the group rate, the relationship between the covariate and initial status, and the relationship between the covariate and rate. In comparison with the results in table 2, values of the estimates of the group’s initial status and rate have changed only slightly after the inclusion of the covariate, which was expected. Years of education is linearly related to initial status (P Z .01), such that every additional year of education (before TBI) above Table 4 Estimates of variability in initial status and rate, and the covariance of initial status and rate Label
Parameter Estimate SE
Variability in initial u00 status Variability in rate u11 Covariance of initial u01 status and rate
the average for the group is associated with a 0.4-unit increase on initial adjusted COWA score. The significant effect of the covariate on rates of change indicates that each 1-year increase in education before TBI is associated with an increase in rate by an average of 0.2 adjusted COWA units per year. Thus, those with more years of education before TBI start with higher word association scores and demonstrate faster improvement, in comparison to those with fewer years of education. Had this example described an actual study, examining the significant associations between the COWAadj covariate and the initial status and rate would provide an interesting side note. While scores for the adjusted COWA measure are initially adjusted for age, sex, and education based on norm tables, centered education was significantly associated with both the initial status and the rate of change estimates. This could indicate that the initial adjustment in COWA scores based on normative non-TBI data was insufficient, or it could indicate that the IGC modeling of associations between centered education and the initial status and rate parameters captured variance for which the original adjustment method was unable to account. This finding demonstrates one of the clinically relevant questions that can be explored with IGC methods.
z Score P
60.9
6.4 9.6
<.0001
3.3 2.4
1.7 1.9 2.5 1.0
.0311 .3229
This example provides only a glimpse of the possibilities offered by the application of IGC analyses in rehabilitation. In our example, we considered 1 covariate for illustrative purposes, although more would likely be included in hypothesis-driven research. We modeled assessments taken over time within individuals; however, patients are nested within rehabilitation programs or facilities, which may have differential effects on outcome. The IGC model can be expanded to account for these organizational structures by introducing a series of level-3 equations that define the effects of an additional “program” level.4 Although there is no theoretic limit to the number of levels one can model, higher levels should have some form of clustering of individuals at lower levels, and as the number of levels increase, so does the computational power needed to perform the analysis. Often individual response patterns do not conform to simple straight lines but instead display curves, cyclical patterns, or other types of change. To accommodate different types of change, IGC analysis is flexible in modeling a wide variety of response patterns. Covariates can also be used to explain variability in the growth parameters that describe more complex types of change, along with defining the trajectory of a curvilinear trend for group comparison. For instance, those who are 20 years of age at injury may have a completely different curvilinear trend than those whose injury occurred at the age of 60. Multiple covariates can also be modeled and assessed simultaneously.
Table 5 Estimates of group intercept and rate, effect of the covariate on initial status and rate Parameter Group Group Effect Effect
Estimate SE
intercept 25.5 rate 4.8 of covariate on initial status 0.4 of covariate on rate 0.2
0.5 0.2 0.2 0.09
t
P
57.4 <.0001 22.7 <.0001 2.5 .0137 2.70 .0073
www.archives-pmr.org
Introduction to individual growth modeling
595
IGC analysis models time explicitly, which can be treated either as continuous (ie, using the actual amount of time from the initial status in days, weeks, years, or other time unit at which each individual’s scores were measured) or as categorical (ie, by assigning common time points to comparable scores for individuals, such as week 1, week 2, week 6, etc, where the actual data collection took place in a window of 2 or 3d). Intervals between time points need not be equally spaced, and as long as the data are at least missing at random, IGC analysis does not require individuals to have scores for all time points.5 In addition, if data are missing systematically, it is possible to assess the validity of the results by use of pattern mixture modeling, meaning that in some instances, even cases with systematically absent data may be used.6 Therefore, an individual need not be removed from the analysis simply because 1 or more outcome measures are missing. As discussed earlier, IGC analysis allows an individual’s trajectory to be compared with that of the group; the trajectory of 1 individual (or subgroup of individuals) can also be directly compared with that of another. This capability provides advantages to clinicians in the application of research findings to practice and in the evaluation of clinical outcomes to inform practice. In applying research findings, clinicians could compare individuals or a subgroup of specific patients with each other or the group average. In this respect, individual growth models are more clinically relevant than fixed-effects-only models (eg, repeatedmeasures ANOVA) because of their ability to describe recovery at the individual, subgroup, or group level. However, IGC analysis only works well when the response patterns of most individuals fit the selected model structure. When individual response patterns do not adhere to the model, then alternative analytic methods such as fixed-effects models should be used. Individual-level prognoses could be generated for new patients based on previous patients’ data. Once an IGC model has been developed from existing patient data, a trajectory can be predicted for a new patient by substituting the values for the new patient’s demographic and injury characteristics, and for any other covariates included in the IGC model, into the equations for the growth parameters. This predicted trajectory could then be used as an individual benchmark to evaluate the new patient’s actual recovery. Decisions like estimating discharge date and disposition could be based on the predicted magnitude and timing of a plateau in a curvilinear trajectory. In addition, where different components of functioning demonstrate different trajectories of recovery (eg, mobility vs self-care), interventions might be sequenced rather than provided concurrently. Competing interventions could be evaluated for clinically significant differences in rate of recovery in addition to just the magnitude of posttreatment outcome differences.
complexity to better understand the factors that account for individual variability in outcomes over time. The integration of IGC into rehabilitation research and practice can provide for revolutionary advances in our understanding of how individuals recover and live with disabilities, and offers a powerful method to evaluate the effects of rehabilitation interventions on outcomes. Thus, application of IGC analysis may provide the opportunity to understand intricacies of recovery as trajectories of change not apparent when evaluating an outcome at a single time point or as a set of cross-sectional time points, and could inform all aspects of rehabilitation science, practice, administration, and policy. Additionally, continued development and applications of IGC methods for the evaluation of patient outcomes has the potential to revolutionize decision-making for all stakeholders in rehabilitation. Interpreting outcome as an ever-changing process mapped by linear, curvilinear, or nonlinear trajectories rather than as a result at a single point in time (or a series of points in time) could improve treatment and discharge planning. However, IGC and other mixed methods are not appropriate for all applications, and researchers should ensure that their statistical methods are compatible with their study design, particularly in the case of small samples or few time points.27 Many resources are available to readers who are interested in learning more about IGC models specifically and mixed models generally.4,5,24,27-29 In summary, IGC analysis is highly versatile, offering numerous options for longitudinal data analysis that are unavailable in other approaches. The benefits of IGC analysis are as follows:
Discussion
While the benefits of IGC analysis for rehabilitation research are considerable, these methods have limitations. Limitations include the following; the assumptions needed for proper inference are indicated by an asterisk:
In this special communication, we have described and demonstrated the application of IGC methods and highlighted the benefits and appropriateness of this approach in modeling rehabilitation outcomes. In addition to accounting for interindividual differences in change over time, IGC models can account for associations between covariates and any or all of the growth parameters considered, as well as for associations between the growth parameters themselves. These features offer analytic options that are particularly well suited for rehabilitation research. Rehabilitation patient populations change during and after intervention, and a variety of factors can influence individual outcomes. IGC models offer the flexibility and www.archives-pmr.org
Simultaneous evaluation of change over time at both the individual and group level The ability to model time as related to outcome in a flexible manner The capacity to treat time as either a continuous or a discrete variable The ability to model continuous or categorical covariates, or both The capacity to retain data for individuals with missing data at 1 or more time points The capacity to examine the variability in the parameter estimates (ie, intercepts and slopes) used in describing how an outcome changes over time The capability to evaluate the extent to which 1 or more covariates explain variability in the parameter estimates of change of the outcome over time The option to include additional levels in the modeling process, by modeling factors that introduce homogeneity into otherwise heterogeneous groups The ability to investigate an expansive set of hypotheses
Data should ideally be available for at least 3 time points. Dependent variables should be continuous, though the use of pseudocontinuous outcomes is not uncommon in the social sciences. Missing data are ideally missing at random. The trajectories of the individual response profiles (level-1 model) are correctly specified (ie, linear, quadratic, cubic change, etc).* Similarly, level-2 models are correctly specified.*
596
A.J. Kozlowski et al
The structure of the residuals for the level-1 model is correctly specified.* The structure of the residuals for the level-2 models assumes a bivariate normal distribution.*
Conclusions IGC modeling has widespread application in rehabilitation research and clinical practice, and can serve as a link between the 2 arenas by simultaneously modeling individual and group-level change in outcome over time. IGC modeling is versatile, more appropriate for longitudinal data analyses than are cross-sectional analyses or pre-post treatment designs, and in many cases more appropriate than repeated-measures ANOVA, as these methods fail to model at the individual level. IGC models should be considered where both the individual and group represent important components of analysis. These situations include those where data are nested within the individual (ie, multiple assessments over time), and where individuals are nested within higherorder structures such as treatment programs or facilities. While applicable to randomized designs, the IGC model is particularly useful for analyses of observational longitudinal data such as those found in the TBI and SCI Model Systems databases. The combination of faster computers, advancements in statistical software packages, growing exposure to IGC analysis by rehabilitation researchers, and the expansion of longitudinal databases all point toward incorporating IGC analysis as an essential component of the rehabilitation researcher’s statistical arsenal.
Supplier a. SAS Institute Inc, 100 SAS Campus Dr, Cary, NC 27513-2414.
Keywords Longitudinal studies; Treatment outcome
Regression
analysis;
Rehabilitation;
Corresponding author Allan J. Kozlowski, PhD, Rehabilitation Institute of Chicago, 345 E Ontario St, Chicago, IL 60611. E-mail address: akozlowski@ ric.org.
Acknowledgments We thank John D. Corrigan, PhD, Mark Sherer, PhD, Jennifer Bogner, PhD, Flora M. Hammond, MD, Jeffrey P. Cuthbert, PhD, and Allen W. Heinemann, PhD, for their contributions in the writing and editing of this manuscript.
References 1. Rogosa DR, Brandt D, Zimowski M. A growth curve approach to the measurement of change. Psychol Bull 1982;92:726-48. 2. Dunlop DD. Regression for longitudinal data: a bridge from least squares regression. Am Stat 1994;48:299-303. 3. Rogosa DR. Understanding correlates of change by modeling individual differences in growth. Psychometrika 1985;50:203-28.
4. Raudenbush SW, Bryk AS. Hierarchical linear models: applications and data analysis methods. 2nd ed. Thousand Oaks: Sage Publications; 2002. 5. Fitzmaurice G, Laird M, Ware J. Applied longitudinal analysis. 2nd ed. Hoboken: John Wiley & Sons; 2011. 6. Hedeker DG, Gibbons RD. Longitudinal data analysis. Hoboken: John Wiley & Sons; 2006. 7. Chan L, Koepsell TD, Deyo RA, et al. The effect of Medicare’s payment system for rehabilitation hospitals on length of stay, charges, and total payments. N Engl J Med 1997;337:978-85. 8. Warschausky S, Kay JB, Kewman DG. Hierarchical linear modeling of FIM instrument growth curve characteristics after spinal cord injury. Arch Phys Med Rehabil 2001;82:329-34. 9. Kwok OM, Underhill AT, Berry JW, Luo W, Elliott TR, Yoon M. Analyzing longitudinal data with multilevel models: an example with individuals living with lower extremity intra-articular fractures. Rehabil Psychol 2008;53:370-86. 10. Barker GM, O’Brien SM, Welke KF, et al. Major infection after pediatric cardiac surgery: a risk estimation model. Ann Thorac Surg 2010;89:843-50. 11. van Leeuwen CM, Post MW, Hoekstra T, et al. Trajectories in the course of life satisfaction after spinal cord injury: identification and predictors. Arch Phys Med Rehabil 2011;92:207-13. 12. Putzke JD, Richards S, Hicken BL, DeVivo MJ. Predictors of life satisfaction: a spinal cord injury cohort study. Arch Phys Med Rehabil 2002;83:555-61. 13. Chen Y, Anderson CJ, Vogel LC, Chlan KM, Betz RR, McDonald CM. Change in life satisfaction of adults with pediatric-onset spinal cord injury. Arch Phys Med Rehabil 2008;89:2285-92. 14. Chu B-C, Millis S, Arango-Lasprilla JC, Hanks R, Novack T, Hart T. Measuring recovery in new learning and memory following traumatic brain injury: a mixed-effects modeling approach. J Clin Exp Neuropsychol 2007;29:617-25. 15. SAS statistical software, version 9.3. Cary: SAS Institute; 2012. 16. SPSS statistics, version 20. Armonk: IBM; 2012. 17. MPlus statistical software. Los Angeles: Muthe´n & Muthe´n; 2012. 18. Stata analysis and statistical software. College Station: StataCorp; 2012. 19. HLM-7 statistical software for hierarchical linear modeling. Version 7. Lincolnwood: Scientific Software International; 2012. 20. Institute for Statistics and Mathematics, Vienna University of Economics and Business. The comprehensive R archive network website 2012. Available at: http://cran.r-project.org/. Accessed July 3, 2012. 21. Benton AL, Hamsher KD. Multilingual aphasia examination. Iowa City: AJA Associates; 1989. 22. Spreen O, Strauss E. A compendium of neuropsychological tests. New York: Oxford University Pr; 1998. 23. Lezak M, Howieson DB, Loring DW. Neuropsychological assessment. 4th ed. New York: Oxford University Pr; 2004. 24. Singer J, Willett J. Applied longitudinal data analysis. Oxford: Oxford University Pr; 2003. 25. Singer JD. Using SAS Proc Mixed to fit multilevel models, hierarchical models, and individual growth models. J Educ Behav Stat 1998; 24:323-55. 26. Burnham KP, Anderson DR. Multimodel inference: understanding AIC and BIC in model selection. Sociol Methods Res 2004;33:261-304. 27. Liu S, Rovine MJ, Molenaar PC. Selecting a linear mixed model for longitudinal data: repeated measures analysis of variance, covariance pattern model, and growth curve approaches. Psychol Methods 2012; 17:15-30. 28. Rabe-Hesketh S, Skrondal A. Multilevel and longitudinal modeling using Stata. Vol I: continuous responses. 3rd ed. College Station: Stata Pr; 2012. 29. Rabe-Hesketh S, Skrondal A. Multilevel and longitudinal modeling using Stata. Vol II: categorical responses, counts, and survival. 3rd ed. College Station: Stata Pr; 2012.
www.archives-pmr.org