Classifying developmental trajectories over time should be done with great caution: a comparison between methods

Classifying developmental trajectories over time should be done with great caution: a comparison between methods

Journal of Clinical Epidemiology 65 (2012) 1078e1087 Classifying developmental trajectories over time should be done with great caution: a comparison...

674KB Sizes 16 Downloads 26 Views

Journal of Clinical Epidemiology 65 (2012) 1078e1087

Classifying developmental trajectories over time should be done with great caution: a comparison between methods Jos Twiska,b,*, Trynke Hoekstraa,b a

Department of Methodology and Applied Biostatistics, Institute of Health Sciences, Faculty of Earth and Life Science, Vrije Universiteit, De Boelelaan 1087, 1081 HV Amsterdam, The Netherlands b Department of Epidemiology and Biostatistics, EMGO Institute for Health and Care Research, Vrije Universiteit Medical Center, Amsterdam, The Netherlands Accepted 18 April 2012; Published online 20 July 2012

Abstract Objective: In the analysis of data from longitudinal cohort studies, there is a growing interest in the analysis of developmental trajectories in subpopulations of the cohort under study. There are different advanced statistical methods available to analyze these trajectories, but in the epidemiologic literature, most of those are never used. The purpose of the present study is to compare five statistical methods to detect developmental trajectories in a longitudinal epidemiological data set. Study Design and Setting: All five statistical methods (K-means clustering, a ‘‘two-step’’ approach with mixed modeling and K-means clustering, latent class analysis [LCA], latent class growth analysis [LCGA], and latent class growth mixture modeling [LCGMM]) were performed on a real-life data set and two manipulated data sets. The first manipulated data set contained four different linear developments over time, whereas the second contained two linear and two quadratic developments. Results: For the real-life data set, all five classification methods revealed comparable trajectories. Regarding the manipulated data sets, LCGA performed best in detecting linear trajectories, whereas none of the methods performed well in detecting a combination of linear and quadratic trajectories. Furthermore, the optimal solution for LCA and LCGA contained more classes compared with LCGMM. Conclusion: Although LCGA and LCGMM seem to be preferable above the more simple methods, all classification methods should be applied with great caution. Ó 2012 Elsevier Inc. All rights reserved. Keywords: Epidemiological methods; Longitudinal studies; Developmental trajectories; Cluster analysis; Latent class growth analysis; Growth mixture modeling

1. Introduction Within medical and epidemiological research, prospective cohort studies become more and more important. One of the main reasons for this increasing popularity is the possibility to study individual development over time. In addition, researchers are often interested in dividing the cohort under study into groups of subjects with comparable developmental trajectories. First, as a tool to describe the population under study and second as a first step to study either the determinants of different trajectories or the consequences of different trajectories. Although the division into subgroups can be done in many different ways, it is surprising that in medical and epidemiological research, sophisticated methods are hardly used. This could be owing to the fact that most of them are based on structural equation modeling (SEM) [1], a fairly complex statistical technique particularly Conflict of interest: None. * Corresponding author. Tel.: þ31-20-5982519; fax: þ31-20-5983668. E-mail address: [email protected] (J. Twisk). 0895-4356/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. doi: 10.1016/j.jclinepi.2012.04.010

popular in psychology and social science, but not so much in medical science and epidemiology. Reviewing the available literature, the techniques used to define subgroups of developmental trajectories can be divided into (1) crosssectional (naive) techniques (i.e., techniques that ignore the longitudinal structure of the data) and (2) longitudinal techniques that define the subgroups according to the parameters of the individual growth curves. The few examples of the classification of developmental trajectories found in the medical and epidemiological literature deal mainly with the development of substance abuse [2e5]; the development of functional limitations in the elderly [6e8]; pediatrics [9e12]; and some specific topics, such as low back pain [13], night time bladder control [14], anxiety and depressive disorders [15], and body fatness [16]. It is striking to see that in these medical and epidemiological studies, there is no consistency in the use of a statistical approach to classify developmental trajectories. The methodology differs from relatively simple cross-sectional methods to complicated SEM techniques.

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087

What is new? Key findings  Although latent class growth analysis (LCGA) and latent class growth mixture modeling (LCGMM) seem to be preferable above the more simple methods, all classification methods should be applied with great caution. What this adds to what was known?  To our knowledge, this is the first study that compares five statistical methods to classify developmental trajectories over time with each other in a practical way without any complicated mathematical issues.  When there are only linear developments over time, LCGA performed the best.  When there are both linear and quadratic developments over time, all classification methods did not perform well.

1079

units were added to the values to make all values positive; (2) Based on the median at the first measurement, the population was divided into two groups and 0.5 standard deviation units were added to all the values of the highest group to get a better distinction between the two groups; (3) Both the highest and the lowest group were then randomly divided into two groups (one group consisted of approximately 40% of the subjects, whereas the other one consisted of approximately 60% of the subjects). One of the two groups was defined as the stable group, whereas the other one was defined as the increasing or decreasing group. For both groups, the increase or decrease was defined at 0.5 standard deviation unit at each of the followup measurements; (4) For the quadratic development over time, the same procedure was followed. However, for the quadratic development, an increase (or decrease) of 1 standard deviation unit at time point 2, 3, and 4 was followed by a decrease (or increase) of 1 standard deviation unit at time point 5 and 6 (see Figs. 1A and 2A). 2.2. Real-life data

The purpose of the present study is to compare several methods with each other, which classifies individuals according to their developmental trajectories. This will first be done on two data sets in which particular developments are manipulated into the data and second on a real-life data set.

The real-life data set is taken from the Amsterdam Growth and Health Longitudinal Study (AGHLS), which is an observational longitudinal study that started in the late seventies among Dutch adolescents living in and around Amsterdam. For the present example, data from three follow-up measurements were used. These measurements were performed at the ages of 32, 36, and 42 years. The parameter of interest in this example was body mass index (BMI). First of all, the subjects were classified into developmental trajectories for BMI and second, these trajectories were used in two additional analyses: (1) the trajectories were used as outcome in a multinomial logistic regression analysis regarding the relationship with birth weight and (2) the trajectories were used as determinant in a linear regression analysis regarding the relationship with mean arterial blood pressure (MAP) measured at the age of 42 years. MAP was calculated as (systolic blood pressure þ 2  diastolic blood pressure)/3. For details about the AGAHLS and the way birth weight and blood pressure were measured, one is referred to other articles [17e19].

2. Methods

2.3. Statistical methods for defining subgroups

2.1. Manipulated data sets

In this article, five statistical methods are compared with each other. The methods are chosen because they are all currently being used in the scientific literature. The general idea of all five classification methods is to minimize the variance within the classes and to maximize the variances between the classes. The first, and probably simplest, statistical method that is used to define subgroups of developmental trajectories is K-means clustering. K-means clustering is a crosssectional iterative technique that creates K number of classes (subgroups) in such a way that the distance between

 The number of classes in the optimal solution derived from LCA and LCGA is (much) bigger compared with LCGMM. What is the implication and what should change now?  This article can be used by practical researchers to choose the optimal method to classify developmental trajectories over time.  This article shows that all classification methods should be applied with great caution.

Two data sets were created; the first data set consists of four linear developmental trajectories over time, whereas the second data set consists of a combination of two linear and two quadratic developmental trajectories. The following procedure was used to create the data sets, in which the starting point was an epidemiological data set with six repeated measurements on 588 subjects: (1) The data at each time point was standardized to set the average development over time to zero. Then 2.5 standard deviation

1080

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087

Fig. 1. Manipulated developmental trajectories and the ones discovered by different methods regarding the linear development over time.

each observation point and the mean of the class, to which that specific observation is classified to, is as least as possible. With this technique, the repeated measurements are not treated as the same variable over time, but as different variables measured on the same individual. The technique requires an a priori specification of the number of classes. The second method is a ‘‘two-step’’ approach. In this approach, the longitudinal nature of the data is not ignored; the approach starts by performing a mixed model analysis to obtain the population growth curve parameters and the individual random effects. With the individual random effects, the differences between the individuals are added to the model for both the average value over time (i.e., random intercept) and the development over time (i.e., random slope) [20]. Using both the individual and population parameters for each subject, the outcome over time is predicted for each subject. The second step is then to use these predicted values in a K-means cluster analysis, as described previously. The other three methods, latent class analysis (LCA), latent class growth analysis (LCGA), and latent class growth mixture modeling (LCGMM), are all the techniques based on SEM, using latent variables [21]. One of the assumptions of these methods is that the data consist of one or more subgroups of individuals with similar developmental trajectories. The goal is therefore to capture this population

heterogeneity with a (latent) categorical variable denoting the number of subgroups. In contrast to K-means clustering, for all three methods performed within an SEM framework, the optimal number of classes can be derived from relative objective criteria, that is, model fit indicators. Basically, LCA is comparable to K-means cluster analysis within an SEM framework. As for K-means cluster analysis within LCA, the longitudinal nature of the data is ignored, which means that the repeated measurements of the same variable are treated as different variables measured on the same individual. LCGA is an extension of LCA, not ignoring the longitudinal nature of the data. In fact, LCGA is comparable with the ‘‘two-step’’ approach described earlier, although the mathematical algorithms used in both methods are different. First, LCGA is performed within an SEM framework, whereas the ‘‘two-step’’ approach is not and second, the two steps in LCGA are performed within one analysis. Fig. 3 illustrates the LCGA model for a longitudinal study with six repeated measurements. The six repeated measurements of the outcome variable are the observed variables and they are summarized into two latent growth curve parameters, the intercept and the slope. Based on the intercepts and slope for the individual growth curve parameters, the classification is performed. Figure 3 illustrates a linear growth curve analyses; for a quadratic growth

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087

1081

Fig. 2. Manipulated developmental trajectories and the ones discovered by different methods regarding the quadratic development over time.

curve analysis, besides an intercept and a slope, also a second latent growth curve parameter is added to the model, that is, the quadratic slope.

yt1

i

yt2

yt3

yt4

yt5

yt6

s

c Fig. 3. Schematic representation of the latent variable modeling framework. The observed variables at time point 1 to time point 6 (yt1eyt6) are represented by squares. The latent variables are represented by circles, the latent growth curve parameters are indicated by i (for intercept) and s (for slope), and the latent class is indicated by c.

LCGMM can be seen as an extension of LCGA. The difference between LCGA and LCGMM has to do with the assumptions regarding the individual trajectories within a certain class. Looking at the development from an LCGA perspective, the cohort under study consists of a number of classes, each of which has its own growth trajectory. The classes differ in trajectory shape, but within the classes the individuals all are assumed to have a similar growth trajectory (there is no within class variation). Looking at the development from an LCGMM perspective, the interpretation of the classes is similar, but within the classes, individuals are allowed to differ in growth trajectory (so there can be within class variation). This indicates that subjects with slightly different growth parameters are easily classified to the same class within LCGMM compared with LCGA. The purpose of this article is not to give an extensive mathematical overview of the different methods. Insight in the (complicated) mathematical background can be found in other articles [21e27]. For the real-life example, the optimal number of classes for LCA, LCGA, and LCGMM was determined by the Bayesian information criterion (BIC) and by the clinical relevance of the different classes; for example, solutions with classes that consist of only one subject were identified as clinically irrelevant. The search for the optimal number

1082

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087

of classes was performed by a ‘‘forward’’ classifying approach, which starts with a one-class solution (i.e., there are no subgroups; all individuals follow the same trajectory over time), and then adding extra classes one at a time to investigate whether or not model fit becomes better (i.e., BIC becomes lower) owing to the additional class. For the manipulated data sets, the same procedure was followed, although in general the four-class solution was presented. 2.4. Computer software Several software programs are used to define the subgroups: K-means clustering was performed with SPSS version 18.0; mixed model analysis was performed with MLwiN version 2.22 [20]; and LCA, LCGA, and LCGMM were performed with Mplus version 4.1 [28]. Furthermore, SPSS version 18.0 was used for the additional multinomial logistic regression analyses and linear regression analyses.

3. Results 3.1. Manipulated data sets Figure 1 shows the results of the different analyses regarding classification of the linear developmental trajectories, and Table 1 shows the cross-tabulation between the numbers in the original classes and the classes detected by the different methods. First of all, LCGA seems to perform best by detecting all four trajectories almost perfectly (91% of the subjects were classified in the same trajectory as in the original classes). The most relevant solution in LCGMM revealed a three-class solution in which the subjects with a stable high and the decreasing development were classified into the same group (see Table 1). Surprisingly, when a four-class solution was estimated for LCGMM (which was statistically the best solution; see Table 2), the increasing group was divided into two separate classes. Regarding the linear development, K-means clustering, the ‘‘two-step’’ approach, and LCA gave more or less the same classes. For all three methods, only the increasing group was detected well. Figure 2 shows the results of the different analyses regarding the classification of the quadratic developmental trajectories, and Table 1 shows the cross-tabulation between the numbers in the original classes and the classes detected by the different methods. None of the methods seem to perform well, which is owing to the fact that the stable high and stable low groups were not detected by the different methods. For K-means clustering, the ‘‘two-step’’ approach, LCA, and LCGA, the class with an increasing quadratic development is detected relatively well. Furthermore, the stable low and the class with a decreasing quadratic development were combined with these four methods. The optimal LCGMM solution (see Table 2) was a three-class solution in which the stable high and the decreasing quadratic development were

Table 1. Cross-tabulation between the number of subjects in the original classes and the classes detected by the different methods regarding the linear and quadratic development over time Linear development

Quadratic development

Original groups 1

Original groups 2

3

4

N

1

2

3

N

4

K-means clustering 1 112 0 0 2 4 168 8 3 0 4 104 4 4 0 64

76 0 0 44

‘‘Two-step’’ approach 1 112 0 0 2 4 172 20 3 0 0 72 4 4 0 84

84 196 0 0 120 16 136 0 196 8 140 4 0 152 0 72 0 32 52 0 84 36 124 112 0 0 104 216

188 0 0 144 16 160 180 8 172 16 0 196 108 0 0 16 0 16 112 112 0 0 104 216

LCA 1 112 0 0 2 8 168 12 3 0 4 104 4 0 0 60

76 188 0 0 132 16 148 0 188 8 172 20 0 200 36 144 0 0 20 0 20 8 68 112 0 4 104 220

LCGA 1 112 0 4 8 2 8 172 16 0 3 0 0 152 12 4 0 0 4 100

104 0 0 152 16 168 196 8 172 12 0 192 164 0 0 12 0 12 104 112 0 0 104 216

LCGMM 1 112 0 0 0 112 120 120 0 0 240 2 8 168 4 0 180 0 52 4 0 56 3/4 0 4 172 120 296 0 0 172 120 292 Abbreviations: N, number of subjects; LCA, latent class analysis; LCGA, latent class growth analysis; LCGMM, latent class growth mixture model.

combined into one class. This was also more or less the case for the stable low and the increasing quadratic development. In the optimal solution, however, the latter was divided into two separate classes. From Table 2, it can be seen that for both the linear and quadratic development over time, the optimal solutions of LCA and LCGA, revealed more than the four classes that were manipulated in the data. Table 2. Bayesian information criteria for the different latent class analyses performed on the two manipulated data sets with a different number of classes Number of classes 2 3 4 5 6 7 8 Optimal

Linear development LCA

LCGA

10106 10219 9173 9402 8689 9027 8369 8950 8080 8392 7829 8087 a 7660 7178 (12)

Quadratic development

LCGMM 7326 7220 7172 a a a a

LCA

LCGA

LCGMM

9795 10034 7940 9330 9633 7823 a 9006 9339 a 8665 9094 a 8294 8911 a 8058 8651 b a 7856 7515 (10)

Abbreviations: LCA, latent class analysis; LCGA, latent class growth analysis; LCGMM, latent class growth mixture model. a Models did not converge. b One of the classes consist of only one subject.

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087

To give some more detail of the results obtained by the different methods, the Appendix (at www.jclinepi.com) shows the mean values and standard deviations of the different classes detected by the different methods. 3.2. Real-life data set Table 3 shows descriptive information of the variables used in the real-life data example, and Fig. 4 shows the results of the classification of BMI trajectories from 32 to 42 years of age in the AGHLS. Table 4 shows the model fit values used to define the optimal number of BMI trajectories in the three methods based on SEM. All methods used to classify the BMI trajectories showed comparable results, stable, slightly increasing trajectories, which differ from each other in the average BMI over time. However, the optimal number of classes derived from LCA and LCGA was higher than the optimal number of classes derived from LCGMM. Although it could not be based on objective criteria, for the ‘‘two-step’’ approach and K-means clustering, a three-class solution was also chosen. In a five-class solution, in both methods, one of the subgroups contained only one subject. Tables 5 and 6 show the results of the additional analyses using the different trajectories. Table 5 shows the results of the multinomial logistic regression analysis regarding the relationship between birth weight and BMI trajectories, and Table 6 shows the results of the linear regression analysis regarding the relationship between the BMI trajectories and MAP. Regarding the relationship between birth weight and BMI trajectory, for the trajectories created with the SEM approaches, more or less the same results were found. Increasing odds ratio for the trajectories with higher BMI’s suggested a positive relationship between birth weight and BMI. Although these results are rather surprising and in contrast with Barker’s hypothesis, it is not the purpose of this study to discuss this any further [18]. Regarding the linear relationship between BMI class membership and MAP measured at the age of 42 years, all results were more or less the same; the higher the average value of the BMI trajectory, the stronger the relationship with MAP.

4. Discussion In the present article, five statistical methods (K-means clustering, a ‘‘two-step’’ approach involving a mixed model Table 3. Descriptive information of the variables used in the real-life data set example Variables

Mean

Standard deviation

BMI at age 32 BMI at age 36 BMI at age 42 Mean arterial blood pressure (mm Hg)a Birth weight (kg)

23.3 23.9 24.5 84.5 3.4

2.9 3.1 3.5 9.9 0.7

Abbreviation: BMI, body mass index. a Measured at 42 years of age.

1083

analysis and K-means clustering, LCA, LCGA, and LCGMM) were compared with each other to create subgroups of individuals with comparable developmental trajectories. This was done in two manipulated data sets and in a real-life data set. In the real-life data set, the results regarding the classification of subjects in different developmental trajectories was comparable for the different methods, whereas for the manipulated data sets, LCGA seem to perform best for detecting linear developments over time. For the quadratic developments over time, all methods did not perform well, although probably LCGMM resulted in the most clinical interpretable solution. However, the trajectories detected by LCGMM were not the same as the ones manipulated into the data. The biggest difference within the SEM approaches was (in both the real-life data set and the manipulated data sets) the higher number of classes in the optimal solution for LCA and LCGA compared with LCGMM. Currently, in the (psychological methodology) literature, there is some discussion about the use of LCGA or LCGMM [29e32]. The LCGA methodology was developed by Nagin and Tremblay [23,33] and implemented in the SAS procedure Traj [34], whereas the LCGMM methodology is developed by Muthen et al. [21,22,24] and implemented in the Mplus software program [28]. In the present article, LCGA was also performed with the Mplus software, as was LCA, making it possible to explore the three methods within one program. As previously mentioned, LCGMM can be seen as an extension of LCGA. The difference between the two methods has to do with the variation within a certain class. With LCGA, this variation is set to zero, whereas for LCGMM this is not the case. This implies that the (optimal) number of classes derived from LCGA is always bigger than the (optimal) number of classes derived from LCGMM. Within LCGA, subjects with slightly different growth parameters are sooner defined to a different class compared with LCGMM in which the growth parameters within a class are allowed to differ. To find the optimal number of classes, various approaches are available. Probably the best way is the strategy performed in the present article, that is, a ‘‘forward’’ classifying approach, which starts with a one-class solution (i.e., there are no subgroups; all individuals follow the same trajectory over time), then adding extra classes one at a time to investigate whether or not the model fit becomes better owing to the additional class. This procedure ends the moment the model fit does not improve anymore. As has been shown in the present article, with this procedure, one has to be very careful. It could be that the solution that statistically optimally describes the data is a solution with one (or more) clinically uninterpretable classes, or with (a) class (es) with very few subjects. Both issues should be kept in mind when analyzing the data with these classification methods. In the present article, the BIC was used to decide on the optimal number of classes in the three latent class

1084

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087

Fig. 4. Developmental trajectories of BMI detected by different methods. BMI, body mass index.

techniques. Although the BIC is probably the most used model fit parameter, there is still a lively discussion going on about this topic in the literature. For instance, Lo et al. [35] developed a likelihood ratio test (LMReLRT), while more recently the bootstrap likelihood ratio test has been proven to be a good indicator for choosing the optimal number of classes (see for a clear overview of these indicators, see Nylund et al. [36] or Muthen [37]). On the other hand, within K-means clustering, there is no possibility to use model fit parameters to decide on the optimal number of classes. This is owing to the fact that the classification Table 4. Bayesian information criteria for the different latent class analyses performed in the real-life data set to detect different developmental trajectories of BMI Number of classes 2 3 4 5 6

LCA

LCGA

LCGMM

5524 5299 5107 5107

5512 5281 5084 4993

4753 4766

b

b

a

a a

Abbreviations: BMI, body mass index; LCA, latent class analysis; LCGA, latent class growth analysis; LCGMM, latent class growth mixture model. a Models did not converge. b One of the classes consist of only one subject.

is not based on maximum likelihood estimates, but on distance measures [38,39]. When using maximum likelihood estimations, it is important to know that the final results with such estimations can easily be influenced by socalled local maxima. This has to do with the fact that the function of the likelihood against the parameters is not a function with only one maximum, but has several maxima. It highly depends on the starting value of the parameters from which solution is obtained, and it is striking that the nonoptimal solutions can be highly different from the optimal solution. It is therefore strongly advised to recompute the solution with different starting values, to obtain a final, optimal solution. Nowadays, software programs such as Mplus or LatentGOLD [40] provide the possibility to use different starting values in one analysis to obtain the optimal solution directly. In the present article, all three latent class techniques were performed with at least 100 different (random) starting values. There are a few other studies comparing different methods to classify developmental trajectories with each other [14,27,41]. Two of them include LCGMM in their comparison. Kreuter and Muthen [41] investigated criminal behavior over time and used Poisson and Zero Inflated Poisson distributions to model their outcomes. Based on model fit parameters and pragmatic evaluation, they

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087 Table 5. Results of the multinomial logistic regression analysis regarding the relationship between birth weight and class membership for the development of BMI between 32 and 42 years of age

1085

Table 6. Results of the linear regression analysis regarding the relationship between class membership for the development of BMI between 32 and 42 years of age and mean arterial blood pressure at 42 years of age

95% CI

P-value

K-means clustering 2 1.17 3 1.07

0.80e1.71 0.60e1.90

0.42 0.82

K-means clustering 2 3

‘‘Two-step’’ approach 2 1.14 3 1.19

0.78e1.66 0.67e2.09

0.51 0.55

‘‘Two-step’’ approach 2 2.1 3 6.3

0.2, 4.6 2.7, 9.8

LCA 2 3 4 5

1.25 1.44 1.46 2.72

0.79e1.96 0.87e2.37 0.82e2.59 0.96e7.68

0.34 0.16 0.20 0.06

LCA 2 3 4 5

1.9 3.2 8.2 6.8

0.9, 0.1, 4.5, 1.5,

4.7 6.4 11.9 15.2

0.19 0.05 !0.001 0.11

LCGA 2 3 4 5

1.26 1.41 1.45 2.72

0.80e1.98 0.86e2.33 0.82e2.58 0.96e7.67

0.33 0.17 0.20 0.06

LCGA 2 3 4 5

1.7 3.5 8.2 6.8

1.1, 0.3, 4.5, 1.5,

4.5 6.6 11.9 15.1

0.23 0.03 !0.001 0.11

LCGMM 2 3

1.43 2.16

0.96e2.12 0.53e8.69

0.08 0.28

LCGMM 2 3

4.7 8.9

2.1, 7.3 1.6, 19.5

!0.001 0.09

Groups

Odds ratioa

Groups

Regression coefficienta 2.1 6.6

95% CI

P-value

0.2, 4.4 3.0, 10.2

0.07 !0.001 0.07 0.001

Abbreviations: BMI, body mass index; CI, confidence interval; LCA, latent class analysis; LCGA, latent class growth analysis; LCGMM, latent class growth mixture model. a All analyses were adjusted for gender and the first group was used as reference category.

Abbreviations: BMI, body mass index; CI, confidence interval; LCA, latent class analysis; LCGA, latent class growth analysis; LCGMM, latent class growth mixture model. a All analyses were adjusted for gender and the first group was used as a reference category.

concluded that LCGMM was the best way of classifying their subjects. Feldman et al. [27] investigated alcohol use over time and they preferred LCGA above LCGMM. This preference was mainly based on the computation difficulties of LCGMM, which is owing to the flexibility in modeling the earlier mentioned heterogeneity regarding the variation within a class. For this flexibility, however, a price has to be paid. The computations of LCGMM are quite complicated and in many situations (such as the examples in the present article), models do not converge; that is, models do not lead to a proper (statistical) solution. Another issue that should be taken into account when classifying developmental trajectories over time is the fact that the methods described in this article aim to detect subgroups in the study sample. This is not necessarily the same as detecting underlying subpopulations. Besides ‘‘real’’ subpopulation differences, there are many other causes (non-normality of the original distribution in the population or subpopulations, sample fluctuation, and so on) for the presence of subgroups in a particular study sample. These problems often lead to an over extraction of the number of classes detected, that is, the number of classes retrieved is too high [31,42]. Within epidemiology, creating groups with the same developmental trajectory is increasingly popular these days. It is mostly used as a descriptive tool, but the trajectories are also used either as a determinant for future health outcomes

or as an outcome variable to investigate potential predictors of these trajectories. In the present article, in the real-life data set, both additional analyses were illustrated. However, it should be realized that both additional questions could be answered with different methods as well. Regarding the investigation of potential predictors for the development over time in a certain outcome, it is not necessary to classify the population under study into several groups. The outcome variable itself can be analyzed with, for instance, a mixed model analysis. Regarding the relationship between different developments over time and future health outcomes, it is also not necessary to classify the population under study into several groups. Instead of using the categorical group variable as a determinant for future health outcomes, it is also possible to use the individual growth parameters as determinants. In general, it should be realized that classifying developmental trajectories is mostly not the only solution to answer certain research questions. Besides this, it is also important to realize that there is some uncertainty in the class assignment. As a result of the three classification methods based on SEM, the probability that a particular individual belongs to a certain class can be estimated. Because this probability (which is known as the posterior class probability) is never equal to one, there is some uncertainty in the class assignment. It is argued that this uncertainty should be taken into account when the classes are used in further analyses. Several methods, such as using the

1086

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087

posterior class probabilities as a weight in the regression analysis or using ‘‘pseudo-class’’ draws, are available [43]. However, it is not clear which method is the best. On the other hand, when the posterior class probabilities are high, the uncertainty of class assignment will be low. 5. Conclusion In conclusion, based on both the real-life data set and the manipulated data sets, it is not clear which method to classify developmental trajectories in prospective epidemiological and medical studies is the best, although LCGA and LCGMM seem to be preferable above the more simple methods. However, all methods should be applied with great caution. Appendix Supplementary data Supplementary data related to this article can be found online at doi:10.1016/j.jclinepi.2012.04.010 References [1] Duncan T, Duncan S, Stryker L, Li F, Alpert A, editors. An introduction to latent variable modelling. Concepts, issues and applications. Mahwah, NJ: Lawrence Erlbaum Associated Publishers; 1999. [2] Conklin C, Perkins K, Sheidow A, Jones B, Levine M, Marcus M. The return to smoking: 1-year relapse trajectories among female smokers. Nicotine Tob Res 1999;7:533e40. [3] Casswell S, Pledger M, Pratap S. Trajectories of drinking from 18 to 26 years: identification and prediction. Addiction 2002;97:1427e37. [4] Schulenberg J, Merline A, Johnston L, O’Malley P, Bachman J, Laetz V. Trajectories of marijuana use during the transition to adulthood: the big picture based on national panel data. J Drug Issues 2005;35:255e79. [5] Reboussin B, Lohman K, Wolfson M. Modeling adolescent drug-use patterns in cluster-unit trials with multiple sources of correlation using robust latent class regressions. Ann Epidemiol 2006;16:850e9. [6] Deeg D. Longitudinal characterization of course types of functional limitations. Disabil Rehabil 2005;27:253e61. [7] Liang J, Shaw B, Krause N, Bennet JM, Kobayashi E, Fukaya T, et al. How does self-assessed health change with age? A study of older adults in Japan. J Gerontol B Psychol Sci Soc Sci 2005;60B:S224e32. [8] Liang J, Shaw B, Krause N, Bennet JM, Blaum C, Kobayashi E, et al. Changes in functional status among older adults in Japan: successful and usual aging. Psychol Aging 2003;18:684e95. [9] Alexy U, Sichert-Hellert W, Kersting M, Schultze-Pawlitschko V. Pattern of long-term fat intake and BMI during childhood and adolescencedresults of the DONALD study. Int J Obes Relat Metab Disord 2004;28:1203e9. [10] Barrett A, White H. Trajectories of gender role orientations in adolescence and early adulthood: a prospective study of the mental health effects of masculinity and femininity. J Health Soc Behav 2002;43:451e68. [11] Li C, Goran M, Kaur H, Nollen N, Ahluwalia J. Developmental trajectories of overweight during childhood: role of early life factors. Obesity 2007;15:760e71. [12] Ventura A, Loken E, Birch L. Risk profiles for metabolic syndrome in a nonclinical sample of adolescent girls. Pediatrics 2006;118:2434e42.

[13] Dunn K, Jordan K, Croft P. Characterizing the course of low back pain: a latent class analysis. Am J Epidemiol 2006;163: 754e61. [14] Croudace T, Jarvelin M-R, Wadsworth M, Jones P. Development typology of trajectories to nighttime bladder control: epidemiologic application of longitudinal latent class analysis. Am J Epidemiol 2003;157:834e42. [15] Ferdinand R, de Nijs P, van Lier P, Verhulst F. Latent class analysis of anxiety and depressive symptoms in referred adolescents. J Affect Disord 2005;88:299e306. [16] Hoekstra T, Barbosa-Leiker C, Koppes LLJ, Twisk JWR. Developmental trajectories of body mass index throughout the life course: an application of latent class growth (mixture) modelling. Longit Life Course Stud 2011;2:319e30. [17] Kemper HC, editor. Amsterdam Growth and Health Longitudinal Study (AGAHLS): a 23-year follow-up from teenager to adult about lifestyle and health. Basel, Switzerland: Karger; 2004. [18] Te Velde SJ, Twisk JW, Van Mechelen W, Kemper HCG. Birth weight, adult body composition, and subcutaneous fat distribution. Obes Res 2003;11:202e8. [19] Ferreira I, Henry RM, Twisk JW, van Mechelen W, Kemper HC, Stehouwer CD. The metabolic syndrome, cardiopulmonary fitness, and subcutaneous trunk fat as independent determinants of arterial stiffness: the Amsterdam Growth and Health Longitudinal Study. Arch Intern Med 2005;165:875e82. [20] Goldstein H. Multilevel statistical models. 3rd ed. London, UK: Edward Arnold; 2003. [21] Muthen B. Latent variable analysis: growth mixture modeling and related techniques for longitudinal data. In: Kaplan D, editor. Handbook of quantitative methodology for the social sciences. Newbury Park, CA: Sage Publications; 2004. [22] Muthen B, Shedden K. Finite mixture modeling with mixture outcomes using the EM algorithm. Biometrics 1999;55:463e9. [23] Nagin D. Analyzing developmental trajectories: a semi-parametric group based approach. Psychol Methods 1999;6:18e34. [24] Muthen B, Muthen L. Integrating person-centered and variablecentered analyses: growth mixture modeling with latent trajectory classes. Alcohol Clin Exp Res 2000;24:882e91. [25] Jung T, Wickrama KAS. An introduction to latent class growth analysis and growth mixture modeling. Soc Personal Psychol Compass 2008;2:302e17. [26] Muthen B, Asparouhov B. Growth mixture modeling: analysis with non-Gaussian random effects. In: Fitmaurice G, Davidian M, Vebeke G, Molenberghs G, editors. Longitudinal data analysis. Boca Raton, FL: Chapman & Hall/CRC Press; 2008:143e65. [27] Feldman B, Masyn K, Conger R. New approaches to studying problem behaviors: a comparison of methods for modeling longitudinal, categorical adolescent drinking data. Dev Psychol 2009;45: 652e76. [28] MPlus [computer program]. Version 4.1; 1998e2007. [29] Muthen B. The potential of growth mixture modelling. Infant Child Dev 2006;15:623e5. [30] Connell A, Frye A. Response to commentaries on target paper, ‘‘Growth mixture modelling in developmental psychology’’. Infant Child Dev 2006;15:639e42. [31] Connell A, Frye A. Growth mixture modelling in developmental psychology: overview and demonstration of heterogeneity in developmental trajectories of adolescent antisocial behaviour. Infant Child Dev 2006;15:609e21. [32] Hoeksma J, Kelderman H. On growth curves and mixture models. Infant Child Dev 2006;15:627e34. [33] Nagin D, Tremblay R. Analyzing developmental trajectories of distinct but related behaviors: a group-based method. Psychol Methods 2001;229:374e93. [34] Jones B, Nagin D, Roeder K. A SAS procedure based on mixed models for estimating developmental trajectories. Sociol Methods Res 2001;229:374e93.

J. Twisk, T. Hoekstra / Journal of Clinical Epidemiology 65 (2012) 1078e1087 [35] Lo Y, Mendell NR, Rubin DB. Testing the number of components in a normal mixture. Biometrika 2001;88:767e88. [36] Nylund K, Asparouhov T, Muthen B. Deciding on the number of classes in latent class analysis and growth mixture modeling: a Monte Carlo simulation study. Struct Equ Modeling 2007;14:535e69. [37] Muthen B. Statistical and substantive checking in growth mixture modeling. Psychol Methods 2003;8:369e77. [38] Hartigan J, Wong M. A K-means clustering algorithm: Algorithm AS 136. Appl Stat 1979;28:126e30. [39] Hartigan J. Clustering algorithms. New York, NY: John Wiley & Sons, Inc; 1975.

1087

[40] LatentGOLD [computer program]. Version 4.0. Belmont, MA: Statistical Innovations; 2000. [41] Kreuter F, Muthen B. Analyzing criminal trajectory profile: bridging multilevel and group-based approaches using growth mixture modelling. J Quant Criminol 2008;24:1e31. [42] Bauer D, Curran P. Distributional assumptions of growth mixture models: implications for over extraction of latent trajectory classes. Psychol Methods 2003;8:338e63. [43] Clark SL, Muthen B. Relating latent class analysis results to variables not included in the analysis. 2009; Available at http://www.statmodel. com/download/Relatinglca.pdf. Accessed June 12, 2012.