Conditional Growth Models: An Exposition and Some Extensions

Conditional Growth Models: An Exposition and Some Extensions

Chapter 11 Conditional Growth Models: An Exposition and Some Extensions Clive Osmond1 and Caroline H.D. Fall MRC Lifecourse Epidemiology Unit, Univer...

542KB Sizes 1 Downloads 62 Views

Chapter 11

Conditional Growth Models: An Exposition and Some Extensions Clive Osmond1 and Caroline H.D. Fall MRC Lifecourse Epidemiology Unit, University of Southampton, Southampton General Hospital, Southampton, United Kingdom 1 Corresponding author: e-mail: [email protected]

ABSTRACT We discuss the analysis of data in which, for each subject, size measures, typically height and weight, are combined to form a growth trajectory, and the interest is in how that growth trajectory is associated with an outcome measured at or after the last size measurement. This is a common scenario in studies of the “Developmental Origins of Health and Disease.” As an illustration we use data on impaired glucose tolerance and diabetes from the New Delhi Birth Cohort Study. We define a model in which growth is assessed in sequential age intervals as a deviation from what might have been predicted at the start of the interval. We develop extensions to this model that consider height and weight simultaneously, that incorporate measures of body fatness, and that consider the problem in reverse time. We apply these models to the Delhi data and conclude by discussing their strengths and weaknesses, and some of the pitfalls inherent in this form of analysis. Keywords: Growth models, Developmental origins, Growth trajectory, Epidemiology

1

INTRODUCTION: THE PROBLEM TO BE ADDRESSED

We discuss the analysis of data in which, for each subject, a sequence of size measures, typically height and weight, are combined to form a growth trajectory, and the interest is in how that growth trajectory is associated with an outcome measured at or after the last size measurement. This is a common scenario in studies of the “Developmental Origins of Health and Disease.” As an illustration we use data from the New Delhi Birth Cohort Study, which we summarize in Section 2. We describe the conditional growth model in Section 3. There follows a section giving descriptive data for the cohort and the results of analyses using traditional cross-sectional approaches. The

Handbook of Statistics, Vol. 37. https://doi.org/10.1016/bs.host.2017.08.012 © 2017 Elsevier B.V. All rights reserved.

275

276 SECTION

VIII Public Health and Epidemic Data Modeling

conditional growth model results are in Section 5, which is followed by a final Section 6, which discusses the model’s strengths and weaknesses and some of the pitfalls inherent in this form of analysis.

2 THE NEW DELHI BIRTH COHORT STUDY The New Delhi Birth Cohort Study began in 1969. A full description of the study design is available in Bhargava et al. (2004). Briefly, the study started when fieldworkers identified all families living in a relatively prosperous 12 km2 area of South Delhi, India, among whom there were 20,755 married women of reproductive age. They visited them every 2 months to record menstrual dates. Then every 2 months a health visitor saw those who became pregnant until 37 weeks of gestation, and then every 2 days afterward. There were 8030 singleton births in this group before the end of 1972. Trained fieldworkers recorded the weight and height of each baby within 3 days of birth, at the ages of 3, 6, 9 months, and then 6-monthly from age 1 year onward, within 15 days of the exact age, up to age 21 years. Between August 1998 and August 2002 fieldworkers traced and visited 2584 of the cohort members, recording life histories. They invited the subjects to clinics and 1526 attended for measurement of anthropometry, blood pressure, and lipids, and for a standard 75 g glucose tolerance test. Subjects came after an overnight fast. Plasma glucose concentrations at baseline and 120 min after ingestion of glucose were measured using the glucose oxidase method with a Beckman autoanalyzer. Impaired glucose tolerance was defined as a fasting plasma glucose concentration of less than 126 mg per dL and a 120-min value of at least 141: diabetes as a fasting value of at least 126 or a 120-min concentration of at least 200. The All India Institute of Medical Sciences approved the study. Informed consent was obtained from each subject.

3 CONDITIONAL GROWTH MODELS 3.1 The Basic Concept Keijzer-Veen et al. (2005) give an early exposition of the basic idea of the conditional model, while Gale et al. (2006) provides an early application. The age range we study is partitioned into n intervals. ða0 to a1 Þ, ða1 to a2 Þ,…, ðan2 to an1 Þ, ðan1 to an Þ, where a0 and an are the youngest and oldest ages considered. Everything we know about subject i at up to and including age aj is summarized as Hi,aj, where “H” is chosen to convey the idea of the subject’s History. We study size measures at the ages a0, a1, a2, …, an2, an1, an which we denote for subject i as xi,a0, xi,a1, xi,a2, …, xi,an2, xi,an1, xi,an. At the beginning of an age interval, say aj to aj+1, we develop a model using data from all the subjects

Conditional Growth Models Chapter

11 277

and apply it to subject i, using  the content of Hi,aj, to form a prediction of xi,aj+1, which we denote ^ xi, aj + 1 ¼ E xi, aj + 1 j Hi, aj , the expected value of xi,aj+1 given Hi,aj. The residual ri,aj+1, which is the difference xi, aj + 1  ^xi, aj + 1 , measures the amount by which the size at the end of the interval (age aj+1) exceeds the prediction made at the beginning of the interval (age aj). This is a measure of the subject’s growth in the interval beyond expectation. For a simple example, if weight is known at ages a0 ¼ 0, a1 ¼ 2, and a2 ¼ 11 years (wti,0, wti,2, wti,11, respectively) in a large population of boys, i ¼ 1 to I, then a possible prediction model for the interval from age 0 to 2 years, for which Hi,0 ¼ {wti,0} is the simple linear model wti,2 ¼ a2 + b0,2  wti,0 + ri,2 , and for the interval from age 2 to 11 years, for which Hi,2 ¼ {wti,0, wti,2} is the linear regression model wti,11 ¼ a11 + b0,11  wti,0 + b2,11  wti,2 + ri,11 , where a2, b0,2, a11, b0,11, and b2,11 are regression coefficients, estimated as ^a2 , ^ b0, 2 , ^a11 , ^ b0,11 , and ^ b2,11 , where w^ti,2 ¼ Eðwti,2 j Hi,0 Þ ¼ ^a2 + ^b0,2  wti,0 ^ and wti, 11 ¼ Eðwti,11 j Hi,2 Þ ¼ ^a11 + ^ b0,11  wti,0 + ^ b2, 11  wti,2 , and where ri,2  N 2 2 (0, s2) and ri,11  N(0, s11) are all independent and estimated as ^ri,2 ¼ wti,2  ^a2  ^ b0, 2  wti,0 and ^ri,11 ¼ wti,11  ^a11  ^b0, 11  wti,0  ^b2, 11  wti,2 . By a standard property of regression analysis it follows that ^ri,2 is uncorrelated with wti,0, and that ^ri,11 is uncorrelated with each of wti,0, wti,2, and ^ri,2 . The set of size measures xi,a0, xi,a1, …, xi,an is to be used to predict an outcome measure yi which is measured at or after the last age an. It would be possible to include each of the size measures as predictors in a regression model. However, their likely correlation leads to difficulty in interpretation of the regression coefficients. The preprocessing described earlier leads to the set of predictors fxi,a0 , ^ri,a1 , ^ri,a2 , …, ^ri,an g which are all uncorrelated. In a model to predict yi each of these term has a direct interpretation, for it represents growth in a specific interval. The regression coefficients address the question of whether there are “critical windows” when growth is associated with the outcome. In the example above a simple linear model could be yi ¼ f + g0  wti,0 + g2 ^ri, 2 + g11 ^ri, 11 + hi where f, g0, g2, g11 are regression coefficients, and hi is an error term. Note that this model produces an identical fit to the model. yi ¼ f 0 + g00  wti,0 + g02  wti,2 + g011  wti,11 + h0i , because the variables fwti,0 , ^ri,2 , ^ri,11 g and {wti,0, wti,2, wti,11} span the same space. The new parametrization just restructures the predictor variables to make the regression coefficients more interpretable. We show examples of such analyses in Section 5.1.

278 SECTION

VIII Public Health and Epidemic Data Modeling

3.2 Data Checking and Choices in Model Formulation This is a three-stage modeling process. First it is necessary to check the data, then there are two stages of model formulation; first modeling the growth process and then modeling the outcome. Two stages are possible because the size and growth measures are established by the last age of measurement, which is no later than the age at which all the outcomes are available. The same characterization of growth can apply to each outcome to be considered. (a) Data checking. It is usually sensible to do this separately for males and females, because their tempo of growth differs. Some size measures are skewed in distribution, so that a normalizing transformation may be helpful. One way of achieving this is to age- and sex-standardize the size measures. This can be done using a published reference standard or using an internal standardizing procedure, such as the LMS approach (Cole and Green, 1992), which models the median, variability, and skewness of the size measures using smooth functions of age. Cole’s method produces values that have mean around zero and standard deviation around unity, whereas using published references may lead to be substantially different mean values. A further and considerable benefit of Cole’s procedure is that it helps to highlight observations in the database that are worth checking before going further. Observations that are more than, say, four standard deviations from the mean should be checked. Further internal checks are possible. For example, interpolation of the standardized values of the size measurements made immediately before and after each observation gives an estimate of a likely standardized value at that age. Comparison of the observed and estimated values may suggest other necessary checks. It is possible to interpolate the standardized values to exact ages and back-transform to obtain size estimates, albeit with reduced variance. When there are several size parameters, studying their joint distribution also helps. In the authors’ experience, such checks are likely to take more than 90% of the total analysis time. (b) The growth models. The set of possible predictors may include not only the earlier size measures but also other pertinent variables. For example, the exact ages at which the size measurements were made could be important. It might again be sensible to run the models separately for males and females. For each of the size prediction models it is necessary to apply the standard principles of regression analysis. How many variables are needed? Do the predictors have linear associations with the final size? Are there any interactions? Are the residuals normally distributed and of equal variance? Here, again it can be helpful to use the standardized data, so that the final size is symmetrically distributed. Leaving the data untransformed, the residuals, which are the goals of the analysis, are measured in, say, centimeters or kilograms. These can be standardized to enable comparisons across age intervals when used

Conditional Growth Models Chapter

11 279

in the outcome model. Standardization makes even more sense if the residuals are already derived from standardized data. (c) The outcome model. The outcome variable may have a variety of forms; continuous, binary, ordinal, etc. Correspondingly the outcome model is multiple linear, logistic, ordinal logistic, etc. We may also study repeated outcomes. The set of predictors will include the calculated residuals, and others, such as sex, age, and socioeconomic status. Some researchers seek associations that apply in many contexts and so, where possible, pool the results of analyses across males and females. They “lump.” Others do not expect the same results in different settings and so are more inclined to produce results for males and females separately. They are “splitters.” Whatever, the principles of regression analysis, as discussed in the previous paragraph, apply equally here.

3.3 An Extension Using Height and Weight Measures Simultaneously Often both height and weight are measured together, allowing the calculation of body mass index (weight divided by the square of height). Height is squared in the denominator so that the index is approximately uncorrelated with height and so represents weight having controlled for height, and is thus designed to represent fatness. This varies in its effectiveness with age. For example, at birth some prefer to use the ponderal index in which the cube of height is used in the denominator. An alternative is to model the weight and height directly. Adopting the notation of Section 3.1 we may use   hti, aj + 1  E hti, aj + 1 j Hi, aj to measure height growth in the age interval (aj to aj+1), and   wti, aj + 1  E wti, aj + 1 j Hi, aj , htaj + 1 to measure weight growth in that interval, beyond that associated with height. This is achieved by incorporating height at the end of the interval into the prediction model. In the example of Section 3.1 the model equations are wti,0 ¼ aw0 + bh0 w0  hti,0 + rwi,0 , hti,2 ¼ ah2 + bh0 h2  hti,0 + bw0 h2  wti,0 + rhi,2 , wti,2 ¼ aw2 + bh0 w2  hti,0 + bw0 w2  wti,0 + bh2 w2  hti,2 + rwi,2 , hti,11 ¼ ah11 + bh0 h11  hti,0 + bw0 h11  wti,0 + bh2 h11  hti,2 + bw2 h11  wti,2 + rhi, 11 , wti,11 ¼ aw11 + bh0 w11  hti,0 + bw0 w11  wti,0 + bh2 w11  hti,2 + bw2 w11  wti,2 + bh11 w11  hti,11 + rhi,11 :

280 SECTION

VIII Public Health and Epidemic Data Modeling

The measures of weight growth, allowing for height growth, are analogous to measures of growth in body mass index, but are calculated more directly. Adair et al. (2013) give examples of their use, studying growth and cardiometabolic outcomes in data from five low- and middle-income countries. We show an example of such an analysis in Section 5.2.

3.4 An Extension Using Height, Weight, and Skinfold Thickness The weight growth in an interval beyond that predicted from the height growth represents gain in soft tissues, of which there are two components, fat and lean. More directly to assess body fatness researchers sometimes measure the thicknesses of several skinfolds, alongside weight and height, and use their sum for analysis, denoted as “sf” below. Typical sites for skinfold thickness measurement are the biceps, the triceps, and the subscapular and suprailiac regions. An extension of the conditional model to distinguish the fat and lean components is given by   hti,aj + 1  E hti,aj + 1 j Hi,aj to measure height growth in the age interval (aj to aj+1),   sf i,aj + 1  E sf i,aj + 1 j Hi,aj , hti,aj + 1 to measure the gain in fat mass in that interval, beyond that associated with height, and   wti, aj + 1  E wti,aj + 1 j Hi,aj , hti,aj + 1 , sf i,aj + 1 to measure the gain in lean mass, beyond that associated with height and fat mass. An example of this approach is given in Krishnaveni et al. (2015). Similar approaches may be used to handle the more direct measures of fat mass available from more advanced technology.

3.5 An Extension Using the Reversal of Time As described so far the conditional approach is constructed prospectively. Everything known at the beginning of an interval is available to inform a model of what the size measure will be at the end of the interval. But the approach can also be defined retrospectively to address a different set of questions. Everything known about subject i at age aj and subsequently is summarized as Fi,aj, where “F” is chosen to convey the idea of the subject’s Future. At the end of an age interval, say aj to aj+1, we use  the content  of Fi,aj+1 to form a xi, aj ¼ E xi, aj j Fi, aj + 1 , the expected value prediction of xi,aj, which we denote ^ of xi,aj given Fi,aj+1. The residual ri,aj, which is the difference xi, aj  ^xi, aj , measures the amount by which the size at the beginning of the interval (age aj) exceeds the prediction made from information available at and after the end

Conditional Growth Models Chapter

11 281

of the interval (age aj+1). This is a measure of the subject’s growth in the interval below expectation, noting that the time reversal means large values now suggest that growth is poor. In this retrospective approach we address the relevant question, “Given what I learn from a subject’s current size, has the route taken to this point had any influence on their current disease status?” We show an example of such an analysis in Section 5.3.

3.6

Selection of Suitable Age Intervals

What guides the choice of the n + 1 age interval endpoints a0, a1, …, an when there are N + 1 options a00 , a10 , …, aN0 from which to choose? Usually a0 ¼ a00 and an ¼ aN0 are chosen as the limits of the age range. There are two guiding principles; one statistical, one biological. In the statistical we calculate the correlations r(j,k) between the measurements of size xi,aj0 and xi,ak0 across all subjects. High correlations indicate “tracking” and suggest that there is little change in the rank ordering of the subjects’ size measures in the interval aj0 to ak0 ; low correlations indicate periods of greater change. For two intervals 0 so that r(0,m) and r(m,N) are as equal as possible. we might choose a1 ¼ am This spreads the variability between the two intervals. The same principle of equalizing the correlations extends to more intervals. As tracking increases with age, the intervals selected in this way usually emphasize early growth. The alternative biological principle suggests choosing intervals that correspond to the different stages of hormonal control of growth.

4 4.1

DESCRIPTIVE DATA AND TRADITIONAL ANALYSES Descriptive Data

Table 1 shows summary statistics for height, weight, and body mass index data at each of the measurement occasions, separately for the 886 males and 640 females. 1442 of them completed glucose tolerance tests at the adult clinic, among whom 136 out of 849 men (16.0%) and 83 out of 593 women (14.0%) had impaired glucose tolerance or diabetes.

4.2

Choice of Age Intervals

Table 2 shows the matrix of partial correlations for height across age, pooling the data for males and females. The age which most nearly equalizes the correlations r(0, age) and r(age, 30) is 6 months. Expanding this to equalize the correlations r(0, age1), r(age1, age2), and r(age2, 30) we find age1 ¼ 3 months and age2 ¼ 1 year to be the best choice. These are perhaps surprisingly young ages. The equivalent values for weight are 3 months and 3 years, and for body mass index are 3 months and 6 years, respectively.

TABLE 1 Descriptive Data for Size Variables in the New Delhi Birth Cohort Study Men (N 5 886) Height (cm)

Age (Years)

N

0

Women (N 5 638)

Weight (kg)

Mean

SD

Mean

873

48.7

2.1

2.9

0.25

677

59.6

2.4

0.5

729

65.4

1

704

2

SD

2

BMI (kg/m )

Height (cm)

Mean

SD

N

0.4

12.1

1.2

5.5

0.7

15.4

2.5

7.0

0.9

72.0

2.8

8.5

832

81.0

3.5

3

865

88.7

4

871

5

Weight (kg)

Mean

SD

Mean

629

48.2

1.9

2.8

1.5

517

58.1

2.3

16.5

1.6

536

63.7

1.1

16.3

1.5

524

10.4

1.2

15.8

1.2

4.0

12.2

1.4

15.4

95.4

4.2

13.9

1.6

863

102.0

4.5

15.5

6

866

108.2

4.8

7

862

114.0

8

859

9

SD

BMI (kg/m2) Mean

SD

0.4

11.9

1.2

5.0

0.7

14.7

1.6

2.4

6.4

0.9

15.6

1.6

70.1

2.9

7.8

1.1

15.9

1.5

603

79.6

3.7

9.8

1.3

15.4

1.3

1.1

626

86.9

4.2

11.5

1.4

15.2

1.2

15.2

1.1

630

93.9

4.4

13.3

1.6

15.0

1.1

1.8

14.9

1.1

628

100.6

4.5

14.9

1.8

14.7

1.1

17.1

2.2

14.6

1.1

627

106.7

4.7

16.3

1.9

14.3

1.0

5.1

18.7

2.6

14.4

1.2

628

112.6

5.1

18.0

2.2

14.1

1.0

119.7

5.4

20.7

3.1

14.4

1.3

625

118.1

5.4

19.9

2.7

14.2

1.2

830

124.7

5.7

22.8

3.7

14.6

1.4

596

123.6

5.7

22.2

3.3

14.4

1.3

10

809

129.8

5.9

25.2

4.3

14.8

1.6

592

129.2

6.1

24.7

4.0

14.8

1.5

11

830

135.0

6.4

28.0

5.1

15.2

1.7

606

134.8

6.7

27.9

5.0

15.3

1.7

12

864

140.5

7.2

31.1

6.2

15.6

2.0

623

141.3

7.1

32.2

6.3

16.0

2.1

13

875

146.5

8.1

34.9

7.4

16.1

2.2

633

147.3

6.9

37.3

7.3

17.1

2.4

14

876

153.5

8.7

39.9

8.6

16.8

2.4

634

151.0

6.1

41.5

7.6

18.1

2.7

15

854

160.1

8.4

45.6

9.5

17.6

2.6

624

153.1

5.7

44.6

7.8

18.9

2.9

16

766

164.7

7.8

50.4

10.1

18.4

2.8

574

154.1

5.6

46.6

7.7

19.6

3.0

17

518

167.2

6.8

53.9

10.1

19.1

2.9

362

155.0

5.3

48.2

8.1

20.1

3.3

18

323

169.0

6.6

56.3

10.4

19.6

3.0

228

155.3

5.5

48.4

8.2

20.1

3.2

19

305

169.8

6.6

57.6

10.3

19.9

2.9

222

155.4

5.6

49.5

9.0

20.5

3.6

20

291

169.7

6.6

58.1

10.4

20.2

3.0

209

156.0

5.5

49.6

8.9

20.5

3.6

21

252

169.6

6.7

58.8

10.5

20.6

3.0

187

156.0

5.5

49.5

8.9

20.6

3.6

30

886

169.3

6.3

73.3

14.3

25.2

4.3

638

154.9

5.5

60.6

13.6

25.0

5.1

BMI, body mass index; SD, standard deviation. Two of the original 1526 subjects did not have adult height recorded.

TABLE 2 Partial Correlations of Height According to Age Age (Years)

Age (Years)

0

0.25

0

1

0.65 0.54 0.50 0.43 0.40 0.37 0.35 0.34 0.34 0.32 0.31 0.29 0.28 0.27 0.24 0.25 0.28 0.29 0.30 0.36 0.36 0.36 0.35 0.29

0.25

0.65 1

0.5

0.54 0.77 1

1

0.50 0.65 0.78 1

2

0.43 0.59 0.68 0.84 1

3

0.40 0.56 0.65 0.77 0.91 1

4

0.37 0.52 0.60 0.73 0.85 0.93 1

5

0.35 0.50 0.59 0.72 0.82 0.89 0.94 1

6

0.34 0.49 0.58 0.71 0.80 0.86 0.91 0.96 1

7

0.34 0.48 0.57 0.69 0.78 0.84 0.89 0.94 0.97 1

8

0.32 0.47 0.55 0.68 0.76 0.82 0.87 0.92 0.95 0.98 1

9

0.31 0.45 0.54 0.66 0.75 0.81 0.85 0.91 0.94 0.97 0.98 1

10

0.29 0.44 0.52 0.63 0.72 0.77 0.82 0.88 0.91 0.94 0.96 0.98 1

11

0.28 0.40 0.49 0.60 0.68 0.73 0.78 0.83 0.87 0.90 0.92 0.96 0.99 1

12

0.27 0.36 0.45 0.55 0.64 0.67 0.72 0.78 0.82 0.85 0.87 0.91 0.95 0.98 1

13

0.24 0.34 0.42 0.52 0.61 0.65 0.70 0.75 0.79 0.82 0.84 0.87 0.91 0.94 0.97 1

14

0.25 0.39 0.47 0.55 0.61 0.65 0.69 0.74 0.77 0.80 0.82 0.84 0.86 0.88 0.89 0.93 1

0.5

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

30

0.77 0.65 0.59 0.56 0.52 0.50 0.49 0.48 0.47 0.45 0.44 0.40 0.36 0.34 0.39 0.46 0.50 0.50 0.50 0.49 0.51 0.52 0.50 0.78 0.68 0.65 0.60 0.59 0.58 0.57 0.55 0.54 0.52 0.49 0.45 0.42 0.47 0.53 0.56 0.54 0.55 0.54 0.54 0.52 0.55 0.84 0.77 0.73 0.72 0.71 0.69 0.68 0.66 0.63 0.60 0.55 0.52 0.55 0.61 0.63 0.62 0.64 0.63 0.63 0.61 0.61 0.91 0.85 0.82 0.80 0.78 0.76 0.75 0.72 0.68 0.64 0.61 0.61 0.63 0.61 0.58 0.58 0.58 0.58 0.57 0.53 0.93 0.89 0.86 0.84 0.82 0.81 0.77 0.73 0.67 0.65 0.65 0.66 0.65 0.62 0.62 0.62 0.62 0.61 0.57 0.94 0.91 0.89 0.87 0.85 0.82 0.78 0.72 0.70 0.69 0.69 0.66 0.60 0.59 0.59 0.61 0.60 0.57 0.96 0.94 0.92 0.91 0.88 0.83 0.78 0.75 0.74 0.72 0.68 0.62 0.61 0.60 0.61 0.60 0.58 0.97 0.95 0.94 0.91 0.87 0.82 0.79 0.77 0.75 0.71 0.64 0.64 0.63 0.63 0.62 0.60 0.98 0.97 0.94 0.90 0.85 0.82 0.80 0.77 0.72 0.64 0.63 0.63 0.63 0.62 0.60 0.98 0.96 0.92 0.87 0.84 0.82 0.79 0.74 0.66 0.64 0.63 0.63 0.62 0.61 0.98 0.96 0.91 0.87 0.84 0.79 0.73 0.65 0.63 0.62 0.62 0.60 0.58 0.99 0.95 0.91 0.86 0.79 0.71 0.62 0.59 0.58 0.58 0.57 0.54 0.98 0.94 0.88 0.79 0.70 0.59 0.56 0.55 0.54 0.54 0.50 0.97 0.89 0.77 0.65 0.53 0.51 0.49 0.49 0.49 0.42 0.93 0.80 0.67 0.52 0.46 0.44 0.44 0.44 0.40 0.93 0.82 0.69 0.61 0.60 0.59 0.59 0.55

15

0.28 0.46 0.53 0.61 0.63 0.66 0.69 0.72 0.75 0.77 0.79 0.79 0.79 0.79 0.77 0.80 0.93 1

16

0.29 0.50 0.56 0.63 0.61 0.65 0.66 0.68 0.71 0.72 0.74 0.73 0.71 0.70 0.65 0.67 0.82 0.96 1

17

0.30 0.50 0.54 0.62 0.58 0.62 0.60 0.62 0.64 0.64 0.66 0.65 0.62 0.59 0.53 0.52 0.69 0.88 0.97 1

18

0.36 0.50 0.55 0.64 0.58 0.62 0.59 0.61 0.64 0.63 0.64 0.63 0.59 0.56 0.51 0.46 0.61 0.83 0.94 0.99 1

19

0.36 0.49 0.54 0.63 0.58 0.62 0.59 0.60 0.63 0.63 0.63 0.62 0.58 0.55 0.49 0.44 0.60 0.81 0.93 0.98 0.99 1

20

0.36 0.51 0.54 0.63 0.58 0.62 0.61 0.61 0.63 0.63 0.63 0.62 0.58 0.54 0.49 0.44 0.59 0.80 0.92 0.97 0.99 0.99 1

1

0.98

21

0.35 0.52 0.52 0.61 0.57 0.61 0.60 0.60 0.62 0.62 0.62 0.60 0.57 0.54 0.49 0.44 0.59 0.80 0.92 0.97 0.98 0.99 1

1

0.98

30

0.29 0.50 0.55 0.61 0.53 0.57 0.57 0.58 0.60 0.60 0.61 0.58 0.54 0.50 0.42 0.40 0.55 0.75 0.87 0.94 0.98 0.98 0.98 0.98 1

Data are pooled for men and women.

0.96 0.88 0.83 0.81 0.80 0.80 0.75 0.97 0.94 0.93 0.92 0.92 0.87 0.99 0.98 0.97 0.97 0.94 0.99 0.99 0.98 0.98 0.99 0.99 0.98

286 SECTION

VIII Public Health and Epidemic Data Modeling

To reflect the importance of the earliest measurements in establishing the rank order of size and to have biologically meaningful intervals in our further analyses we opted for birth, early infancy (0–6 months), late infancy (6 months–2 years), childhood (2–8 years), adolescence (8–15 years), and adult life (15–30 years). In Table 3 we give the correlation matrices for height, weight, and body mass index that correspond to this choice. Table 2 also shows how variations in the timing of the pubertal growth spurt lead to reductions in correlations at that stage. Such effects are even clearer in data separated for males and females.

4.3 Classical Approaches We used Cole’s LMS method to represent each measurement of height, weight, and body mass index as an age- and sex-standardized score. Fig. 1 shows the difference in the mean value of these scores for the 219 men and women who had impaired glucose tolerance or diabetes, “disease,” less the mean value of the scores for the 1223 men and women who were normoglycemic, along with 95% confidence intervals. These are cross-sectional analyses. There is a suggestion that those with disease had lower values at birth and in infancy, most convincingly for body mass index, and that the values for weight and body mass index rose after that, so that the highest values occurred in adult life. These observations are developed in Table 4. Pairs of size measures made at ages 2 and 30 years are included in logistic regression models used to predict disease. Height is not convincingly associated with disease. However, weight at age 2 years is negatively associated and weight at age 30 years is positively associated, both in men and women, and in much the same way. There are similar associations with body mass index, for which the 2 degree of freedom chi-square test suggests slightly better prediction, despite the odds ratios being slightly weaker. This is because there is more correlation between the weight measures at ages 2 and 30, than there is between the body mass index measures (Table 3). These observations are further developed in Table 5. In the top half of the table the percentage of subjects with disease is shown according to body mass index measures at ages 2 and 30 years, shown in sex-specific fourths. The percentages generally fall with body mass index at age 2 years and rise with body mass index at age 30 years. There are 39 men and women in the two extreme cells. In the “fat/thin” cell one develops disease, whereas in the “thin/fat” cell there are 15 cases. The values are expressed as odds ratios relative to the thin/thin group in the lower half of the table, and the margins measure the odds ratio per fourth of body mass index. There is no convincing evidence for interaction. We include a brief discussion of the public health message of these results in Section 6.7. Bhargava et al. (2004) rely on such techniques. We now consider the conditional model, an approach designed to exploit the longitudinal nature of the size measurements.

Conditional Growth Models Chapter

11 287

TABLE 3 Partial Correlations of Height, Weight, and Body Mass Index According to Age Age (Years) Age (Years)

0

0.5

2

8

15

30

0

1

0.54

0.43

0.32

0.28

0.29

0.5

0.54

1

0.68

0.55

0.53

0.55

2

0.43

0.68

1

0.76

0.63

0.53

8

0.32

0.55

0.76

1

0.79

0.61

15

0.28

0.53

0.63

0.79

1

0.75

30

0.29

0.55

0.53

0.61

0.75

1

0

1

0.51

0.42

0.30

0.20

0.22

0.5

0.51

1

0.74

0.58

0.43

0.48

2

0.42

0.74

1

0.75

0.56

0.51

8

0.30

0.58

0.75

1

0.79

0.61

15

0.20

0.43

0.56

0.79

1

0.65

30

0.22

0.48

0.51

0.61

0.65

1

0

1

0.31

0.25

0.18

0.09

0.10

0.5

0.31

1

0.49

0.40

0.23

0.23

2

0.25

0.49

1

0.51

0.31

0.26

8

0.18

0.40

0.51

1

0.67

0.50

15

0.09

0.23

0.31

0.67

1

0.65

30

0.10

0.23

0.26

0.50

0.65

1

Height

Weight

Body mass index

Data are pooled for men and women.

5 CONDITIONAL MODELS APPLIED TO THE NEW DELHI BIRTH COHORT STUDY DATA We restrict analyses to the 1084 subjects (75%) with a complete set of size measurements available at ages 0, 6 months, 2, 8, 15, and 30 years. As in Section 3 we derive the growth measures separately for males and females, including all previous standardized size measures of the same type—height, weight, or body mass index—and use the standardized residuals as the growth

288 SECTION

VIII Public Health and Epidemic Data Modeling

FIG. 1 Difference in mean value of the standardized score for height, weight, and body mass index at selected ages (impaired glucose tolerance and diabetes less normoglycemia) and 95% confidence intervals.

measures. This leads to three sets of six mutually uncorrelated growth measures to be used as predictors of disease. There are 48 regression coefficients reported in the next three tables. Formal interaction tests show that there are three differences, statistically significantly different at the 5% level, between the corresponding coefficients calculated separately for men and women. This is about as many as one might expect to arise simply by chance. We therefore choose to ignore these differences and focus on the results pooled for men and women.

5.1 Models in Height, Weight, and Body Mass Index Separately We use a logistic regression model with a binary variable for impaired glucose tolerance and diabetes as the outcome and include terms for sex and a block of six conditional growth measures, as set out in Section 3.1. Table 6 shows the odds ratios according to the three sets of conditional growth measures, and these are plotted in Fig. 2. The impression is similar to that given in Fig. 1. Again the greatest contrast occurs for the early and late changes in body mass index.

5.2 Models for Height and Weight Simultaneously Table 7 and Fig. 3 show the results of the extension of this idea to the mutually adjusted height and weight measures that we define in Section 3.2. The height coefficients correspond closely to those in the independent analysis of Section 5.1, and the weight controlled for height coefficients correspond closely to those for body mass index.

TABLE 4 Odds Ratios for Impaired Glucose Tolerance or Diabetes According to Size at Ages 2 and 30 Years Age 2 Years OR

95% CI

Males

1.04

0.79–1.37

Females

1.21

Pooled

Age 30 Years P

Model x2 (2df )

OR

95% CI

P

0.8

1.03

0.70–1.51

0.9

0.2

0.85–1.73

0.3

0.57

0.33–0.97

0.04

4.4

1.09

0.88–1.35

0.4

0.84

0.61–1.15

0.3

1.2

Males

0.67

0.52–0.87

0.003

2.07

1.53–2.81

<0.001

24.1

Females

0.68

0.50–0.92

0.01

1.43

1.05–1.95

0.02

7.8

Pooled

0.69

0.56–0.83

<0.001

1.74

1.40–2.15

<0.001

28.3

Males

0.70

0.55–0.89

0.003

1.89

1.45–2.46

<0.001

26.3

Females

0.69

0.52–0.90

0.007

1.37

1.06–1.76

0.01

11.0

Pooled

0.70

0.58–0.84

<0.001

1.60

1.33–1.92

<0.001

34.0

Height

Weight

Body mass index

The table gives results from nine logistic regression models, each of which includes size both at age 2 years and at age 30 years. The pooled models also include a binary term for sex. CI, confidence interval; OR, odds ratio per SD; P, P-value.

TABLE 5 Impaired Glucose Tolerance or Diabetes According to Body Mass Index at Ages 2 and 30 Years Fourths of Body Mass Index at Age 30 Years First

Second

Third

Fourth

Total

Fourths of Body Mass Index at Age 2 Years

Percent

(Cases/ Subjects)

Percent

(Cases/ Subjects)

Percent

(Cases/ Subjects)

Percent

(Cases/ Subjects)

Percent

(Cases/ Subjects)

First

13

(11/87)

21

(16/77)

19

(13/67)

38

(15/39)

20

(55/270)

Second

10

(9/91)

13

(8/64)

22

(13/58)

20

(12/59)

15

(42/272)

Third

13

(7/53)

16

(11/67)

8

(6/79)

19

(14/73)

14

(38/272)

Fourth

3

(1/39)

3

(2/64)

6

(4/68)

24

(24/99)

11

(31/270)

Total

10

(28/270)

14

(37/272)

13

(36/272)

24

(65/270)

15

OR

a

a

a

a

(166/1084) a

(95% CI)

OR

(95% CI)

OR

(95% CI)

OR

(95% CI)

OR

(95% CI)

First

1.0

(Baseline)

1.8

(0.8–4.2)

1.6

(0.7–3.9)

4.5

(1.8–11.1)

1.5

(1.1–2.0)

Second

0.8

(0.3–1.9)

1.0

(0.4–2.6)

2.0

(0.8–4.9)

1.7

(0.7–4.3)

1.4

(1.0–1.8)

Third

1.1

(0.4–3.0)

1.4

(0.5–3.4)

0.6

(0.2–1.6)

1.6

(0.7–3.8)

1.0

(0.8–1.5)

Fourth

0.2

(0.0–1.4)

0.2

(0.0–1.0)

0.4

(0.1–1.4)

2.2

(1.0–4.9)

3.1

(1.8–5.5)

Total

0.8

(0.5–1.2)

0.6

(0.5–0.9)

0.6

(0.4–0.9)

0.8

(0.7–1.1)

1.5

(1.3–1.7)

0.7

(0.6–0.9)

CI, confidence interval; OR, odds ratio. Quartiles for body mass index (kg/m2) at age 2 years are: males 14.92, 15.74, 16.59; females 14.43, 15.23, 16.08. Quartiles for body mass index (kg/m2) at age 30 years are: males 22.49, 25.18, 27.64; females 21.48, 24.73, 27.99. a Odds ratios within the body of the table come from a logistic regression model including sex and 15 categorical variables that pick out the cells; odds ratios in the margins come from logistic regression models including sex and body mass index fourths, running from 1 to 4.

Conditional Growth Models Chapter

11 291

TABLE 6 Odds Ratios From Three Logistic Regression Models With Presence of Impaired Glucose Tolerance or Diabetes as the Outcome and Conditional Growth Variables as the Predictors Age

Odds Ratio

95% Confidence Interval

P-Value

0

0.89

0.75–1.05

0.2

0–6 months

0.94

0.79–1.11

0.4

6 months–2 years

1.15

0.98–1.36

0.09

2–8 years

1.05

0.89–1.24

0.6

8–15 years

0.95

0.81–1.13

0.6

15–30 years

0.87

0.73–1.03

0.1

0

0.87

0.74–1.03

0.1

0–6 months

0.89

0.75–1.05

0.2

6 months–2 years

0.99

0.83–1.17

0.9

2–8 years

1.21

1.03–1.42

0.02

8–15 years

1.23

1.04–1.45

0.02

15–30 years

1.45

1.23–1.72

<0.001

0

0.92

0.77–1.08

0.3

0–6 months

0.89

0.75–1.05

0.2

6 months–2 years

0.83

0.70–0.99

0.03

2–8 years

1.15

0.97–1.36

0.1

8–15 years

1.33

1.13–1.58

<0.001

15–30 years

1.44

1.22–1.72

<0.001

Height

Weight

Body mass index

Each conditional growth variable is in standardized units. Sex is included in each of the three models as a binary variable. Model w2 values on 6 degrees of freedom are 9.2 (height), 34.6 (weight), and 39.5 (body mass index).

5.3

Models That Reverse Time

We define the models that look back over time in Section 3.4. The results of applying this procedure are shown in Table 8 and in Fig. 4. There is little suggestion that height at any stage is importantly associated with impaired

292 SECTION

VIII Public Health and Epidemic Data Modeling

FIG. 2 Odds Ratios (95% confidence intervals) for impaired glucose tolerance and diabetes according to conditional growth measures (standardized units). Results for men and women are combined.

TABLE 7 Odds Ratios From a Logistic Regression Models With Presence of Impaired Glucose Tolerance or Diabetes as the Outcome and Mutually Derived Conditional Growth Variables as the Predictors Age

Odds Ratio

95% Confidence Interval

P-Value

0

0.87

0.74–1.04

0.1

0–6 months

0.94

0.79–1.12

0.5

6 months–2 years

1.20

1.02–1.42

0.03

2–8 years

1.10

0.93–1.30

0.3

8–15 years

0.97

0.82–1.15

0.7

15–30 years

0.93

0.78–1.11

0.4

0

0.94

0.79–1.11

0.4

0–6 months

0.90

0.76–1.07

0.2

6 months–2 years

0.85

0.72–1.02

0.08

2–8 years

1.14

0.96–1.35

0.1

8–15 years

1.36

1.15–1.60

<0.001

15–30 years

1.44

1.21–1.72

<0.001

Height

Weight, given height

Each conditional growth variable is in standardized units. Sex is also included as a binary variable. Model w2 values on 12 degrees of freedom are 47.2.

Conditional Growth Models Chapter

11 293

FIG. 3 Odds Ratios (95% confidence intervals) for impaired glucose tolerance and diabetes according to mutually adjusted conditional growth measures (standardized units). Results for men and women are combined.

glucose tolerance and diabetes. The primary impression from the weight and body mass index analyses is that the adult value is the strongest associate. The reversed model addresses the issue of whether the path to the adult size conveys any further information. It appears that there is little to be learned before age 2 years. However, knowing growth back to age 2 years does indeed seem useful. That is brought out clearly by the traditional analyses of Tables 4 and 5.

6 STRENGTHS AND WEAKNESSES OF THE CONDITIONAL GROWTH MODEL: CONCLUSIONS 6.1

These Models Are Limited to Internal Comparisons

The models described generate measures of growth derived entirely from within one data set. They identify departures from the prevailing growth pattern for the cohort. It is possible that what really matters is the trajectory of the whole cohort, not the fine details of individual differences. To address such wider issues requires data from other contexts. This has been recognized, for example, by the COHORTS collaboration, which pools data from five birth cohort studies in low- and middle-income countries (Adair et al., 2013).

6.2

The Reversal Paradox

Tu et al. (2005) use the expression “reversal paradox” to express surprise that the regression coefficient of an outcome regressed on an early measure differs

294 SECTION

VIII Public Health and Epidemic Data Modeling

TABLE 8 Odds Ratios From Three Logistic Regression Models With Presence of Impaired Glucose Tolerance or Diabetes as the Outcome and Reverse-Calculated Conditional Growth Variables as the Predictors Age

Odds Ratio

95% Confidence Interval

P-Value

30 years

1.02

0.86–1.21

0.8

30–15 years

1.13

0.95–1.34

0.2

15–8 years

1.08

0.91–1.27

0.4

8–2 years

0.99

0.84–1.17

0.9

2 years–6 months

0.86

0.73–1.02

0.07

6 months–0

0.92

0.78–1.08

0.3

30 years

1.40

1.18–1.67

<0.001

30–15 years

0.88

0.75–1.04

0.1

15–8 years

0.80

0.68–0.95

0.01

8–2 years

0.80

0.67–0.94

0.007

2 years–6 months

0.86

0.72–1.02

0.08

6 months–0

0.94

0.79–1.11

0.47

30 years

1.46

1.23–1.74

<0.001

30–15 years

0.89

0.76–1.06

0.2

15–8 years

0.79

0.67–0.93

0.005

8–2 years

0.77

0.65–0.91

0.002

2 years–6 months

0.93

0.78–1.10

0.4

6 months–0

0.96

0.81–1.14

0.7

Height

Weight

Body mass index

Each conditional growth variable is in standardized units. Sex is included in each of the three models as a binary variable. Model w2 values on 6 degrees of freedom are 7.8 (height), 32.9 (weight), and 38.1 (body mass index). w2 values from these reversed conditionals would be identical to those from forward conditionals in models that were not pooled for males and females.

from the corresponding regression coefficient obtained when a later size measure is also included in the model, often changing sign. Let U be the early size, V the later size, and Y the outcome, all of which may be standardized without loss of generality to have mean zero and standard deviation one, and let their correlations be rUV, rUY, and rVY. Then standard results are that

Conditional Growth Models Chapter

11 295

FIG. 4 Odds Ratios (95% confidence intervals) for impaired glucose tolerance and diabetes according to reverse calculated conditional growth measures (standardized units). Results for men and women are combined.

the regression coefficient of Y on U ignoring V is rUY, while the regression coefficient of Y on U including V in the regression model is   ðrUY  rUV  rVY Þ= 1  r2UV For only moderate correlation between U and V the denominator is close to 1 and ignorable. When there is tracking, rUV is greater than zero. So if rVY is positive, it follows that the regression coefficient will be smaller when V is included in the model than when it is not, and may sometimes switch from being positive to being negative. The two models address different questions. The first is prospective and could be posed by a pediatrician, “Given all the information that I have at this early moment in time, to what extent is this early size associated with the outcome?” The second is retrospective and could only be posed by seeing the subject as an adult, “Given the full information that I now have, including later size, to what extent is early size associated with the outcome?” This would be to attempt to extract any developmental contribution to adult disease risk. These are two different questions, so it is not a paradox that they lead to different answers. The joint regression model attempts to extract the developmental component by mimicking a study in which subjects are matched according to their adult size. It is possible that this component will vary according to adult size. This is the simplest illustration available. Structural equation models, path models, and directed acyclic graphs are used to explore more complex scenarios. The two questions also reflect the aims of the forward and reversed conditional models.

296 SECTION

VIII Public Health and Epidemic Data Modeling

6.3 Analogies With the Classical “Age, Period, Cohort” Problem The “age, period, cohort problem” is a well-known epidemiological trap (see Clayton and Schifflers, 1987). Suppose that disease rates r(a, p) are available according to age, a, and period, p, which is the time of measurement. The corresponding time of birth, c, is given by p  a and is referred to as the cohort. To measure the relative importance of age, period, and cohort on the disease rates it seems natural to express some function, f, of the rates as a linear combination of functions of these three components, u(a), v(p), and w(c). So, f ðrða, pÞÞ ¼ uðaÞ + vðpÞ + wðcÞ + error: But this is impossible to achieve uniquely; there is an “identifiability” problem. For example, we can define new functions of age, period, and cohort which differ by a linear shift. Thus, ul ðaÞ ¼ uðaÞ + la vlðpÞ ¼ vðpÞ  lp wl ðcÞ ¼ wðcÞ + lc ¼ wðcÞ + lðp  aÞ: It follows that ul ðaÞ + vl ðpÞ + wl ðcÞ ¼ uðaÞ + vðpÞ + wðcÞ for all values of l. The problem is that we are trying to characterize the influence of three variables when any two of them define the third. The reality is that there are only two free, independent variables. All attempts to resolve the problem must contain a step which fixes a particular value of l. But this is precisely the problem that occurs when we assess the influence of two sizes and growth, measured as the difference between them, on a disease outcome. There are only two free variables. Lucas et al. (1999) first made this point in the Developmental Origins context. Nor is the problem resolved by adding more data points. Where there are n size measurements, these can generate only up to n linearly independent variables.

6.4 Other Epidemiological Principles The example in this chapter illustrates a technique. There are at least four further issues that need to be addressed for a full epidemiological analysis.

6.4.1 Confounding It is possible, for example, that poor early growth and adult impaired glucose tolerance and diabetes are both a consequence of poverty, measures of which could be introduced into the regression model.

Conditional Growth Models Chapter

11 297

6.4.2 Survival and Selection Effects Less than one quarter of the subjects originally enrolled in the New Delhi Birth Cohort are included in this study. It would be possible to compare those included and excluded on measures made early in life in an attempt to assess their representativeness. However, the fundamental question, which is unanswerable, is whether the association between growth and disease is the same in those who are and who are not in the study. 6.4.3 Missing Data Even among those whose disease status is known there are losses due to incomplete growth records. Again, representativeness comparisons are possible, but are not the full answer. It may be that progress can be made through use of multiple imputation techniques (see, for example, Kenward, 2013). 6.4.4 Measurement Errors It is necessary to consider how seriously errors in the size measurements impact the growth measures and so attenuate the associations between growth and outcome. This can be addressed algebraically in simple cases of additive random error, or by simulation in more complex situations. There may be a distinction between studies in which size measurements are recorded in routine clinical care and studies which use dedicated, trained fieldworkers.

6.5

Linear Spline Mixed Models

This is another two-stage approach. We first model a size measure and will use weight, wt, as an example in this exposition. As for the conditional models, we partition the age range into n intervals ða0 to a1 Þ, ða1 to a2 Þ,…, ðan2 to an1 Þ, ðan1 to an Þ, where the ages a1, a2, …, an1 are “knots,” a term that is used to convey how weight is modeled as a sequence of straight lines, one in each age interval, that are “tied together” at these ages to form a continuous curve, called a spline. For subject i we have ni measurements of weight, denoted wti1, …, wtini measured at ages ti1, …, tini, which fall in the interval from a0 to an inclusive. For the spline model we use the full set of observations; whereas in the conditional models we use only weights at the beginning, knots, and end. The line fitted to the weights for subject i at age t is Xn ðb + uki Þsk ðtÞ w^ti ðtÞ ¼ ðk¼0Þ k where s0(t) ¼ 1 and sk(t) ¼ max(t  ak1, 0) for k ¼ 1, …, n are spline components; where b0 (intercept) and b1, …, bn (slopes) are parameters that define the spline for the whole population; and where u0i, u1i, …, uni are deviations

298 SECTION

VIII Public Health and Epidemic Data Modeling

that measure the differences from the corresponding b term for subject i. The residual for subject i at age tij, eij, is thus given by     eij ¼ wti tij  w^ti tij : The model is specified by requiring that all e terms are independent random variables with mean zero and variance s2 and are independent of all uki terms. All uki terms are random variables with mean zero, and the covariance between uki and uli is given by ckl. Software to fit such multilevel spline models is now available in most statistical packages. This growth model has age as the predictor variable, unlike the conditional model, which uses earlier sizes. Because the standard deviation of, say, weight increases substantially with age (over 30-fold in Table 1) leading to heteroscedasticity, and because height and weight velocity is not constant (see, for example, Tanner and Whitehouse, 1976), whereas constancy is implied by the spline model, it makes sense to fit the splines on the standardized data. This resolves both issues. Taking this approach the fitting of the spline model is still delicate. When the age intervals are short, as in our example, there is a tendency for a large positive slope in one time interval to be followed by a large negative slope in the next, in order to remain within the bounds required for a standardized score. This leads to correlation among the predicted values and instability in the second-stage model used to predict the disease outcome. With these provisos, it is certainly worth considering these spline models, because they use all available data and do not require measurements to be made near to knot points.

6.6 The Markov Principle and Regression to the Mean It is instructive to consider the Markov principle of stochastic processes: “Given my current state, do I learn anything about the future from knowing my past states?” The developmental origins hypothesis suggests that measurements made in the past do retain predictive usefulness, though it is possible that this usefulness is rooted in other current measures, perhaps epigenetic. The conditional growth model approach is motivated by a non-Markov viewpoint of how growth impacts disease, as it seeks to establish critical windows of risk. Another statistical principle for which it controls is regression to the mean. The growth models explicitly include past size measurements into the modeling of future size, so that the expected size will be somewhat shrunk toward the mean. This avoids part of the problem that would arise in analyzing simple standardized differences as the growth measures.

6.7 Public Health Relevance of the Results Reported Here In Adair et al. (2013) the authors study early growth trajectories and adult outcomes in cohort studies from five low and middle income countries, including

Conditional Growth Models Chapter

11 299

the New Delhi Birth Cohort. They suggest that early linear growth is not associated with adverse consequences for adult cardiometabolic disease, but is associated with benefits, including improved survival and enhanced cognitive function. However, weight gain in childhood, independent of linear growth, is associated with adverse cardiometabolic consequences, particularly when it occurs after the first 1000 day window (Victora et al., 2010). This is much as suggested by our analysis of impaired glucose tolerance and diabetes, as illustrated in Fig. 3. Effective interventions should reduce stunting and should restrict measures to combat underweight to the first 2 years of life.

6.8

Summary of the Conditional Growth Models

Two major weaknesses are the potential loss of sample caused by missing size measurements near the chosen critical age points, and the reliance on observations made near those ages to the exclusion of others. Thus, the study design will be critical in determining whether this is a valuable tool in any new context. The advantage is the focus that the method gives to potential critical developmental windows. It is also notable that the traditional methods of analysis shown in Section 4.3 remain useful and can convey a simple, effective summary message.

ACKNOWLEDGMENTS The authors are grateful to Dr. S.K. Bhargava and Professor H.P. Sachdev of the New Delhi Birth Cohort Study for all their collaboration and for their willingness to allow us to use their data. We are also indebted to the study participants.

REFERENCES Adair, L.S., Fall, C.H.D., Osmond, C., Stein, A.D., Martorell, R., Ramirez-Zea, M., Sachdev, H.S., Dahly, D.L., Bas, I., Norris, S.A., Micklesfield, L., Hallal, P., Victora, C.G., COHORTS Group, 2013. Associations of linear growth and relative weight gain during early life with adult health and human capital in countries of low and middle income: findings from five birth cohort studies. Lancet 382, 525–534. Bhargava, S.K., Sachdev, H.S., Fall, C.H.D., Osmond, C., Lakshmy, R., Barker, D.J.P., Biswas, S.K.D., Ramji, S., Prabhakaran, D., Reddy, K.S., 2004. Relation of serial changes in childhood body mass index to impaired glucose tolerance in young adulthood. N. Engl. J. Med. 350, 865–875. Clayton, D.G., Schifflers, E., 1987. Models for temporal variation in cancer rates. II: age-periodcohort models. Stat. Med. 6, 469–481. Cole, T.J., Green, P.J., 1992. Smoothing reference centile curves: the LMS method and penalized likelihood. Stat. Med. 11, 1305–1319. Gale, C.R., O’Callaghan, F.J., Bredow, M., Martyn, C.N., The ALSPAC Study Team, 2006. The influence of head growth in fetal life, infancy and childhood on intelligence at the ages of 4 and 8 years. Pediatrics 118, 1486–1492. Keijzer-Veen, M.G., Euser, A.M., van Montfoort, N., Dekker, F.W., Vandenbroucke, J.P., van Houwelingen, H.C., 2005. A regression model with unexplained residuals was preferred in

300 SECTION

VIII Public Health and Epidemic Data Modeling

the analysis of the fetal origins of adult disease hypothesis. J. Clin. Epidemiol. 58, 1320–1324. Kenward, M.G., 2013. The handling of missing data in clinical trials. Clin. Investig. 3, 241–250. Krishnaveni, G.V., Veena, S.R., Srinivasan, K., Osmond, C., Fall, C.H.D., 2015. Linear growth and fat and lean tissue gain during childhood: associations with cardiometabolic and cognitive outcomes in adolescent Indian children. PLos One 10(11)e0143231. Lucas, A., Fewtrell, M.S., Cole, T.J., 1999. Fetal origins of adult disease—the hypothesis revisited. Br. Med. J. 319, 245–249. Tanner, J.M., Whitehouse, R.H., 1976. Clinical longitudinal standards for height, weight, height velocity, weight velocity, and stages of puberty. Arch. Dis. Child. 51, 170–179. Tu, Y.-K., West, R., Ellison, G.T.H., Gilthorpe, M.S., 2005. Why evidence for the fetal origins of adult disease might be a statistical artifact: the “reversal paradox” for the relation between birth weight and blood pressure in later life. Am. J. Epidemiol. 161, 27–32. Victora, C.G., de Onis, M., Hallal, P.C., Blossner, M., Shrimpton, R., 2010. Worldwide timing of growth faltering: revisiting implications for interventions. Pediatrics 125, e473–e480.