A multilevel time-series model for the incidence of AIDS cases in Spain

A multilevel time-series model for the incidence of AIDS cases in Spain

Health & Place 6 (2000) 309±317 www.elsevier.com/locate/healthplace A multilevel time-series model for the incidence of AIDS cases in Spain JesuÂs R...

208KB Sizes 0 Downloads 19 Views

Health & Place 6 (2000) 309±317

www.elsevier.com/locate/healthplace

A multilevel time-series model for the incidence of AIDS cases in Spain JesuÂs Rosel*, Juan C. Oliver, Pilar Jara, Antonio Caballer Universidad `Jaume I', CastelloÂn, Spain

Abstract Objective. The main aim of this research is to study the quantitative evolution of the incidence of AIDS in the 19 Spanish Communities. The hypothesis is that incidence follows a multilevel autoregressive model, where each Community shows random variability around a general process. Method. On the basis of ocial data on the number of existing AIDS cases, an autorregressive multilevel timeseries model was developed. Results and Conclusions. Analysis shows that the hypothesis is supported, indicating that overall AIDS incidence in Spain has already reached a maximum and has a tendency to remain stable or to decline in future. Long term expected values have become stable in most Communities; a slight increase is expected only in Extremadura. However, this Community has a relatively sparse population, and its contribution on the overall Spanish incidence is small. Long term expected values are estimated to be around 152.99 new cases per million inhabitants per year. This value is slightly smaller than the maximum incidence, observed in 1994 (179.4 cases). 7 2000 Elsevier Science Ltd. All rights reserved. Keywords: AIDS cases in Spain; Development of AIDS cases; Multilevel models; Panel analysis; Time-series models; Random coef®cient models

1. Introduction The relatively recent occurrence of the Acquired Immune-De®ciency Syndrome (AIDS), as an endemic disease, and the resulting scarcity of data make it dicult to describe the long term epidemiological trends, so in Spain there are data from 1983, for each of the 19 Communities. There is a methodological diculty

* Corresponding author Tel: +964-72-80-00; fax: 964-72-9349. E-mail address: [email protected] (J. Rosel).

in obtaining AIDS data due to its retrospective nature. When somebody is diagnosed as being HIV positive, they say which year they were infected in, and between these two dates (acquisition and diagnosis) 2 years can easily have elapsed. For this reason, ocial AIDS statistics are valid for up to 2 years beforehand (so, in the year 2000 AIDS data are reliable from 1997 or before). Nevertheless, research and statistical prediction models may be employed in the design of public health policies aimed at either improving patient assistance or in planning prevention campaigns (Brookmeyer and Gail, 1994). The Centro Nacional de Epidemiologõ a (National

1353-8292/00/$ - see front matter 7 2000 Elsevier Science Ltd. All rights reserved. PII: S 1 3 5 3 - 8 2 9 2 ( 0 0 ) 0 0 0 1 2 - 5

310

J. Rosel et al. / Health & Place 6 (2000) 309±317

Centre for Epidemiology) in Spain (1997) has recently published data on the incidence of AIDS (number of new cases per million inhabitants) in the 19 Spanish Communities. The objective of this study is to develop a predictive model on the basis of that information (Table 1). Our general hypothesis will be that changes in AIDS incidence in Spain over the years follow an autoregressive model. Particular rates in the di€erent Communities are considered as random ¯uctuations around that general trend. An autoregressive model of order p is an equation where the temporary variable (Yt) is a function of p previous values of the variable (Yt), according to the formula: Yt ˆ b0 ‡ b1  Ytÿ1 ‡ b2  Ytÿ2 ‡ . . . ‡ bp  Ytÿp ‡ et ,

…1†

the terms b0, b1, b2,. . .,bp being the autoregressive coecients and et the prediction errors (a white noise process). The former general hypothesis has two subcomponents: (a) changes in AIDS incidence in Spain follow a ®rst-order autoregressive model that can be estimated by a pooled time series model (Judge et al., 1985; Hsiao, 1986; Dielman, 1988; Sayrs, 1989; Hamilton, 1994; Baltagi, 1995); (b) the regression coecients (constant and slope of the autoregressive variable) constitute random e€ects across the di€erent Communities (Bryk and Raudenbush, 1992; Longford, 1993; Hox, 1995; Goldstein, 1996). According to this formulation, the following linear model is proposed: YC, t ˆ …b0 ‡ b0C † ‡ …b1 ‡ b1C †  YC, tÿ1 ‡ eC, t

…2†

where YC,t represents the number, per million inhabitants, of new AIDS cases (Y ) for a given Community (C) in year t; b0 is the model constant, and b1 is the model coecient (b0 and b1 have to be estimated, and they are supposed to be ®xed e€ects common to all Communities); YC,tÿ1 is the number of existing cases in a given Community during the previous year (tÿ1); eC,t is the prediction error for Community C in year t. Observe that C=1, 2, . . . , 19; and t=1983, 1984, . . . , 1996. b0C will be a value speci®c to each Community to be added to the ®xed e€ects constant b0, and is distributed normally with mean 0 and variance s 20C; b1C will also have a di€erent value for each Community, to be added to the ®xed coecient b1 and will be distributed as b1C 0 N(0,s 21C); eC,t will be the prediction errors. Note how Eq. (2) encompasses an ordinary regression model where b0 and b1 are the ®xed e€ects, whereas b0C and b1C are the random coecients; in the case where only b0 and b1 yield signi®cant results, the statistical representation of this hypothesis would be:

YC,t=b0+b1YC,tÿ1+uC,t, that is, a general cross-sectional ®rst order autoregressive equation. 2. Data analysis A graphical representation of existing data is shown in Fig. 1; it can be easily seen how the standard deviation of AIDS incidence varies with time. A Breusch± Pagan test was conducted, and the null hypothesis for homogeneity of variance was rejected, Y=86.17, p < 0.005 (Gujarati, 1988; Judge et al., 1985), therefore, Var(YC,tvt)$Constant. Ordinary least squares estimation of regression coef®cients would not be ecient in this case, since variances are heterogeneous (Draper and Smith, 1981). For that reason, a statistical procedure was used, by which each term of Eq. (2) is divided by the standard deviation of the corresponding year (st):     YC, t 1 YC, tÿ1 ‡ …b1 ‡ b1C † ˆ …b0 ‡ b0C † st st st ‡

eC, t st

…3†

which could be re-expressed as: Y C, t ˆ …b0 ‡ b0C † ‡ …b1 ‡ b1C †  Y C, tÿ1 ‡ eC, t

…4†

where s1t is the weighting variable. The transformed expressions from Eq. (3) to Eq. (4) are: YC, t ˆ Y C, t , st eC, t ˆ eC, t : st

YC, tÿ1 ˆ Y C, tÿ1 , st

and …5†

whose transversal variance is constant and equal to 1

Fig. 1. AIDS incidence in Spain from 1984 to 1996. The thick line represents the overall mean, and the vertical bars indicate 21SD of the incidence of speci®c Communities for each year.

b

0.4 0.65

1.3 1.15

1.9

4.3 2.96

2.5 12.2 3 3.9 8.4

0.7 4.2 1

1.7

0.4 0.6 0.3

4.4 1.4 1.9 1.2 2.4 4.6 3.7

2.9 0.7 1.9 1.2 1.5a 1.8 0.5

1.5

2.7 3.4

1985

0.5

1984

0.2

1983

19.2 12.5 8.66

5.1 8.4 7.1 20.3 6.2 1.9 4.3 1.2 19.8 9.5 0.9 8.6 30.5 3 5.8 23.8 11.6

1986 19.2 8.4 15.2 51.8 11.6 17.2 15.3 8.5 43.8 20.2 7.5 13.6 54.5 16.7 23.2 41.2 11.5 15.1 38.3 27.3 15.11

1987 31.4 29.5 37 81.6 25.8 47.6 22.7 23.6 94.7 48.2 27.4 31.3 119.8 22.4 48.3 81.9 45.9 60.2 56.9 57.4 27.53

1988

Missing data in the original table; it was estimated by interpolation. The standard deviation was calculated for individual columns from the raw data.

AndalucõÂ a AragoÂn Asturias Baleares Canarias Cantabria Castilla-LeoÂn Castilla-La Mancha CatalunÄa Comunidad Valenciana Extremadura Galicia Comunidad Madrid Murcia (RegioÂn de) Comunidad Navarra PaõÂ s Vasco La Rioja Ceuta Melilla Total Standard deviationb

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

a

Community

No. 53.1 42.2 47.2 115.5 39.1 70.5 29.8 30.8 131.1 59.9 32 51.4 160.3 32 32.8 116.8 45.8 75 18.8 79.8 39.91

1989

Table 1 Raw data provided by the Centro Nacional de EpidemiologõÂ a (1997), as they were used in the analysis

65 59.2 44.8 133.8 47.7 70.5 47.1 37.4 165.7 82.8 40.4 71.4 175.4 46.2 85.1 138.8 106.6 74.7a 18.5a 98 44.76

1990 66.5 85.5 68.2 146.6 58.2 89.4 54.5 43.3 176.3 100.3 31.8 91.1 223.3 62.1 94.8 172.4 121.4 74.3 18.1 114.6 53.31

1991 90 59.3 90 203.6 70 87.4 58.3 51.6 190.7 99 45.5 92.1 244.9 92 119.9 188.8 117 147.9 52.7 127.8 58.23

1992

108.8 89.7 89.5 216.7 87.7 102.2 78.6 46 191.4 106.1 56.1 87.9 260 81.9 158.5 198.8 93.6 117.7 84.8 137.6 58.18

1993

143.2 132.4 125.1 255.6 105.2 129.7 91.4 66.4 250.7 137.4 76.2 120.6 345.3 105.4 160.1 249.2 177.6 146 145.8 179.4 71.40

1994

140.4 90.5 121 280.2 126.2 85.5 84.3 75.7 240.2 116.8 43.6 106.9 304.4 106 163.3 238.7 171 274.5 152.9 167 77.42

1995

107.3 62 90.3 213 95.3 83.6 86.6 48.7 177.1 94.1 78.2 124.6 244.4 97.7 121 203.8 145.5 332.3 76.5 135.2 72.78

1996 J. Rosel et al. / Health & Place 6 (2000) 309±317 311

312

J. Rosel et al. / Health & Place 6 (2000) 309±317

(Var(Y C,tvt)=1=Constant). The magnitude of the coecients change in Eq. (4) along with the transformation of the variables, but the process can be reversed. There is a one-to-one relationship between the coecients corresponding to YC,t and Y C,t and one set can be recovered from the other. Fig. 2 provides a graphical representation of the Y C,t values, where each original value YC,t is divided by its corresponding transversal standard deviation, st. A random coecient regression analysis on the weighted data was then performed by using the linear model in Eq. (4) and an autoregressive covariance structure of the errors, a general least squares estimation (Judge et al., 1985; Blundell and Windmeijer, 1997); computations were done with PROC MIXED, a procedure in the Statistical Analysis System (SAS) package (Littell et al., 1996). The resulting equation was:

cally signi®cant, b 1C is not signi®cant. Table 3 lists b 0C estimates for each Community in Eq. (6). The model in Eq. (6) has been selected as more accurate for present data according to the following criteria: 1. It has more explanatory power than the basic model: Y C, t ˆ b0 ‡ eC, t

model (6) has a value ÿ2 REML Log Likelihood =638.0797, 2 parameters) and the value of ÿ2 REML Log Likelihood for Eq. (6) is 275.4268, 4 parameters; ÿ2 REML Log Likelihood di€erence=362.6529; di€erence of parameters=2; p < 0.0001, 2. The model in Eq. (6) ®ts also better than the simple autoregression model:

Y C, t ˆ …1:765 ‡ b0C † ‡ 0:645  …Y C, tÿ1 ÿ 1:712† ‡ eC, t

…7†

YC, t ˆ b0 ‡ b1  …YC, tÿ1 ÿ 1:712† ‡ eC, t

(ÿ2 REML Log Likelihood=285.1224, 4 parameters; di€erence=9.6956, di€erence of parameters=1, p = 0.0018).

…6†

the solution for point estimates and the tests of hypotheses for ®xed e€ects are described in Table 2. Both the overall F-test and the ®xed coecient e€ects were statistically signi®cant, indicating a good model ®t. For a 95% con®dence interval on b 1, the estimated value of the parameter was between 0.553 and 0.737 (b 1ÿzase(b 1) < bà 1 < b 1+zase(b 1)), in which z0.05=1.96 and se(b 1)=0.047; therefore, there is enough evidence to conclude that its value is di€erent from 1. Di€erence predictor scores (Y C,tÿ1ÿY C,t) were  used instead of raw predictor scores (Y C,tÿ1) so that the overall intercept (Y C,t=1.712) could be interpreted as the grand mean of AIDS incidence (Goldstein, 1996; Littell et al, 1996; Plewis, 1997). Only the random coecient b 0C in Eq. (6) is statisti-

…8†

Values for all weighted residuals are represented in Fig. 3(a), where vertical lines divide results for di€erent Communities. Residuals for Andalucõ a, AragoÂn and Asturias are depicted in more detail in Fig. 3(b) so as to facilitate the observation of the year by year sequence. Residuals are normally distributed with mean equal to zero (W: Normal=0.9839, p = 0.5805), Durbin±Watson's statistic were equal to 2.047, and the ®rst lag autocorrelation function was ÿ0.031 ( p = 0.645); taken together these results are indicative of white noise residuals (Box et al., 1994; Durbin and Watson, 1950, 1951; Greene, 1993). The procedure for calculating predicted values and residuals both for weighted and original data is explained in Appendix A. Appendix B details both the Table 2 Results of ®xed and random e€ect coecients and overall statistics from the regression Eq. (6) Solution for ®xed e€ects Parameter Estimate Standard Error DDF t b 0 b 1

Fig. 2. Time graph of weighted variable Y C,t for the di€erent Communities. Note that the transversal variance is equal to unity for each year. The thick line represents the overall weighted incidence for AIDS in Spain.

1.765 0.645

0.081 0.047

Pr > vtv

18 201

21.88 0.0001 13.77 0.0001

Test of random e€ects Source NDF

DDF

Type III F

Pr > F

(Y C,tÿ1ÿ1.7119)

201

189.55

0.0001

1

J. Rosel et al. / Health & Place 6 (2000) 309±317

313

Table 3 Results of b 0C random coecient in Eq. (6) Community

Estimate

SE Pred

DDF

t

Pr > vtv

AndalucõÂ a AragoÂn Asturias Baleares Canarias Cantabria Castilla-LeoÂn Castilla-La Mancha CatalunÄa Comunidad Valenciana Extremadura Galicia Comunidad Madrid Murcia (Region de) Comunidad Navarra PaõÂ s Vasco La Rioja Ceuta Melilla

ÿ0.1217 ÿ0.2269 ÿ0.1356 0.3855 ÿ0.2092 ÿ0.1925 ÿ0.2590 ÿ0.3486 0.4183 ÿ0.0500 ÿ0.2654 ÿ0.1063 0.7213 ÿ0.2129 ÿ0.0168 0.4249 0.0348 0.3909 ÿ0.2307

0.1303 0.1364 0.1396 0.1391 0.1345 0.1323 0.1328 0.1339 0.1363 0.1319 0.1442 0.1329 0.1605 0.1341 0.1350 0.1419 0.1388 0.1437 0.1391

201 201 201 201 201 201 201 201 201 201 201 201 201 201 201 201 201 201 201

ÿ0.93 ÿ1.66 ÿ0.97 2.77 ÿ1.56 ÿ1.45 ÿ1.95 ÿ2.60 3.07 ÿ0.38 ÿ1.84 ÿ0.80 4.50 ÿ1.59 ÿ0.12 3.00 0.25 2.72 ÿ1.66

0.3516 0.0978 0.3325 0.0061 0.1213 0.1473 0.0525 0.0099 0.0024 0.7052 0.0671 0.4246 0.0001 0.1140 0.9012 0.0031 0.8023 0.0071 0.0988

Fig. 3. (a) Time graph of the weighted residuals (e C,t) corresponding to Eq. (6) for each Community and year. (b) Enlargement for Communities 1 (AndalucõÂ a), 2 (AragoÂn), and 3 (Asturias). Note: The name corresponding to the numbers of each Community: 1 AndalucõÂ a, 2 AragoÂn, 3 Asturias, 4 Baleares, 5 Canarias, 6 Cantabria, 7 Castilla-LeoÂn, 8 CastillaMancha, 9 CatalunÄa, 10 Comunidad Valenciana, 11 Extremadura, 12 Galicia, 13 Comunidad Madrid, 14 R. Murcia, 15 Comunidad Navarra, 16 PaõÂ s Vasco, 17 La Rioja, 18 Ceuta, 19 Melilla.

rationale and calculations for obtaining predicted values for each Community in 1997. We have obtained the AIDS data declared for 1997, which are included in Table 4. In this same table the expected values for 1997 are shown as obtained from Eq. (6). A comparison of the variances of the estimated weighted errors for the series shown in Eq. (6) and those for 1997 was carried out, giving F(18, 220)=1.626, which is not signi®cant, that is, there are no di€erences between the two variances, and thus from Eq. (6) we obtain a reliable forecast for 1997. The forecast values for the years 1997±2005 and long term expected values in the communities and in Spain obtained from Eq. (6) are shown in Table 4, in which it can be seen that the expected values for 2005 are very similar to the long term values, and therefore it was considered unnecessary to continue the forecast. Appendix C shows how to compute overall predicted values of AIDS incidence in Spain as a function of the number of existing cases in the di€erent Communities. Appendix D shows long term expected incidence for each Spanish Community by using the `equilibrium point' method (Bowerman and O'Connell, 1987; Wei, 1989; Banerjee et al., 1993; Hamilton, 1994). 3. Conclusions and discussion Results give support to our initial hypothesis of a

AndalucõÂ a AragoÂn Asturias Baleares Canarias Cantabria Castilla-LeoÂn Castilla-La Mancha CataluÄa Comunidad Valenciana Extremadura Galicia Comunidad Madrid R. Murcia Comunidad Navarra PaõÂ s Vasco La Rioja Ceuta Melilla Total

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

96.7 79.8 54.4 189.6 78.0 81.7 69.5 46.4 147.0 97.9 75.1 93.7 202.3 80.7 81.5 146.0 92.3 291.9 111.6 114.8

1997a 108.4 71.6 96.5 213.6 94.3 88.0 143.5 237.1 156.6 100.3 186.4 71.8 161.3 83.0 127.2 172.9 100.0 208.0 94.3 141.8

1997 109.2 77.7 100.4 213.9 93.7 90.8 121.8 175.7 179.6 109.2 149.1 86.6 204.6 86.2 129.0 190.5 115.1 210.7 92.1 145.8

1998 109.7 81.7 103.0 214.2 93.3 92.7 107.8 136.1 194.4 114.9 124.9 96.2 232.6 88.2 130.1 201.9 124.9 212.5 90.7 148.3

1999 110.0 84.3 104.7 214.3 93.1 93.9 98.8 110.5 203.9 118.6 109.4 102.4 250.6 89.5 130.8 209.3 131.2 213.6 89.8 150.0

2000 110.2 85.9 105.7 214.4 92.9 94.6 93.0 94.0 210.1 120.9 99.3 106.4 262.3 90.3 131.2 214.0 135.2 214.4 89.2 151.0

2001 110.3 87.0 106.4 214.5 92.8 95.1 89.2 83.4 214.1 122.5 92.8 109.0 269.8 90.8 131.5 217.1 137.9 214.8 88.9 151.7

2002 110.4 87.7 106.9 214.5 92.7 95.4 86.8 76.5 216.6 123.4 88.7 110.7 274.7 91.2 131.7 219.1 139.6 215.1 88.6 152.2

2003 110.4 88.2 107.2 214.5 92.7 95.6 85.2 72.1 218.3 124.1 86.0 111.7 277.8 91.4 131.8 220.3 140.7 215.3 88.5 152.5

2004

110.5 88.4 107.4 214.6 92.6 95.8 84.2 69.2 219.4 124.5 84.2 112.4 279.8 91.6 131.9 221.2 141.4 215.5 88.4 152.6

2005

... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

...

110.6 89.0 107.7 214.6 92.6 96.0 82.4 64.0 221.3 125.3 81.1 113.7 283.5 91.8 132.1 222.7 142.7 215.7 88.2 153.0

Long run

The data for this column were obtained from the website www.isciii.es (Instituto de Salud Carlos III, Centro Nacional de EpidemiologõÂ a), updated 30 September 1999, all other columns show expected values.

a

Community

No.

Table 4 Actual values for 1997a, expected values for the period 1997±2005 and long run AIDS incidence per million people for each Spanish Community

314 J. Rosel et al. / Health & Place 6 (2000) 309±317

J. Rosel et al. / Health & Place 6 (2000) 309±317

common autoregressive model for AIDS incidence in Spain with random coecient variability across the di€erent Communities. The resulting mixed model equation (which contains both ®xed and random e€ects) allows for the estimation of AIDS incidence in the future. In spite of the recent appearance of AIDS as an individual and social disease and its fast early spread in Spain, it seems to have already reached a maximum value. It is expected that incidence will become more stable, and even decrease in coming years. This interpretation is based on two pieces of evidence: (a) AIDS incidence in Spain has been steadily decreasing since 1994, with a maximum transversal standard deviation in 1995; (b) there is enough statistical evidence to conclude that the b 1 coecient, 0.6452, in Eq. (6), is less than the value 1. Had the con®dence interval of the b 1 coecient included the value of 1, the Eq. (6) model would have been `explosive' in mean and in variance and would therefore be of no statistical value. There are also `long term equilibrium point' values (see Appendix D) which are consistent with the AIDS values observed up to the present. These results lend statistical support to the conclusion that AIDS incidence will vary without any large ¯uctuations in coming years. The expected values for 1997 are shown in Table 4, and an incidence of 141.79 people per million inhabitants a€ected by AIDS for the whole of Spain is foreseen. An estimated incidence of approximately 152.99 cases per million inhabitants and year is expected in Spain in the future, according to results in Appendix D. The maximum number was reached in 1994 (with 179.4 cases per million inhabitants). It is expected that the process will become more stable and slowly reach the ®rst ®gure (152.99). According to long term expected values for the individual Spanish Communities (see Appendix D and Table 4), Madrid (283.46), Paõ s Vasco (222.67) and CatalunÄa (221.32) will probably have highest rates, while Castilla-LeoÂn (82.37), Castilla-La Mancha (64.00) and Extremadura (81.07) are expected to have the lowest incidences. In all of the Spanish Communities, except Extremadura, the long term expected value for AIDS is lower than the maximum incidence of AIDS observed in each of the respective Communities within the period 1993±96. In the case of Extremadura, the maximum value detected up to now has been 78.2 in 1996, while a long term incidence of 81.07 per million inhabitants is to be expected. Fortunately, Extremadura is a Community with a low AIDS incidence and a relatively low population (1,070,244 inhabitants). The reliability of the model can be seen by the fact that the variances of the errors for the estimated 1997 values are not signi®cantly di€erent from the errors for

315

the whole series calculated by Eq. (6), although the data for 1997 were made available in June 1999 (which means that they are still not totally trustworthy, and they tend to increase as new cases are reported retrospectively). Because our forecast for 1997 from Eq. (6) is higher, and the AIDS incidence values will increase over the coming years, the estimation errors for 1997 (and their variance) obtained from Eq. (6) will tend to decrease. The above model of epidemiological changes in the number of AIDS cases in the di€erent Spanish Communities could obviously be improved. One way of ®ne-tuning the prediction process would be to incorporate into the forecasting process new data on a year by year basis. Control and feedback procedures on incidence could be tailored to individual Communities; in that way, one could adjust intervention e€orts and prevention campaigns to the speci®c characteristics of Communities in an adaptive manner. Previous calculations have been performed on the assumption that AIDS propagation across Communities will follow a similar process to the one observed up to the present time; that is, the statistical model will be invariant. However, one could not discard new and unforeseen aggravating circumstances such as virus mutations, new transmission processes, new subgroups with a larger infection risk, or even new events that may improve conditions, such as e€ective prevention policies, new drugs or vaccines. It could be interesting to test whether countries or communities that are similar to Spain in social and economic characteristics show a similar model in the evolution of the incidence of AIDS. In any case, there should be an increase in research e€orts and prevention and information policies, especially in Communities where the expected incidence tends to increase. Acknowledgements Sincere thanks to Francisco Herrero Machancoses for informing the authors on the availability of the data and to JesuÂs Castilla (of the Instituto de Salud Carlos III) for his explanations about the nature of the AIDS data. This work has been carried out with partial funding by the Fondo de Investigaciones Sanitarias (Proyecto 97/2121), Ministerio de Sanidad y Consumo, Spain. Appendix A. Expected values for observations in the 1984±1996 time period, along with its corresponding error Let us take, as an example, the direct value (see Table 1) for 1988 of the 10th Community (Comunidad

316

J. Rosel et al. / Health & Place 6 (2000) 309±317

Valenciana), it is YC,t=48.2 (st=27.53), and also the corresponding value for the previous year (1987) is YC,tÿ1=20.2 (stÿ1=15.11). The weighted values are therefore: Y C, t ˆ

YC, t 48:2 ˆ 1:751 ˆ 27:53 st

Y C, tÿ1 ˆ

YC, tÿ1 20:2 ˆ ˆ 1:337 stÿ1 15:11

It can be seen in Table 3 that the b 0C value for the Comunidad Valenciana is ÿ0.050. Substituting this value in Eq. (6): Y C, t ˆ …1:765 ‡ b0C † ‡ 0:645  …Y C, tÿ1 ÿ 1:712† ‡ eC, t 1:751 ˆ …1:765 ÿ 0:050† ‡ 0:645  …1:337 ÿ 1:712† ‡ eC, t ˆ 1:706 ‡ 0:645  …ÿ0:375† ‡ eC, t ˆ 1:706 ÿ 0:242 ‡ eC, t ˆ 1:464 ‡ eC, t The weighted predicted value is therefore YÃC,t=1.464, and the weighted prediction error is: e C,t=Y C,tÿYÃC,t=1.751ÿ1.464=0.287. The raw prediction error and predicted values will be: eC,t=e C,tst=0.28727.53=7.901 and  YÃc,t=YÃC,tst=1.46427.53=40.304, respectively. Note that YC,t=YÃC,t+eC,t=40.304+7.901=48.205348.20, which is the original data point. Appendix B. Predicted values for 1997 Let us suppose that it is of interest to predict the AIDS incidence in Community 1 (Andalucõ a) in 1997; in Table 1, the 1996 value YC,tÿ1=107.3 (stÿ1=72.78), and its corresponding weighted value Y C, tÿ1 ˆ

YC, tÿ1 107:3 ˆ ˆ 1:474 stÿ1 72:78

By substituting in Eq. (6) the weighted predicted value for 1997 can be calculated. Y C, t ˆ …1:765 ‡ b0C † ‡ 0:645  …Y C, tÿ1 ÿ 1:712† ‡ eC, t Y C, t ˆ …1:765 ÿ 0:122† ‡ 0:645  …1:474 ÿ 1:712† ‡ eC, t ˆ 1:643 ‡ 0:645  …ÿ0:238† ˆ 1:643 ÿ 0:154 ˆ 1:489 In order to calculate the raw predicted value, YÃC,t=Y C,tst, s1997 has to be estimated; if we assume that being an autoregressive model, the standard deviation for 1997 will be similar to that of 1996, the

value for the standard deviation of 1997 will be 72.78 (which is the standard deviation for 1996). The predicted value for 1997 will accordingly be: YÃC,t=Y C,tst=1.48972.78=108.45 per million. Predictions for each Spanish Community in 1997 are shown in Table 4.

Appendix C. Expected value for AIDS incidence in Spain in 1997 The last row of Table 1 (overall incidence of AIDS in Spain) has not been used in the development of the model. From the column of expected values in Table 4, the average expected value of AIDS incidence in Spain can be calculated; to do that, expected values for each Community are multiplied by their respective population size; the overall quantity is then divided by the population size for the whole country: Overall Incidence in Spain Y^ C1, 1997  nC1 ‡ Y^ C2, 1997  nC2 ‡    ‡ Y^ C19, 1997  nC19 ˆ nSp where YÃC1,1997, YÃC2,1997, . . ., YÃC19,1997 are expected AIDS incidence values in 1997 from Community 1 (Andalucõ a) to 19 (Melilla), and nC1, nC2, . . . , nC19 are their respective populations; nSp=nC1+nC2+ . . .+nC19 is therefore the overall population obtained from the Ocial Census reports, thus: Overall AIDS prevalence in Spain 108:45  7216649 ‡ 71:56  1187546 ‡    ‡ 80:64  59576 ˆ 39652742 ˆ 141:79 that is, the overall expected value for AIDS incidence in Spain for 1997 is 141.79 cases per million inhabitants.

Appendix D. Overall and community-speci®c long term expected values of the number of AIDS cases in Spain It would be desirable to show what is the long term expected value of AIDS incidence in Spain for the di€erent Spanish Communities; to do that, it is possible to calculate the `equilibrium point' (Huckfeldt et al., 1982) for each Community by means of Eq. (6), and taking into account the property of the `equilibrium point': . . . ˆ Y C, tÿ1 ˆ Y C, t ˆ Y C, t‡1 ˆ . . . ˆ Y EP . . . ˆ eC, tÿ1 ˆ eC, t ˆ eC, t‡1 ˆ . . . ˆ 0

J. Rosel et al. / Health & Place 6 (2000) 309±317

i.e., on the basis of previous relationships, Eq. (6) becomes an equilibrium point in the long run: Y EP ˆ …1:765 ‡ b0C † ‡ b1  …Y EP ÿ 1:712† Solving for YEP in the case of each Community: Y EP ˆ …1:765 ‡ b0C † ‡ 0:645  …Y EP ÿ 1:712† Y EP ˆ

…1:765 ‡ b0C † ÿ 1:105 0:355

As an example, in Andalucõ a (Community 1), Y EP ˆ

…1:765 ÿ 0:122† ÿ 1:105 ˆ 1:519 0:355

Therefore the unweighted incidence value will be: EP EP Y^ ˆ Y^  st ˆ 1:519  72:78 ˆ 110:55

which indicates that long term expected AIDS incidence in Andalucõ a will be 110.55 cases per million. Table 4 shows long term expected values for the rest of Communities using the same procedure. Overall long term expected values for AIDS incidence in Spain is calculated in a second stage on the basis of speci®c Community values, using the procedure described in Appendix C. Long term overall prevalence in Spain EP EP EP Y^ C1  nC1 ‡ Y^ C2  nC2 ‡    ‡ Y^ C19  nC19 ˆ nEsp 110:55  7216649 ‡ 88:96  1187546 ‡    ‡ 88:18  59576 39652742 ˆ 152:99

ˆ

Thus, it is expected that there will be approximately 152.99 cases per million inhabitants and year in Spain in the long term. References Baltagi, B., 1995. Econometric analysis of panel data. Wiley, New York. Banerjee, A., Dolado, J.J., Galbraith, J.W., Hendry, D.F., 1993. Co-integration, error correction, and the econometric analysis of non-stationary data. Oxford University Press, Oxford.

317

Blundell, R., Windmeijer, F., 1997. Correlated cluster e€ects and simultaneity in multilevel models. Health Economics 1, 6±13. Bowerman, B.L., O'Connell, R.T., 1987. Time series forecasting. Uni®ed concepts and computer implementation. Duxbury Press, Boston, MA. Box, G.E.P., Jenkins, G.M., Reinsel, G.C., 1994. Time series analysis: Forecasting and control, (Rev. ed.) Holden-Day, San Francisco, CA. Brookmeyer, R., Gail, M.H., 1994. AIDS epidemiology: A quantitative approach. Oxford University Press, Oxford. Bryk, A.S., Raudenbush, S.W., 1992. Hierarchical linear models. Sage, Newbury Park, CA. Centro Nacional de Nacional Epidemiologõ a, 1997. InformacioÂn epidemioloÂgica. Vigilancia epidemioloÂgica del SIDA en EspanÄa a fecha de actualizacioÂn 31 de marzo de 1997. PublicacioÂn O®cial de la Sociedad EspanÄola Interdisciplinaria del S.I.D.A. 8, 550±557. Dielman, T.E., 1988. Pooled cross-sectional and time series data analysis. Dekker, New York. Draper, N.R., Smith, H., 1981. Applied regression analysis. Wiley, New York. Durbin, J., Watson, G., 1950. Testing for serial correlation in least square regression Ð I. Biometrika 37, 409±428. Durbin, J., Watson, G., 1951. Testing for serial correlation in least square regression Ð II. Biometrika 38, 159±178. Goldstein, H., 1996. Multilevel statistical models. Arnold, London. Greene, W.H., 1993. Econometric analysis. Prentice Hall, Englewood Cli€s, NJ. Gujarati, D.N., 1988. Basic econometrics, (2nd ed.). McGawHill, New York. Hamilton, J.D., 1994. Time series analysis. Princeton University Press, Princeton, NJ. Hox, J.J., 1995. Applied multilevel analysis. TTP, Amsterdam. Hsiao, C., 1986. Analysis of panel data. Cambridge University Press, Cambridge. Huckfeldt, R.R., Kohfeld, C.W., Likens, T.W., 1982. Dynamic modelling. Sage, Newbury Park, CA. Judge, G.G., Griths, W.E., LuÈtkepohl, R.C., Lee, T.C., 1985. The theory and practice of econometrics. Wiley, New York. Littell, R.C., Milliken, G.A., Stroup, W.W., Wol®nger, R.D., 1996. SAS system for mixed models. SAS Institute, Cary, NC. Longford, N.T., 1993. Random coecient models. Clarendon Press, Oxford. Plewis, I., 1997. Statistics in Education. Arnold, London. Sayrs, L.W., 1989. Pooled time series analysis. Sage, Newbury Park, CA. Wei, W.W.S., 1989. Time series analysis. Univariate and multivariate methods. Addison-Wesley, Redwood City, CA.