Regression analysis with nested effects in epidemiological studies: Assessment of a method eliminating one level of clustering

PREVENTIVE VETERINARY MEDICINE Preventive Veterinary Medicine 25 ( 1996) 315-325 Regression analysis with nested effects in epidemiological studies:...

Download PDF

752KB Sizes 6 Downloads 42 Views

Report

PDF Reader
Full Text

PREVENTIVE

VETERINARY MEDICINE Preventive Veterinary Medicine 25 ( 1996) 315-325

Regression analysis with nested effects in epidemiological studies: Assessment of a method eliminating one level of clustering F. BeaudeatPbs*, C. Fourichon”, K. Frankenab “Unit ofAnimalHealth Management, [NRA-Veterinary School C.P. 3013-44087 Nantes, Cedex 03, France hDepartment ofAnimalHusbandry, Division of Animal Health and Epidemiology, Agricultural University, P.O. Box 338,670o AH Wageningen, Netherlands

Accepted 26 April 1995

The analysis of binary data originating from clustered observations is often a problem in epidemiological studies. In most epidemiological studies on cattle, clustering occurs at two levels: cows are clustered within herds and lactation records are clustered within cows. This paper proposes and assesses a method of eliminating one level of clustering in regression analysis. Data from an observational study were used to assess the relationships between health disorders and culling, by means of logistic regression analysis, in two different ways. In the Random One Lactation (ROL) analysis, ten random selections of one lactation record per cow were done. Logistic regression analysis was performed in each sample separately. Results showed qualitative and quantitative variations between the ten final models. From the ten random samples, a disease was considered as significantly associated with culling, if present in at least four final models. Using a single random sample might give incomplete information. In the All Lactations (AL) analysis, all the available lactation records were included in one model, despite the dependency between lactations. The two methods generally agreed. All diseases significantly related in the AL analysis, except one, were also significantly associated with culling with the ROL analysis. However, differences in the regression coefficients between the ROL and the AL analyses showed the importance of dispersion parameters. &words:

Cluster effect; Logistic regression; Culling; Disease

*Corresponding author 0167-5877/96/$15.00 0 1996 Elsevier Science B.V. All rights reserved SSDlO167-5877(95)00503-X

316

F. Beaudeau et al. /Preventive Veterinary Medicine 25 (1996) 315-325

1. Introduction Multiple logistic regression is an appropriate method of investigating risk factors associated with dichotomous outcomes (Breslow and Day, 1980). The use of ordinary logistic regression requires the assumption that the individual responses are independent. However, in most field surveys, the unit of sampling is a group of animals (herd, litter) and thus, ordinary logistic regression is often not appropriate: distributions of responses vary between groups (Curtis et al., 1993). The productive lives of most large animals have biological cycles. This is especially true of lactations in cattle. Thus, in most epidemiological surveys involving cows, data are recorded on a lactational basis and the unit of observation considered in the analyses is often the lactation, sometimes as an approximation of the cow. Then, lactation records within cow are not independent. When analysing data at the lactational level, an investigator should control for both cow and herd effects. For instance, in some studies investigating health disorders as risk factors for culling, the authors focused on the role of health disorders on culling in the same lactation. Strictly speaking, the analyses were performed at the lactational level. Herd effect was accounted for by Martin et al. (1982), Dohoo and Martin (1984) and Erb et al. (1985). In some studies, only one lactation per cow was available (Martin et al., 1982; Oltenacu et al., 1984; Griihn et al., 1986). In such cases, the lactational level was in fact the cow level. In other studies, all the available lactations were considered (Dohoo and Martin, 1984; Erb et al., 1985; Bendixen, 1988; Bendixen and Astrand, 1989), but the cow effect was not accounted for. The same was observed when considering disease history of cows using two consecutive lactations as unit of analysis (Cobo-Abreu et al., 1979). Possible bias induced by dependency between observations in computed solutions was not evaluated. For categorical responses, several methods are available to account for the group effects (McDermott et al., 1994). One way is to model them as fixed effects. Mauritsen ( 1984), cited by Curtis et al. ( 1993), has put forward the disadvantages of this method. Modelling herd or cow effects as fixed effects is inappropriate, since herds and cows are usually a random sample from a population of herds and cows and should be considered accordingly in the analysis. Alternatives to the fixed effects model include: conditional logistic regression, random-effects logistic regression (beta-binomial, logistic-normal and logistic-binomial regression) and generalized estimating equations (GEE) (McDermott et al., 1994). Random-effects logistic-binomial regression was deemed one of the most appropriate methods to model dichotomous outcomes when analysis is done at the individual level, but animals are grouped (Curtis et al., 1993). The problem is that, with many clusters of relatively small size (e.g. cows), parameter estimates from these models may become unstable when some groups have no event (McDermott et al., 1994). Moreover, the huge computing time needed owing to the large number of random clusters of small size remains a major limit for practical use. In addition, no software was available to most researchers to account for random effects at two levels (herd and cow). Therefore, there is a lack of practical options for considering the cow effect in epidemiological studies. A strategy to overcome this methodological constraint may consist of effectively eliminating the need to account for the cow level, by randomly selecting one record per cluster. However, random selection induces variability due to sampling error (Martin et al., 1987)

F. Beaudeau et al. /Preventive Veterinary Medicine 25 (1996) 315-325

317

and therefore, the outcomes of the analyses might vary from one sample to another. This variation might be qualitative, meaning that some parameters do not appear to be significant in all final models, and/or quantitative, meaning that the point estimates of significant parameters vary between analyses. The objectives of the present study were (1) to propose a method based on repeated random sampling of one unit per cluster to avoid clustering, (2) to evaluate the variability in the resulting outputs, and (3) to present differences in computed solutions with those resulting from ignoring the assumption of independency between observations. We base our assessments on data used to investigate health disorders as risk factors for culling in dairy herds.

2. Materials and methods 2.1. Data The study population, data collection, storage and validation have been described previously (Beaudeau et al., 1994). Data were recorded in 47 commercial Holstein herds, situated in western France, enrolled in a milk recording program, with cows served by artificial insemination. Data included all lactations started between 1 February 1986 and 31 January 1990 and ended (by a culling or a new calving) before 1 July 1990. A total of 16 disease conditions, reproductive performance and milk production parameters were investigated as possible risk factors for late culling. Definitions are shown in the Appendix. A prefix ‘P’ was attached to variables applying to the previous lactation in all models. All cullings occurring after 45 days postpartum (except cows culled with only one milk record within 30 days post-partum) were considered as late cullings (Beaudeau et al., 1994). 2.2. Modelling In order to investigate diseases as risk factors for late culling, the late culling group was compared to a group including cows which were not culled. The model-building process used has been described previously (Beaudeau et al., 1994). It involved three stages. Stages 1 and 2 were carried out using the LOGISTIC procedure of the Statistical Analysis Systems Institute Inc. (SAS Institute Inc., 1989), according to the method described by Hosmer and Lemeshow ( 1989). A univariable analysis was performed to relate culling to each possible health disorder occurring in both previous and current lactations. Only factors associated with culling (P < 0.25 for the likelihood ratio chi-square test) were offered to multivariable models. Multivariable models were used performing a backward-elimination procedure based on the likelihood-ratio chi-square statistic, until a final model was obtained with all remaining diseases significant at P
318

F. Beaudeau et al. /Preventive Veterinary Medicine 25 (1996) 315-325

(Jansen, 1990). Computations were carried out using a FORTRAN program developed by Jansen. Disease conditions which became non significant were removed using a backward procedure until a final model was obtained with only significant diseases (P < 0.10). 2.3. Analyses The original dataset included cows which experienced two, three and four complete lactations ( 1016, 934 and 113 cows respectively) during the study period. All lactations that lacked a record of a previous lactation within the dataset (thus, including all selected first lactations) were excluded from analyses. The dependent variable was late culling (yes/ no) ; the independent variables were based on data from the current and the previous lactations. In total, 3223 lactation records for which data from the previous lactation were not lacking were eligible. The unit of analysis was the lactation record. 2.3.1. AL analysis In analysis AL (All Lactations), all lactation records available in the dataset were used, ignoring the fact that these records were not independent within cows. 2.3.2. ROL analysis In analysis ROL (Random One Lactation), to account for non-independency of lactation records within cows, one lactation record per cow was randomly selected using the PLAN procedure of the SAS Institute Inc. ( 1989). In order to account for the variability due to the random selection, selection and analyses were performed ten times. Increasing the number of random selections would have resulted in keeping more information from the native dataset. However, considering the number of possible lactation records available per cow, and given the ten random selections, the probability for a lactation record not to be used in any model was below 0.001 for cows with two lactation records eligible, and below 0.02 for cows with three lactation records eligible. The ten final models are a sample of all possible models. In this sample, the number of models in which a given disease was significantly (P < 0.10) related to culling follows a binomial distribution (Snedecor and Cochran, 1980). If the disease is not related to culling, it has a 0.10 probability of being significant in any one model. Thus, in a sample of ten models, the probabilities of having such a disease present in three or more and four or more models out of ten are respectively 0.07 and 0.013. Therefore, a disease was considered as significantly associated to culling at a 5% significance level if present in four models or more out of ten. Variability in resulting outputs due to the use of the method ROL was assessed. Within analysis ROL, qualitative two-by-two between-final models comparisons were performed by calculating the number and percentages of mismatches between two given models. A percent mismatch was defined as the number of diseases retained in one final model and not in the other one, divided by the total number of diseases retained in at least one model. For all diseases significantly associated with culling based on PP (the proportion of models for which the disease was significantly related to culling), means of their n associated regression coefficients ( p) and of their corresponding standard errors were calculated as summary point estimates and dispersion parameters. Standard errors of the mean of p from

F. Beaudeau et al. /Preventive Veterinary Medicine 25 (1996) 315-325

319

Table 1 Diseases significantly

(P < 0. IO) related to late culling in at least one final model and proportion of models in which the

corresponding disease was a risk factor Disease”

Sample

Proportion Ppb

I

2

3

4

5

6

7

8

9

10

P-ABOI

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

1’

P-ACC

N

Y

Y

N

N

N

N

N

Y

Y

0.4*

P-KETO

N

N

N

Y

Y

N

N

N

N

N

0.2

P-LOC

N

N

Y

N

N

N

N

N

N

N

0.1

P-MAS3

N

N

N

N

N

N

N

Y

N

N

0.2

P-MASS

Y

N

N

Y

N

Y

N

N

N

N

0.3

P-MET1

N

N

N

N

N

Y

N

N

N

N

0.1

P-OVA3

N

N

Y

Y

Y

N

Y

Y

Y

Y

0.7*

P-UDD2

N

N

N

N

Y

Y

N

N

N

N

0.2

P-XC

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

1*

ABOI

Y

Y

Y

Y

Y

Y

N

N

Y

Y

0.8*

AB02

Y

N

Y

Y

Y

Y

N

N

Y

Y

0.7 *

ASSIST

N

Y

N

Y

N

Y

Y

Y

N

Y

0.6*

MASI

Y

Y

Y

Y

Y

Y

Y

Y

Y

Y

1*

MAS2

Y

Y

Y

Y

Y

N

Y

Y

Y

Y

0.9’

MAS3

Y

N

N

Y

N

Y

Y

N

N

Y

0.5f

MAS5

N

N

N

N

N

Y

N

Y

N

Y

0.3

MET1

Y

Y

Y

Y

Y

N

Y

N

Y

Y

0.8*

RP

N

N

N

Y

Y

Y

N

N

N

N

0.3

TEAI

N

N

Y

N

N

N

Y

Y

N

N

0.3

see

N

N

Y

Y

N

N

Y

N

N

N

0.3

*For definition, see Appendix. bNumber of final models (n) in which the concerned disease was significantly

(P < 0.10)

related to culling, divided by the total

“umber of models (N= IO). Y, significantly related to late culling (P < 0.10); N, non significantly related to late culling (I’ > 0.10). *Significant proportion ( PP)

(P

< 0.10).

logistic binomial regression were also assessed. These were compared with corresponding regression coefficients and standard errors resulting from the AL analysis to measure the possible bias due to ignoring the cow level of clustering.

3. Results The incidences the Appendix.

of the health disorders taken into account in the analyses are displayed in

3.1. Between-jinal models comparison, ROL approach Table 1 gives between-final-models comparisons for disease conditions. Twenty-one disease conditions were related at least once to late culling. The number of significant parameters varied from 8 to 14 per model depending on the sample. Eleven disease conditions were significantly associated with late culling.

320

F. Beaudeau et al. /Preventive

Veterinary Medicine 25 (1996) 315-325

Table 2 Number of mismatches (number of diseases retained in one final model and not in the other) between pairs of samples in the Random One Lactation (ROL) analysis. The numbers in brackets are the percentages of misrepresentation of these mismatches Sample

Sample I

2 3 4 5 6 7 8 9 10

2

3

4

5

6

7

8

5(24)

7(33) 5(24) _

6(29) 8(38) 8(38)

3(27) l(9) 6(54) 5(45) 5(45) O(0) 2(18)

2(18) 3(27) 2(18) 4(36) 3(27) 3(27)

6(29) 7(33) 7(33) 5(24) _

7(33) lO(48) 14(67) 8(48) 9(43)

5(45) 4(36) 4(36) l(9) 3(27)

7(33) 6(29) 6(29) 6(29) 9(43) 12(57)

5(45) 5(45) 6(54) 4(36)

2(18)

lO(48) 7(33) 9(43) ll(52) lO(48) ll(52) 5(24) _

5(45) 3(27)

5(45) 5(45)

3(27) 4(36) 4(36) 5(45) 4(36) 4(36) 3(27) 3(27)

9

10

X24) 4(19) 4( 19) 8(38) 5(24) 12(57) 8(38) 7(33) _ 2(18)

X24) 4(19) 6(29) 6(29) 7(33) 8(38) 6(29) 7(33) 4( 19) _

Table 2 summarizes the overall two-by-two between-final models comparison. This table is in two sections. The part above the diagonal exhibits the number of diseases that appear in one of the two models but not in the other. The number in brackets beside it is the percentage representation of these mismatches. For example, five diseases (P-ACC, PMASS, AB02, ASSIST and MAS3) appear in just one of samples 1 and 2. This represents 24% of the 2 1 diseases offered in the models. The part below the diagonal exhibits the same counts and proportions, but is based only on those 11 diseases for which PP > 0.4. When all diseases significantly related to culling were taken into account, the number of mismatches between models varied from 4 to 14. When the number of diseases was restricted to those that appeared in four or more samples (PP > 0.4), this number varied from 0 to 6. Under these circumstances, 22 of the 45 two-by-two comparisons had more than 30% mismatches. The directions of effect estimates for a given disease were consistent (Table 3). 3.2. Health disorders as risk factors for culling in ROL analysis Table 3 gives the descriptive statistics of the logistic regression coefficients corresponding to diseases significantly associated to culling, i.e. PP > 0.4. A total of 1988 lactation records were considered in analysis ROL. The number of lactation records that ended by a late culling ranged from 759 to 791 (mean 780). Four health disorders from the previous lactation were significantly associated with late culling. Cows previously experiencing abortion (P-ABOl), accident (P-ACC), P-SCCB and PSCCC had a higher risk of being culled than cows without these disorders. In contrast, cystic ovaries (P-OVA3) were protective for culling in the subsequent lactation. 7 health disorders ( ABOl, AB02, ASSIST, MAS 1, MAS2, MAS3, MET1 ) were positively associated with culling in the current lactation. The means of standard errors of p suggested that estimates of regression coefficients for a given disease showed large variations in many cases. However, as suggested by the

F. Beaudeauet al. /Preventive VeterinaryMedicine25 (1996) 315-325

321

Table 3 Regression coefficients and their standard errors corresponding to disease significantly related to hue culling in ROL (Random One Lactation) analysis ( 1988 cows, based on PP) and AL (All Lactations) analysis (3058 lactation records) Disease”

P-ABOI P-ACC P-OVA3 P-see P-SCCB P-sccc ABOI AB02 ASSIST MAS I MAS2 MAS3 MET I see SCCB sccc

Analysis ROL

Ppb

Regression

coefficient

(p)

Standard error

Mean

Min.

Max.

Mean

2.17 0.9 1 _ 2.07

1.72 0.76 - 2.47

3.0 1.10 - 1.75

0.93 0.43 1.12

0.13 0.08 0.10

0.25 0.50 1.37 1.25 0.48 0.4 1 0.54 0.47 0.5 1

0.16 0.36 0.97 0.87 0.44 0.28 0.39 0.40 0.40

0.43 0.67 1.70 1.77 0.61 0.50 0.73 0.58 0.68

0.13 0.16 0.63 0.5 1 0.23 0.17 0.21 0.21 0.21

0.03 0.03 0.09 0.13 0.02 0.02 0.04 0.03 0.03

“For definition, see Appendix. “PP.number of final models (n) in which the concerned divided by the total number of models (IV= 10).

Standard error of the mean of fi

Analysis AL Regression coefficient

Standard error

0.4 0.7

1.85 0.67 - 1.80

0.74 0.37 0.94

1 1 0.8 0.7 0.6 1 0.9 0.5 0.8

0.30 0.54 1.25 1.16 0.41 0.49 0.50 0.38 0.55

0.11 0.14 0.52 0.39 0.19 0.14 0.18 0.17 0.18

0.3 0.3

- 0.25 - 0.29

0.11 0.14

I

disease was significantly

(P < 0.10) related to culling,

standard errors of the mean of p, the number of final models including the disease appeared not to greatly influence the dispersion of p attached to that disease. 3.3. ROL analysis versus AL analysis General trends in the relationship between each health disorder and culling can be assessed by presenting outcomes of AL analysis, in addition to those of ROL analysis (Table 3). No discrepancies can be noticed in the direction of the relationship, since each disease was either a risk indicator or a protective indicator in both analyses. However, the magnitude of the relationship showed variations. Nine (out of 12) regression coefficients issued from the AL analysis were included in the range (minimum-maximum) of the j3 issued from the ROL analysis.

4. Discussion Our study proposed a method to eliminate the cow level of clustering and partly assessed the advantages and drawbacks of the method. This method was based on repeated random-

322

F. Beaudeau et al. /Preventive Veterinary Medicine 25 (1996) 31.5-325

ized samplings. Bigras-Poulin (1985) is another example of random selection of cows in epidemiological studies. In contrast to Bigras-Poulin (1985) who did only one random sampling, ten random samplings were done in the present study to assess the variability of outcomes. In our study, the ten samples showed considerable overlap because 1016 (out of 2063) cows experienced only one eligible lactation record during the study period and therefore were forced into each sample. Despite this, the between-samples comparison showed qualitative variations (Tables 1 and 2). This variation was not dramatically reduced after exclusion of the parameters that were not significant based on the proportion PP. Using a single random sample might give incomplete information, so the understanding of the studied process will be suboptimal. The estimated coefficients of significant parameters vary between analyses (Table 3). Both the range of the regression coefficients and the mean of standard error of j3 attached to each disease support the utility of repeated random selections for a more accurate description. An apparent drawback of this method is that it provides point estimates and intervals that have to be summarized to answer the research question (in the present case: what is the magnitude of the association between a given disease and late culling?). A method effectively accounting for more than one level of nested effects would have directly provided appropriate point estimates and confidence intervals. However, the exponentiation of summary estimates from the /3s, such as means, medians, percentiles (calculated from the ROL method), in order to obtain odds ratios can be done. The overall agreement in results when accounting or not for dependency of observations stresses the fact that ignoring the within-cow dependency of lactations would not lead to large bias in computed solutions in our study. However, some qualitative differences can be pointed out. SCCB and SCCC are significantly related to culling in the AL analysis and not in the ROL analysis. One explanation is the potential selection bias due to heterogeneity of disease incidences between the three subpopulations of cows, i.e. cows with two, three and four lactations available respectively. If the disease is more (respectively less) frequent in cows with two lactations, its prevalence in the random sample is higher (respectively lower) than when considering all lactations. For a given disease, the confidence interval of p partly depends on its prevalence. For a given point estimate of the p, the confidence interval of p is higher (respectively lower) if the disease is less (respectively more) frequent in cows with only two lactations. Therefore, the probability of being significantly associated with culling for a disease more (respectively less) frequent in cows with only two lactations is larger (respectively lower) in case of unequally distributed incidence. A way for correcting such bias could be to apply a same selection rate in the three subpopulations but that implies a loss of power. However, when health disorders are equally distributed, it is worthwhile to maintain the power as large as possible. Furthermore, for disease with low prevalence, the power is more sensitive to sample size. Excluding some lactations may decrease the power. As Table 3 shows, the effect estimates were always in the same direction, even when the corresponding PP was low. Although the p of three disease conditions estimated in the AL analysis were out of the range of the p estimated in the ROL analysis, there is an overall agreement between ROL and AL methods. These results are consistent with studies stating

F. Beaudeau et al. /Preventive

Veterinary Medicine 25 (1996) 31.5-325

323

that ignoring clustering does not introduce huge problems with point estimates, but spuriously low estimations of standard errors (Curtis et al., 1993). The proposed method, which was performed at the cow level, alleviated the problem of overdispersion due to the cow effect in each sample. Thus, in the ROL analysis, the potential extra-binomial variation due to the cow effect did not interfere with the procedure of variable selection at each step of the modelling. Therefore, the ROL method really accounted for the residual variance due the nested effects. However, owing to the resampling, this technique may induce variations in parameter estimates. From this study, there is evidence of the utility of repeated random sampling to model the same ‘question’ several times from the same initial dataset, each sample containing only one unit of interest per cluster. We find this method particularly interesting for the case of a large number of small clusters. However, repeated sampling before modelling, although very simple in principle, obviously requires huge computing time. However, increasing the number of analyses performed is necessary to provide a more precise identification and assessment of potential risk factors. It may be used when accounting for nested effects in regression analyses is necessary.

Acknowledgments This project was supported Poulenc Animal Nutrition.

in part by grants from the GALA Foundation

and Rh6ne

324

F. Beaudeau et al. /Preventive Veterinary Medicine 25 (1996) 315-325

Appendix Table 1 Description of disease variables taken into account in the analyses Abbreviation

Definition

Incidence per lactation (S)

Current lactation

Previous lactation

Disease variable” ABOl AB02 ACC

P-AB02 P-ACC

ASSIST DIG KETO

P-ASSIST P-DIG P-KETO

LOC MASI MAS2 MAS3 MAS4 MAS5 MASD MET1

P-LOC P-MASl P-MAS2 P-MAS3 P-MAS4 P-MASS P-MASD P-MET1

MET2

P-MET2

MET3

P-MET3

MF OVAI OVA2

P-MF P-OVA I P-OVA2

OVA3

P-OVA3

RESI

P-RES 1

RES2

P-RES2

RP see

P-RP P-SCC

TEA1

P-TEA 1

TEA1

P-TEA1

UDDI

P-UDDI

UDD2

P-UDD2

Abortion from 100 to 180 days of gestation Abortion at more than 180 days of gestation Accident, trauma, haemorrhage of genital txactus, embryotomy, caesarean section, leg paralysis diagnosed at 10 days or less post-parturn Dystocia: Calving provided with assistance Miscellaneous digestive disorders Ketosis or loss of appetite without any concurrent health disorder within a [ - 3, + 31 day interval, diagnosed at 45 days or less post-parturn Foot and legs disorders Mastitis diagnosis at 45 days or less post-partm Mastitis diagnosed from 46 to 90 days post-parturn Mastitis diagnosed from 91 to 180 days post-parturn Mastitis diagnosed from 181 to 270 days post-parturn Mastitis diagnosed at 271 days or more post-pamtm Mastitis diagnosed within the dry period Vaginitis, vulvitis, vulva-vaginitis, endomeuitis, metritis, pyometritis diagnosed at 21 days or less postpartum Vaginitis, vulvitis, vulva-vaginitis, endometritis, me&s, pyomethitis diagnosed from 22 to 49 days postp*m Vaginitis, vulvitis, vulva-vaginitis, endometritis, metritis, pyometritis diagnosed at 50 days or more postp*m Milk fever Cystic ovaries diagnosed from 46 to 90 days post-parturn Cystic ovaries diagnosed from 91 to 150 days postpartum Cystic ovaries diagnosed at 151 days or more postp*m Respiratory disorders diagnosed at 150 days or less postpartum Respiratory disorders diagnosed at 151 days or more post-pwtum Retained placenta ( > 12 h) Milk somatic cell count status SCCA: healthy: all SCC < 300 OOQcells ml - ’ SCCC: infected: at least 2 SCC > 800 000 cells ml-’ SCCB: no conclusion: other case Teat injuries and trauma diagnosed at 45 days or less post-parturn Teat injuries and trauma diagnosed at 46 days or more post-pxtum Non traumatic udder disorders diagnosed at 45 days or less post-parturn Non traumatic udder disorders diagnosed at 46 days or more post-partum

“For each variable, the cow was positive when the event occurred at least once.

0.9 1.1 1.6

8.2 4.5 2.9

14.2 11.3 5.2 6.5 4.1 1.7 2.0 8.0

6.8

6.5

7.7 2.5 1.4 0.6 3.0 1.9 10.8 51.1 29.5 19.4 0.5 1.0 2.1

1.2

F. Beaudeau et al. /Preventive Veterinary Medicine 25 (1996) 315-325

325

References Beaudeau, F., Frankena, K., Fourichon, C., Seegers, H., Faye, B. and Noordhuizen, J.P., 1994. Associations between health disorders of French dairy cows and early and late culling decision making within the lactation, Prev. Vet. Med., 19: 213-231. Bendixen, P.H., 1988. Risk indicators of disease occurrence in dairy cows in Sweden. Rep. 18, Department of Animal Hygiene, Faculty of Veterinary Medicine, Swedish University of Agricultural Sciences, Skara. Bendixen, P.H. and Astrand, D.B., 1989. Removal risks in Swedish Friesian dairy cows according to parity, stage of lactation, and occurrence of clinical mastitis. Acta Vet. Stand., 30: 37-42. Bigras-Poulin, M., 1985. Interrelationships among calving events, selected health problems, milk production, disposal and death in Ontario Holstein cows. Ph.D. Thesis, University of Guelph, Canada, 220 pp. Breslow, N.E. and Day, N.E., 1980. Statistical methods in cancer research. Vol. I-The analysis of case-control studies. IALC Scientific Publications, Lyon, France, 338 pp. Cobo-Abreu, R., Martin, SW., Willoughby, R.A. and Stone, J.B., 1979. The association between disease, production and culling in a university dairy herd. Can. Vet. J., 20: 191-195. Curtis, C.R., Mauritsen, R.H., Kaas, P.H., Salman, M.D. and Erb, H.N., 1993. Ordinary versus random-effects logistic regression for analysing herd-level calf morbidity and mortality data. Prev. Vet. Med., 16: 207-222. Dohoo, I.R. and Martin, SW., 1984. Disease, production and culling in Holstein-Friesian cows, V-Survivorship. Prev. Vet. Med., 2: 771-784. Erb. H.N., Smith. R.D., Oltenacu, P.A., Guard, CL., Hillman, R.B., Powers, P.A., Smith, M.C. and White, W.E., 1985. Path model of reproductive disorders and performance, milk fever, mastitis, milk yield and culling in Holstein cows. J. Dairy Sci., 68: 3337-3349. Griihn, Y.T., Saloniemi, H.S. and Syvajijkvi, J., 1986. An epidemiological and genetic study on registered diseases in Finnish Ayrshire cattle, l-the data, disease occurrence and culling. Acta Vet. Stand., 27: 182-195. Hosmer, D.W. and Lemeshow, S., 1989. Applied Logistic Regression. Wiley-Interscience, New York, 307 pp. Jansen, J., 1990. On the statistical analysis of ordinal data when extravariation is present. Appl. Stat., 39: 75-84. Martin, S.W., Aziz, S.A., Sandals, W.C.D. and Curtis, R.A., 1982. The association between clinical disease, production and culling of Holstein Friesian cows. Can. J. Anim. Sci., 62: 633-640. Martin, SW., Meek, A.H. and Willeberg, P., 1987. Veterinary Epidemiology. Principles and Methods, Iowa State University Press, Ames. Mauritsen, R.H., 1984. Logistic regression with random effects. Ph.D. Thesis, Department of Biostatistics, University of Washington, USA. McDermott, J.J., Schukken, Y.H. and Shoukri, M.M., 1994. Study design and analytic methods for data collected from clusters of animals. Prev. Vet. Med., 18: 175-191. Oltenacu. P.A., Britt, J.H., Braun, R.K. and Mellenberger, R.W., 1984. Effect of health status on culling and reproductive performance of Holstein cows. J. Dairy Sci., 67: 1783-1792. Statistical Analysis Systems Institute Inc., 1989. SAS/STAT User’s Guide, Version 6,4th edn. SAS Institute Inc., Gary, NC. Snedecor, G.W. and Co&an, W.G., 1980. Statistical Methods, 7th edn. Iowa State University Press, Ames.