J. Dairy Sci. 92:4063–4071 doi:10.3168/jds.2008-1918 © American Dairy Science Association, 2009.
Comparison between a sire model and an animal model for genetic evaluation of fertility traits in Danish Holstein population C. Sun,*† P. Madsen,* U. S. Nielsen,‡ Y. Zhang,† M. S. Lund,* and G. Su*1 *Department of Genetics and Biotechnology, Faculty of Agricultural Sciences, Aarhus University, DK-8830, Tjele, Denmark †College of Animal Science and Technology, China Agricultural University, Beijing, China ‡Danish Agricultural Advisory Service, DK-8200 Aarhus N, Denmark
ABSTRACT
Comparisons between a sire model, a sire-dam model, and an animal model were carried out to evaluate the ability of the models to predict breeding values of fertility traits, based on data including 471,742 records from the first lactation of Danish Holstein cows, covering insemination years from 1995 to 2004. The traits in the analysis were days from calving to first insemination, calving interval, days open, days from first to last insemination, number of inseminations per conception, and nonreturn rate within 56 d after first service. The correlations between sire estimated breeding value (EBV) from the animal model and the sire-dam model were close to 1 for all the traits, and those between the animal model and the sire model ranged from 0.95 to 0.97. Model ability to predict sire breeding value was assessed using 4 criteria: 1) the correlation between sire EBV from 2 data subsets (DATAA and DATAB); 2) the correlation between sire EBV from training data (DATAA or DATAB) and yield deviation from test data (DATAB or DATAA) in a cross-validation procedure; 3) the correlation between the EBV of proven bulls, obtained from the whole data set (DATAT) and from a reduced set of data (DATAC1) that contained only the first-crop daughters of sires; and 4) the reliability of sire EBV, calculated from the prediction error variance of EBV. All criteria used showed that the animal model was superior to the sire model for all the traits. The sire-dam model performed as well as the animal model and had a slightly smaller computational demand. Averaged over the 6 traits, the correlations between sire EBV from DATAA and DATAB were 0.61 (sire model) versus 0.64 (animal model), the correlations between EBV from DATAT and DATAC1 for proven bulls were 0.59 versus 0.67, the correlations between EBV and yield deviation in the cross-validation were 0.21 versus 0.24, and the reliabilities of sire EBV were 0.42 versus Received November 21, 2008. Accepted April 8, 2009. 1 Corresponding author:
[email protected]
0.46. Model ability to predict cow breeding value was measured by the reliability of cow EBV, which increased from 0.21 using the sire model to 0.27 using the animal model. All the results suggest that the animal model, rather than the sire model, should be used for genetic evaluation of fertility traits. Key words: animal model, female fertility, genetic evaluation, model validation INTRODUCTION
Intensive selection for yield and unfavorable genetic correlation between yield and fertility traits has resulted in a downward genetic trend in the fertility of dairy cows (Weller and Ezra, 1997; Wall et al., 2003; Pryce et al., 2004). Poor fertility decreases economic efficiency in dairy cattle production because reproduction problems are often followed by extra inseminations and veterinary treatment costs, prolonged calving intervals, and greater rates of involuntary culling. Many studies have reported that poor fertility could be a major reason for involuntary culling of dairy cows (Westell et al., 1992; Esslemont, 1993; Olori et al., 2002). Therefore, fertility is considered one of the most important traits in the breeding goal after milk production and mastitis. Female fertility is a combination of many factors. The traits often used in genetic evaluation of female fertility are those that reflect the ability to return to cycling after calving [days from calving to first insemination (ICF), the ability to conceive following insemination (days from the first to last insemination; IFL), number of inseminations per conception (AIS), and nonreturn rate within 56 d after first service (NRR56)] and the combination of these abilities [calving interval (CI) and days open (DO)] (Jorjani, 2006, 2007). These traits have a low heritability (Dematawewa and Berger, 1998; Pryce et al., 2004), and the data are not normally distributed and often include censored records (e.g., the records for cows that have not cycled or conceived within the inspecting period). More sophisticated models (e.g., proportional hazards model, censored Gaussian model, censored threshold
4063
4064
SUN ET AL.
model, and threshold-linear model) have been proposed for genetic evaluation of fertility traits, with the concern on censoring and data distribution (Schneider et al., 2005; González-Recio et al., 2006; Urioste et al., 2007; Hou et al., 2009). However, the more sophisticated models require more computational resources and are more complicated to implement. Currently, a linear sire model (SM) is used for genetic evaluation of fertility traits in most countries, including Denmark (Interbull, 2009). The SM is theoretically inferior to the animal model (AM) in the estimation of variances and other genetic parameters (Everett et al., 1979; Schaeffer, 1983; Hudson and Schaeffer, 1984), but the superiority of the linear AM over the linear SM in the ability to predict breeding values has received less attention (Ramirez-Valverde et al., 2001). Therefore, the aim of this study was to test the superiority of the AM, sire-dam model (SDM), and SM for genetic evaluation of fertility traits, based on the data from first lactation in the Danish Holstein population, with regard to the ability to predict sire breeding value (i.e., the future fertility performance of the daughters of sires). In addition, effects of the models on the reliability of cow EBV were investigated. MATERIALS AND METHODS Data
Female fertility data on the Danish Holstein population were obtained from the Danish Cattle Federation (Aarhus, Denmark). Detailed information on the breeding scheme for this population can be found at online (http://www.nordicebv.info/BreedingWork/ Breedingwork.htm). The raw data included records from heifers and from the first 3 lactations of cows, covering insemination years from 1992 to 2006. The data from only the first lactation during insemination years 1995 to 2004 were used in the present study. The restriction to the period from 1995 to 2004 was imposed to exclude left-censored cows and cows with fertility events in progress (still undergoing inspection of fertility events). The traits in the analysis were ICF, CI, DO, IFL, AIS, and NRR56. The raw data were edited using the following 3 steps. Č Step 1. Editing criteria in this step were as follows: 1) Age at first insemination as a heifer should be between 270 and 900 d, and age at first calving should be between 550 and 1100 d. 2) Herds should have records in each year from 1995 to 2004, and on average should have at least 50 records (sum of the number of records across Journal of Dairy Science Vol. 92 No. 8, 2009
heifers and the first 3 parities) per year. 3) Sires of cows should be known. Č Step 2. Data were further edited for each particular trait in the first lactation. First, data from cows that were not inseminated and cows with ICF <20 d were deleted. Second, herd-year subclasses were required to have a minimum of 5 records, and sires should have at least 5 daughters with records. Č Step 3. The aim of this step was to handle censored and extreme records. Approximately 15% of the cows had no known date of confirmed successful insemination. For these cows, the last insemination was taken as an unsuccessful insemination, and the corresponding records were taken as censored records. Many strategies have been used to handle censored records of fertility traits (Donoghue et al., 2004; González-Recio et al., 2006; Urioste et al., 2007; Hou et al., 2009). In this study, a penalty of 21 d was added to censored IFL, DO, and CI, and a penalty of 1 was added to censored AIS. This was a simple approach, although not satisfactory, to deal with censoring in a linear model setting. For cows without a subsequent calving, CI was calculated as DO plus 280 d. The following upper limits were imposed: 200 d for ICF, 600 d for CI, 320 d for DO, 230 d for IFL, and 8 for AIS. Records with values larger than the upper limit were replaced with the upper limit. After editing, the whole data set (DATAT) contained 471,742 first-lactation records from 6,887 sires and 1,899 herds. Pedigrees for the 3 models were built by tracing the ancestors back as far as possible by using the sire-dam structure. Consequently, the pedigrees included 928,665; 645,444; and 23,744 individuals for the AM, SDM, and SM, respectively. Another data set was created from the raw data by the above editing procedure, with the exception that herds were not required to have records in each year or were not required to have at least 50 per year. This approach resulted in a large data set with 1,050,494 records (DATADH), which was used to investigate the reliabilities of EBV of fertility traits for first parity in the Danish Holstein population. In addition, a reduced data set (DATADHR) was created from DATADH by leaving out the records of the last 2 insemination years, to estimate the reliabilities of EBV of the cows without their own records. For the purpose of model validation, 3 data subsets were created from DATAT. Subsets DATAA and DATAB were created by a division of the whole data set (DATAT) randomly by herds. Subset DATAC1 consisted of the records from first-crop daughters of
4065
GENETIC ANALYSIS OF FEMALE FERTILITY
Table 1. Data structures for data sets DATAT, DATAA, DATAB, DATAC1, DATADH, and DATADHR1 Item No. No. No. No. No. No.
DATAT
of of of of of of
records sires with daughter records herds individuals in pedigree for AM individuals in pedigree for SDM individuals in pedigree for SM
471,742 6,887 1,899 928,665 645,444 23,744
DATAA 236,481 5,779 948
DATAB
DATAC1
235,261 5,640 951
DATADH
DATADHR
161,312 1,050,494 6,444 10,949 1,896 8,516 2,048,317
852,266 9,556 8,388
1 DATAT was the whole data set for detailed analysis. DATAA and DATAB were the divisions of DATAT. DATAC1 contained only the records of first-crop daughters of sires in DATAT. DATADH was an extra data set containing the records from the whole Danish Holstein population. DATADHR was a reduced data set from DATADH created by deleting the records of the last 2 insemination years. AM = animal model; SDM = siredam model; SM = sire model.
sires. Detailed information on the data sets is shown in Table 1. Although a unique pedigree could be built for each data set, the pedigree built for DATAT was applied to the analysis based on the subsets. Similarly, the pedigree built for DATADH was applied to the analysis based on DATADHR.
For AM:
(
)
(
)
a~N 0, Aσa2 , e~N 0, Iσe2 ,
For SM:
(
(
)
)
s~N 0, As σs2 , e~N 0, Iσe2 , and
Models
The following 3 univariate linear Gaussian models were used for genetic evaluation of the 6 fertility traits. AM: y = Xβ + Za + e,
SM: y = Xβ + Z1s + e, and
SDM:
s
For SDM: ⎡s ⎤ ⎢ ⎥ ∼ N 0, A σ2 , e~N 0, Iσ2 , ⎢d⎥ sd sd esd ⎢⎣ ⎥⎦
(
(
)
where A, As, and Asd were the matrices of the genetic relationship between individuals in the pedigree of all animals, of sires, and of sires and dams, respectively; 2 σa2 , σs2 , and σsd were animal additive genetic variance, sire genetic variance, and sire-dam genetic variance; I was an identity matrix; and σe2 , σe2 , and σe2 were resids
y = Xβ + Z2 s + Z3 d + e,
where y was the vector of observations of a particular fertility trait; β was the vector of fixed effects, including age group (in months, defined as the age at first insemination as being a heifer), year-month of calving (ICF, CI, DO) or insemination (IFL, AIS, NRR56), herd-year of calving (ICF, CI, DO) or insemination (IFL, AIS, NRR56), regression on the breed proportion of US Holstein, and regression on total heterozygosity; a was the vector of animal additive genetic effects; s was the vector of sire genetic effects; d was the vector of dam genetic effects; e was the vector of random residuals; and X, Z, Z1, Z2, and Z3 were corresponding incidence matrices. The distributions for the random effects were assumed as follows:
)
sd
ual variances for the 3 models, respectively. For SDM, we assumed that sire and dam genetic effects on the offspring came from the same distribution. In these models, all unknown parents were assigned to 10 phantom groups based on sex and birth year. Breed proportion of US Holstein was included in the models to account for genetic differences in the base generation between the domestic and foreign populations. However, the estimate of breed contribution was added to the EBV afterward because this component is a part of the breeding value. Variance components were estimated by using the AM based on DATAT for each of the fertility traits, applying REML with the average information algorithm (AI-REML). These estimates of variance components were then used to predict breeding values by using the 3 models shown above. Variance component analysis
Journal of Dairy Science Vol. 92 No. 8, 2009
4066
SUN ET AL.
Table 2. Mean and standard deviation for all the fertility traits, calculated from the whole data set (DATAT)1 Statistic Mean SD Median 5th to 95th percentiles
ICF, d
CI, d
DO, d
IFL, d
AIS
81.27 40.29 72 34 to 169
413.06 75.88 393 323 to 589
133.29 75.66 114 44 to 309
51.09 65.44 21 01 to 212
2.244 1.530 2 11 to 5
NRR56, 0 or 1 0.563 0.500 — —
1
Frequency(IFL = 0) = 41%; Frequency(AIS = 1) = 41%. ICF = days from calving to first insemination; CI = calving interval; DO = days open; IFL = days from the first to last insemination; AIS = number of inseminations per conception; NRR56 = nonreturn rate within 56 d after first service.
and breeding value prediction were carried out using the DMU package (Madsen et al., 2006; Madsen and Jensen, 2007). Model Comparison
Model stability in terms of sire EBV was assessed as the correlation between EBV from subsets DATAA and DATAB. The predictive ability of the models with regard to the accuracy of sire EBV was evaluated using 3 criteria. The first criterion was the correlation between EBV from DATAC1 and DATAT for proven bulls. These bulls had at least 100 daughter records, with an average of 1,279 records in DATAT. Therefore, EBV based on DATAT for these bulls might be considered as approximations of true breeding values. The second criterion was a cross-validation based on the correlation between EBV and yield deviation (YD). The YD was estimated from DATAA and DATAB, respectively, by adjusting daughter performance for fixed and nongenetic random effects, which were estimated from DATAT using the 3 models, respectively. The correlation between EBV from DATAA and YD from DATAB, and between EBV from DATAB and YD from DATAA reflected the accuracy of sire EBV. The third criterion was the expected reliability (R2) of EBV, calculated from the prediction error variance of EBV, R2 = − PEV σa2 . In this study, the prediction error variance of EBV was obtained from
the inverse of the coefficient matrix of the mixed model equation. The ability of the SM and AM to predict the breeding value of cows was evaluated by the expected reliability of EBV for the cows that had their own records. For this purpose, the SM integrated the pedigree built for the AM to predict cow breeding value. RESULTS
Based on the whole data set (DATAT), the means (Table 2) for ICF, CI, DO, and IFL were 81.3, 413.1, 133.3, and 51.1 d, respectively. The average number of recorded inseminations per conception was 2.24, and 56.3% of the cows did not return to service within 56 d after the first insemination. Coefficients of variation for different traits ranged from 18.4% for CI to 128.1% for IFL. The difference between the 95th percentile and the median was much larger than that between the fifth percentile and the median for all the traits (except for NRR56, which was a binary trait), indicating a rightskewed distribution for the data of these traits. Variance components for each trait, estimated from DATAT using the AM, are shown in Table 3. Estimates of heritability were low for all fertility traits (below 0.1). Heritability was 0.081 for ICF, 0.030 for IFL, and 0.028 for AIS. Correspondingly, DO and CI (which had ICF and IFL as components) had heritabilities of 0.067. The NRR56 had the lowest heritability (0.012).
( )
( )
Table 3. Estimates of additive genetic variances σa2 , residual variances σe2 , heritabilities (h2), and the standard error [SE(h2)] for ICF, CI, DO, IFL, AIS, and NRR56 in first lactation, estimated using an animal model and based on the whole data set (DATAT)1 Trait ICF, d CI, d DO, d IFL, d AIS NRR56, 0 or 1
σa2
σe2
h2
SE(h2)
108.2 355.2 350.0 122.7 0.0629 0.0027
1,223.6 4,919.1 4,889.4 3,964.4 2.161 0.234
0.081 0.067 0.067 0.030 0.028 0.012
0.0063 0.0059 0.0059 0.0043 0.0042 0.0024
1 ICF = days from calving to first insemination; CI = calving interval; DO = days open; IFL = days from the first to last insemination; AIS = number of inseminations per conception; NRR56 = nonreturn rate within 56 d after first service.
Journal of Dairy Science Vol. 92 No. 8, 2009
4067
GENETIC ANALYSIS OF FEMALE FERTILITY
Table 4. Spearman rank correlations between the breeding values predicted using different models for sires with at least 20 daughter records in the whole data set (DATAT), and the number of common individuals in the 100 top-ranking sires (in parentheses) from different models1 Model AM-SDM AM-SM SDM-SM
ICF
CI
DO
IFL
AIS
NRR56
0.9995(98) 0.9637(77) 0.9624(76)
0.9996(99) 0.9667(68) 0.9656(69)
0.9996(99) 0.9663(69) 0.9653(68)
0.9997(98) 0.9648(75) 0.9648(73)
0.9997(98) 0.9604(65) 0.9602(65)
0.9997(99) 0.9518(67) 0.9515(66)
1 ICF = days from calving to first insemination; CI = calving interval; DO = days open; IFL = days from the first to last insemination; AIS = number of inseminations per conception; NRR56 = nonreturn rate within 56 d after first service; AM = animal model; SDM = sire-dam model; SM = sire model.
Table 4 gives the Spearman rank correlation and the number of common superior individuals based on EBV from DATAT when using the AM, SDM, and SM for sires that had at least 20 daughter records in DATAT. The correlations between EBV from the AM and SDM were very large (more than 0.999) for all traits. The correlations between the SM and AM ranged from 0.95 to 0.97. The number of common individuals in the 100 top-ranking sires from the AM and SDM was at least 98 for all the fertility traits. The top-ranking sires between the SM and AM or SDM were inconsistent. The number of common individuals in the 100 top-ranking sires from the SM and AM ranged from 65 for AIS to 77 for ICF. For each alternative model, the rank correlations between EBV from DATAA and DATAB were calculated for sires with at least 20 daughter records in DATAT (Table 5). The correlations ranged from 0.54 to 0.65 for different traits and models. The AM and SDM led to almost the same correlations. Correlations for the SM were lower than those for the AM and SDM in all the traits, particularly for NRR56, AIS, and IFL. Averaged over the 6 traits, correlation for the AM was 5% higher than that for the SM. Table 6 gives the rank correlations between EBV from DATAT and subset DATAC1 for proven bulls with at least 100 daughters in DATAT. The correlations for different traits ranged from 0.51 to 0.68 when using the SM and from 0.62 to 0.77 when using the AM or SDM, and the correlations were higher for ICF, CI, and DO than for IFL, AIS, and NRR56. The consistency of EBV calculated from the 2 data sets was better for the AM and SDM than that for the SM for all the traits, particularly for NRR56, AIS, and IFL. On average, the correlation for the AM or SDM was increased by 13%, compared with the SM. From the cross-validation (DATAA vs. DATAB), the rank correlations between EBV from one data set and YD from the other data set were calculated for sires that appeared in the test data and that had at least 20 daughter records in DATAT (Table 7). The correlations were highest for ICF, followed by CI and DO, and then
IFL and AIS, and were lowest for NRR56. The rank was consistent with the rank of estimated heritabilities for these traits. With regard to the models, the AM and SDM resulted in higher correlations than the SM for all the traits, with an average increase of 12%. Table 8 gives the reliability of EBV based on DATAT for sires that had at least 20 daughters with records and for the cows with their own records. The reliabilities of EBV estimated from the SM were lower than those from the AM or SDM for all the traits. The reliabilities of sire EBV ranged from 0.27 to 0.52, with an average of 0.42 when using the SM, and from 0.31 to 0.56, with an average of 0.46 when using the AM or SDM. For cow EBV, the reliabilities ranged from 0.17 to 0.23, with an average of 0.21 when using the SM, and from 0.21 to 0.31, with an average of 0.27 when using the AM. The reliabilities for different traits were consistent with the heritabilities of the traits, and were largest for ICF, followed by CI and DO, and then IFL and AIS, and were smallest for NRR56. The reliabilities of EBV for different groups of bulls and cows are shown in Table 9. Based on DATADH, the reliabilities of EBV for sires with at least 20 daughter records ranged from 0.37 for NRR56 to 0.62 for ICF. The reliabilities for the young sires (which had only
Table 5. Spearman rank correlations between EBV from 2 data subsets (DATAA and DATAB) for sires with at least 20 daughter records in the whole data set (DATAT)1 Model Trait
AM
SDM
SM
ICF CI DO IFL AIS NRR56
0.648 0.648 0.651 0.637 0.661 0.618
0.646 0.646 0.649 0.636 0.661 0.619
0.642 0.636 0.640 0.601 0.628 0.538
1 AM = animal model; SDM = sire-dam model; SM = sire model; ICF = days from calving to first insemination; CI = calving interval; DO = days open; IFL = days from the first to last insemination; AIS = number of inseminations per conception; NRR56 = nonreturn rate within 56 d after first service.
Journal of Dairy Science Vol. 92 No. 8, 2009
4068
SUN ET AL.
Table 6. Spearman rank correlations between EBV from the whole data set (DATAT) and data subset (DATAC1)1 for proven sires (n = 235) with more than 100 daughter records in DATAT2 Model Trait
AM
SDM
SM
ICF CI DO IFL AIS NRR56
0.773 0.689 0.686 0.643 0.620 0.627
0.772 0.686 0.686 0.640 0.625 0.626
0.683 0.646 0.649 0.548 0.522 0.511
1
DATAC1 contained only the records of first-crop daughters of sires in DATAT. 2 AM = animal model; SDM = sire-dam model; SM = sire model; ICF = days from calving to first insemination; CI = calving interval; DO = days open; IFL = days from the first to last insemination; AIS = number of inseminations per conception; NRR56 = nonreturn rate within 56 d after first service.
first-crop daughters) with at least 20 daughter records ranged from 0.35 to 0.61, and the values were between 0.23 and 0.32 for cows with their own records. Reliabilities of EBV for cows that did not have their own records ranged from 0.16 to 0.24, based on DATADHR. DISCUSSION
The SM is currently used for genetic evaluation of fertility traits in most of countries (Interbull, 2009). In theory, an SM is inferior to an AM, but this theory has not been assessed with regard to the ability to predict breeding value of fertility traits in dairy cattle. In particular, there is a lack of information on the benefit to genetic evaluation by replacing the SM with AM. In the present study, comparisons were carried out between the SM, SDM, and AM using different criteria, including a cross-validation. The results confirmed that the AM had a superior ability to predict breeding value in terms of stability and accuracy of the EBV. Based on the present data, the correlations between sire EBV from the AM and SDM were close to unity for all traits, whereas the correlations between sire EBV from the AM and SM ranged from 0.95 to 0.97 for different fertility traits. Ferreira et al. (1999) analyzed the growth traits of beef cattle using different statistical models and found that the correlations between EBV from the AM and SDM were greater than those between the AM and SM. Although the correlations between EBV from the AM and SM were large, only 65 to 77 individuals were common in the groups of the top 100 sires from the AM and from the SM, suggesting that the AM and SM could lead to a considerable reranking of bull merits. The SM has the advantage of less computational demand, and might have good predictive properties under Journal of Dairy Science Vol. 92 No. 8, 2009
the conditions no genetic relationship exists between the sire and dam, that no genetic relationship exists between dams, and that mating is random. However, the assumed conditions necessary for accurate and unbiased EBV using an SM are frequently violated in current dairy populations. If mates are chosen in some nonrandom manner, and if the model does not account for mating schemes, sire evaluations may be adversely affected and could be biased (Schaeffer, 1983). In practical dairy cattle breeding, elite sires may be used to mate superior cows (positive assortative mating) to produce better offspring, or may be used to mate inferior cows (negative assortative mating) for genetic improvement. With the SM, sire EBV could be overestimated because of positive assortative mating or could be underestimated because of negative assortative mating. In addition, the disadvantage of the SM could be more obvious in genetic evaluations based on data collected over many years from a population under parental selection, because the model does not account for the selection effects on mates. This problem, however, is less likely for traits such as fertility, for which little direct selection is applied relative to traits associated with productivity. Nevertheless, models that adequately incorporate the genetic relationship among both males and females could overcome the problem. Everett et al. (1979) and Schaeffer (1983) argued that a sire-maternal grandsire model could reduce bias, compared with the SM. In the present study, the stability of sire EBV using different models was evaluated by the correlation between EBV obtained from 2 subsets. The AM and SDM led to higher stability of sire EBV than did the Table 7. Spearman rank correlations between yield deviation (YD) from test data (DATAA or DATAB) and EBV from training data (DATAB or DATAA) for sires that appeared in the test data and had at least 20 daughter records in the whole data set (DATAT)1 Model Test data
Trait
AM
SDM
SM
DATAA (n = 3,937)
ICF CI DO IFL AIS NRR56 ICF CI DO IFL AIS NRR56
0.322 0.317 0.318 0.205 0.195 0.114 0.332 0.312 0.312 0.193 0.171 0.064
0.323 0.318 0.319 0.206 0.195 0.114 0.331 0.313 0.312 0.193 0.172 0.065
0.312 0.278 0.280 0.166 0.168 0.105 0.310 0.285 0.284 0.161 0.145 0.059
DATAB (n = 3,899)
1 AM = animal model; SDM = sire-dam model; SM = sire model; ICF = days from calving to first insemination; CI = calving interval; DO = days open; IFL = days from the first to last insemination; AIS = number of inseminations per conception; NRR56 = nonreturn rate within 56 d after first service.
4069
GENETIC ANALYSIS OF FEMALE FERTILITY
Table 8. Reliabilities of EBV based on the whole data set (DATAT) using the animal model (AM), sire-dam model (SDM), and sire model (SM) for the sires having at least 20 daughter records, and for the cows having their own records1 Animal Model Sire Cow
AM SDM SM AM SM
N
ICF
CI
DO
IFL
AIS
NRR56
4,201 4,201 4,201 471,742 471,742
0.563 0.564 0.520 0.314 0.233
0.534 0.535 0.490 0.300 0.228
0.533 0.534 0.489 0.300 0.228
0.421 0.421 0.377 0.253 0.203
0.410 0.410 0.367 0.249 0.200
0.311 0.310 0.269 0.209 0.170
1
ICF = days from calving to first insemination; CI = calving interval; DO = days open; IFL = days from the first to last insemination; AIS = number of inseminations per conception; NRR56 = nonreturn rate within 56 d after first service.
SM, which was particularly clear for NRR56. RamirezValverde et al. (2001) analyzed calving difficulty in beef cattle and reported that a linear AM had higher stability than a linear sire-maternal grandsire model, based on the correlation between EBV from 2 random subsets for sires with 50 or fewer progeny records in the data set. The expected reliability of EBV, which is important information for genetic evaluation, is usually released together with EBV of the candidates. Comparison of the expected reliability of EBV suggested that the AM could provide EBV with higher accuracy than the SM for all fertility traits. The AM allows all relatives to contribute to the evaluation of an animal, whereas the SM uses less information on the relatives. As expected, the reliability of sire EBV from the AM was higher than that from the SM for all 6 fertility traits. The prediction error for sire EBV decreases with an increasing number of daughters. Veerkamp et al. (2001) reported that accuracies of EBV for fertility would approach 1 when the number of daughters was more than 1,000. In this study, the proven sires had 1,279 daughters on average in DATAT. Therefore, EBV calculated based on DATAT for these proven sires would be close to the true breeding values, and the correlation between EBV from DATAC1 and from DATAT for these
bulls could be a good indicator of the accuracy of sire EBV. According to this criterion, AM and SDM showed better predictive ability than SM. Cross-validation is a popular method to validate the predictive ability of models. In a typical cross-validation in which data are split into 2 subsets, predicted values based on the estimates from one subset are compared with observations in the other subset. To account for different contemporary effects, in this study, the crossvalidation was carried out by comparing EBV from one subset with YD from the other subset, instead of with original observations. The cross-validation again suggests that the AM and SDM had better predictive ability for sire EBV than did the SM. A potential advantage of the SDM over the AM for genetic evaluation is a reduced computational demand. If there are no records on the dams, the SDM is equivalent to the AM with regard to sire EBV. In the present data, the dams of some cows (40%) also had their own records, indicating that the AM and SDM were not equivalent even though, based on the validation criteria in the present study, the SDM performed as well as the AM. However, in cattle data, using an SDM rather than an AM would not reduce computational demands dramatically. As shown in the present study, the number of equations in the SDM was 70% of that
Table 9. Reliabilities of EBV using the animal model (AM), for all the sires with at least 20 daughter records (Sireall), young sires (excluding proven sires) with at least 20 daughter records (Sireyoung), and cows with their own records (Cow), estimated using AM based on DATADH,1 as well as reliabilities of EBV of cows without their own records (Cow0), based on DATADHR Group
N
ICF
CI
DO
IFL
AIS
NRR56
Sireall Sireyoung Cow Cow0
5,837 4,767 1,050,494 198,228
0.620 0.610 0.323 0.235
0.591 0.581 0.311 0.228
0.590 0.579 0.310 0.228
0.478 0.463 0.268 0.199
0.467 0.453 0.265 0.202
0.366 0.351 0.231 0.162
1
DATADH = the data set that included the records of first lactation from the whole Danish Holstein population during the period from 1995 to 2004. 2 DATADHR = the reduced data set created by excluding the records of the last 2 insemination years from DATADH. ICF = days from calving to first insemination; CI = calving interval; DO = days open; IFL = days from the first to last insemination; AIS = number of inseminations per conception; NRR56 = nonreturn rate within 56 d after first service. Journal of Dairy Science Vol. 92 No. 8, 2009
4070
SUN ET AL.
in the AM. On the other hand, the SDM has no ability to use a cow’s own record to predict its breeding value. This is a serious restriction because the EBV of dams of young bulls are important for preselection of young bulls to enter a progeny test. One of the advantages of the AM is that the model allows using the cow’s own record directly to predict its breeding value, whereas in the SM and SDM, the cow’s record is used only as a progeny record and its EBV is obtained through the relatives. Averaged over the 6 traits, the reliability of cow EBV using the AM was 1.3 times as high as that using the SM in the present study. Young bulls are usually preselected using the parent average, which is calculated from the EBV of the sire and dam if the EBV of the dam is available, and otherwise is calculated from the EBV of the sire and maternal grandsire. Reliabilities of EBV for cows ranged from 0.231 to 0.323, based on the analysis of first-lactation data from the whole Danish Holstein population during the years 1995 to 2004, despite the low heritabilities for fertility traits. This result was more than half the reliabilities of sire EBV. Therefore, reliabilities of the parent average based on EBV of the sire and dam were obviously greater than reliabilities based on the EBV of the sire and maternal grandsire. Reliabilities of EBV differed between traits. The ICF, CI, and DO had larger reliabilities of EBV than did the IFL, AIS, and NRR56, which was consistent with the differences in heritabilities among these traits. Reverter (1998) compared methods of estimating genetic parameters using simulated data with different levels of heritability and discovered that higher heritability corresponded to higher accuracy of EBV. In addition, the categorical or binary nature of traits could contribute to poor accuracy of EBV for AIS and NRR56 when using a linear Gaussian model in theory. However, AndersenRanberg et al. (2005) studied NRR56 and ICF using a bivariate threshold-linear model and a linear-linear model, and did not arrive at consistent conclusions regarding the predictive ability of the 2 kinds of models, as measured by the χ2 statistic, for expected and observed NRR56 from different data sets. In the present study, the comparisons between the AM, SM, and SMD were based on data for primiparous cows only. A potential bias in the prediction of breeding values attributable to the selected data could result (Henderson 1975a,b). It is not clear if the possible bias is the same for the AM, SM, and SDM. Genetic evaluation of fertility traits using the conventional linear model would not be the best approach because of the presence of censoring and lack of data normality. Several studies have made comparisons between the conventional linear model and more sophisticated models for genetic evaluation of fertility traits Journal of Dairy Science Vol. 92 No. 8, 2009
in dairy cattle. Schneider et al. (2005, 2006), performing simulation studies, reported that the sire EBV of DO when using a Weibull proportional hazards model and the sire EBV of IFL when using a grouped-data proportional hazards model had a higher correlation with the true breeding values of conception rate than when using a conventional linear model. In a study on AIS when using a discrete proportional hazards model, an ordinal censored threshold model, and a sequential threshold model, González-Recio et al. (2005) found that the sequential threshold model had greater predictive ability for daughter fertility at first insemination and that the ordinal censored threshold model predicted daughter fertility more accurately in subsequent inseminations. González-Recio et al. (2006) reported that the Weibull proportional hazards model and the censored linear model predicted the DO of daughters better than the conventional linear model. Hou et al. (2009) reported that the Cox proportional hazards model with a piecewise constant baseline hazard function led to higher stability and accuracy of sire EBV for ICF and DO than did the other 4 models in the study. These results indicated that it is still unclear which alternative model should be used in the genetic evaluation of fertility traits. In addition, more sophisticated models require more computational resources, and thus complicate the index of total merit, because EBV from more sophisticated models are often not of the same scale as observed performance. CONCLUSIONS
The results from this study indicate that the AM can yield increased stability and accuracy of sire genetic evaluation of fertility traits in dairy cattle compared with the SM. The SDM performs as well as the the AM, but does not dramatically reduce computational demands. In addition, the AM has the advantage of providing EBV of cows with moderate reliability. Therefore, we recommend the AM for genetic evaluation of fertility traits. ACKNOWLEDGMENTS
This study is a part of the project “Improved models and key indicators for genetic evaluation and management to improve reproduction.” We acknowledge the Danish Cattle Federation (Aarhus, Denmark) for funding the project and providing data. REFERENCES Andersen-Ranberg, I. M., B. Heringstad, D. Gianola, Y. M. Chang, and G. Klemetsdal. 2005. Comparison between bivariate models
GENETIC ANALYSIS OF FEMALE FERTILITY
for 56-day nonreturn and interval from calving to first insemination in Norwegian Red. J. Dairy Sci. 88:2190–2198. Dematawewa, C. M. B., and P. J. Berger. 1998. Genetic and phenotypic parameters for 305-day yield, fertility, and survival in Holsteins. J. Dairy Sci. 81:2700–2709. Donoghue, K. A., R. Rekaya, and J. K. Bertrand. 2004. Comparison of methods for handling censored records in beef fertility data: Field data. J. Anim. Sci. 82:357–361. Esslemont, R. J. 1993. Relationship between herd calving to conception interval and culling rate for failure to conceive. Vet. Rec. 133:163– 164. Everett, R. W., R. L. Quaas, and A. E. McClintock. 1979. Daughters’ maternal grandsires in sire evaluation. J. Dairy Sci. 62:1304– 1313. Ferreira, G. B., M. D. MacNeil, and L. D. Van Vleck. 1999. Variance components and breeding values for growth traits from different statistical models. J. Anim. Sci. 77:2641–2650. González-Recio, O., Y. M. Chang, D. Gianola, and K. A. Weigel. 2005. Number of inseminations to conception in Holstein cows using censored records and time-dependent covariates. J. Dairy Sci. 88:3655–3662. González-Recio, O., Y. M. Chang, D. Gianola, and K. A. Weigel. 2006. Comparison of models using different censoring scenarios for days open in Spanish Holstein cows. Anim. Sci. 82:233–239. Henderson, C. R. 1975a. Best linear unbiased estimation and prediction under a selection model. Biometrics 31:423–447. Henderson, C. R. 1975b. Comparison of alternative sire evaluation methods. J. Anim. Sci. 41:760–770. Hou, Y., P. Madsen, R. Labouriau, Y. Zhang, M. S. Lund, and G. Su. 2009. Genetic analysis of days from calving to first insemination and days open in Danish Holsteins using different models and censoring scenarios. J. Dairy Sci. 92:1229–1239. Hudson, G. F. S., and L. R. Schaeffer. 1984. Monte Carlo comparison of sire evaluation models in populations subject to selection and nonrandom mating. J. Dairy Sci. 67:1264–1272. Interbull. 2009. National GES information. http://www-interbull.slu. se/national_ges_info2/framesida-ges.htm Accessed Jan. 22, 2009. Jorjani, H. 2006. International genetic evaluation for female fertility traits. Interbull Bull. 35:42–46. Jorjani, H. 2007. International genetic evaluation of female fertility traits in five major breeds. Interbull Bull. 37:144–147. Madsen, P., and J. Jensen. 2007. A User’s Guide to DMU, Version 6, Release 4.7. Faculty of Agricultural Sciences, University of Aarhus, Aarhus, Denmark.
4071
Madsen, P., P. Sørensen, G. Su, L. H. Damgaard, H. Thomsen, and R. Labouriau. 2006. DMU—A package for analyzing multivariate mixed models. In Book of Abstracts, Proc. 8th World Congr. Genet. Appl. Livest. Prod., Belo Horizonte, Brazil, Commun. No. 07-06. Olori, V. E., T. H. E. Meuwissen, and R. F. Veerkamp. 2002. Calving interval and survival breeding values as measure of cow fertility in a pasture-based production system with seasonal calving. J. Dairy Sci. 85:689–696. Pryce, J. E., M. D. Royal, P. C. Garnsworthy, and I. L. Mao. 2004. Fertility in high-producing dairy cow. Livest. Prod. Sci. 86:125– 135. Ramirez-Valverde, R., I. Misztal, and J. K. Bertrand. 2001. Comparison of threshold vs. linear and animal vs. sire models for predicting direct and maternal genetic effects on calving difficulty in beef cattle. J. Anim. Sci. 79:333–338. Reverter, A. 1998. Empirical evidence of the optimality of method R estimates. In Book of Abstracts, Proc. 6th World Congr. Genet. Appl. Livest. Prod., Armidale, Australia. Commun. No. 25533. Schaeffer, L. R. 1983. Effectiveness of model for cow evaluation intraherd. J. Dairy Sci. 66:874–880. Schneider, M. del P., E. Strandberg, V. Ducrocq, and A. Roth. 2005. Survival analysis applied to genetic evaluation for female fertility in dairy cattle. J. Dairy Sci. 88:2253–2259. Schneider, M. del P., E. Strandberg, V. Ducrocq, and A. Roth. 2006. Short communication: Genetic evaluation of the interval from first to last insemination with survival analysis and linear models. J. Dairy Sci. 89:4903–4906. Urioste, J. I., I. Misztal, and J. K. Bertrand. 2007. Fertility traits in spring-calving Aberdeen Angus cattle. 1. Model development and genetic parameters. J. Anim. Sci. 85:2854–2860. Veerkamp, R. F., E. P. C. Koenen, and G. De Jong. 2001. Genetic correlations among body condition score, yield, and fertility in first-parity cows estimated by random regression models. J. Dairy Sci. 84:2327–2335. Wall, E., S. Brotherstone, J. A. Wooliams, G. Banos, and M. P. Coffey. 2003. Genetic evaluation of fertility using direct and correlated traits. J. Dairy Sci. 86:4093–4102. Weller, J. I., and E. Ezra. 1997. Genetic analysis of somatic cell score and female fertility of Israeli Holsteins with an individual animal model. J. Dairy Sci. 80:586–593. Westell, R. A., E. B. Burnside, and L. R. Schaeffer. 1992. Evaluation of Canadian Holstein-Friesian sires on disposal reasons of their daughters. J. Dairy Sci. 65:2366–2372.
Journal of Dairy Science Vol. 92 No. 8, 2009