Effect of type I error threshold on marker-assisted selection in dairy cattle

Effect of type I error threshold on marker-assisted selection in dairy cattle

Livestock Production Science 85 (2004) 189 – 199 www.elsevier.com/locate/livprodsci Effect of type I error threshold on marker-assisted selection in ...

136KB Sizes 21 Downloads 38 Views

Livestock Production Science 85 (2004) 189 – 199 www.elsevier.com/locate/livprodsci

Effect of type I error threshold on marker-assisted selection in dairy cattle C. Israel a, J.I. Weller b,* b

a Department of Genetics, Hebrew University of Jerusalem, Jerusalem 91904, Israel Institute of Animal Sciences, A.R.O., The Volcani Center, P.O. Box 6, Bet Dagan 50250, Israel

Received 29 January 2002; received in revised form 1 November 2002; accepted 6 May 2003

Abstract Two markers bracketing a quantitative gene with a substitution effect of 0.5 or 0.3 phenotypic standard deviations with recombination frequencies of 0.1 and 0.2 with the quantitative gene were simulated. Ten simulated populations with 20 sires heterozygous for both markers with varying numbers of daughters were analyzed for each combination of gene effect and allele frequency. Sire quantitative gene genotype was determined by the regression of the daughter genetic evaluations on their paternal markers. Sires were determined to be heterozygous by an F-test of the model to residual sum of squares. Three values for probability of type I error were simulated: 0.05, 0.1 and 0.20. Marker allele effects were then included in an animal model analysis of the simulated populations. The algorithm of Whittaker et al. [Heredity 77 (1996) 23] was used to estimate gene effects and location. Estimates for both effect and location of the quantitative gene were nearly unbiased. Cow genetic evaluations were always more accurate by the model proposed than by a standard animal model. The increase of genetic gain due to marker-assisted selection of young sires, as compared to dam selection was between 2% and 15%. Type I error rate did not appreciably affect selection decisions. D 2003 Elsevier B.V. All rights reserved. Keywords: Animal model; Dairy cattle; Genetic marker; Marker-assisted selection; Quantitative trait loci

1. Introduction Most economically important traits of dairy cattle are determined by the joint effects of many environmental and genetic factors. Several studies showed that the rate of genetic gain could be increased by marker-assisted selection (MAS) on identified quantitative trait loci (QTL) (reviewed by Spelman et al.,

* Corresponding author. Tel.: +972-8-948-4430; fax: +972-8947-0587. E-mail address: [email protected] (J.I. Weller). 0301-6226/$ - see front matter D 2003 Elsevier B.V. All rights reserved. doi:10.1016/S0301-6226(03)00136-2

1999). Segregating quantitative trait loci (QTL) affecting these traits have been identified with the aid of genetic markers (Heyen et al., 1999; Zhang et al., 1998). Potentially MAS can increase the rate of genetic gain by increasing the accuracy of evaluation, increasing the selection differential, or decreasing the generation interval (Weller and Fernando, 1991). A priori, MAS should be able to make a major contribution to dairy cattle breeding, because the important economic traits are expressed only in females, which have very limited fertility. Traditionally dairy cattle breeding programs are based on progeny-testing of young sires

190

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199

based on relatively small samples of daughters. Sires with superior evaluations are then returned to general service. However, by the time these evaluations are available, the sires are at least five years old, even though bulls reach sexual maturity at the age of 1 year. Thus the average generation interval along the sire-todam path is much greater than required by purely biological considerations. Even once segregating QTL have been identified via linked genetic markers, application of MAS is non trivial. Only a fraction of the total genetic variation will be detected by linkage studies. Thus it will be necessary to incorporate MAS into existing breeding programs. In addition, linkage relationships between QTL and genetic markers will be different in different families, and will tend to break down over time. Two basic strategies have been proposed to apply MAS in dairy cattle. Kashi et al. (1990) proposed preselection of young sires based only on marker information. Segregating QTL in widely used sires are first detected by application of a daughter or granddaughter design (Weller et al., 1990), and sons of these sires with the favorable QTL alleles are selected for progeny testing. In the case of a type I error (assuming incorrectly that a sire is heterozygous for a QTL) selection pressure based on linked marker genotypes is wasted, because all sons will receive the same paternal QTL allele. In the case of a type II error (a segregating QTL is not detected) sons carrying the favorable paternal QTL allele are not selected. Several studies found that genetic progress is maximized with a relatively large type I error rate, because the type II error rate is reduced (Kashi et al., 1990; Mackinnon and Georges, 1998; Spelman and Garrick, 1998). None of these studies addressed the problem that candidates for selection can have unequal genetic evaluations based on pedigree and phenotypic information. In this case it will be necessary to correctly weight the different types of information available. Fernando and Grossman (1989) first proposed that QTL effects could be incorporated into an individual animal model analysis as random effects. A breeding value consisting of the polygenic and QTL effects is computed for each individual, and these estimated breeding values (EBV) are then used to rank candidates for selection within or across families. They assumed that for each QTL included in the model, each individual with unknown parent received two

different alleles sampled from a continuous distribution with a known variance. Thus, two additional equations must be added to the analysis for each individual for each QTL included in the analysis. However, most QTL studies to date tend to indicate that most individuals are homozygous for any particular QTL (e.g. Heyen et al., 1999). Thus the effective number of alleles for any specific segregating QTL is generally low. Fernando and Grossman (1989) also assumed that all individuals included in the analysis were genotyped for all markers. In practice only a small fraction of the population will be genotyped. Israel and Weller (1998) presented a method to include effects of a known gene as a fixed effect in the model, even if only a small fraction of the population is genotyped. They assumed that only two QTL alleles are segregating in the population. If additive effects are assumed for the segregating QTL, then only a single equation is added to the analysis for each QTL included. Similar to the method of Fernando and Grossman (1989), EBV are obtained for each individual consisting of the polygenic and marked QTL effects. These EBV can then be used to rank candidates for selection within or across families. Recently, Israel and Weller (2002) extended this method to estimate a QTL bracketed between two markers, based on the algorithm of Whittaker et al. (1996). Nearly unbiased estimates were obtained for QTL effect and location. Kadarmideen and Dekkers (1999) extended the method of Whittaker et al. (1996) to situations of uncertain parental allele transmission. This will be the case in an outbreeding population if only one parent is genotyped, and both parent and progeny have the same genotype (Ron et al., 1993). QTL effects can be estimated across families with this method, but it must be assumed that the same alleles are segregating in all families, and that sire QTL genotype and linkage phase is known. Thus, similar to Kashi et al. (1990) and Mackinnon and Georges (1998) it is necessary to conduct a preliminary analysis to determine which sires are heterozygous for the segregating QTL and their phase. The objective of this study was to determine the optimal type I error rate for deciding which sires are heterozygous for segregating QTL of varying magnitudes, based on simulated populations. The effect of

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199

three different values for type I error on QTL mapping and on accuracy of the estimated breeding values and selection of elite cows was investigated. In addition, MAS of young sires was considered, by calculating the expected transmitting ability of the elite dams, taking into account the potential contribution of selection among their sires using the marker information of the dam.

2. Materials and methods 2.1. Simulations A single sex-limited trait with a heritability of 0.25 was simulated. For cows, the phenotypic record was computed as the sum of the additive genetic, QTL, permanent environmental, herd-year-season (HYS), and residual effects. All effects except for QTL were simulated from normal distributions with means of zero. Variances were 0.25 for the HYS effect, 0.25 for the total additive genetic variance including the variance due to the QTL, 0.25 for the permanent environmental effect, and 0.5 for the residual. Therefore, the total phenotypic variance excluding the HYS effect, which was considered fixed, was equal to unity. A codominant diallelic QTL was simulated with a Hardy – Weinberg distribution of genotypes for all individuals. Three values were simulated for the QTL additive effect, a = 0.5, 0.3, and 0. The ‘additive effect’ is defined as half the difference between the mean values of the two QTL homozygotes. Populations were simulated with PQ = 0.5 and with PQ = 0.3, where PQ is the frequency of QTL allele with the positive effect. Two genetic markers bracketing the QTL were simulated with recombination frequencies of r1 = 0.1 and r2 = 0.2 between the QTL and the two markers. Zero recombination interference was assumed. Thus recombination frequency between the two flanking markers was: R ¼ r1 þ r2  2r1 r2 ¼ 0:2 6. The QTL was simulated at an asymmetric location relative to the marker bracket, first because this more closely reflects reality, and second to determine if the estimated QTL location is biased relative to the center of the marker interval, as some previous studies have found (Weller, 2001). Twenty sires were simulated for each simulated population. Each sire’s QTL genotype was simulated

191

by sampling twice from a uniform distribution. If a value smaller than PQ was obtained, then it was assumed that the sire received the positive QTL allele; otherwise the sire received the negative QTL allele. Each daughter had a probability of one half to receive either paternal allele. The maternal QTL allele was determined by sampling independently from a uniform distribution, according to the simulated frequencies of the two QTL alleles. All sires were assumed to be heterozygous for both markers with known linkage phase. Linkage phase for the QTL with respect to the genetic marker was not known. It was further assumed that the sire alleles were sufficiently rare so that the paternal allele could be determined unequivocally for all genotyped daughters (Ron et al., 1993). Significance of the difference between the observed number of heterozygous sires for the QTL and the expected number based on the simulated allelic frequencies and a Hardy – Weinberg distribution of genotypes was determined by a v2 test for ‘goodness-of-fit’. From 100 to 1000 daughters were simulated for each sire. The number of daughters per sire was determined as 10 multiplied by the base 10 antilog of a randomly generated variable from a uniform distribution between 1 and 2. The mean number of daughters was 380. A different dam was assumed for each cow. All dams were assumed to be unrelated to each other and to the sires. Since the scheme consisted on only two generations; parents and progeny, and only progeny records were analyzed, records of dams were not included in the analysis. From one to five lactation records were simulated for each cow. The conditional probability for each additional parity record was set at 0.6. One hundred HYS were simulated. Daughters of the same sire and lactations of the same cow were assigned to HYS nonrandomly, as described previously (Israel and Weller, 1998). All the daughters were genotyped for both markers. 2.2. Analysis models Two analysis models were considered. AM, a standard animal model was: y ¼ Xh þ Z1 p þ Z2 u þ e where y = vector of production records; X, Z1 and Z 2 = incidence matrices for fixed and random

192

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199

effects; and h, u, p, and e = unknown vectors of solutions for HYS, additive genetic, permanent environment, and residual effects. The HYS effects were assumed fixed, all other effects were assumed to be random. Solutions were derived by iteration of the mixed model equations, which included the complete relationship matrix and grouping of individuals with unknown parents, as described by Israel and Weller (1998). All dams were assigned to a single phantom group, and all parents of all sires were assigned to an additional phantom group. This should not have affected the estimates of QTL effects, which were only followed from sires to daughters. AMQ, a modified AM, including the effects of the two linked markers as regression effects, as follows: y ¼ Xh þ Mb þ Z1 p þ Z2 u þ e where M = the matrix of indicator variables for the paternal alleles of the two markers, b = the vector of marker solutions, and the other terms are as described for the AM model. M has two columns and rows equal to the number of animals with records, and b has two elements for the paternal allele substitution effect of each marker. For the daughters of sires assumed to be heterozygous for the QTL, the elements of M are either zero or 1, depending on the allele they received from their sire: one if they received the marker allele assumed to be linked to the positive QTL allele, and zero otherwise. If the sire was assumed to be homozygous, the elements of M are both 0.5. Sire QTL genotype and phase was determined by a regression analysis of the daughter EBV on their paternal alleles for the two markers. Each family was analyzed separately. Sires were determined to be heterozygous for the QTL if the F probability for the ratio of model to residual mean squares was below the specified type I error rate of a. Otherwise sires were assumed to be homozygous for the QTL. QTL phase relative to the genetic markers was determined based on the difference in mean EBV between the two daughter groups that received the two paternal marker haplotypes. Since the linkage phase between the two markers is assumed known, and only two QTL alleles are segregating, there are two possible linkage phases for the QTL. However, even if a sire was determined correctly to be hetero-

zygous for the QTL, it was still possible to incorrectly determine QTL linkage phase. Three values for a were simulated: 0.05, 0.1 and 0.20. The number and type of errors were followed in each simulation. Marker solutions were then obtained by the AMQ model for the different values of a. QTL effects and location were then derived as described by Whittaker et al. (1996). The regression coefficients for both markers must have the same sign in order to estimate a QTL effect between the markers (Whittaker et al., 1996). If this criterion was not met, the QTL effect and location were not estimated. Solutions for both models were computed by iteration assuming r2e /r2a = 2, and r2e /rp2 = 2; where r2e , r2a , and rp2 are the residual, additive genetic (excluding the QTL), and permanent environmental variance components. For PQ = 0.5, the variance contributed by the QTL, 2PQ(1  PQ)(a)2, equals 0.125 for a = 0.5, and 0.045 for a = 0.3. For PQ = 0.3, the variance contributed by the QTL is 0.105 for a = 0.5, and 0.0378 for a = 0.3. Thus, the true variance ratio for r2e /r2a is between 2 and 4. Ten replicate simulations were analyzed by both models for each combination of allelic frequencies and QTL effect: a = 0.3 and PQ = 0.5, a = 0.3 and PQ = 0.3, a = 0.5 and PQ = 0.5, a = 0.5 and PQ = 0.3, and a = 0. The parameters for the simulations are summarized in Table 1.

2.3. Parameter estimation and comparison between models Two criteria were employed to evaluate the estimators for the QTL effect and mean squared P location: ˆ  hÞ2 =n , and mean error (MSE), defined as: ½ð h P ˆ bias, defined as: ½ðhˆ  hÞ=h=n, where h ¼ a or r1, h is the estimate of h, and n = 10, the number of populations simulated for each combination of a and PQ. In addition, significance of the difference between the mean of the estimated values and the simulated value was determined by a t-test. For the standard AM, EBV for all the cows and sires was computed as the solutions for the additive genetic effect. For the AMQ, EBV for cow i, EBVi, was computed as follows: EBVi ¼ uˆ i þ ½m1i m2i bˆ

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199 Table 1 Parameters of the simulations Parameter

Value

Sires Daughters per sire Herd-year-seasons (HYS) Variance components: HYS (considered as fixed in the analysis models) Additive genetic (including QTL effect) Permanent environmental Residual Lactations per cow Number of loci: Markers QTL Number of alleles per locus: QTL Genetic markers Frequency of QTL Allele Q Marker-QTL recombination frequencies QTL effects: Additive Dominance

20 100 – 1000 100 0.25 0.25 0.25 0.5 1–5 2 1 2 l 0.3 or 0.5 0.1 and 0.2 0, 0.3, or 0.5 0

where uˆi is the additive polygenic solution for cow i, m1i and m2i are the coefficients of M for cow i, and bˆ is the solution vector for b. Correlations were computed between EBV derived by AM and AMQ, and between the simulated breeding values (SBV) and the EBV for all the daughters in each simulated population for each value of a. ‘SBV’ were defined as the sum of the additive genetic and QTL effects for each individual. The 20 daughters with the highest EBV in AM and AMQ were compared, and also compared to the 20 daughters with the highest SBV in each simulation. The mean SBV of the elite cows were calculated according to selection based on SBV and EBV computed from AM and AMQ. Within family selection of candidate bulls from these 20 elite cows was also considered. The genetic gain obtained by selection of young sires, based on their marker genotypes, was estimated by the expected simulated transmitting ability (TA) of their dams. For the AM analysis, no information was available on the bull dams’ QTL genotype, and the cows’ TA were calculated as half the simulated breeding value. For the AMQ analysis, if the bull dam’s sire was determined to be homozygous for the QTL, then no selection was possible among her sons based on

193

marker haplotype. The bull dams’ TA were then also computed as half of her SBV. If the bull dam’s sire was determined to be heterozygous for the QTL, the following strategy was applied to select candidate bulls among her sons: 1. If the cow had marker coefficients of [11], bull calves that received both marker alleles from their grandsire were selected. 2. If the cow had marker coefficients of [00], bull calves that received neither marker alleles from their grandsire were selected. 3. If the cow had coefficients [10] or [01], bull calves that received the positive grand-paternal marker allele, but did not receive the negative grandpaternal marker allele were selected. The TA was then computed as half of the cow’s polygenic effect, plus the expected QTL effect transmitted from the dam, which is a linear function of b1 and b2, where: b1 ¼

ar2 ð1  r2 Þð1  2r1 Þ Rð1  RÞ

b2 ¼

ar1 ð1  r1 Þð1  2r2 Þ Rð1  RÞ

as shown by Whittaker et al. (1996). The estimated TA of bull dams selected by AM or AMQ were compared to the optimal simulated TA. If the cow was homozygous for the QTL, then the optimal simulated TA is half of her breeding value, including the QTL effect. If the cow was heterozygous for the QTL, sons inheriting the favorable allele from their mother were chosen without error, and the TA was calculated as half the polygenic breeding value plus a.

3. Results The mean number of heterozygous sires for the QTL, and the frequencies of type I and type II errors for the different values of a are given in Table 2. The expected number of heterozygous sires is 10 for PQ = 0.5 and 8.4 for PQ = 0.3. Observed values were close to the expected values, and were not signifi-

194

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199

Table 2 Mean number of heterozygous sires, and mean number of type I and type II errorsa, for the different probabilities of type I error (a) Parameter a

PQ

0.3 0.3 0.5 0.5

0.5 0.3 0.5 0.3

Heterozygous sires

a = 5% Type I

Type II

a = 10% Type I

Type II

Type I

a = 20% Type II

8.6 9.2 10.6 8.5

0.5 0.6 0.6 0.6

2.8 3 1 1

0.9 1.3 1.3 1.3

2.2 1.5 0.5 0.3

1.7 2 2.1 2.7

1.7 0.8 0.4 0.1

Results are the mean of 10 independent simulations for each set of parameters. Twenty sires were analyzed in each simulation. a Type I error: a sire homozygous for the QTL is determined to be heterozygous; type II error: a heterozygous sire is determined to be homozygous.

cantly different as determined by a v2 test for ‘goodness-of-fit’. As expected, the frequency of type I errors increased with increasing a, while the frequency of type II errors decreased. The total number of errors was lowest at a = 10% for a = 0.3 and at a = 5% for a = 0.5. There were fewer errors with a = 0.5, as

compared to a = 0.3. QTL linkage phase of heterozygous sires was always determined correctly, if the sire was determined to be heterozygous for the QTL. Estimates, mean squared errors, and bias for the estimate of r1 and the QTL effect, for the different values of a, PQ and a are given in Table 3. In general,

Table 3 Estimates, mean squared errors (MSE), and bias for the recombination frequency between the QTL and the left marker (r1) and the QTL effect (a), for the different probabilities of type I error (a) Simulated values a

PQ

0.3

0.5

r1

a

0.3

r1

a

0.5

0.5

r1

a

0.3

a = 5%

Parameter estimated

r1

a

Mean MSEa Biasb Mean MSE Bias Mean MSE Bias Mean MSE Bias Mean MSE Bias Mean MSE Bias Mean MSE Bias Mean MSE Bias

a = 10%

a = 20%

Est.

S.D.

Est.

S.D.

Est.

S.D.

0.101 0.001 0.008 0.318 0.002 0.061 0.092 0.001  0.077 0.325 0.001 0.084 0.103 0 0.027 0.499 0.001  0.002 0.098 0  0.015 0.504 0.002 0.008

0.033 0.002 0.334 0.044 0.002 0.145 0.036 0.002 0.360 0.034 0.002 0.112 0.016 0 0.166 0.03 0.001 0.06 0.019 0 0.194 0.051 0.002 0.102

0.098 0  0.016 0.303 0.002 0.011 0.096 0.001  0.036 0.310 0.001 0.032 0.105 0 0.046 0.479 0.001  0.041 0.101 0 0.012 0.479 0.003  0.041

0.032 0.002 0.325 0.045 0.002 0.151 0.032 0.002 0.323 0.035 0.002 0.118 0.018 0 0.184 0.029 0.001 0.059 0.021 0 0.216 0.053 0.003 0.106

0.101 0.001 0.012 0.290 0.003  0.033 0.1 0.001 0 0.295 0.001  0.015 0.102 0 0.022 0.454 0.004  0.092 0.102 0 0.019 0.427 0.007  0.146

0.034 0.002 0.345 0.062 0.003 0.206 0.032 0.002 0.316 0.040 0.001 0.132 0.019 0 0.190 0.047 0.005 0.095 0.023 0 0.229 0.049 0.004 0.097

Results are the mean ( F S.D.) of 10 independent simulations for each set of parameters. r1 = 0.1 for all simulations.P a MSE is computed as P ½ðhˆ  hÞ2 =10, where h is a or r1. b Bias is computed as ½ðhˆ  hÞ=h=10, where h is a or r1.

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199

195

Table 4 Mean correlation coefficient between simulated breeding value (SIM), and estimated breeding value by the AM and AMQ for the cow population for the different probabilities of the type I error (a) Parameter a

PQ

0.3 0.3 0.5 0.5

0.5 0.3 0.5 0.3

SIM-AM

a = 5%

a = 10%

a = 20%

SIM-AMQ

AM-AMQ

SIM-AMQ

AM-AMQ

SIM-AMQ

AM-AMQ

0.661 0.669 0.658 0.645

0.669 0.678 0.691 0.674

0.985 0.982 0.949 0.955

0.669 0.679 0.691 0.673

0.984 0.980 0.949 0.956

0.668 0.679 0.689 0.670

0.984 0.980 0.951 0.958

Results are the mean of 10 independent simulations for each set of parameters.

estimates for r1 were very close to the simulated value of 0.1 for all values of a and were not significantly different as determined by a t-test, comparing the mean of the estimates to the simulated value. MSE for this parameter is nearly equal to zero for a = 0.5. The estimates for a decreased consistently, with increased type I error, for all the four parameter simulations. For a = 0.5, MSE and bias is minimum at a = 5%, while higher values of a underestimated the effect at both values of PQ. For a = 0.3 bias is minimum at a = 10% for PQ = 0.5, and at a = 20% for PQ = 0.3. Thus, higher values of a seem preferable, as the variance explained by the QTL decreases. Estimates for a were significantly different from the simulated values for a = 0.3, PQ = 0.3, and a = 5%; and for a = 0.5 and a = 20% for both values of PQ. The mean correlations between the simulated breeding value and the EBV for AM and AMQ for the cow population are presented in Table 4. The correlations between the AM and AMQ EBV are about 0.95 for a = 0.5, and 0.98 for a = 0.3. Thus including the marker

effects in the model does not result in dramatic changes in the daughters’ EBV. The correlations between the simulated breeding value and the EBV are always higher for the AMQ than for the standard AM. The increase in correlation is greater for a = 0.5 than for a = 0.3. For a = 0.5, correlations between EBV and simulated BV were lowest at a = 20%, but all differences between different a values were very small. The mean SBV of the 20 daughters with the highest SBV, and mean SBV of the 20 daughters with the highest EBV according to the AM and AMQ, are presented in Table 5 for the different values of a, PQ and a. The ratio of AMQ to AM for a = 10% is also presented. Mean SBV of the cows selected by either AM or AMQ were considerable less than the 20 cows with the highest SBV. Mean SBV for AMQ with a = 10% was between 2% and 7% higher than for AM. Thus, AMQ was able to more accurately select the elite cows. For a = 0.3, highest mean SBV was obtained at a = 10%, but all differences were very small.

Table 5 Mean simulated breeding values of the 20 daughters with the highest simulated breeding value (SIM), and the mean simulated breeding values of the 20 daughters with the highest estimated breeding value by the AM and AMQ, for the different probabilities of the type I error (a) Parameter

SIM

a

PQ

0.3 0.3 0.5 0.5

0.5 0.3 0.5 0.3

1.539 1.414 1.410 1.323

AM

1.012 0.869 0.924 0.752

AMQa

AMQ a = 5%

a = 10%

a = 20%

AM

1.033 0.892 0.957 0.807

1.037 0.915 0.957 0.808

1.029 0.908 0.951 0.821

1.025 1.053 1.036 1.074

Results are the mean of 10 independent simulations for each set of parameters. a AMQ for a = 10%.

196

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199

Table 6 Mean transmitting ability of the 20 daughters with the highest simulated breeding value (SIM), and the mean transmitting ability of the 20 daughters with the highest estimated breeding value by the AM and AMQ, for the different probabilities of the type I error (a) Parameter a

PQ

0.3 0.3 0.5 0.5

0.5 0.3 0.5 0.3

AMQa

SIM

AM

AMQ a = 5%

a = 10%

a = 20%

AM

0.799 0.764 0.713 0.699

0.506 0.435 0.462 0.376

0.524 0.480 0.510 0.465

0.528 0.502 0.510 0.461

0.527 0.495 0.507 0.460

1.043 1.154 1.104 1.226

Results are the mean of 10 independent simulations for each set of parameters. a AMQ for a = 10%.

The mean TA of the 20 daughters with the highest SBV, and the mean TA of the 20 daughters with the highest EBV according to the AM and AMQ, are presented in Table 6 for the different values of a, PQ, and a. The ratio of AMQ to AM for a = 10% is also presented. For a = 0.5 maximum TA was obtained with a = 5%, and minimum with a = 20%. For a = 0.3 maximum TA was obtained with a = 10% for both values of PQ. For small effects, higher values of a result in more efficient selection. However, differences in TA for the three values of a are small for the four parameter combinations. Mean TA for AMQ with a = 10% was between 4% and 23% higher than for AM. The ratios of AMQ and AM were greater for PQ = 0.3 for both values of a. Thus the gain obtained by MAS is greater if the favorable allele is at a lower frequency. The increase of genetic gain due to MAS of young sires, as compared to dam selection was between 2% and 15%. The mean number of common elite cows selected by AM, AMQ, and SBV are presented in Table 7. The number of common cows selected by SBV and either

AM or AMQ were very similar. The type I error rate also had virtually no effect. From 13 to 18 common cows were selected by AM and AMQ. Thus, implementation of the AMQ will affect about a quarter of the bull dams chosen. Ten populations were also simulated with a = 0. Four replicates at a = 5% and 10%, and five at a = 20% gave regression coefficients of opposite signs, indicating that data are incompatible with a QTL segregating between the two markers. In addition, QTL effects could not be estimated in two populations with a = 5%, and one population with both a = 10% and 20%, because regression coefficients were equal to zero. In the remaining populations correlations between AM and AMQ cow EBV were 0.998 for a = 5%, and 0.996 for a = 20%. With a = 5%, the same 20 elite cows were selected by AM and AMQ. Differences were minimal for higher values of a. The mean numbers of sires determined to be heterozygous for the QTL were 1.3, 2.2, and 3.8 for a = 5%, 10% and 20%, respectively. These values were similar to the expected numbers of type I errors

Table 7 Mean number of common cows among the twenty with the highest estimated breeding values as determined by simulated breeding values (SIM), AM and AMQ for the different probabilities of the type I error (a) Parameter a

PQ

0.3 0.3 0.5 0.5

0.5 0.3 0.5 0.3

SIM-AM

a = 5%

a = 10%

SIM-AMQ

AM-AMQ

SIM-AMQ

AM-AMQ

SIM-AMQ

AM-AMQ

2.9 3.3 2.6 2.7

3 3.5 2.6 3

17.5 15.8 13.4 13.5

3.1 3.2 2.6 2.9

17.4 14.5 13.4 13.6

3 3.2 2.4 3.1

17.1 14.3 13.5 14.2

Results are the mean of 10 independent simulations for each set of parameters.

a = 20%

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199

with 20 sires: 1, 2, and 4; for the corresponding values of a. None of the estimated values were significantly different from the expected number of type I errors, as determined by a v2 test for ‘goodness-of-fit’.

4. Discussion Sire QTL genotype and linkage phase were determined based on genotyping their daughters for two markers bracketing the QTL. The fewest errors in sire QTL genotype identification were found at a = 10% for a = 0.3 and at a = 5% for a = 0.5. If a sire was correctly determined to be heterozygous for the QTL, linkage phase was always determined correctly. Recombination distances between the markers and the QTL will vary, and should also affect the power to detect segregating QTL. The method of Whittaker et al. (1996) was able to derive accurate estimates of QTL location and effect with marker effects included in an AM analysis model. The type I error influenced the estimates for the QTL effect, the bigger the type I error, the greater the underestimate. This was expected, since for higher values of a, more homozygous sires are assumed to be heterozygous, and the mean effect estimated will be decreased. In the current study all the daughters were assumed to be genotyped, but none of the dams. In analysis of a commercial population, only a very small fraction of the population will be genotyped, potentially including animals of several generations. Individuals that were not genotyped can be included in the analysis, as described by Israel and Weller (1998). Most studies that have considered MAS in dairy cattle have assumed that the major contribution of marker information will be via preselection of young sires with the desired marker genotypes. Parent EBV are computed based on pedigree and phenotypic information, and then marker information is used for within-family selection. Spelman and Garrick (1998) evaluated the effect of different threshold levels in identifying QTL genotype of sires of sons for ‘bottom-up’ MAS. A sire was deemed to be heterozygous if the difference between the two haplotype progeny groups was greater than a given threshold. They concluded that lower thresholds ( < 0.8ra) were genetically and economically optimal when reproductive

197

technologies were used in conjunction with MAS. This corresponds to high values for a. They also found that over the range of 60 to 150 daughters, the number of daughters genotyped to determine the QTL genotype of their sire did not affect the expected rate of genetic gain due to MAS. Kashi et al. (1990) evaluated the effects of an increase or decrease in number of daughters scored, recombination rate, and type I error on the accuracy of sire QTL genotype determination. The range of QTL effects considered, a = 0.15 – 0.3, was lower than in the present study, but they assumed that more daughters per sire were genotyped. They found that a type I error of approximately 0.10 was usually optimal. Increasing type I error to 0.20 always increased the number of incorrect QTL genotype determinations. Expected gains per generation if 500 daughters per sire were genotyped were greatest for type I error of 0.20, but gains were generally greater for type I error of 0.05 –0.10 if more daughters were genotyped. Mackinnon and Georges (1998) also found that for a ‘bottom-up’ scheme, genetic gains decreased linearly with increasing thresholds, and a low criterion for accepting estimated QTL effects as real was optimal. On the other hand, top-down schemes are unsuitable at low thresholds, because of the enormous costs. They assumed that only young sires with favorable marker genotypes for all segregating QTL would be selected for progeny testing. If several QTL are considered, they proposed increasing the number of candidate bull calves so that the required number of calves with the favorable marker genotype for all segregating QTL can be obtained. This is not a viable option if the number of segregating QTL is large. Furthermore, some candidate calves without the desired marker genotypes may have overall higher EBV. Using the methods presented in the current study these problems are solved, because candidates for selection are ranked on EBV based on all sources of information. Thus it is not necessary to increase the number of candidate bulls to apply MAS, as proposed by Mackinnon and Georges (1998), even if many QTL are included in the breeding program. For the range of a checked, type I error does not appreciably affect selection decisions. The differences in Tables 3 –6 are small for the different values of a. Even with a relative large a, AMQ was never inferior to traditional selection index. Also when the simulated

198

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199

value for the QTL was zero, EBV from AM and AMQ was nearly identical. Therefore, using the methods presented in the current analysis, MAS is never inferior to phenotypic selection over the range of type I errors tested, and there is virtually no loss due to selection on a nonexistent QTL. The proposed method can be used to accurately compare young bulls with or without marker information, and to compare young bulls with differing pedigrees. In addition, if marker information is incorporated into the genetic evaluation model, this information can also be used to more accurately select bull dams, as demonstrated in Table 5. Increased gains were obtained by making use of the marker information, as shown in Table 6. Although the increase in the correlation between SBV and EBV with marking information was slight, this difference should increase if additional QTL are included in the analysis. The increase was greater for the larger QTL effect. The model proposed was evaluated in a twogeneration population. The more accurate evaluations obtained as compared to a standard AM, suggest that extra genetic response will be obtained by adopting the model for routine genetic evaluation. Tracing the marker information along several generations is not trivial, because linkage phase relationships will break down. Recently, Weller et al., (2003) have investigated a similar model in a multigenerational scheme using an actual QTL segregating in the Israel-Holstein population.

5. Conclusions Nearly unbiased estimates for the QTL location and effect were obtained by the AMQ, using the algorithm of Whittaker et al. (1996). Estimates for the QTL effect were affected by the type I error chosen. Genetic evaluations were more accurate by the model proposed than by a standard AM. The correlations between the simulated breeding value and the EBV were always higher for the AMQ than for the standard AM, especially for a = 0.5. Superior elite daughters were chosen by the AMQ, as compared to the standard AM. The advantage was increased if selection of candidate bulls based on marker genotype was also considered. Type I error does not appreciably affect the success of the system over a

broad range. Virtually no loss resulted if a nonexistent QTL was included in the AMQ analysis. Acknowledgements This research was supported by a grant from the Israel Cattle Breeders Association (ICBA), and the U.S. – Israel Binational Agricultural Research and Development Fund (BARD). We thank M. Soller and E. Ezra for useful discussions. References Fernando, R., Grossman, M., 1989. Marker assisted selection using best linear unbiased prediction. Genet. Sel. Evol. 21, 467 – 477. Heyen, D.W., Weller, J.I., Ron, M., Band, M., Beever, J.E., Feldmesser, E., Da, Y., Wiggans, G.R., VanRaden, P.M., Lewin, H.A., 1999. A genome scan for QTL influencing milk production and health traits in dairy cattle. Physiol. Genom. 1, 165 – 175. Israel, C., Weller, J.I., 1998. Estimation of candidate gene effects in dairy cattle populations. J. Dairy Sci. 81, 1653 – 1662. Israel, C., Weller, J.I., 2002. Estimation of quantitative trait loci effects in dairy cattle populations. J. Dairy Sci. 85, 1285 – 1297. Kadarmideen, H.N., Dekkers, J.C.M., 1999. Regression on markers with uncertain allele transmission for QTL mapping in half-sib designs. Genet. Sel Evol. 31, 437 – 455. Kashi, Y., Hallerman, E., Soller, M., 1990. Marker assisted selection of candidate bulls for progeny testing programmes. Anim. Prod. 51, 63 – 74. Mackinnon, M.J., Georges, M.A.J., 1998. Marker-assisted preselection of young diary sires prior to progeny-testing. Livest. Prod. Sci. 54, 229 – 250. Ron, M., Band, M., Wyler, A., Weller, J.I., 1993. Unequivocal determination of sire allele origin for multiallelic microsatellites when only the sire and progeny are genotyped. Anim. Genet. 24, 171 – 176. Spelman, R.J., Garrick, D.J., 1998. Genetic and economic responses for within-family marker assisted selection in dairy cattle breeding schemes. J. Dairy Sci. 81, 2942 – 2950. Spelman, R.J., Garrick, D.J., van Arendonk, J.A.M., 1999. Utilization of genetic variation by marker assisted selection in commercial diary cattle populations. Livest. Prod. Sci. 59, 51 – 60. Weller, J.I., 2001. Quantitative Trait Loci Analysis in Animals. CABI Publishing, London. Weller, J.I., Fernando, R.L., 1991. Strategies for the improvement of animal production using marker assisted selection. In: Schook, L.B., Lewin, H.A., McLaren, D.G. (Eds.), Gene Mapping: Strategies, Techniques and Applications. Marcel Dekker, New York, pp. 305 – 328. Weller, J.I., Golik, M., Serroussi, E., Ezra, E., Ron, M., 2003. Population-wide analysis of a QTL affecting milk-fat production in the Israeli Holstein population. J. Dairy Sci. 86, 2219 – 2227. Weller, J.I., Kashi, Y., Soller, M., 1990. Power of daughter and

C. Israel, J.I. Weller / Livestock Production Science 85 (2004) 189–199 granddaughter designs for genetic mapping of quantitative traits in dairy cattle using genetic markers. J. Dairy Sci. 73, 2525 – 2537. Whittaker, J.C., Thompsom, R., Visscher, P.M., 1996. On the mapping of QTL by regression of phenotype on marker-type. Heredity 77, 23 – 32.

199

Zhang, Q., Boichard, D., Hoeschele, I., Ernst, C., Eggen, A., Murkve, B., Pfister-Genskow, M., Witte, L.A., Grignola, F.E., Uimari, P., Thaller, G., Bishop, M.D., 1998. Mapping quantitative trait loci for milk production and health of dairy cattle in a large outbred pedigree. Genetics 149, 1959 – 1973.