Bias in Sire Evaluation Due to Selection

Bias in Sire Evaluation Due to Selection

BIAS IN SIRE EVALUATION D U E ' TO S E L E C T I O N L. D. VAN VLECK .(NI) C. R. HENDERSON Department of Animal Husbandry, Cornell University, Ithac...

573KB Sizes 2 Downloads 162 Views

BIAS IN SIRE EVALUATION

D U E ' TO S E L E C T I O N

L. D. VAN VLECK .(NI) C. R. HENDERSON Department of Animal Husbandry, Cornell University, Ithaca, New York SUMMARY

The first and second lactation records of New York artifically sired Holstein cows were analyzed to determine the effect of culling after the first lactation on sire evaluation based on both first and second lactation records. Results indicated that weighting first and second records according to number of records per cow, repeatability, and heritability evaluated sires ahnost identically with the method which uses the average of a daughter's first and second records. Even with a pronounced differential culling rate after the first lactation, there was no evidence of a differential bias in valuating sires of different genetic merit based on first and second lactation records.

There are many possible sources of bias in sire evaluation. One bias may result from culling of daughters based on their previous records. The future records of those selected will not be a representative measure of the sire's true genetic value. I f there is a differential selection intensity among daughter groups, then a differential bias will result in the evaluation of the sires. The purpose of this study was to exaufine the extent of this differential bias. Several restrictions and assumptions were necessarily made. The first was that no selection was practiced before initiation of the first lactation. This assumption makes available a base from which to judge the effects of later culling. Implicit also is the assumption that extension factors in current use are suitable for extending first records terminated before completion. The inferences which can be made from the study will be restricted, due to the inclusion of only first and second lactation records. I t could be readily supposed that iuclusion of later records would intensify any differential bias in sire evaluation. I n practice, however, sire evaluation is very seldom based on more than first or second lactation records. I f evaluation is not made before the third, fourth, or fifth lactation records are available, the sire probably will not have much influence on the dairy population. I n fact, nmst decisions to accept or reject a sire from service in artificial insemination (A.I.) are the result of daughter performance in first lactations only. There is some evidence, Robertson and Khishin (8) and Hickman and Henderson (6), which suggests that this procedure does not have a negative effect on later or, even lifetime, production. Rec~eived for publication March 7, 19'63.

A more specific purpose of this study was to compare evaluation of A.I. sires on the basis of large numbers of first lactation daughter records with evaluation including second lactation records. DATA AND ~¢[ETHODS

The records of Holstein daughters included in the A.I. sire file of the New York Dairy Records Processing Center as of JanualT 1, 1961, (this includes recordu of cows calving before December, 1959) were available for this analysis. All records were put on a 2 ×, M.E. 305-day basis with standard adjustment factors used by the Records Center. The records were classified into various types as described in Table 1. I t wilI be noticed that rather arbitrary definitions of first and second lactations were necessary. A catego~T of extra interest was established for first records which did not have a subsequent reported record started within 20 months of the first. Sires were put into six classes, depending upon their first lactation daughter-herd-mate milk production differences. Sire Level One included those whose daughters exceeded their herd-mates by more than 800 lb of milk; Level Two, between 350 and 800 lb of milk; Level Three, between 0 and 350 lb; Level Four, between 0 and -350 lb; Level Five, between -350 and-800 lb; and Level Six, below -809 lb of milk. Future daughter superiority for the sires was estimated from first lactation milk records, from second lactation milk records, from first and second lactation records, and from various subsets of these. The procedure is that used by the New York I)ail~/Records Center until September, 1962, and has been described in detail by Henderson (2) and Heidhues, Van Vleek, 976

BIAS

IN

SIRE

EVALUATION

977

TABLE 1 Definitions of symbols used to describe the classification of various types of daughter records 1 = Records begun prior to 36 months of age and thus considered to be first records. 2 = Records begun less than 20 months after a first record or if not associated with a first record begun between 36 and 48 months of age and thus considered to be second records. 11 : First records which had a second record begin within 20 months of initiation of the first record. 10 = First records which did not have a second record and which were from cows old enough to have had a second record at the time of this study (January, 1961; cows born before 1956 ~were assumed to fall in this category). 13 = First records with a subsequent record which did not begin within 20 months of the first record. (The extra-long interval may mean the cow's record was not centrally processed if another record actually began within 20 months of the first record or that the calving interval was actually greater than 20 months.) 21 = All second records with first records in the file. 20= All second records not associated with first records front this study. (These could be first records begun after 35 months of age or a second record not having a previous centrally processed record.) and Henderson (1). The computing f o r m u l a is : Estimated f u t u r e n daughter superiority ---- A.I. + n + 1-----5 [D -- .9 ( S M -- D H I A ) -- A.I.] where A.I. = A.I. breed average f o r the current evaluation = 12,668 lb milk f o r the J a n u a r y 1, 1961, report, D = sire daughter average of 2 X , 305day M.E. records, n = nmnber of daughters, S M = adjusted herd-mate average (Henderson, Carter, and Godfrey, 5) .9----intra-sire regression of daughter teeords on their adjusted herd-mate averages, and D H I A ---- D H I A 5-yr breed average f o r the curr e n t evaluation = 12,277 lb milk f o r the J a n u a r y 1, 1961, report. There are several possible ways to combine first and second lactation records f o r sire evaluation. Two ways would be to use only first records or, alternatively, only second records. The latter would seem immensely unpractical. Another would be to use the direct average of the two records. The average would be the value used for each daughter in the sire daughter average. This method was used at the New York center, although it was known that under such a procedure the average of more than one record has too small weight, i.e., the same weight as single records. A t the initiation of the New Y o r k system the computing facilities were insufficient to do otherwise. Another method with even more computional ease is to weight each record e q u a l l y - - n o t distinguishing between repeated records or single records. This allows too much weight to the daughters with

repeated rcords. The selection index procedure is to weight the average of a daughter record according to the number of records, repeatability, and heritability. The procedure now being used by the New Y o r k D a i r y Records Processing Center weights records in this way, but also uses a maximumlikelihood estimate for the A.I. average rather than an arithmetic a v e r a g e . As a p a r t of a larger p r o g r a m to estimate breeding values, H e n d e r s o n (3) has given the optimum weighting. T h e weight given to the i th daughter with n, records is U,/~U~ where

U' = I I + (n' -- l)

h~ - l ( 1h ~ = h e r i t ' 4

ability and r = repeatability). W h e n h-~_ -.25

and r ~ . 4 5 ,

U~--

80n~

31m + 44 The computing procedure given previously is changed correspondingly. This method of weighting first and second records is derived f r o m general selection index theory as described by Henderson (4). The properties of this index are, therefore, the same as those f o r the selection index procedure. One i m p o r t a n t p r o p e r t y is that the selection index maximizes the probability of correct ranking. Correlations were computed between estimated daughter levels based on the types of records shown in Table 1 and the index of weighted first and second records f o r sires with at least 20 daughters with first records and f o r sires with at least 200 daughters with first records. The number of sires and daughters are given in Table 2 by level o f the sire group. The to~al number of daughters and the total number of sires would be the sum of those for the six levcIs.

L. D. YAN ~LECK AND C. R. HENDERSON

978

TABLE Number

Level of sire

of sires and

number

of daughter

2 records included

in average

differences ~

Type of daughter records 1

2

11 and 21 b

13

10

20

1 2 3 4 5 6

Sires w ~ h number of duughters with first records ~ 2'0 2,150(12) 947(12) 620(12) 29(3) 146(7) 8,851(22) 7,750(2.2) 5,069(22) 409(15) 1,549(18) 5,604(29) 4,441(29) 2,888(29) 288(20) 1,033(27) 12,615(39) 10,374(39) 6,627(39) 411(26) 2,526(35) 9,344(38) 9,271(38) 5,700(38) 402(29) 2,417(36) 5,487(32) 5,013(32) 3,215(32) 208(22) 1,399(29)

327(11) 2,681(22) 1,553(27) 3,747(39) 3,571(38) 1,798(32)

1 2 3 4 5 6

Sires with number of daughters with first records ~> 200 1,502(3) 769(3') 505(3) 28(2) 124(3) 7,916(9) 7,117(9) 4,672(9) 376(9) 1,423(9) 3,731(8) 3,218(8) 2,116(8) 181(7) 741(8) 10,989(15) 9,320(15) 6,002(15) 373(13) 2,286(14) 7,733(11) 7,288(11) 4,801(11) 330(11) 2,072(11) 4,125(7) 3,874(7) 2,556(7) 164.(7) 1,099(7)

264(3) 2,445(9) 1,102(8) 3,318(15) 2,487(11) 1,318(7)

" The first value is the number of daughter records and the value in brackets is the number of sire groups represented. b The number of 11 and 21 records was the same. RESULTS A~D DISCUSSIO~T The estimated correlations a m o n g the estim a t e d d a u g h t e r levels f r o m six k i n d s of first or second lactation records a n d including as a seventh p r o c e d u r e the estimate derived f r o m the index of weighted first a n d second records are shown in T a b l e 3 f o r sires w i t h 20 or more d a u g h t e r s with first records a n d in Table 4 f o r sires with 200 or more first r e c o r d d a u g h t e r s . Some of these correlations are r a t h e r academic, because of the extreme d i s p a r i t y in n u m b e r of records included in the various groups.

The estimates of p a r t i c u l a r i n t e r e s t are those f o r first records w i t h the index of weighted first a n d second records a n d f o r first records w i t h second records. The first r e l a t i o n s h i p is s a t i s f a c t o r i l y h i g h ( n e a r u n i t y ) even w h e n as few as 20 d a u g h t e r records a r e included in the d a u g h t e r level estimates. The correlations between various t y p e s of first a n d second records are s o m e w h a t confusing. A s the n u m b e r of d a u g h t e r s p e r g r o u p becomes large the corr e l a t i o n s between all firsts a n d seconds w i t h o u t firsts becomes a n estimate of the genetic cor-

TABLE 3 Correlations between estimated daughter superiority based on various classifications of daughter milk records--sires with number of daughters with first records ~ 20 Type of record F i r s t records

(1)

Second records

Type of record 2

11

13

10

21

20

I a

.73

.86

.56

.71

.67

~.69

.98

. . . . .

79

.42

.58

.94

.90

.82

51

.61

.76

.70

.85

40

.38

.49

.52

54

.64

.69

72

.78

(2) Firsts with seconds (11) Firsts with seconds with intervals 20 months (13) Firsts without seconds (10) Seconds of the (11) 's

. . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

(21) Seconds not having a previous record in the file (20)

..: . . . . . . . . . . . . . . . . . . . . . .

72

I is the index estimate of daughter superiority obtained by weighting all first records ( 1 ' s ) and second records with firsts ( 2 1 ' s ) .

]~IAS

IN

SIRE

EVALUATION

TABLE

979

4

Correlations between estimated daughter superiority based on various classifications of daughter milk records--sires with number of daughters with first records >f 2if0 Type of record

Type of record 2

11

13

10

21

20

I ~

F i r s t records

.88

.97

.71

.88

.82

.92

.98

Second records

.....

85

.62

.79

.99

.94

.95

Firsts with seconds (n) Firsts with seconds with intervals 20 months (13) Firsts with seconds (10) Seconds of the (11) 's

. . . . . . . . .

70

.80

.81

.87

.95

63

.58

.68

.69

74

.84

.86

88

.92

. . . . . . . . . . . . . . . . . . . . . . . . .

94

(~) (2)

Seconds not having a previous record in the file (20)

. . . . . . . . . . . . .

. . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

~ I is the index estimate of daughter superiority obtained by weighting all first records (1) 's and second records with firsts (21) 's. r e l a t i o n between first a n d second records. I t will be a slight u n d e r e s t i m a t e , since the denomin a t o r c o m p o n e n t s of the correlation estimates c o n t a i n something more t h a n the genetic v a r i a ances c o r r e s p o n d i n g to the genetic covariance in the n u m e r a t o r . This c o r r e l a t i o n of .92 {Table 4) suggests t h a t the genetic c o r r e l a t i o n between first a n d second records p r o b a b l y is in the r a n g e .90 to 1.00. The correlation in Table 4 between first a n d second records of d a u g h t e r s h a v i n g two records is smaller t h a n the correlation between first records of d a u g h t e r s h a v i n g two records a n d second records of o t h e r d a u g h t e r s n o t having first records in the file. This r e s u l t is surp r i s i n g a n d n o t easily explained. The reverse, however, is t r u e f o r Table 3, f o r which f e w e r n u m b e r s of d a u g h t e r s p e r sire were required. The lowest correlations are those f o r first records which did not h a v e a second record s t a r t w i t h i n 20 m o n t h s of the first with all o t h e r groups. The small n u m b e r of these records p e r sire g r o u p m a y p a r t i a l l y e x p l a i n the relatively low correlations, b u t t h e r e a p p e a r s to be somet h i n g a b o u t the records which makes t h e m diff e r e n t , as will be seen in a l a t e r table. T h e y seem to be m u c h h i g h e r t h a n o t h e r first records. On the o t h e r h a n d , the first records without second records would be expected to be less t h a n o t h e r first records, due to the likelihood t h a t a h i g h f r a c t i o n of these cows was culled following one lactation. The estinmted correlation between these two extreme t y p e s of records is quite low, 0.63.

More i m p o r t a n t are the correlations between three systems of u s i n g first a n d second records. These are g i v e n in Table 5. C o n t r a r y to exTABLE 5 Estimated correlations between estimated daughter superiority of sires based on ( I ) first milk records of daughters, (2) average of average first and second records, and (3) index of weighted averages of first and second records

No. sires

No. first record daughters per sire

rl,~

rl, 3

186 52

~20 ~ 200

.99 .98

.98 .98

r2, s 1.00 1.00

pectation, there seems to be little a p p a r e n t difference between u s i n g only first records, the a v e r a g e s of first a n d second records as the d a u g h t e r records, or the index of weighted first a n d second records. A l t h o u g h some bias due to selection p r o b a b l y exists when u s i n g the average of records, it is so small t h a t r a n k i n g of sires on t h a t basis is essentially the same as r a n k i n g on the basis of the correct procedure. I f it can be assumed t h a t the p r o c e d u r e u s i n g weighted a v e r a g e s is t r u l y correct, t h e n t h e r e seems to be only a slight a d v a n t a g e in u s i n g a v e r a g e records or even the index of weighted a v e r a g e records over the use of first records only. Thus, it a p p e a r s t h a t first records, presumably unselected, can be used f o r sire evalua-

980

L . D . VAN VLECK AND C . R . HENDERSON TABLE 6 C o m p a r i s o n of a v e r a g e differences of v a r i o u s d a u g h t e r milk records f r o m their h e r d - m a t e a v e r a g e s of sires of d i f f e r e n t levels of m e r l t - - - s i r e s w i t h n u m b e r of d a u g h t e r s w i t h first records ) 2 0 ~ T y p e of daughter records

Level of sire 1 (High)

2

3

4,

5

F i r s t records

1,344,

561

165

-161

-542

-1,097

Second records

1,337

543

125

95

-167

-566

F i r s t s w i t h seconds (11) F i r s t s with seconds ~dth i n t e r v a l s 20 m o n t h s (13) A v e r a g e p e r cent

1,501

652

45

-37

-371

-879

1,980

1,064

55

97

-35

-271

1.0

4.1

4.2

3.4

4.4

3.6

859

16

-234

-598

-1,116

-1,949

11.9

18.8

22.7

24,.0

25.2

26.8

1,656

418

110

79

-131

-414

1,138

652

305

96

-192

-707

(1)

(2)

6 (Low)

(13) 's of (1) 's F i r s t s w i t h o u t seconds (10) Average per cent (10) 's of (1) 's Seconds of t h e (11) 's

(21)

Seconds n o t h a v i n g a previous record irt t h e fi~e (20)

The a v e r a g e s g i v e n in t h e table are a v e r a g e s of sire group a v e r a g e differences of d a u g h t e r milk records f r o m their h e r d - m a t e a v e r a g e s .

TABLE 7 C o m p a r i s o n of a v e r a g e differences of v a r i o u s d a u g h t e r milk records f r o m t h e i r h e r d - m a t e a v e r a g e s f o r s i r e s of d i f f e r e n t levels of m e r i t - - s i r e s w i t h n u m b e r of d a u g h t e r s w i t h first records ~ 200 " T y p e of daughter records

Level of sire 1

2

3

F i r s t records

1,050

501

176

-189

-541

-958

Second records

1,273

689

367

48

-514

-767

F i r s t s with seconds (11) F i r s t s with seconds with i n t e r v a l s ~ 20 m o n t h s (13) A v e r a g e per cent

1,001

633

323

17

-340

-644

826

990

804

162

-158

-262

3.6

5.7

5.6

3.6

4.1

4.5

1,983

-200

-282

-657

-1,107

-1,712

16.0

21.9

24.1

24.4

27.7

29.6

1,293

715

354

57

-217

-744

1,231

635

441

45

-231

-795

(1)

(2)

4

5

6

( l a ) 's of (1) ,s F i r s t s w i t h o u t seconds (10) A v e r a g e p e r cent (10) 's of (1) 's Seconds of the (11) 's (21) Seconds n o t h a v i n g a previous record in t h e file (20)

" T h e a v e r a g e s given in t h e table a r e a v e r a g e s of sire g r o u p a v e r a g e differences of d a u g h t e r ml]k records f r o m their h e r d - m a t e a v e r a g e s .

B I A S IN S I R E E V A L U A T I O N

tion without the added trouble of collecting additional records. The reports of Robertson and Khishin (8), Hickman and Henderson (6), and Parker et al. (7) agree that selection on the basis of first records does not appear to have a detrimental effect on selection for later performance. The high correlation between daughter level estimates based on first records and on second records also substantiates this argument, if only through the second lactation. Tables 6, 7, and 8 present in a different form the results indicated by the previously discussed correlations. The sires were divided into six levels according to their daughter first record deviations from herd-mate averages. Tables 6 and 7 represent averages of sire averages, whereas Table 8 describes the averages of all daughters by sires falling into any of the six levels. F o r example, the average percentages (13)'s of (1)'s f o r Tables 6 and 7 were computed by determining for each sire group the fraction of (13) records included in all (1) first records and then by averaging the sire group averages. The average percentages (10)'s of (1)'s were computed the same way. The values in Table 8 were computed similarly ( e x cept that sire group averages were not determined) as fraction of (13) records of the total number of first records by daughters of sires in each sire level group.

981

Several points are of interest in these summaries. The first comparison is between all first and all second records. Although the levels vary considerably in average difference between daughter averages and stable-mate averages, the differences between the first and second records are relatively constant from sire level to sire level. The second records appear to be about 200 lb of milk higher than first records (on a M.E. basis) in all levels. Two reasons are readily available: The mature equivalent factors under correct first lactinn records relative to second records or selection may be raising the average second records above the average first records. Both factors are probably in operation. Another contrast is between first records with subsequent second records and the corresponding second records. This comparison would indicate whether the increase in M.E. production from first to second lactation is different for high-level than for low-level sires. Inspection of Table 8 (Rows 11 and 21) reveals no pattern except that the second records are lower than the first records only for daughters of the lowest-level sires. At all other sire levels the second records average higher than the first records. Of the two special comparisons the one between first records having subsequent records

TABLE 8 C~mparison of differences of various daughter milk averages from their herd-mate averages within different levels of estimated merit of their sires--sires with number of daughters with first records ~> 200 ~ Type of daughter records First records

Level of sire 1

2

3

5

6

1,052

464

154

-197

4

-597

-940

1,204

684

361

7

-252

-769

1,040

586

303

6

-371

-626

1,228

982

820

312

-46

-319

4.3

5.8

6.0

4.3

4.6

4.3

698

-89

-349

-719

-1,234

-1,728

18.9

22.0

24.4

26.4

28.8

28..8

1,160

727

343

18

-242

-765

1,290

600

394

-14

-270

-776

(1) Second records

(2,) Firsts with seconds (11) First with seconds with intervals> 20 months (13) Average per cent

(13) 's of (1) 's Firsts without seconds (10) Average per cent (10) 's of (1)'s Seconds of the (11) 's (21) Seconds not having a previous record in the file (20)

" The values given in this table are average milk record differences from herd-mate averages of daughters having sires in the sire level.

982

L . D . VAN VLECK AND C.R. HENDERSON

initiated nmre than 20 months after the first and all firsts is the most puzzling. The per cent of this type of record is relatively constant for every sire level. In almost all cases, these records are substantially larger than other first records. The differences are nearly the same for all sire levels. Although this difference can be discounted as a possible cause of differential bias in sire selection, something must occur in these 4-6% of first lactations which prevents their completion of a 12- to 13-month calving interval. The theory that some high producers burn themselves out in a first lactation and are not easy to settle would seem to be substantiated if only the high-level sires are considered. The argument fails to hold in the present data, because in the low level of sire merit the average of these cows is below their stable-mates. Thus, it would seem that due to either management practices or chance a small fraction of cows does not produce a second calf within 20 months of their first. The advantage of not diverting part of their nutrient intake to the process of growing a calf could easily account for the difference in production. I t would be interesting to compare production in the first five months of the lactation of these cows with that of other first lactation cows. Perhaps, in such cases, five-month daughter production would provide a better guide to the true genetic value of a sire than does 305-day performance. The final comparison of interest involves the first records of cows not having subsequent records with all first records. Due to the nature of the data, some of these cows probably did have additional records which were not processed at the records center. This difficulty, however, should not affect one group of sires more than any other. Depending on the table, the per cent of this type of record varies from 11-18 in the high-level sire groups to 27-30 in the low-level groups. This result is expected, since the lower producers would be culled more heavily than higher producers. There is also a corresponding increase in the difference between these first records and all first records as the sire level decreases. This pattern would be expected to create a noticeable bias when second records are included in sire evaluation. The estimated daughter level of poorer sires could be biased upward more than those of the better sires. The conlparison of first and second records does not substantiate the expected result. A tendency of daughters of high-level sires to increase in production more with age than daughters of low-level sires nmy be balancing the added advantage the low-

level sires received from a higher selection pressure among their daughters. CONCLUSIONS

The most general conclusion (and the most important from a computing standpoint) drawn from these results is that weighting the average of first and second daughter records according to number of records, repeatability, and herRability does not noticeably improve sire evaluation over the use of the average of a daughter's first and second records. The inclusion of more than the first two lactations might result in a different conclusion. The differential bias due to differential culling among daughters of bulls of various levels would probably be intensified with the consideration of additional lactations. The pronounced differential culling rate on the basis of first records among daughters of different sire levels apparently has little effect on sire eva!uation. A further conclusion is that the fraction of cows having extra-long calving intervals is apparently independent of the genetic value of sires. REFERENCES ( ] ) HEIInHUiSS, T., VAI~ Vbl~CK, L. D., AND HEII~-

(2)

(3) (4)

(5)

(6)

DERSON, C. R. Actual and Expected Accuracy of Sire Proofs Under the New York System of Sampling Bulls. Z. Tierziicht. Ziichtungsbiol., 75: 323. 1961. HE~Ci)EaSO~, C. R. Cornell Research on Methods of Selecting Dairy Bulls. Proc. New Zealand Soc. Animal Production, 16: 69. 1956. HE~CDFmSON, C. R. Program to Estimate Breeding Values. (Mimeo.) 1960. H~N])E~SO~, C. R. Selection Index and Expected Genetic Advance. Sympos. Statistical Genetics and Plant Breeding. Raleigh, North Carolina. 1961. H~NDE~SON, C. R., CXRT~, H. W., AN~) GODFREY, J. T. Use of the Contemporary Herd Average in Appraising Progeny Tests of Dairy Bulls. J. Animal Sei., 13: 959. 1954. HIC:KMAN, C. G., ANI) HE~I)E~SOI~, C. 1~. Components of the :Relationship Between Level of Production and Rate of Maturity in Dairy Cattle. J. Dairy SeL, 38: 883. 1955.

(7) PARKEt%, J-. B., B.a~YLIS~, ~NT. B., FOHt%k'~AIq, M. t~., AlqD PLOW1VgAI~, R. D. F a c t o r s Influencing Dairy Cattle Longevity. J. Dairy

Sci., 43: 401. 1960. (8) I~OB~aTSO~-, A., AND KI~ISHIr¢, S. S. The Effect of Selection for Heifer Milk Yield on the Production Level of Mature Cows. J. Agr. Sei., 50: 12. 1958.