Inferred HLA Haplotype Information for Donors From Hematopoietic Stem Cells Donor Registries

Inferred HLA Haplotype Information for Donors From Hematopoietic Stem Cells Donor Registries

Inferred HLA Haplotype Information for Donors From Hematopoietic Stem Cells Donor Registries Pierre-Antoine Gourraud, Phillipe Lamiraux, Nabil El-Kadh...

120KB Sizes 0 Downloads 15 Views

Inferred HLA Haplotype Information for Donors From Hematopoietic Stem Cells Donor Registries Pierre-Antoine Gourraud, Phillipe Lamiraux, Nabil El-Kadhi, Colette Raffoux, and Anne Cambon-Thomsen ABSTRACT: Human leukocyte antigen (HLA) matching remains a key issue in the outcome of transplantation. In hematopoietic stem cell transplantation with unrelated donors, the matching for compatible donors is based on the HLA phenotype information. In familial transplantation, the matching is achieved at the haplotype level because donor and recipient share the block-transmitted major histocompatibility complex region. We present a statistical method based on the HLA haplotype inference to refine the HLA information available in an unrelated situation. We implement a systematic statistical inference of the haplotype combinations at the individual level. It computes the most likely haplotype pair given the phenotype and its probability. The method is validated on 301 phase-known phenotypes from CEPH families (Centre d’Etude du Polymorphisme Humain). The method is further applied to 85,933 HLA-A B DR typed unrelated ABBREVIATIONS BMD bone marrow donor CEPH Centre d’Etude du Polymorphisme Human HLA human leukocyte antigen

INTRODUCTION Allogeneic hematopoietic stem cell (HSC) transplantation is now a well-established curative therapy for an increasing number of hematologic diseases [1–3]. The

From the Faculty of Medicine, INSERM, Toulouse, France (P.A.G., A.C.-T.), Laboratoire Epitech de Recherche en Informatique Appliquee´, Le Kremlin Bicêtre, France (P.L., N.E.-K.), and FGM France Greffe de Moëlle, French Registry of Hematopoïetic cells Donors, Paris, France (C.R.). Address reprint requests to: Pierre-Antoine Gourraud, INSERM Unit 558, Faculty of Medicine, 37 allées Jules Guesde, F-31000, Toulouse, France; E-mail: [email protected]. Supported by EFG (Etablissement Français des Greffes) Grant 2003, EU contract MADO No: QLG7-CT-2001-00065. Received December 10, 2004; revised January 3, 2005; accepted January 11, 2005. Human Immunology 66, 563–570 (2005) © American Society for Histocompatibility and Immunogenetics, 2005 Published by Elsevier Inc.

donors from the French Registry of hematopoietic stem cells donors (France Greffe de Moëlle). The average value of prediction probability is 0.761 (SD 0.199) ranging from 0.26 to 1. Correlations between phenotype characteristics and predictions are also given. Homozygosity (OR ⫽ 2.08; [2.02–2.14] p ⬍10⫺3) and linkage disequilibrium (p ⬍10⫺3) are the major factors influencing the quality of prediction. Limits and relevance of the method are related to limits of haplotype estimation. Relevance of the method is discussed in the context of HLA matching refinement. Human Immunology 66, 563–570 (2005). © American Society for Histocompatibility and Immunogenetics, 2005. Published by Elsevier Inc. KEYWORDS: Donor registry; HLA haplotypes; population immunogenetics; statistical application; transplantation

HSC OR

hematopoietic stem cell odds ratio

role of human leukocyte antigen (HLA) matching between donor and recipient has been studied by many groups over the past years, but its optimal level remains unclear [4, 5]. The development of molecular typing techniques allows a refined matching and thus contributes to reduce risk of graft immunologic failure from host-versus-graft and graft-versus-host allorecognition. The best donor remains an HLA-matched relative, but such a donor is not always available. In 70% of the cases, a search for an unrelated HLA-matched donor is performed among the 9.1 million bone marrow donors (BMDs) gathered in 54 stem cell donor registries from 40 countries and 0198-8859/05/$–see front matter doi:10.1016/j.humimm.2005.01.011

564

37 cord blood registries from 21 countries from BMDs worldwide (http://www.bmdw.org) and the World Marrow Donor Association (http://www.worldmarrow.org/). Nevertheless, the amount of HLA information taken into account is different. Indeed, through typing the patients’ relatives, the actual level of HLA information used in familial HSC transplantation is the HLA haplotypes: matching is thus for two haplotypes (genoidentical situation) or only one (semi– haplo-identical situation) segregating in the family. In contrast, in unrelated situations, the haplotype information is known in the patient but not in the donor: that is, there is an asymmetry in the information available. This is usually solved by taking into account the most minimally shared information; namely, the phenotypic one. The large content of the BMD registries enables the estimate of HLA frequencies in a given population [6 –10]. The HLA population genetics data have always been a relevant field to apply maximum likelihood estimation of haplotype frequencies [11–13]. Because of the structure of the major histocompatibility complex region, such a method has successfully overcome the lack of phase information at an individual level to produce haplotype frequencies in populations. Besides their interest from the population point of view, we investigate here their possible use for the selection of unrelated donors from BMD registries in individual cases. The aim is to study how much population frequency information can be used to upgrade the donor information taken into account for the individual decision at haplotype level rather than downgrading the patient one at phenotype level. Knowing the genetic background of donors throughout the registries, we implemented a systematic statistical inference of haplotype pairs at the individual level. It computes the most likely haplotype pair given the phenotype and haplotype frequency information in the donors’ population as additional information. Incomplete phenotype and use of HLA nomenclature is allowed. Genetic properties influencing the accuracy of the prediction are discussed and may be of interest in genetic epidemiology as an example of individual haplotype inference procedure.

POPULATION AND METHODS As reminded in the following sections, a diploid three contiguous locus phenotype can result in a maximum of four distinct phase configurations on the chromosomes.

P.A. Gourraud et al.

HLA Phenotype Locus A-Locus B-Locus DR (1, 2)-(8, 44)-(4, 3) 1-8-3, 2-44-4 Possible 1-44-3, 2-8-4 pairs of 1-8-4, 2-44-3 haplotype 1-44-4, 2-8-3



For a K-ploid phenotype of R contiguous loci, n, the number of possible pairs of haplotypes is n ⫽ KHr⫺1, where the number of heterozygous loci is HR. There is only one possible pair if only one locus is heterozygous. The proposed algorithm deals with this issue. Algorithm Given haplotype frequencies, the algorithm computes the likelihood for each possible phase. Then, it selects the one with the maximum value: HLA Phenotype Locus 1-locus 2-locus A-B-C, a-b-c ⱍ L1 ⫽ 2 ⫻ fABC ⫻ fabc



A-B-c, a-b-C ⱍ L2 ⫽ 2 ⫻ fABc ⫻ fabC A-b-C, a-B-c ⱍ L3 ⫽ 2 ⫻ fAbC ⫻ faBc A-b-c, a-B-C ⱍ L4 ⫽ 2 ⫻ fAbc ⫻ faBC

If the obtained pair of haplotypes is homozygous, the likelihood of such (unambiguous) pair is the squared value of the haplotype frequency estimation. The probability p of the most likely pair of haplotypes is: P⫽

max(Li, i 僆 n * i ⱕ I) I

兺 i⫽1

(1)

Li

Where p is the prediction probability of the most likely haplotype pair; i is a natural integer used to enumerate the different haplotype pairs, Li is the likelihood of haplotype pair I as defined previously, given haplotype frequencies and Hardy-Weinberg equilibrium; and I is the overall number of possible haplotype pairs indexed by i. The method ability to find the most likely haplotype pair is given by mean median (measure of central tendency) and percentiles (a value on a scale of 100 that indicates the percentage of the distribution of the phase prediction value that is equal to or below it) of the distribution of P probability defined in Equation 1 over the considered sample. Several alternative estimations can be provided:

Inferred Haplotype Information for Donor Selection

1. Phenotypes sometimes include ambiguous codes. If those are specified, their handling is implemented in the algorithm. For example, if A9 must be solved considering A23 and A24 as possible alleles, the algorithm can produce the corresponding possible pairs and compute the corresponding likelihood. An example is given following in the event that DR3 must be solved considering DR17 and DR18 as possible alleles: HLA Phenotype Locus A-Locus B-Locus DR (1, 2)-(8, 44)-(4, 3) HLA-DR 3 ⫽ HLA-DR 17 OR 18 1-8-18, 2-44-4



1-44-18, 2-8-4 1-8-4, 2-44-18 1-44-4, 8-2-18 1-8-17, 2-44-4 1-44-17, 2-8-4

Possible pairs of haplotype

1-8-4, 2-44-17 1-44-4, 2-8-18

Haplotype prediction software achieves the same computations over the set of possible phases that is deducted from the implementation of nomenclature codes. 2. Phenotypes are sometimes incomplete. To predict the possible haplotypes in such cases, the algorithm produces all possible haplotype pairs corresponding to the incomplete phenotype and computes the corresponding likelihood. During the phase prediction, a set of options manages the implementation of the nomenclature and the replacement of missing values in the phenotype. These features are implemented in a software named “haplopred” (available on request to the authors). Computations are easily achieved on a personal computer. It is a C-written software developed with corresponding libraries to make it usable in a flexible way to BMD registry computer system management. The algorithm presented requires a set of haplotype frequencies. It has been applied to two sets of HLA data. The first one consists of individuals with known haplotypes from family segregation to validate the estimation of haplotype pairs predicted by statistics. The second one consists of unrelated phase-unknown individuals from the French BMD Registry to describe the outcome of the method.

565

Application on Phase-Known Data Centre d’Etude du Polymorphisme Human (CEPH) families have been used to apply the algorithm on phase-known data (data available on request). HLA-A, -B, -DR haplotypes were deduced from the familial study of HLA segregation [14]: 301 different pairs of HLA-A, -B, -DR haplotypes were obtained from 39 families. The algorithm for prediction deals with each phenotype. The outcome is compared with the actual phase as defined by the study of segregation. The ␹2 test assesses the statistical significance of the predicted accuracy of the method. Application on Phase-Unknown Population Data Potential donors from the French BMD Registry typed for HLA-AB and -DRB1 were used (N ⫽ 85,933). The haplotype estimation is based on the likelihood methods implemented within an expectation maximization algorithm according to the previously implemented procedures [6, 7]. As an approximation, all individuals with only one allele at a given locus were analyzed as homozygous at this locus. The description of the population can be found on the annual report of the French Registry (http://www.fgm.fr). The debate on the use of the BMD Registry to infer HLA haplotype frequencies has been largely discussed, as has the potential bias (such as selection on HLA-DR typing) [6 –10]. Prediction statistics distribution and properties are presented on the results obtained from the BMDs’ dataset. A prediction probability is given to each most likely haplotype pair assigned in the context of the search for unrelated hematopoietic stem cell donor. The likelihood of each haplotype pair is based on the phenotype of the individual and on the population haplotype frequencies. A priori, each haplotype pair has the same chances to occur, thus defining a minimal prediction value. For example, in a phenotype with three heterozygous loci, four pairs of haplotypes are possible. In this case, the minimal prediction value is 25%. This minimal value would be the one obtained in absence of gametic disequilibrium. The detailed description of the outcome of this prediction is given in the set of the HLA-ABDR phenotypes of French BMD Registry. The influence of several factors has been evaluated. Key factors that are correlated with the quality of the prediction outcome were quantified by odds ratio (OR) and tested using the ␹2 test. RESULTS Of 301 phase-known phenotypes from CEPH families, the observed number of correct predictions is 69.4% (n ⫽ 209/301). According to the prediction probabil-

566

P.A. Gourraud et al.

TABLE 1 The 10th, 25th, 50th, 75th, and 90th percentile of haplotype inference probability for HLA phenotype for 85,933 French unrelated bone marrow donors HLA-A, -B, -DR phenotypes (bottom) and HLA-A HLA-B phenotypes (top) Phenotype A, B Percentile (%)

Value

CI (95%)

10 25 (Median) 50 75 90

0.592 0.733 0.937 0.997 1

0.590–0.594 0.730–0.737 0.937–0.937 0.996–0.997 X

Phenotype A, B, DR Percentile (%)

Value

CI (95%)

10 25 (Median) 50 75 90

0.481 0.588 0.794 0.956 0.994

0.479–0.484 0.586–0.59 0.792–0.797 0.955–0.957 0.993–0.994

Abbreviation: HLA ⫽ human leukocyte antigen. Last column reports 95% confidence intervals (CIs) of the percentiles estimated.

ities, the average ability of the algorithm to correctly predict an individual haplotype pair is expected to be 76.64% (standard deviation ⫽ 20.4%). The observed number of correct predictions is not significantly different from the expected number (␹2 test, p ⫽ 0.17). Among the 85,933 phenotypes typed for HLA-A, -B, -DR, the average value of HLA-ABDR haplotype pairs prediction probability is 0.761 (standard deviation 19.9%), ranging from 0.26 to 1. Table 1 shows the

FIGURE 1 Distribution of haplotype prediction given phenotypes on 85,933 French unrelated bone marrow donors. Distribution of prediction obtained on human leukocyte antigen (HLA)-A HLA-B phenotypes are given in white. Distribution of prediction obtained on HLA-A, HLA-B, HLA-DR phenotypes are given in black.

distribution of the prediction probability according to their 10th, 25th, 50th, 75th, and 90th percentiles for HLA-ABDR haplotype pairs and HLA-AB haplotype pairs prediction. The distribution of the prediction probability in HLA-A, -B, -DR phenotypes and HLA-A, -B phenotypes is given in Figure 1. Many huge differences do exist between phenotypes, suggesting the interplay of various parameters: allele and haplotype frequency, homozygosity, and linkage disequilibrium. To clarify the different factors influencing the outcome of the prediction, examples are given in Table 2. This table is divided into four parts according to the prediction reliability. Individuals with at least two loci considered as homozygous represent two thirds (4760/ 6223) of the predicted haplotype pairs given as nonambiguous (p⫽1) (Table 2, Part 1). The remaining ones (1463/6223) correspond to different situations as shown in Table 2, part 2. They include the associations of a frequent haplotype and a rarer one, or two quite rare haplotypes in a strong linkage disequilibrium. This also applied to predictions close to 1, including some phenotypes made of two frequent haplotypes (Table 2, Part 3). As expected, the level of prediction is low prediction when phenotypes include several frequent alleles in low linkage disequilibrium. Examples are given in Table 2, part 4. A few factors seem to be the major ones influencing the likelihood of the prediction: the degree of homozygosity, the frequency of the alleles, the frequency of haplotypes predicted, and the linkage disequilibrium (the nonrandom association of alleles at two physically linked loci). Their consequences have been quantified by OR and underscore the following facts:

Inferred Haplotype Information for Donor Selection

567

TABLE 2 Table of examples of haplotype pair prediction on 85,933 French unrelated bone marrow donors Part 1: Phenotype with at least two homozygous loci Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

3-7-15 3-7-1 2-35-11 2-44-1 2-62-4 1-8-15

0,0257 0,0032 0,0032 0,0056 0,0091 0,0035

3-7-15 3-40-1 2-35-13 2-27-1 2-62-13 1-8-17

0,0257 0,0001 0,0021 0,0035 0,0046 0,0227

1 1 1 1 1 1

Part 2: Nonambiguous haplotype pair prediction Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

2-60-11 1-8-7 1-8-17 3-64-13 1-37-7 30-18-17 2-39-14 2-7-17 26-38-13 2-41-13

0,0012 0,0023 0,0227 0,0002 0,0003 0,0040 0,0004 0,0005 0,0032 0,0008

29-44-11 31-44-7 3-53-17 3-7-8 25-44-7 30-39-1 25-18-14 25-8-17 66-41-13 11-35-103

0,0026 0,0007 1,4e-5 0,0011 0,0001 4,0e-5 0,0003 0,0002 0,0003 0,0007

1 1 1 1 1 1 1 1 1 1

Part 3: Haplotype pair prediction ⬎95% Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

1-8-3 1-8-17 3-7-15 2-13-7 2-12-4 3-14-7 1-8-17 2-44-16 1-8-13 1-8-3 2-60-13 1-17-7 2-57-7 1-8-3 1-8-3 24-57-7

0,0227 0,0227 0,0257 0,0035 0,0015 0,0006 0,0227 0,0022 0,0030 0,0227 0,0046 0,0018 0,0045 0,0227 0,0227 0,0011

3-35-1 2-62-4 30-13-7 29-44-7 29-12-07 03-35-11 1-44-16 68-65-13 3-18-13 11-5-15 23-44-7 3-35-1 30-18-3 11-56-1 2-62-2 24-62-4

0,0125 0,0091 0,0044 0,0273 0,0016 0,0034 8,4e-5 0,00085 0,0005 3,5e-5 0,0082 0,0125 0,0039 0,0004 0,0002 0,0016

0,9921 0,9855 0,9938 0,9844 0,9511 0,9961 0,9985 0,9782 0,9875 0,9697 0,9649 0,9868 0,9789 0,9964 0,9566 0,9795

Part 4: Haplotype pair prediction ⬍35% Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

2-51-4 1-51-4 2-44-4 2-44-4 2-51-13 2-44-11 2-60-13 3-51-11 24-62-11 2-18-1 3-56-11

0,0039 0,0008 0,0207 0,0207 0,0058 0,0074 0,0046 0,0013 0,0018 0,0010 0,0001

24-62-13 2-44-7 24-51-1 24-60-13 28-44-11 24-27-1 3-51-4 11-35-13 32-35-1 32-39-4 24-51-13

0,0031 0,0104 0,0003 0,0003 0,0010 0,0007 0,0010 0,0022 0,0007 0,0001 0,0009

0,3468 0,3448 0,3166 0,3273 0,3339 0,3150 0,3488 0,3335 0,3253 0,3096 0,2915 (Continued)

568

P.A. Gourraud et al.

TABLE 2 (Continued) Part 4: Haplotype pair prediction ⬍35% Haplotype 1

Frequency 1

Haplotype 2

Frequency 2

Prediction value

11-18-13 2-51-14 2-21-13 2-62-13 2-51-7

0,0002 0,0017 0,0002 0,0046 0,0028

32-44-1 3-18-10 3-40-4 3-27-4 24-49-13

0,0003 2,5e-5 0,0002 0,0007 0,0005

0,3235 0,3295 0,3314 0,3215 0,3473

Part 1 presents the prediction of the most likely haplotype pair of human leukocyte antigen (HLA)-A B DR phenotypes, which has at least two homozygous loci. Part 2 presents the prediction of the most likely haplotype pair of HLA-A B DR phenotypes, which result is nonambiguous two homozygous loci. Part 3 presents the prediction of the most likely haplotype pair of HLA-A B DR phenotypes for which prediction value is between 95% and 1. Finally, Part 4 presents the prediction of the most likely haplotype pair of HLA-A B DR phenotypes, for which prediction value is below 35%.

1. The presence of at least one homozygous locus is associated with an increased prediction probability (OR ⫽ 2.08; [2.02–2.14] p ⫽ ⬍10.3). 2. The presence of the most frequent HLA alleles in the phenotype is not correlated with a high prediction value (p ⬎ 0.05). In absence of a significant linkage disequilibrium, the allele frequency does not play a key role in the prediction outcome (data not shown). 3. Linkage disequilibrium also increases the prediction probability. Having at least a standardized pair-wise linkage disequilibrium |D'LociX⫺Y| ⬎0.5 is positively correlated with higher prediction probability (|D'HLA*A⫺HLA*B|; OR ⫽ 1.74 [1.69 –1.79] p ⬍ 10.3) (|D'HLA*B⫺HLA*DR|; OR ⫽ 1.12 [1.09 –1.16] p ⬍ 10.3). Interestingly, further analysis (not shown) indicates that the positive linkage disequilibrium measures (association) are correlated with increased phase prediction probability (p ⬍ 10.3), whereas negative linkage disequilibrium measures (repulsion) are correlated with decreased phase prediction probability (p ⬍ 10.3). Moreover, the positive linkage disequilibrium between three loci is strongly associated with higher prediction probability (D'HLA*A⫺HLA*B⫺HLA*D ⬎0.01; OR ⫽ 6.7 [6.46 – 6.95] p ⬍ 10.3). Homozygosity and positive linkage disequilibrium are the major factors influencing the prediction quality. DISCUSSION In this article, we addressed the possibility of predicting haplotype phase from individual HLA phenotypes. We described the outcome of the proposed method on 301 phase-known individuals of the CEPH panel and on 85,933 HLA-ABDR phenotypes of the French BMD Registry. Key factors correlated with the quality of the prediction outcome were homozygosity and the linkage disequilibrium, which clearly increase the prediction probability. This study on individual phase inference is

dedicated to HLA phenotypes and shows that part of the information available in familial situations can be reached by statistical method in an unrelated situation. The haplotype prediction method also contributes to show that individual haplotype prediction could be of practical use. The method is a way to incorporate the current knowledge of the HLA region linkage disequilibrium through the registries in the interpretation of their phenotype. Other studies focusing on haplotype inference in HLA region deal preferentially with some association studies than transplantation genetics [15, 16]. In other genetic regions, several studies have demonstrated the power of haplotype prediction. Examples are available in Xu et al. [17] for five single nucleotide polymorphism (SNP)s in N-acetyl transferase 2 gene (NAT2, 8p22) and for five SNPs on the X chromosome (Xp11.4) or in Orzack et al. for nine apolipoprotein E sites (APOE 19q13.22) [18]. The relevance of such an approach is always discussed regarding the phase predicted rather than the prediction value. Thus the relevance of the haplotype prediction methods must be assessed depending on the kind of markers used and on the genomic region considered. Validation of the predicted haplotype phase probability in itself is difficult in practice. For each phenotype, it would require a significant sample of phase-known HLA data from unrelated individuals. The haplotype approach is powerful because it reduces the number of theoretic possible phenotypes analyzed. The method assumes that the haplotype frequency estimations and the phenotypes under analysis are drawn from the same population. The individual HLA data from CEPH families and the potential BMD phenotype data were recruited within the French population. The French BMD Registry provides haplotype frequency estimations. The fact that the registry haplotype frequencies are used as approximations of the source population

Inferred Haplotype Information for Donor Selection

[6 –10] has been previously discussed [19]. In the French Registry, the DR-typing bias has been reduced in recent years. Nevertheless, at the individual level, haplotype phase inference is limited by the origins of the donors. The phase prediction for individuals in non-Caucasian CEPH families (Amish and Venezuelan, for example) provides evidence that applications should be restricted to the population used for the haplotype frequency estimation. Thus different population haplotype frequency estimation should be used in cases of different genetic background. In the context of HLA and transplantation, haplotype analysis has been mainly used at the population level, especially to model the likelihood to find a donor [20, 21]. However, the results presented here show that an application at the individual level may also help to assess the degree of haplotype matching for unrelated transplantation. No matter the sample size, no matter the number of haplotype frequency estimations used to compute the phase prediction, the method is limited: no prediction probability can be lower than 0.25 (threshold for random attribution of phase in three heterozygous loci phenotypes). It is possible that none of the haplotypes required to explain the phenotypes were estimated in the reference sample. In this case, it is not possible to compute prediction probability. Even if confidence intervals may be computed by bootstrap methods of haplotype frequency estimation, sampling errors on very rare haplotypes remains the main source of variability of the estimation [22]. Haplotype pair predictions may be routinely given to clinicians by the prediction tools described here. They have been implemented by France Greffe de Moelle. On request, haplotype phase prediction is performed using the latest estimation of HLA-ABDR haplotype frequencies in the source population. While sorting to indicate the predicted pair of haplotypes, Registry data remain unchanged. In some cases, the user may explore the likelihood of all possible haplotype pairs. By the implementation of simple statistics, the method presented provides more HLA information to help decide for soliciting donors. Such a tool implemented in the BMD Registries may be of practical interest in several aspects. It would evaluate the chances for a phenotype to match a given haplotype. The results presented here suggest that most (about 76%) of the HLA-A, -B, -DR phenotype matching individuals are matched at the haplotype level in the French BMD Registry. Some of them are compatible with the phenotype level only. It confirms from the statistical point of view previous finding that show that ancestral haplotypes (strong linkage or positive linkage disequilibrium) increased survival in unrelated transplantation [23, 24]. A similar study could be interesting using HLA haplotype frequencies at higher resolution level to investi-

569

gate to which extent higher level of HLA-typing influences the matching at haplotype level in unrelated situation. Handling other HLA locus or genetic markers in the HLA region may also help to define the compatibility at haplotype level. The findings presented here can be applied to assess the degree of HLA matching in any kind of transplantation or when typing relatives is not possible. For example, the phase prediction probabilities assessing the HLA-ABDR haplotype matching may indicate further HLA-typing requirements. It can also characterize the identity for one haplotype in unrelated situations when only partially incompatible donors are available. We demonstrated here that taking advantage of the genetic structure of HLA data allows accessing more information than expected. It has a general relevance as a decision element used in the assessment of compatibility in transplantation. Thoughtful statistics considerations on immunogenetics and populations may allow the development of practical tools of clinical relevance. ACKNOWLEDGMENT

The authors wish to gratefully acknowledge the help of the France Greffe de Moelle staff and the bioinformatics platform of Genopole Toulouse Midi-Pyrénées.

REFERENCES 1. Mughal TI, Goldman JM: Chronic myeloid leukaemia: current status and controversies. Oncology (Huntingt) 18:837, 2004. 2. Petersdorf EW, Anasetti C, Martin PJ, Gooley T, Radich J, Malkki M, Woolfrey A, Smith A, Mickelson E, Hansen JA: Limits of HLA mismatching in unrelated hematopoietic cell transplantation. Blood 104:2976, 2004. 3. Petersdorf EW, Anasetti C, Martin PJ, Hansen JA: Tissue typing in support of unrelated hematopoietic cell transplantation. Tissue Antigens 61:1, 2003. 4. Hansen JA, Yamamoto K, Petersdorf E, Sasazuki T: The role of HLA matching in hematopoietic cell transplantation. Rev Immunogenet 1:359, 1999. 5. Petersdorf EW, Mickelson EM, Anasetti C, Martin PJ, Woolfrey AE, Hansen JA: Effect of HLA mismatches on the outcome of hematopoietic transplants. Curr Opin Immunol 11:521, 1999. 6. Gourraud PA, Genin E, Cambon-Thomsen A: Handling missing values in population data: consequences for maximum likelihood estimation of haplotype frequencies. Eur J Hum Genet 12:805, 2004. 7. Lonjou C, Clayton J, Cambon-Thomsen A, Raffoux C: HLA -A, -B, -DR haplotype frequencies in France— implications for recruitment of potential bone marrow donors. Transplantation 60:375, 1995. 8. Martinetti M, Degioanni A, D’Aronzo AM, Benazzi E,

570

9.

10.

11.

12.

13. 14.

15.

16.

P.A. Gourraud et al.

Carpanelli R, Castellani L, Cenzuales S, De Biase U, De Filippo C, De Giuli A, Gerosa A, Fare M, Ferrioli G, Galvani G, Lombardo C, Malagoli A, Marchesi S, Mascaretti L, Motta F, Sioli V, Rinaldini C, Rizzolo L, Pascutto C, Bernardinelli L, Salvaneschi L: An immunogenetic map of Lombardy (Northern Italy). Ann Hum Genet 66:37, 2002. Muller CR, Ehninger G, Goldmann SF: Gene and haplotype frequencies for the loci HLA-A, HLA-B, and HLA-DR based on over 13,000 german blood donors. Hum Immunol 64:137, 2003. Rendine S, Borelli I, Barbanti M, Sacchi N, Roggero S, Curtoni ES: HLA polymorphisms in Italian bone marrow donors: a regional analysis. Tissue Antigens 52:135, 1998. Piazza A: Haplotypes and linkage disequilibrium from three-locus phenotypes. Histocompatibility Testing, Munksgaard: Kissmeyer-Nielsen, eds, Copenhagen, 923, 1975. Yasuda N: Estimation of haplotype frequency and linkage disequilibrium parameter in the HLA system. Tissue Antigens 12:315, 1978. Morton NE, Simpson SP, Lew R, Yee S: Estimation of haplotype frequencies. Tissue Antigens 22:257, 1983. Bugawan TL, Klitz W, Blair A, Erlich HA: High-resolution HLA class I typing in the CEPH families: analysis of linkage disequilibrium among HLA loci. Tissue Antigens 56:392, 2000. Cordell HJ, Clayton DG: A unified stepwise regression procedure for evaluating the relative effects of polymorphisms within a gene using case/control or family data: application to HLA in type 1 diabetes. Am J Hum Genet 70:124, 2002. Schaid DJ, Rowland CM, Tines DE, Jacobson RM, Poland GA: Score tests for association between traits and haplotypes when linkage phase is ambiguous. Am J Hum Genet 70:425, 2002.

17. Xu CF, Lewis K, Cantone KL, Khan P, Donnelly C, White N, Crocker N, Boyd PR, Zaykin DV, Purvis IJ: Effectiveness of computational methods in haplotype prediction. Hum Genet 110:148, 2002. 18. Orzack SH, Gusfield D, Olson J, Nesbitt S, Subrahmanyan L, Stanton VP Jr: Analysis and exploration of the use of rule-based algorithms and consensus methods for the inferral of haplotypes. Genetics 165:915, 2003. 19. Schipper RF, Oudshoorn M, D’Amaro J, van der Zanden HG, de Lange P, Bakker JT, Bakker J, van Rood JJ: Validation of large data sets, an essential prerequisite for data analysis: an analytical survey of the Bone Marrow Donors Worldwide. Tissue Antigens 47:169, 1996. 20. Mori M, Graves M, Milford EL, Beatty PG: Computer program to predict likelihood of finding and HLAmatched donor: methodology, validation, and application. Biol Blood Marrow Transplant 2:134, 1996. 21. Kollman C, Abella E, Baitty RL, Beatty PG, Chakraborty R, Christiansen CL, Hartzman RJ, Hurley CK, Milford E, Nyman JA, Smith TJ, Switzer GE, Wada RK, Setterholm M: Assessment of optimal size and composition of the U.S. National Registry of hematopoietic stem cell donors. Transplantation 78:89, 2004. 22. Fallin D, Schork NJ: Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data. Am J Hum Genet 67:947, 2000. 23. Tay GK, Witt CS, Christiansen FT, Charron D, Baker D, Herrmann R, Smith LK, Diepeveen D, Mallal S, McCluskey J, et al: Matching for MHC haplotypes results in improved survival following unrelated bone marrow transplantation. Bone Marrow Transplant 15:381, 1995. 24. Christiansen FT, Witt CS, Dawkins RL: Questions in marrow matching: the implications of ancestral haplotypes for routine practice. Bone Marrow Transplant 8:83, 1991.