Sequence-Based Typing of HLA Class I Alleles in Alaskan Yupik Eskimo Mary E. Ellexson-Turner, Mary S. Leffell, Andrea A. Zachary, Sea´n Turner, Tara Bennett, David A. Sidebottom, Kai Cao, Marcelo Ferna´ndez-Vin˜a, and William H. Hildebrand ABSTRACT: In comparison to South America, native North Americans tend to be less diverse in their repertoire of HLA class I alleles. Based upon this observation, we hypothesized that the Yupik Eskimo would exhibit a limited number of previously identified class I HLA alleles. To test this hypothesis, sequence-based typing was performed at the HLA-A, -B and -C loci for 99 Central Yupik individuals from southwestern Alaska. Two new class I alleles, A*2423 and Cw*0806, were identified. While A*2423 was observed in only one sample, Cw*0806 was present in 26 of the 99 individuals and all of the Cw*0806 samples contained B*4801. Allele Cw*0806 differs from Cw*0803 by a single nucleotide substitution such that Cw*0803 may be the progenitor of
ABBREVIATIONS HLA human leukocyte antigen SBT sequence-based typing
INTRODUCTION The Alaska Natives known as the Eskimo can be divided into two main linguistic groups: the Inupiat of the far north, and the Yupik of the southwestern coast and Yukon-Kuskokwim Delta. The Yupik live within a region of mostly flat, marshy plains crisscrossed by many waterways where milder temperatures of ⫺80 to 80 degrees Fahrenheit yield a variety of vegetation and From the University of Oklahoma Health Sciences Center, 975 NE 10th Street, Oklahoma City, OK, 73104 (M.E.E.-T., S.T., T.B., D.A.S., W.H.H.); American Red Cross, 22 Green Street South, Baltimore, MD, 21201 (K.C., M.F.-V.); and Johns Hopkins University, 2041 E. Monument Street, Baltimore, MD, 21205 (M.S.L., A.A.Z.). Address reprint requests to: William H. Hildebrand, Department of Microbiology and Immunology, The University of Oklahoma Health Sciences Center, 975 NE 10th Street, BRC 317, Oklahoma City, Oklahoma 73104; E-mail:
[email protected] Received November 9, 2000; revised February 12, 2001; accepted March 16, 2001. Human Immunology 62, 639 – 644 (2001) © American Society for Histocompatibility and Immunogenetics, 2001 Published by Elsevier Science Inc.
Cw*0806. Allele Cw*0803 was originally characterized as unique to South America, but detection of Cw*0803 in the Yupik indicates that Cw*0803 was a founding allele of the Americas. The presence of new alleles and previously unrecognized founding alleles in the Yupik population show that natives of North America are more diverse than previously envisioned. Human Immunology 62, 639 – 644 (2001). © American Society for Histocompatibility and Immunogenetics, 2001. Published by Elsevier Science Inc. KEYWORDS: Yupik; Eskimo; Amerindian; HLA; Polymorphism
TCR
T cell receptor
wildlife, although the sea is the principal focus of activities [1]. It has been suggested that the Eskimo originate in Asia. One premise among anthropologists is that people migrating from Asia across the Bering land bridge to Alaska populated the Americas [2]. However, the exact geographic source, number of migrations, and timing of these population movements (10,000 to 40,000 years ago) remains controversial [1– 6]. Further migrations from Alaska to the south are thought to have peopled much of North and South America. Exploration into archeological, linguistic, and genetic data from Amerindian populations located within North and South America has yielded evidence to support migrations from Asia across a land bridge to Alaska [7]. HLA studies of Amerindian populations have provided novel insights into the putative origins of Native 0198-8859/01/$–see front matter PII S0198-8859(01)00243-9
640
American tribes and the evolution of HLA class I polymorphism [2, 8]. Previous investigations of Native American populations have revealed fewer numbers of HLA-A, -B, and -C alleles than numbers found in other ethnic groups [9 –11]. While a small number of previously uncharacterized HLA alleles have been detected within these populations, few characteristics of the HLA in Northern Amerindians are unique to this population [12, 13]. For the most part, HLA alleles detected within modern Northern Amerindian populations can consistently be matched to proposed founding alleles detected in Asian populations [10]. Previous data have shown conspicuous differences between the North and South Amerindian populations. While most North Amerindian HLA alleles match proposed founding alleles, South Amerindian populations posses numerous unique HLA alleles that apparently have arisen from proposed founding alleles [12, 14, 15]. Unique alleles in South Amerindians are concentrated at the HLA-B locus, with fewer numbers of new alleles seen at the HLA-A and -C loci, respectively. On the basis of these previous observations we hypothesized that class I HLA alleles in the Yupik Eskimo would match founding alleles as seen in earlier studies of North Amerindians. To test this hypothesis we DNA sequence-based typed 99 individuals from two Yupik villages. Eighty-five of these individuals had no first-degree relatives in the study group. The predominant alleles and putative impact of polymorphism among the class I alleles in this population are discussed. MATERIALS AND METHODS DNA Samples Samples in the form of frozen buffy coats from 99 Alaskan Eskimos were supplied by the ASHI/NIH Minority HLA Workshop. Genomic DNA was extracted from 200 L of each sample using the QIAmp blood tissue kit (Qiagen, Valencia, CA) (96-well-plate format) following the manufacturer’s instructions. PCR and Direct Sequencing of PCR Products Exons 2 and 3 were amplified using either HLA-A, -B, or -C locus-specific primers and bidirectional sequencing of each exon was performed as previously described [16]. Cloning and Sequencing Putative new HLA alleles were reamplified from genomic DNA using HLA locus-specific primers as previously described and cloned into the TA vector using the Topo TA cloning kit (Invitrogen, Carlsbad, CA) according to the manufacturer’s instructions. Ten white colonies from each sample were picked and grown overnight in 10 mL of LB media containing 50 mg/mL of
M. E. Ellexson-Turner et al.
ampicillin. Plasmid DNA was isolated using the Promega Wizard kit (Promega, Madison, WI) following the manufacturer’s protocol. Each clone underwent Eco R1 digestion and was run on a 0.8% agarose gel to screen for insert. Screen sequencing of clones containing insert was achieved with a ⫺21mer M13 Universal primer and an AutoRead sequencing kit (Amersham Pharmacia Biotech, Piscataway, NJ). Bidirectional sequencing of clones from the desired populations were performed as before in combination with an M13 reverse primer. Sequencing reactions were again loaded onto a 6% Page Plus gel which was run on an Amersham Pharmacia ALFexpress automated DNA sequencer. Sequence data were analyzed using the Wisconsin GCG (Genetics Computer Group, Madison, WI) sequence analysis system on a Digital Equipment Corporation VAX 6610. Accession Numbers and Nomenclature The new HLA alleles detected here were submitted to GenBank and assigned accession numbers (A*2423: AF128537, AF128538 and Cw*0806:AF082800, AF082801). The names A*2423 and Cw*0806 were officially assigned by the World Health Organization (WHO) Nomenclature Committee. This follows the agreed policy that, subject to the conditions stated in the Nomenclature Report, names will be assigned to new sequences as they are identified. Lists of such new names will be published in the next WHO Nomenclature Report. RESULTS Data for the HLA-A locus revealed that eight of the 21 HLA class I -A lineages are represented within this population (Figure 1 [17]). The number of HLA-A lineages in the Yupik is high compared to other Northern Amerindian populations in which three to four of the 21 lineages are represented [2, 10]. The most prevalent alleles found in this Yupik cohort are A*2402101, A*68012 and A*0206, with A*2402101 and 68012 previously proposed as founding alleles for Native American populations (Table 1) [2]. Whilst A*0206 is not listed as a founding allele in previous studies, its high frequency within Asian ethnicities and the Yupik lead us to propose A*0206 as a founding allele for natives of North America [18, 19]. Thirty-seven of the Yupik individuals were found to be homozygous at the HLA-A locus with 29 of the 37 homozygotes being HLA-A*2402101. Of interest was the characterization of a new A*24 allele, A*2423, which shows the most homology to the proposed founding allele A*2402101. Allele A*2423 contains a single nucleotide substitution at position 571 of G to T, which translates to a coding difference at amino acid 167 of G
Class I Alleles in Alaskan Yupik Eskimo
641
FIGURE 1 Phylogenetic trees of HLA-A, -B and -C lineages. Lineages that were observed in this Yupik population are bolded.
to W (Figure 2). This new allele was detected in only one Yupik individual. Data for the HLA-B locus shows 12 of the 34 HLA-B lineages represented within this Yupik Eskimo cohort (Figure 1) with B*4002, B*27052, B*4801, and B*3501 demonstrating the highest frequencies. Once again the number of HLA-B alleles and lineages represented in this population is considerably higher than TABLE 1 A compilation of Asian HLA class I alleles considered to be originating alleles for current North and South Amerindian populations [2]. Alleles detected within this Yupik Eskimo population have been bolded and newly proposed founding alleles have been underlined. HLA-A
HLA-B
HLA-C
A*0201 A*0206 A*2402101 A*31012 A*68012
B*1501 B*27052 B*3501 B*39011 B*40012 B*4002 B*4801 B*51011 B*5102
Cw*0102 Cw*02022 Cw*03031 Cw*03041 Cw*04011 Cw*0702 Cw*0801 Cw*0803 Cw*1502
numbers previously observed in North Amerindian groups [10]. Of nine proposed HLA-B founding alleles for North and South Amerindians, seven are represented within the Yupik (Table 1). Unlike the HLA-A locus, only 11 individuals were homozygotes with seven homozygous for B*4002. No new HLA-B alleles were detected while sequencing exons two and three. Data for the HLA-C locus demonstrated 10 of the 15 HLA-C lineages present in this population (Figure 1). Previous studies found fewer HLA-C lineages within North and South Native Americans [9, 10]. Eight of the nine putative founders for Amerindian HLA-C alleles were detected in this study (Table 1). The predominant -C locus alleles seen in the Alaskan Yupik were Cw*03041, Cw*02022, and Cw*0806. Eighteen of the 99 individuals were found to be homozygotes with thirteen of these 18 people being homozygous for Cw*03041. Although previous studies did not list Cw*03031 as a founding allele, its presence among Asian ethnicities and its high frequency in the Yupik lead us to suggest Cw*03031 as a founding allele for Amerindians [20, Parham, 1989 #12]. The most substantial class I HLA finding in this population was the new allele Cw*0806. This new Cw*08 allele was found in 26 of the people, demonstrating a high frequency within the Central Yupik Eskimo. Allele Cw*0806 shares the most homology to Cw*0803, differing by a single nucleotide at position 559 (Table 2a). This nucleotide change of A to G translates to a unique residue change of threonine to alanine at amino acid 163 atop the alpha 2 alpha-helix (Table 2b). In this sample group, the new Cw*0806 allele demonstrated a complete association with B*4801; whenever Cw*0806 was detected, B*4801 was present at the -B locus. A comparison of the Cw*08 subtypes indicates that Cw*0806 is a single nucleotide removed from Cw*0803. The unique positioning of the Cw*0806 substitution indicates no putative donors for gene conversion, suggesting that Cw*0806 arose from Cw*0803 via point mutation. Allele Cw*0803 was initially found in South America and proposed to have arisen from the founder Cw*0801 in South America [12]. However, Cw*0803 has now been found in Asia [20] and in our characterization of Yupik individuals. We therefore propose that Cw*0803 is an Asian founder of class I molecules in the Americas. DISCUSSION Sequence-based typing was performed at the HLA-A, -B and -C loci for 99 Central Yupik individuals from southwestern Alaska. The hypothesis was that the Yupik would most resemble natives of North America, and data were compared to previous Amerindian population stud-
642
M. E. Ellexson-Turner et al.
FIGURE 2 Ribbon diagrams depicting the location of amino acid substitutions for new alleles A*2423 and Cw*0806. New positions are shaded black.
ies. We applied DNA sequence-based typing to scrutinize all of exons 2 and 3, and the resulting data provides new insights into the number, ancestry, and putative functional impact of class I alleles in the Central Yupik population. The HLA typing data presented here and in previous studies indicate that various factors have contributed to
HLA diversity within Native American populations. For example, in South Amerindian populations, HLA-B genes are more divergent from their proposed founders than HLA-A and -C genes [15, 18]. Many new HLA-B alleles have been detected within various tribes, several of which have replaced proposed founding alleles originating in Eastern Asia. This South American HLA-B allelic turnover may be explained by environmental stresses such as parasitic infection and other climate-dependent pathogens which may have led to natural selection for HLA genes within these populations [2, 21]. Indeed,
TABLE 2 All nucleotide (a) and amino acid (b) differences between Cw*0801 and other Cw8 subtypes are shown here with positions numbered within the corresponding exon or alpha domain. Dashes represent positions of homology to Cw*0801. (a) Allele
EXON 2
Cw*0801 Cw*0802 Cw*0803 Cw*0804 Cw*0805 Cw*0806
289 A — — — G —
(b) Allele
Alpha 1
Cw*0801 Cw*0802 Cw*0803 Cw*0804 Cw*0805 Cw*0806
73 T — — — A —
EXON 3 485 C A — A A —
526 A G — G G —
527 C A — A A —
539 T G — — G —
559 A — — — — G
595 G — A — — A
Alpha 2 138 T K — K K —
152 T E — E E —
156 L R — — R —
163 T — — — — A
175 G — R — — R
Class I Alleles in Alaskan Yupik Eskimo
investigation into the HLA pocket supermotifs of South American HLA-B molecules indicated a dominance of heterozygosity resulting in an increased peptide repertoire, further suggesting natural selection [21]. Unlike South American Indians, North American Indian populations have retained many of the proposed Asian founding alleles and posses fewer numbers of new alleles [2, 10]. Less divergence has been interpreted to mean that the founding alleles have been successful in protecting these people against pathogens encountered within Northern America. Alternatively, detection of fewer alleles in Northern Amerindians may result from a genetic bottleneck that limited the number of founding alleles. Finally, epidemics of influenza and TB introduced by Europeans in the 1850’s may have eliminated or selected against particular HLA alleles or haplotypes. The detection of new Yupik HLA-A and HLA-C alleles, and the presence of multiple families of founding alleles in the Yupik, contrasts with Amerindian studies in North America. We are now faced with a scheme where there are multiple founders and several unique individual alleles in Alaska, few founders and little diversification reported in North America, and an intermediate number of founders with many new alleles in South America. Integration of this Yupik data with that of North and South America raises a few questions: Are there more founding alleles than previously thought? Are the new alleles detected within the Yupik unique to this population? All discussion of founding alleles, new alleles, and allelic HLA diversity must be qualified by the quality of the HLA typing data itself. Prior Native American studies tended to rely upon cost-efficient, low-resolution HLA typing methods for initial characterization of the study population. Putative variants were then flagged for DNA sequence analysis. Using this approach, HLA allelic variants differing at multiple sites become most readily apparent; both serologic and low-to-intermediate resolution molecular typing are well suited for detecting multiple nucleotide/amino acid substitutions. In contrast, SBT equally weights all polymorphisms (i.e. point substitutions and gene conversion events). Here, we directly apply high-resolution SBT, detecting two new alleles that differ by only a single nucleotide/amino acid substitution from previously described alleles. By extrapolation, it is conceivable that HLA-A and HLA-C point substitutions leading to allelic diversity in North American Natives is widespread but has gone largely undetected by methods less sensitive to point substitutions. Indeed, the high frequency of Cw*0806 among Yupiks would indicate that prevalent HLA alleles escape detection in such populations, making discussion of founding alleles and allelic diversity somewhat circular. Thus, until significant numbers of natives in the Americas are
643
subjected to SBT, it is difficult to reach conclusions concerning trends in divergence, the number of founders, the number of alleles at a particular locus, or the strength of selective pressure operating at the class I HLA-A, -B, and -C loci. One thing that can be commented on is the nature of new alleles found in the Yupik. It is noteworthy that both the new A*2423 and Cw*0806 molecules contain polymorphisms positioned to interact with the N-terminus of presented ligands. The G to W polymorphism at residue 167 in A*2423 has previously been shown to have a “gatekeeper” function whereby the side chain of 167 controls access of P1 in peptide ligands to the charged side chain of amino acid 166 [22]. In this instance the bulky Trp at 167 in ␣2 of A*2423 will restrict P1 access to the negatively charged side chain of glutamic acid at 166, whereas the smaller Gly side chain of A*2402101 will allow P1 of ligands to interact with the side chain at position 166. It is therefore predicted that a subset of the ligands bound by A*2402101 will be distinguished by a positive charge at P1 when compared to ligands bound by A*2423. Less data are available for how the Cw*0806 threonine to alanine polymorphism at 163 impacts ligand presentation (pooled Edman data cannot call P1 of ligands). However, the side chain of 163 falls into the A, B, and D specificity pockets suggesting that there would be a difference in the peptide repertoire of Cw*0803 versus Cw*0806 [23–25]. Taken together, the polymorphisms which distinguish A*2423 and Cw*0806 are positioned to modify the N-terminus of the ligands presented by these class I molecules. In summary, we found more putative “founding” and “new” HLA class I alleles in the Yupik population than studies of North American Indians would have led us to predict. Also, we did not expect the HLA-A and HLA-C loci generated diversity in the Yupik population to result from point substitutions. Whether or not the alleles Cw*0806 and A*2423 are unique to the Yupik population or found elsewhere in Asia or the Americas is unclear at this time, but the detection of two point substitutions providing for Cw*0806 and A*2423 in the Yupik demonstrates that new variants continue to await detection. Given that the Cw*0806 and A*2423 polymorphisms are positioned to interact with N-terminus of bound peptides as well as with the TCR [26], the importance of thoroughly screening for class I polymorphisms arising via point substitution cannot be overstated. ACKNOWLEDGMENTS
This work was supported, in part, by NIH Contract No. 1 AI82514. The data reported in this study are part of a larger project, organized by the American Society for Histocompatibility and Immunogenetics, to better define the HLA system among Native Americans and Alaska Natives.
644
M. E. Ellexson-Turner et al.
REFERENCES 1. Harper AB: Origins and divergence of Aleuts, Eskimos, and American Indians. Ann Hum Biol 7(6):547, 1980. 2. Parham P, Ohta T: Population biology of antigen presentation by MHC class I molecules. Science 272:67, 1996. 3. Ossenberg NS: Congruence of distance matrices based on cranial discrete traits, cranial measurements, and linguistic-geographic criteria in five Alaskan populations. Am J Phys Anthropol 47(1):93, 1977. 4. Petersen GM, et al.: Genetic polymorphisms in southwest Alaskan Eskimos. Hum Hered 41(4):236, 1991. 5. Scott EM, Wright RC: Genetic diversity of Central Yupik Eskimos. Hum Biol 55(2):409, 1983. 6. Gibbons A, Geneticists trace the DNA trail of the first Americans [news]. Science 259(5093):312, 1993. 7. Bailliet G, et al.: Founder mitochondrial haplotypes in Amerindian populations. Am J Hum Genet 55(1):27, 1994. 8. Watkins DI, et al.: New recombinant HLA-B alleles in a tribe of South American Amerindians indicate rapid evolution of MHC class I loci. Nature 357:329, 1992. 9. Martinez-Arends A, et al.: Characterization of the HLA class I genotypes of a Venezuelan Amerindian group by molecular methods. Tissue Antigens 52(1):51, 1998. 10. Cadavid LF, Watkins DI: Heirs of the jaguar and the anaconda: HLA, conquest and disease in the indigenous populations of the Americas [corrected and republished article originally printed in Tissue Antigens 1997 Sep; 50(3):209. Tissue Antigens 50(6):702, 1997. 11. Kostyu DD, Amos DB: Mysteries of the Amerindians. Tissue Antigens 16/17:111, 1981. 12. Belich MP, et al.: Unusual HLA-B alleles in two tribes of Brazilian Indians. Nature 357:326, 1992. 13. Hildebrand WH, et al.: Serologic cross-reactivities poorly reflect allelic relationships in the HLA-B12 and HLAB21 groups: dominant epitopes of the ␣2 helix. J Immunol 149:3563, 1992. 14. Garber TL, et al.: HLA-B alleles of the Navajo: No evi-
15.
16.
17.
18.
19. 20.
21.
22.
23.
24.
25.
26.
dence for rapid evolution in the Nadene. Tissue Antigens 47(2):143, 1996. Parham P, et al.: Episodic evolution and turnover of HLA-B in the indigenous human populations of the Americas. Tissue Antigens 50(3):219, 1997. Ellexson-Turner M, et al.: Precision Genotyping of Human Leukocyte Antigen-A, -B and -C Loci Via Direct DNA Sequencing, in Methods in Molecular Biology, Antigen Processing and Presentation Protocols, J. Solheim (ed), Totowa, New Jersey: Humana Press. pp. 129 –142. Madrigal JA, et al.: Structural diversity in the HLA-A10 family of alleles: correlations with serology. Tissue antigens 41:72, 1993. Ezquerra A, et al.: Molecular analysis of an HLA-A2 functional variant CLA defined by cytolytic T lymphocytes. J Immunol 137(5):1642, 1986. Parham P, et al.: Diversity and diversification of HLA-A, B, C alleles. J Immunol 142:3937, 1989. Wang H, et al.: Identification of HLA-C alleles using PCR-single-strand-conformation polymorphism and direct sequencing. Tissue Antigens 49(2):134, 1997. Fernandez-Vina MA, et al.: Dissimilar evolution of Blocus versus A-locus and class II loci of the HLA region in South American Indian tribes. Tissue Antigens 50(3):233, 1997. Prilliman KR, et al.: Alpha-2 domain polymorphism and HLA class I peptide loading. Tissue Antigens 54(5):450, 1999. Saper MA, Bjorkman PJ, Wiley DC: Refined structure of the human histocompatibility antigen HLA-A2 at 2.6 Å resolution. J Mol Biol 219:277, 1991. Matsumura M, et al.: Emerging principles for the recognition of peptide antigens by MHC class I molecules. Science 257:927, 1992. Chelvanayagam G: A roadmap for HLA-A, HLA-B, and HLA-C peptide binding specificities. Immunogenetics 45:15, 1996. Bjorkman PJ, Parham P: Structure, function and diversity of class I major histocompatibility molecules. Annu Rev Biochem 59:253, 1990.