Genomics 85 (2005) 201 – 207 www.elsevier.com/locate/ygeno
Straightening out the LINEs: LINE-1 orthologous loci $ Huei Jin Hoa, David A. Raya, Abdel-Halim Salema,b, Jeremy S. Myersa,c, Mark A. Batzera,* a
Department of Biological Sciences, Biological Computation and Visualization Center, Louisiana State University, 202 Life Sciences Building, Baton Rouge, LA 70803, USA b Department of Anatomy, Faculty of Medicine, Suez Canal University, Ismailia, Egypt c Department of Biochemistry, 607 Light Hall, Vanderbilt University, Nashville, TN 37232, USA Received 23 August 2004; accepted 29 October 2004 Available online 2 December 2004
Abstract The L1Hs preTa subfamily of long interspersed elements (LINEs) originated after the divergence of human and chimpanzee and is therefore found only in the human genome. Thirty-three of the 254 L1Hs preTa elements are polymorphic for the absence/presence of the insertion, making them useful markers for studying human population genetics. The problem of homoplasy, however, can diminish the value of LINEs as phylogenetic and population genetic markers. We examined anomalous orthologous sites in a range of nonhuman primates. Only two cases of other mobile elements inserting near the preintegration sites of L1Hs preTa elements were observed: an AluY insertion in Chlorocebus and an L1PA8 insertion in Aotus. Sequence analysis showed that both elements were clearly distinguishable from their human counterparts. We conclude that L1 elements can continue to be regarded as essentially homoplasy-free genetic characters. D 2004 Elsevier Inc. All rights reserved. Keywords: LINE; Long interspersed element; Homoplasy; Mobile element
Long interspersed elements (LINEs) are found abundantly in mammalian genomes and comprise ~21% of the human genome [1]. They are autonomous elements, meaning they encode the enzymatic machinery required for their own transposition. A fully functional LINE is about 6 kb in length and consists of a 5Vuntranslated region (UTR) with an internal RNA polymerase II promoter, two nonoverlapping open reading frames (ORF1 and ORF2), which are separated by an intergenic spacer, and a 3V UTR with a polyadenylation signal and ends with a poly(A) tail [2]. LINEs are usually flanked by target site duplications (TSDs) of 7 to 20 bp at each end as a result of their mode of retrotransposition, target primed reverse transcription (TPRT) [3,4].
Sequence data from this article have been deposited with the GenBank Data Library under Accession Nos. AY705214–AY705231. * Corresponding author. Fax: +1 225 578 7113. E-mail address:
[email protected] (M.A. Batzer). $
0888-7543/$ - see front matter D 2004 Elsevier Inc. All rights reserved. doi:10.1016/j.ygeno.2004.10.016
LINE-1 (L1) is the youngest and the only known actively retrotransposing subfamily of LINEs in primates. The L1 subfamily emerged around 120 mya and is found in all mammals [5]. There are over 500,000 copies of L1 in the human genome, making up 17% of the sequence, but most of these L1s have lost their ability to mobilize due to 5V truncations, 5V inversions, and accumulation of deleterious point mutations [2,6–8]. The use of LINE and SINE (short interspersed element) insertions as phylogenetic and population genetic markers is increasing. Some recent applications of mobile elements to phylogeny include the elucidation of hominid phylogeny [9], an examination of interfamilial relationships in turtles [10], and clarification of cichlid phylogenetics [11]. Mobile elements make excellent phylogenetic and population genetic markers primarily because they have two major advantages over more traditional molecular data such as mitochondrial and nuclear sequences. First, the presence of a mobile element in an individual is thought to represent identity by descent, since the probability that two different mobile elements would integrate independently in the same
202
H.J. Ho et al. / Genomics 85 (2005) 201–207
chromosomal location is small [12–15]. A second advantage of these genetic markers is that the ancestral state of an insertion polymorphism is known to be the absence of the element at a particular genomic location [12,13,16]. Precise knowledge of the ancestral state of a genomic polymorphism allows us to draw trees of population and phylogenetic relationships without making unnecessary assumptions [13,15–20]. This does not mean that mobile elements are without problems with regard to phylogenetic analysis. It is known that insertion homoplasy can occur across distantly related taxa as a function of evolutionary time and variable retroposition rates among species [21–23]. This can limit the application of mobile elements in examinations of more diverse taxa. Random sorting of the ancestral allelic lineages, sequence convergence, and sequence exchanges between alleles or duplicated loci have also been identified as likely factors confounding the interpretation of the interrelationships among species. The purpose of the present study is to investigate these concerns. It is important to determine the frequency of potential homoplasy events since they can affect the accuracy of phylogeny inference. The preTa subfamily of L1 elements, characterized by diagnostic mutations at positions 5930–5932, began expanding in the human genome ~2.34 mya. Recently, Salem et al. [24] surveyed primate genomic variation at LINE-1 preTa loci. In the course of their study, they noted several instances of potential homoplasy as evidenced by PCR analyses. Such examples have also been noted in a recent examination of the Ta subfamily [25]. In that work, no instances of PCR amplification patterns with the potential to be interpreted as homoplasy were due to secondary LINE insertions. Instead, most anomalous amplifications were due to insertions of Alu elements in the same 100-bp preintegration site. The present study expands on the work of Salem et al. [24] by examining L1 preTa orthologous loci in a larger nonhuman primate phylogenetic panel.
Results Of the estimated ~400 preTa LINEs in the human genome 254 were found to be amenable to evolutionary
PCR analysis [24]. Two hundred thirty-five loci amplified empty sites in one or more nonhuman primates. The positions and sequences of the empty sites were determined by computationally removing the L1 element along with one of the target site duplications [25]. PCR was conducted on the remaining 19 loci using the internal subfamily-specific primer (5V-CCTAATGCTAGATGACACG-3V) and the 3V flanking primer to investigate the possibility that the L1Hs preTa subfamily is much older than suspected and therefore would have the L1 insert in the nonhuman primate orthologous sites. Twelve of those loci showed the presence of an L1 insertion in humans but no PCR product was observed in any of the nonhuman primate genomes. The other 7 loci showed no PCR amplification in either human or nonhuman primates. The absence of an empty site in all nonhuman primate samples at the 12 loci may suggest that some genomic sequence was deleted upon the insertion of the L1 element [26,27]. If this were the case, we would underestimate the size of the empty site in the nonhuman primates. The absence of PCR products could also be due to technical reasons caused by mutations in the primer binding sequence, preventing PCR amplification. As would be expected, successful amplification of orthologous loci decreased as estimated divergence times increased. For example, the numbers of orthologous loci amplified in the hominid taxa, common chimpanzee (222), pygmy chimpanzee (224), gorilla (224), and orangutan (199), were substantially higher than the number of successful amplications in green monkey (132), owl monkey (91), and galago (42). Of a total of 1134 loci amplified, 9 of the empty sites did not match their predicted sizes. Sequence analysis of the PCR products revealed the precise nature of the events that contributed to those anomalies. There were four instances (L1AD3, L1AD54, L1AD138, and L1AD207) of larger than expected fragment sizes at the orthologous sites in nonhuman primates (Table 1). At locus L1AD3, a simple sequence repeat expansion increased the size of the amplicon in green monkey by 111 bp. Segments of the L1AD54 and L1AD138 loci were duplicated in the green monkey and gorilla, respectively. Finally, a 17-bp insertion occurred at locus L1AD207 in orangutan. Three deletion events were observed in various orthologous empty sites (L1AD9, L1AD44, and L1AD207). Fig. 1 illustrates
Table 1 Simple sequence insertions and genomic deletions L1Hs preTa locus L1AD3 L1AD9 L1AD44 L1AD54 L1AD138 L1AD207
Human 0
Common chimp
Pygmy chimp
Gorilla
Green monkey
Owl monkey
Galago
0 0
0 0 0
a
0
0 0
Orangutan
0
+111 bp 137 bpb 39 bpb +40 bpa
+76 bpa +17 bpa
177 bpb
b0Q denotes no amplification or, at locus 207, amplification of a nonorthologous product. b Q denotes presence of the expected PCR product. a Simple sequence insertions at orthologous loci. b Genomic deletions at orthologous loci.
0 0
H.J. Ho et al. / Genomics 85 (2005) 201–207
203
Fig. 1. Anomalous events in the L1AD207 locus. A 17-bp insertion occurred in the green monkey locus and a 177-bp deletion in the owl monkey locus independent of the L1AD207 element insertion. The red arrow denotes the size of the expected empty site (511 bp) determined by computationally removing the L1 insertion.
some of the different types of anomalous orthologous sites at the L1AD207 locus. It is important to note that the deletions mentioned above do not appear to have been facilitated by LINE insertion events. LINE-mediated deletions occur during TPRT and result in larger than expected empty sites in the orthologous loci of nonhuman primate genomes. The mechanism that leads to such deletions remains unclear. Only one such event, a 26-bp deletion at locus L1AD361, was recovered from our analyses. This apparent LINE-mediated deletion occurred in a noncoding region of the genome, deleting a portion of a LIMC3 element. It is possible that additional LINE-mediated deletions occurred but have gone undetected by gel electrophoresis because they were too small (1–20 bp) to be detected or too large (N1.5 kb) to be amplified in a standard PCR. Only two independent, near-parallel insertion events were detected among all the 1134 amplified sites. First, the L1AD216 orthologous locus in owl monkey had a truncated L1PA8 insertion just 1 base away from the L1AD216 insertion site. In the second case, an AluY element inserted into the green monkey genome 78 bp from the L1AD273 insertion site (Fig. 2). These near-parallel insertions do not qualify as authentic homoplasy events for several reasons. Both amplicons were of different sizes compared to the filled sites of the loci in the human lineage. In addition, DNA sequence analysis of the loci revealed that they did not have the same insertion sites as their human counterparts or identical TSDs upon their insertion. Thus we conclude that they are not authentic homoplasious mobile element insertions.
Discussion The expansion of mobile elements in the mammalian genome has provided a method to trace the evolutionary history of related taxa. After initial insertion into the genome, a mobile element typically drifts toward being fixed for presence or is lost from the population. Fixed LINEs and SINEs remain in the genome and are passed down to all descendants; therefore, mobile elements that are shared by a group of organisms indicate that they share a common ancestor. This makes LINEs and SINEs ideal markers to examine the evolutionary history of closely related organisms [9,28–30]. However, the reliability of mobile elements as markers has been questioned because they may be susceptible to insertion homoplasy, especially in distantly related taxa as a function of evolutionary time [22]. The present data strongly support the hypothesis that individual mobile element insertions are unique events in the evolutionary history of a genome. None of the anomalous preTa amplification patterns resulted from authentic parallel forward insertions. In addition, there was no evidence for the clean deletion of a preTa LINE observed in the human lineage. The chances of a true parallel insertion are so small as to be ignored for most practical purposes. A true parallel L1 insertion is defined as a secondary insertion of an L1 in exactly the same locus in a different genome. The insertion would also have to create identical TSDs. To date, no parallel mobile element insertions in nonhuman primate taxa fit these criteria, thus supporting the idea that mobile elements are homoplasy-free markers
204
H.J. Ho et al. / Genomics 85 (2005) 201–207
Fig. 2. Near-parallel insertion of an AluY in the L1AD273 green monkey ortholog. A 313-bp AluY inserted 78 bp away from the L1AD273 preTa element insertion site. The predicted empty sites were amplified in all orthologous sites except orangutan, owl monkey, and galago. No band is visible in human due to the large size of the filled site (N6 kb).
[9,25,31]. Several examples of apparent mobile element homoplasy have been reported in the past for Alu elements [9,21,32,33], but none have ever been reported for L1 elements. Detailed DNA sequence analysis showed that all of the events within the primate lineage involving Alu repeats were not truly homoplasious insertions. Because it is often impractical to sequence all of the amplified PCR products to determine their authenticity, apparent homoplasy (the occurrence of a PCR product larger than the expected empty site due to an independent insertion event) can pose a problem when mobile elements are used as phylogenetic markers. It is therefore necessary to estimate the frequency of these events. The frequencies of observed anomalous orthologous sites in the gorilla, orangutan, green monkey, and owl monkey
are 0.446 (1/224), 0.503 (1/199), 3.788 (5/132) and 2.198% (2/91), respectively. No anomalous orthologous sites were recovered from the chimpanzee and galago genomes. Humans diverged from chimpanzees relatively recently, about 4–6 mya [34], giving them little time to accumulate new insertions in their genome [35]. Therefore, the lack of apparent parallel insertions is not surprising. Owl monkey and galago, having greater divergences from the human lineage, would be expected to have more anomalous orthologous sites since the chances of parallel and nearparallel insertions increase across more distantly related taxa. Our observations were probably skewed because many of the owl monkey and galago loci could not be amplified by PCR due to increased sequence divergence. Thus, the number of anomalous events may be underestimated,
Table 2 List of DNA sources for all species studied Species Homo sapiens Pan troglodytes Pan paniscus Gorilla gorilla Pongo pygmaeus Chlorocebus aethiops Aotus trivirgatus Galago senegalensis a b c
Common name Human Common chimpanzee Bonobo Lowland gorilla Orangutan Green monkey Three-striped owl monkey Senegal galago
Origin a
ATCC Coriellb Coriellb Coriellb Coriellb ATCC ATCC Cell line
From cell lines provided by the American Type Culture Collection, P.O. Box 1549, Manassas, VA 20108, USA. Coriell Institute for Medical Research, 403 Haddon Avenue, Camden, NJ 08103, USA. Adenovirus 12 SV40-transformed fibroblast cell line maintained in the laboratory of Dr. Mark Batzer, Louisiana State University.
ID No. CCL2 AG06939 AG05253 AG05251 NG12256 CCL-70 CRL1556 c
H.J. Ho et al. / Genomics 85 (2005) 201–207
especially in owl monkey and galago. The individual nearparallel insertion rates for green monkey and owl monkey are 0.758 (1/132) and 1.099% (1/91), respectively. We also
205
calculated an overall regional near-parallel insertion frequency of 0.176% (2/1134). Similar frequencies were previously reported by Vincent et al. [25]. Taking their data
Table 3 Redesigned preTa L1 oligonucleotide primers Name
Forward primer
Reverse primer
PCR product size Filled
Empty
L1AD1 L1AD5 L1AD8 L1AD44 L1AD59 L1AD63 L1AD64 L1AD65 L1AD68 L1AD70 L1AD74 L1AD96 L1AD123 L1AD138 L1AD149 L1AD174 L1AD177 L1AD185 L1AD186 L1AD190 L1AD192 L1AD193 L1AD195 L1AD197 L1AD209 L1AD214 L1AD219 L1AD226 L1AD242 L1AD243 L1AD244 L1AD254 L1AD261 L1AD276 L1AD287 L1AD290 L1AD291 L1AD293 L1AD295 L1AD299 L1AD301 L1AD318 L1AD325 L1AD327 L1AD328 L1AD334 L1AD338 L1AD339 L1AD348 L1AD349 L1AD355 L1AD359 L1AD372 L1AD373 L1AD383 L1ADY8
TTCCCTTCCTTTGTGAATGTCT TCATCTCACAGAGCTCACAG TGGTTTCAATCCCTACTTCTGG ACATTGGTGTCTGAGTGTCTGG GCAGTGGGACATTGACTCCTAC GCCCACTAGTCTGTTTTGTGCT TCTGGAGGCCACTGCTAATC TCCTCAAAGTTGATGCTCCTC GCATGCATACTGGACAAAACAT TGTGTTCAGTATGCGGGTTC TTTGTTTCAAGCCAATGCTG GGGAAAGCTCATTTGTTTGC TCATTACAGACCATTGACATGC CCAGACAAGTTTGCCTTATGACT AGGCCGAAACACAGATAAGC CCATTGGACTTCCCTCTTGA CTGCCTTTCTGATTGCTCGT CTAGGGCCACTGGTAGGGTA GTCAGCAGGCCAATGTGAC TTGGGGAATAATCATGCACTC AGCAACAGTAAGTCCCCATTT TCTTTTACTCCCAAAAGGAA TGTCCACCAGTCCTGTGATG GATTCACGGAAGTTAGGTAGCC AATGAGTTCACGCATTGTGTT AAGTGACGCACCTTCTGCTT GCAGGGAATATTTGGGACAT GCCCCTAGAGGCATTTGAGT TGAGAGGGGGATTATTTTGA GTGAACATGACTGTGATATTTTAAGG CCCCTGTGGTCTTCCTTCA CCAAACTTTAAGAACGCCATGT CTATGGACCCATCTGACTGT CCTAGCCATATTGAACCGTGA TGCCTAAGCCCAAATCTGAA CTCCCATGCCTCAACATCTC TGGAAAAATATCCCATAATGA AGCACTTTCTTTGCCTTGGA CCATTCCGCATGGAAAATTA AGAGTTTCCCAGCTGCACTC AAATTCCTGAGCGCTAACCT ACCTTGACCATGGGATGAAC AGAATGGATGTTGGGTGCTC AAAACATATTTGGAGGAGCA TGGCCATTCTCATGTTCTCA TTGACTTGTTTAGAAAGGGATT TCCAATTTGCAACAGCTACA GTTAAAATGCCAGGCTGAT AGGAAGATTCCACAGAATGTGA ACAAGCTGCAATTGTGTTGG CTGAGTGCCTGCAATCCTTT AAGGGCATATAAAACTGGTG TCGAAATACACTTACGCCTCAA GGAGAGGCAAGAAACTCCAA TGGTGGTCTCAGAGTAAACA TCACACGTATCCCTTTGCAG
ATATGGCCATCTTGACTCAGTG CTAGGAATCCTTCTGTCTGG TCTGGGTGGAATGATAAAGTCA GTTGCTCCAAAGGAACTTTGTT TGTGGCATAGGTTTCTGGAAGT ATGCCTTGGACATGGTGAATAG AGAAAGGCATGACAGCCAGT CCTTCCCTTGTTCCCTCATT CTTACTTCATCCCATGCACCAC ACAATTTGTGGGCCTAGCAC TGGCATACTCGTATTCTAAGTGC ACATCCAGACCACCAGGAAG TAAATGCATTGGCACCACAC TGCTCCCTTATGACCACTATCAT TGGTGGTGCCCATATTTGTA AGAGCTGCACACCCAGAGAT CCTGGGAATTCCATCGTAAC ATCACCGGGCTGTAACAAAC GGAGAACTTGCGCATTAGGA AAGCAGCATCTACAGGCAAAG TGACTTTAGTGACTCCTGCTCTTTT TTGGGTAGATGAAGATGACC TGCCTCTTGGTAACCGTAGC ACCCCAGGTACACACACCTAGT AGCAAAGCAAGGCAGGTATG TGGAGGTTGACTCCGATGTA CAGTCCCCACCACACTAGAA CAACAAGTTTACGCAGAACACTG CCTAACAGTCAGGAAAGCTGA TTGTGGTGTTGACTGCATGA CCAGAGTCTGATGCGTTTGA GTGGGGGAGGTTTAGGGTAG AGTTATTAAACCGGCCACTA TGGATTTCAAGAGGAGACCAA TTCAAATTCTCCTCACCTATGG GAACCCACGAGGTTGTTAGC TTTCAGATGGTTTTTCAACA TTCACATTCCAGTAGGGGAGA GCAGCTTTGTACCGAAGTCC AGATCAATGGCTCTGCGAAC TGGAATGTGAGGATGAAGGA ATGCCTGTGGACTTGCTACA TGTCCCCCATGAACAAATTC GTGACCTGGTGTTTTTGTCT AGCATCACCAACACAACCTG GGATAAAGCTGAAAGCTCAA CTGCACATTGCTTTGGACAT TGAGAAATGTGTTCTCCAAA AGAGTTTGACAACTGGCTGGT ACTGCCTTGCTCTCCTTTCA GAAACTGGGTAAACCCCAAG GCACCCATTAACTCATCATT GGATAAACCACAATAGTGACCATC CTGCACTGTGTTGTCATTGGT ACCCAAAACATCATTAGTGC GCGCTTTGTGTCCTATGTTG
6778 749 985 6394 767 6389 6290 6808 788 959 1237 1003 555 1109 2193 341 6713 6402 791 440 1034 1833 6449 6262 1653 6325 6424 733 6245 1755 591 904 6269 727 369 779 6277 2184 991 6344 1279 634 1099 6315 2431 6322 6454 1169 1379 1805 1692 6460 2217 850 1642 2041
191 326 No TSDs 334 No TSDs 329 236 292 605 No TSDs No TSDs 169 243 404 541 159 599 No TSDs 138 228 227 193 341 185 200 193 360 No TSDs 186 131 No TSDs No TSDs 222 No TSDs 128 204 180 182 No TSDs No TSDs 157 179 No TSDs 202 233 233 186 136 No TSDs 177 217 356 172 192 117 343
Subfamily 574 150 353 314 286 254 179 573 201 309 310 303 282 408 277 127 684 326 247 337 363 236 402 300 248 344 395 283 197 282 243 355 245 293 177 240 311 292 327 280 230 258 358 314 240 215 440 349 224 235 308 328 279 236 1026 432
AT 54 55 54 55 55 54 54 60 54 NA 54 54 54 54 55 54 53 55 54 54 54 60 NA 54 54 55 54 54 54 54 54 54 55 54 55 54 53 54 53 54 54 54 53 55 54 53 54 53 54 54 54 NA 55 55 54 NA
206
H.J. Ho et al. / Genomics 85 (2005) 201–207
into account, a total of seven Alu and one LINE nearparallel insertion events were recovered from an analysis of 2470 orthologous loci. It is possible for L1-mediated deletions to cause disease if parts of a gene are removed. Since most of the genome consists of noncoding sequences, while less than 5% of the sequences make up genes, the probability of a mobile element inserting within a gene and causing a deletion is low. Only one (7.69%) of the anomalies we found was an L1-mediated deletion. This is lower than the estimate of 21.62% made by Gilbert, Lutz-Prigge, and Moran [26]. However, because small deviations in PCR product sizes are difficult to detect by our methods, there may be other short L1-mediated deletions (b20 bp different from the predicted empty site size) that have gone undetected. L1-mediated deletions can cause homoplasy if the size of the deletion is similar to the size of the L1 insertion. PCR amplification would result in filled and empty sites of the same size, resembling a parallel insertion in each primate orthologous locus. No such events were noted in our study, suggesting that the likelihood of such an event is rare. The lack of homoplasy observed in LINEs can be explained in several ways. Since LINEs have a slower amplification rate than Alu elements, the probability of a parallel insertion is low. In addition, new L1 insertions are variable in length due to 5V truncations, which are easily distinguished by PCR as different-sized L1 insertions. However, only a fraction (~10%) of Alu elements is truncated during the TPRT process, making parallel insertions of the same size more common. Alternatively, parallel L1 insertions could be harder to detect by PCR if they are too large to be amplified in a standard PCR. Most L1s are larger than 1 kb in size, while Alu elements are only about 300 kb. In conclusion, our examination of L1 preTa orthologous loci in nonhuman primates produced no evidence of authentic Insertion homoplasy. Rates of near-parallel insertion were similar to those reported in other studies, ~1%. While we continue to recommend detailed sequence analysis of any loci exhibiting anomalous amplification patterns, the low frequency of authentic secondary insertions suggests that such analysis of all loci in all organisms may not be necessary.
Materials and methods Primate DNA samples were isolated from cell lines from Coriell Cell Repositories: HeLa (ATCC No. CCL-2), common chimpanzee (Pan troglodytes) Wes (Repository No. AG06939), pygmy chimpanzee (Pan paniscus) (Repository No. AG05253), gorilla (Gorilla gorilla) lowland gorilla (Repository No. AG05251), orangutan (Pongo pygmaeus) (Repository No. NG12256), green monkey (Chlorocebus aethiops) (ATCC No. CCL-70), owl monkey (Aotus trivirgatus) (ATCC No. CRL-1556), and galago
(adenovirus 12 SV40-transformed Galago senegalensis fibroblasts). Sources for the cell lines are listed in Table 2. Two hundred fifty-four individual preTa L1 insertion loci were amplified in a range of primate taxa to test for the presence or absence of the elements. The panel tested included all species listed in Table 2. Twenty-five-microliter PCR amplifications were performed under the following conditions: 10–100 ng of template DNA, 40 pM each oligonucleotide primer, 50 mM KCl, 10 mM Tris–HCl (pH 8.4), 1.5 mM MgCl2, 0.2 mM dNTPs, and Taq DNA polymerase (1 U). All reactions were subjected to an initial denaturation step at 948C for 150 s, followed by 32 cycles of a 150-s denaturation step at 948C, a 15-s primer annealing step at the specified annealing temperature, and a 60-s extension step at 728C. This was followed by a final 180-s extension step at 728C. Gel electrophoresis was performed on the PCR products in 2% agarose gels stained with ethidium bromide. PCR bands were detected using UV fluorescence. Primer sequences and annealing temperatures used for PCR amplification have been previously reported by Salem et al. [24]. Lack of consistent amplification in some nonhuman primate taxa prompted the redesign of some primers used by Salem et al. [24]. All of the redesigned primer sequences are indicated in Table 3. PCR products purified directly or from the agarose gel were cloned with the TOPO TA Cloning Kit (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s directions and clones were sequenced using chain termination DNA sequencing on an ABI 3100 automated DNA sequencer [36]. DNA sequences were aligned using MegAlign v5.00. All the sequences generated for the project have been deposited with GenBank under Accession Nos. AY705214–AY705231.
Acknowledgments Drs. D. Donze and M. Noor contributed comments to earlier drafts of the manuscript. This research was supported by Louisiana Board of Regents Millennium Trust Health Excellence Fund Grants (2000-05)-05, (2000-05)-01, and (2001-06)-02 (to M.A.B.) and National Science Foundation Grant BCS-0218338 (M.A.B.).
References [1] E.S. Lander, L.M. Linton, B. Birren, C. Nusbaum, M.C. Zody, J. Baldwin, K. Devon, K. Dewar, M. Doyle, W. FitzHugh, R. Funke, D. Gage, K. Harris, A. Heaford, J. Howland, L. Kann, J. Lehoczky, R. LeVine, P. McEwan, K. McKernan, J. Meldrim, J.P. Mesirov, C. Miranda, W. Morris, J. Naylor, C. Raymond, M. Rosetti, R. Santos, A. Sheridan, C. Sougnez, N. Stange-Thomann, N. Stojanovic, A. Subramanian, D. Wyman, J. Rogers, J. Sulston, R. Ainscough, S. Beck, D. Bentley, J. Burton, C. Clee, N. Carter, A. Coulson, R. Deadman, P. Deloukas, A. Dunham, I. Dunham, R. Durbin, L. French, D. Grafham, S. Gregory, T. Hubbard, S. Humphray, A. Hunt, M.
H.J. Ho et al. / Genomics 85 (2005) 201–207
[2] [3] [4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12] [13]
[14] [15] [16]
[17]
[18]
Jones, C. Lloyd, A. McMurray, L. Matthews, S. Mercer, S. Milne, J.C. Mullikin, A. Mungall, R. Plumb, M. Ross, R. Shownkeen, S. Sims, R.H. Waterston, R.K. Wilson, L.W. Hillier, J.D. McPherson, M.A. Marra, E.R. Mardis, L.A. Fulton, A.T. Chinwalla, K.H. Pepin, W.R. Gish, S.L. Chissoe, M.C. Wendl, K.D. Delehaunty, T.L. Miner, A. Delehaunty, J.B. Kramer, L.L. Cook, R.S. Fulton, D.L. Johnson, P.J. Minx, S.W. Clifton, T. Hawkins, E. Branscomb, P. Predki, P. Richardson, S. Wenning, T. Slezak, N. Doggett, J.F. Cheng, A. Olsen, S. Lucas, C. Elkin, E. Uberbacher, M. Frazier, et al., Initial sequencing and analysis of the human genome, Nature 409 (2001) 860 – 921. H.H. Kazazian Jr., J.V. Moran, The impact of L1 retrotransposons on the human genome, Nat. Genet. 19 (1998) 19 – 24. T.G. Fanning, M.F. Singer, LINE-1: a mammalian transposable element, Biochim. Biophys. Acta 910 (1987) 203 – 212. D.D. Luan, M.H. Korman, J.L. Jakubczak, T.H. Eickbush, Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition, Cell 72 (1993) 595 – 605. A.F. Smit, G. Toth, A.D. Riggs, J. Jurka, Ancestral, mammalian-wide subfamilies of LINE-1 repetitive sequences, J. Mol. Biol. 246 (1995) 401 – 417. E.M. Ostertag, H.H. Kazazian Jr., Twin priming: a proposed mechanism for the creation of inversions in L1 retrotransposition, Genome Res. 11 (2001) 2059 – 2065. J.S. Myers, B.J. Vincent, H. Udall, W.S. Watkins, T.A. Morrish, G.E. Kilroy, G.D. Swergold, J. Henke, L. Henke, J.V. Moran, L.B. Jorde, M.A. Batzer, A comprehensive analysis of recently integrated human Ta L1 elements, Am. J. Hum. Genet. 71 (2002) 312 – 326. B. Brouha, J. Schustak, R.M. Badge, S. Lutz-Prigge, A.H. Farley, J.V. Moran, H.H. Kazazian Jr., Hot L1s account for the bulk of retrotransposition in the human population, Proc. Natl. Acad. Sci. USA 100 (2003) 5280 – 5285. A.H. Salem, D.A. Ray, J. Xing, P.A. Callinan, J.S. Myers, D.J. Hedges, R.K. Garber, D.J. Witherspoon, L.B. Jorde, M.A. Batzer, Alu elements and hominid phylogenetics, Proc. Natl. Acad. Sci. USA 100 (2003) 12787 – 12791. T. Sasaki, K. Takahashi, M. Nikaido, S. Miura, Y. Yasukawa, N. Okada, First application of the SINE (short interspersed repetitive element) method to infer phylogenetic relationships in reptiles: an example from the turtle superfamily Testudinoidea, Mol. Biol. Evol. 21 (2004) 705 – 715. Y. Terai, N. Takezaki, W.E. Mayer, H. Tichy, N. Takahata, J. Klein, N. Okada, Phylogenetic relationships among East African haplochromine fish as revealed by short interspersed elements (SINEs), J. Mol. Evol. 58 (2004) 64 – 78. M.A. Batzer, P.L. Deininger, A human-specific subfamily of Alu sequences, Genomics 9 (1991) 481 – 487. M.A. Batzer, M. Stoneking, M. Alegria-Hartman, H. Bazan, D.H. Kass, T.H. Shaikh, G.E. Novick, P.A. Ioannou, W.D. Scheer, R.J. Herrera, et al., African origin of human-specific polymorphic Alu insertions, Proc. Natl. Acad. Sci. USA 91 (1994) 12288 – 12292. N. Okada, M. Hamada, I. Ogiwara, K. Ohshima, SINEs and LINEs share common 3Vsequences: a review, Gene 205 (1997) 229 – 243. M.A. Batzer, P.L. Deininger, Alu repeats and human genomic diversity, Nat. Rev. Genet. 3 (2002) 370 – 379. N.T. Perna, M.A. Batzer, P.L. Deininger, M. Stoneking, Alu insertion polymorphism: a new type of marker for human population studies, Hum. Biol. 64 (1992) 641 – 648. M.A. Batzer, S.S. Arcot, J.W. Phinney, M. Alegria-Hartman, D.H. Kass, S.M. Milligan, C. Kimpton, P. Gill, M. Hochmeister, P.A. Ioannou, R.J. Herrera, D.A. Boudreau, W.D. Scheer, B.J. Keats, P.L. Deininger, M. Stoneking, Genetic variation of recent Alu insertions in human populations, J. Mol. Evol. 42 (1996) 22 – 29. M. Stoneking, J.J. Fontius, S.L. Clifford, H. Soodyall, S.S. Arcot, N.
[19]
[20]
[21]
[22] [23] [24]
[25]
[26] [27]
[28]
[29]
[30] [31]
[32]
[33]
[34]
[35]
[36]
207
Saha, T. Jenkins, M.A. Tahir, P.L. Deininger, M.A. Batzer, Alu insertion polymorphisms and human evolution: evidence for a larger population size in Africa, Genome Res. 7 (1997) 1061 – 1071. W.S. Watkins, C.E. Ricker, M.J. Bamshad, M.L. Carroll, S.V. Nguyen, M.A. Batzer, H.C. Harpending, A.R. Rogers, L.B. Jorde, Patterns of ancestral human diversity: an analysis of Alu-insertion and restriction-site polymorphisms, Am. J. Hum. Genet. 68 (2001) 738 – 752. W.S. Watkins, A.R. Rogers, C.T. Ostler, S. Wooding, M.J. Bamshad, A.M. Brassington, M.L. Carroll, S.V. Nguyen, J.A. Walker, B.V. Prasad, P.G. Reddy, P.K. Das, M.A. Batzer, L.B. Jorde, Genetic variation among world populations: inferences from 100 Alu insertion polymorphisms, Genome Res. 13 (2003) 1607 – 1618. M.A. Cantrell, B.J. Filanoski, A.R. Ingermann, K. Olsson, N. DiLuglio, Z. Lister, H.A. Wichman, An ancient retrovirus-like element contains hot spots for SINE insertion, Genetics 158 (2001) 769 – 777. D.M. Hillis, SINEs of the perfect character, Proc. Natl. Acad. Sci. USA 96 (1999) 9979 – 9981. M.M. Miyamoto, Molecular systematics: perfect SINEs of evolutionary history? Curr. Biol. 9 (1999) R816 – R819. A.H. Salem, J.S. Myers, A.C. Otieno, W.S. Watkins, L.B. Jorde, M.A. Batzer, LINE-1 preTa elements in the human genome, J. Mol. Biol. 326 (2003) 1127 – 1146. B.J. Vincent, J.S. Myers, H.J. Ho, G.E. Kilroy, J.A. Walker, W.S. Watkins, L.B. Jorde, M.A. Batzer, Following the LINEs: an analysis of primate genomic variation at human-specific LINE-1 insertion sites, Mol. Biol. Evol. 20 (2003) 1338 – 1348. N. Gilbert, S. Lutz-Prigge, J.V. Moran, Genomic deletions created upon LINE-1 retrotransposition, Cell 110 (2002) 315 – 325. D.E. Symer, C. Connelly, S.T. Szak, E.M. Caputo, G.J. Cost, G. Parmigiani, J.D. Boeke, Human L1 retrotransposition is associated with genetic instability in vivo, Cell 110 (2002) 327 – 338. M. Nikaido, A.P. Rooney, N. Okada, Phylogenetic relationships among cetartiodactyls based on insertions of short and long interspersed elements: hippopotamuses are the closest extant relatives of whales, Proc. Natl. Acad. Sci. USA 96 (1999) 10261 – 10266. K. Kawai, M. Nikaido, M. Harada, S. Matsumura, L.K. Lin, Y. Wu, M. Hasegawa, N. Okada, Intra- and interfamily relationships of Vespertilionidae inferred by various molecular markers including SINE insertion data, J. Mol. Evol. 55 (2002) 284 – 301. A.M. Shedlock, N. Okada, SINE insertions: powerful tools for molecular systematics, Bioessays 22 (2000) 148 – 160. A.-H. Salem, D.A. Ray, M.A. Batzer, Identity by descent and DNA sequence variation of human SINE and LINE elements, Cytogenetic Genome Res. 108 (2005) 63 – 72. A.M. Roy-Engel, M.L. Carroll, M. El-Sawy, A.H. Salem, R.K. Garber, S.V. Nguyen, P.L. Deininger, M.A. Batzer, Non-traditional Alu evolution and primate genomic diversity, J. Mol. Biol. 316 (2002) 1033 – 1040. A.H. Salem, G.E. Kilroy, W.S. Watkins, L.B. Jorde, M.A. Batzer, Recently integrated Alu elements and human genomic diversity, Mol. Biol. Evol. 20 (2003) 1349 – 1361. M.M. Miyamoto, J.L. Slightom, M. Goodman, Phylogenetic relations of humans and African apes from DNA sequences in the psi eta-globin region, Science 238 (1987) 369 – 373. D.J. Hedges, P.A. Callinan, R. Cordaux, J. Xing, E. Barnes, M.A. Batzer, Differential Alu mobilization and polymorphism among the human and chimpanzee lineages, Genome Res. 14 (2004) 1068 – 1075. F. Sanger, S. Nicklen, A.R. Coulson, DNA sequencing with chainterminating inhibitors, Proc. Natl. Acad. Sci. USA 74 (1977) 5463 – 5467.