Vertebrate evolution reflected in the evolution of nuclear ribosomal internal transcribed spacer 2

Vertebrate evolution reflected in the evolution of nuclear ribosomal internal transcribed spacer 2

Gene 508 (2012) 85–91 Contents lists available at SciVerse ScienceDirect Gene journal homepage: www.elsevier.com/locate/gene Vertebrate evolution r...

580KB Sizes 0 Downloads 64 Views

Gene 508 (2012) 85–91

Contents lists available at SciVerse ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Vertebrate evolution reflected in the evolution of nuclear ribosomal internal transcribed spacer 2 Natalia Kupriyanova ⁎, Dmitrii Shibalev, Alexander Voronov, Kirill Netchvolodov, Tatiana Kurako, Alexei Ryskov Institute of Gene Biology, Laboratory of Genome Organization, Russian Academy of Sciences, Vavilov Street 34/5, Moscow 119334, Russian Federation

a r t i c l e

i n f o

Article history: Accepted 17 July 2012 Available online 1 August 2012 Keywords: rRNA processing ITS2 composition Reptiles Evolutionary relationship

a b s t r a c t In eukaryotes, mature rRNA sequences are produced from single large (45S) precursor (pre-rRNA) as the result of successive removal of spacers through a series of rapid and intricate actions of endo- and exonucleases. The excision of internal transcribed spacer (ITS2), a eukaryotic-specific insertion, remains the most elusive processing step. ITS2 is the element mandatory for all eukaryotic pre-rRNAs that contain at least three processing cleavage sites for precise 5.8S and 28S formation. Conserved core sequences (cis-elements) binding to trans-factors provide for precise rRNA processing, whereas rapidly diverging regions between the core sequences preserve internal complementarity, which guarantees the spatial integrity of ITS2. Characteristic differences in the formation of such insertions during evolution should reflect the relationships between taxa. The phylogeny of the reptiles and the relationships between taxa proposed by scientists are controversial. To delineate the structural and functional features preserved among reptilian ITS2s, we cloned and sequenced 58 ITS2s belonging to four reptile orders: Squamata, Crocodilians, Aves, and Testudines. We studied the subsequent alignment and folding of variable regions. The sizes and packing of the loop–stems between conserved consensus segments in reptiles vary considerably between taxa. Our phylogenetic trees constructed on the basis of the reptile ITS2s primary structural alignments revealed a split between Iguania clade and all other taxa. True lizards (suborder Scleroglossa) and snakes (suborder Serpentes) show sister relationships, as well as the two other reptilian orders, Crocodilia + Aves and Testudines. In summary, our phylogenetic trees exhibit a mix of specific features deduced or, to the contrary, rejected earlier by other authors. © 2012 Elsevier B.V. All rights reserved.

Abbreviations: rDNA Ribosomal, DNA; rRNA Ribosomal, RNA; (ITS), Internal transcribed spacer; D. arm, Darevskia armeniaca; D. val, Darevskia valentini; D. por, Darevskia portschinskii; L. med, Lacerta media; L. str, Lacerta strigata; G. maj, Gerrhosaurus major; E. mur, Eulamprus murrayi; E. sch, Eumeces schneideri; E. mac, Eublepharis macularius; V. exa, Varanus exantematicus; H. hor, Heloderma horridum; A. fra, Anguis fragilis; C. oed, Ctenosaura oedrina; P. mar, Polychrus marmoratus; B. plu, Basiliscus plumifrons; P. vit, Pogona vitticeps; P. coc, Physignathus cocincinus; F. par, Frucifer pardalis; R. tig, Rhabdophis tigrinus; N. nat, Natrix natrix; N. kao, Naja kaouthia; B. con, Boa constrictor; C. ruf, Cylindrophis ruffus; C. nil, Crockodile niloticus; C. sia, Crocodylus siamensis; Gal. g, Gallus gallus; T. gra, Testudina graeca; A. hor, Agrionemys horsfield; C. pic, Chrysemys picta; P.pla, Platemys platycephala; N. fuz, Nothobranchius furzeri; C. fab, Centroscyllium fabricii; X. lae, Xenopus laevis; R. nor, Rattus norvegicus; D. rad, Darevskia raddei; D. ros, Darevskia rostombekovi; D. mix, Darevskia mixta; L. agi, Lacerta agilis; G. gal, Gallotia gallotia; C. cat, Cordylus cataphractus; C. oce, Chalcides ocellatus; T. sci, Tiligua scincoides; H. cau, Hemitheconyx caudicinctus; V. pra, Varanus prasinus; I. igu, Iguana iguana; U. sta, Uta stansburiana; A. car, Anolis carolinensis; L. cau, Laudakia caucasia; U. aeg, Uromastyx aegiptia; C. cha, Chamaeleo chamaeleon; C. par, Calumma parsonii; P. lin, Psammophis lineolatus; M. mon, Malpolon monspessulanum; A. lub, Aspidelaps lubricus; V. ren, Vipera renardi; T. mue, Typhlops muelleri; C. por, Crocodylus porosus; C. cro, Caiman crocodilus; T. gut, Taeniopygia guttata; M. tor, Malacochersus tornieri; E. orb, Emys orbicularis; H. spi, Heospines spinosa; C. mil, Callorhinchus milii; A. pel, Alopias pelagicus; T. khu, Tor khudree; X. bor, Xenopus borealis; H. sap, Homo sapiens. ⁎ Corresponding author. Tel.: +7 8 499 1359864; fax: +7 8 499 1354105. E-mail addresses: [email protected] (N. Kupriyanova), [email protected] (D. Shibalev), [email protected] (A. Voronov), [email protected] (K. Netchvolodov), [email protected] (T. Kurako), [email protected] (A. Ryskov). 0378-1119/$ – see front matter © 2012 Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.gene.2012.07.024

1. Introduction The study of reptilian genome as far as studies of ancient vertebrate ancestors' genomes is progressing vigorously within the last few years. Four orders of recent reptiles are currently recognized: Sphenodontia (tuatara) (2 species); Crocodilia (23 species); Testudines (about 300 species) and Squamata reptiles which number approximately 8000 living species and are a major component of the world's terrestrial vertebrate diversity. The classical phylogeny of living reptiles pairs crocodilians with birds, tuataras with squamates, and places turtles at the base of the tree. Analyses of mitochondrial DNA and 22 nuclear genes join crocodilians with turtles and place squamates at the base of the tree. Molecular time estimates support a Triassic origin for the major groups of living reptiles (Hedges and Poling, 1999); Macey et al., 2004. Morphological classification of Squamata by Estes et al. (1988) places snakes close to the other limbless forms, Dibamidae and Amphisbaenia and makes a proposal that Iguania and Scleroglossa split in the late Triassic period. In some other classifications, snakes were considered as the sister taxon of varanids. This opinion was supported by placing snakes within the Anguimorpha clade, thereby controverting their separation as an individual clade from other Squamata (Fuller et al., 1998; Gorr et al., 1998; Tatarinov, 2006).

86

N. Kupriyanova et al. / Gene 508 (2012) 85–91

Molecular phylogeny does not accept the basal split of squamates into Iguania and Scleroglossa, that is in conflict with morphological evidence (Kumazawa, 2007; Townsend et al., 2004; Vidal and Hedges, 2005). Phylogenetic analyses based on the molecular data from nine nuclear protein-coding genes place snakes to the same clade as lacertids and amphisbaenids. Furthermore, a more recent classification leads to a conclusion that together with snakes and anguimorphs, iguanians form a clade – Toxicofera – characterized by the presence of toxin secreting oral glands and demonstrating a single early origin of venom in squamates (Vidal and Hedges, 2005, 2009). It is noted that an employment of different gene markers often leads to different conclusions (Kupriyanova, 2009). This raises the question of the use of supplementary DNA regions to address the problem (Townsend et al., 2008). The ribosomal operon is one of the suitable candidates for this role. In eukaryotic organisms, the ribosomal RNA (rRNA) is transcribing in the nucleolus as a single large precursor with the mature rRNA sequences separated by internal transcribed spacers, ITS1 and ITS2. Studies using the yeast model have confirmed their importance in the faithful production of mature rRNAs (van Nues et al., 1997; Cote et al., 2002). Therefore, we can infer ITS2 to be an important marker of taxons evolution. The ITS2 sequences of all vertebrates studied share conserved structural elements. ITS2 is organized in three-four main domains of secondary structure emerging from a preserved central core (Coleman, 2003; Joseph et al., 1999; Michot et al., 1999). Here, we have studied ITS2 sequences of a set of reptile taxa and revealed high variable functionally neutral regions alternating with functionally important conserved regions detected earlier. We have based the ITS2s alignment on ability of conservative regions to form constant columns that separate orthological stem–loop areas, and permit a local comparison between them. The phylogenetic tree based on our alignment reveals a mix of features deduced or rejected earlier by other authors. 2. Materials and methods 2.1. Samples Some of the DNA samples were kindly provided by colleagues from the Laboratory of Eukaryotic Genome Evolution (Engelhard Institute of Molecular Biology, Moscow); others were isolated from blood or tissues of lizards obtained from Moscow Zoo, Moscow and St.-Petersburg Universities. A list of taxa is given in Table 1. 2.2. ITS2 PCR products cloning and sequencing To isolate and sequence ITS2, we used primers complementary to conserved regions of human 5.8S and 28S rDNAs (P1 5′-aattaatgt gaattgcaggacaca-3′; P2 5′-gccgcgtctgatctgaggtc-3′), according to (Gonzalez and Sylvester, 1995). The PCR reaction mixture (30 μl) contained 100 ng of DNA, 0.5 μM each primer, 200 μM each dNTP, buffer with a final concentration of 67 mM Tris–HCl (pH 8.8), 16 mM (NH4)2SO4, 0.1% Tween 20, 2.5 mM MgCl2, and 1.0 unit of Thermus aquaticus polymerase (Fermentas). The PCR conditions typically included the preliminary heating cycle at 95 °C for 1 min., followed by 34 cycles at 95 °C for 40 s, 56 °C for 40 s, and 74 °C for 1 min in an automated Terzik thermal cycler (DNA technology, Russia). The PCR products were fractionated in 1.5% agarose gel and purified with Illustra GFX PCR DNA and Gel Band Purification Kit (GE Healthcare). Ligation and transformation were performed correspondingly with the use of pGEMT easy Vector System and High Efficiency Competent Cells (“Promega”), according to the manufacturer's protocol. For recombinant plasmids isolation Illustra DNA Extraction Kit BACC2 was used. Sequencing was performed on Applied Biosystems 3100 unit by “Syntol” company. From three to five parallel clones containing ITS2 were sequenced for each species.

2.3. RNA folding Optimal folding was predicted for DI–DIII hairpin structures on a thermodynamic basis using MFOLD program, version 3.2 mfold server [21] (Zuker, 2003) http://mfold.rna.albany.edu/?q=mfold/ RNA-Folding-Form. 2.4. ITS2 sequence alignment and phylogenetic tree construction ITS2 sequences known for vertebrates today were aligned using the GeneBee Molecular Biology Server software program http:// www.genebee.msu.su/services/malignreduced.html (Brodsky et al., 1995; Nikolaev et al., 1997). Selection of this program was determined by the task complexity. The currently available programs were successfully applied for rather short ITS2s of plants, protists, and many invertebrates. In our case, ITS2 sequences belong to different vertebrates' taxons. They are characterized by considerable size differences, as well as the differences in primary and secondary structures of functionally neutral regions. The GeneBee Server software program makes it possible to carry out synchronous, total automated alignment of the group of sequences of any sizes with subsequent construction of the prescribed tree type based on a given alignment. The phylogenetic tree is also constructed using the Bayesian inference method (with the MrBayes program). 3. Results and discussion 3.1. Composition of reptiles' ITS2s We have cloned and sequenced the ITS2s of 58 species of the three reptile orders, Squamata, Crocodiles and Turtles. Squamata reptiles under study belong to suborders Serpentes and Sauria (infraorders Iguania, Scincomorpha,Varanoidea, and Anguimorpha (Table 1). They contain conservative regions (consensus sequences), analogous to those in all the vertebrates' ITS2 studied before (Coleman, 2003; Joseph et al., 1999; Michot et al., 1999) (Fig 1). Alignment of reptiles' ITS2s is shown in Fig. 2. Our schema recognizes all major consensus segments, one of them being the very first 12 nucleotides of the ITS2 representing a cis-element for the U3 small nucleolar RNAs. It corresponds to the underlined ‘a’ element in Fig. 1 and is followed by the A stem with apical variable DI stem. The second consensuses (‘b1 and b2’) are divided in true lizards and some acrodonts by insertions which are able to form stem–loop structures DII′. The B stems with apical variable DII (between b2 and consensus ‘c’) (Fig. 2) are also present in all organisms under study. Consensus ‘c’ comprises the region providing the ITS2 specific cleavage for the 8S pre-rRNA. On reptiles' ITS2s alignment we have expanded the ‘d’ consensus region (containing the cleavage site for 12S pre-rRNA maturation) up to about 40 nt against several base pairs slightly denoted on the previous scheme (compare Figs. 1 and 2). So, we have decided to consider all the area between consensuses ‘c’ and ‘d’ as a stem DIII. 3.2. The consensus regions of reptiles' ITS2s The consensus ‘a’ consists of 12 nucleotides, subdivided into two halves: five nucleotides at the 5′-end often containing taxon specific substitutions and absolutely conservative seven bases at the 3′-end with one exception for Caiman crocodilus (line 49, Fig. 2). A maximal number of variations at the 5′-end are detected in birds, gekkos, some snakes, Anguis fragilis, and Dibamus deharvengi (Fig. 2). The sizes of the DI stems vary considerably between species from 14 nts in D. deharvengi (lizards) up to 141 nts in Malpolon monspessulanum (snakes) having usually closely related lengths in related taxa. The region DII′ separates the consensuses b1–b2 (Figs. 1, 2). It contains insertions (16–30 nts.) in all true lizards' and in some

N. Kupriyanova et al. / Gene 508 (2012) 85–91

87

Table 1 Reptiles' classification and ITS2s sequences numbers used in this work. Reptiles' classification corresponds to www.reptile-database.org (2007). The annotated numbers are deposited in GeneBank. ITS2s sequences for Anolis carolinensis and Taeniopygia guttata are found among non marked fill-sized genomic sequences (http://blast.ncbi.nlm.nih.gov/Blast.cgi). Order

Suborder

Infraorder

Family

Genus, species

Abbr.

Acc. no.

Squamata

Sauria (Lacertilia)

Scincomorpha

Lacertidae

Darevskia armeniaca Darevskia raddei Darevskia valentini Darevskia rostombekovi Darevskia portschinskii Darevskia mixta Lacerta media Lacerta agilis Lacerta strigata Gallotia sp. Gerrhosaurus major Cordylus cataphractus Eulamprus murrayi Chalcides ocellatus Eumeces schneideri Tiligua scincoides Eublepharis macularius Hemitheconyx caudicinctus Varanus exantematicus Varanus prasinus Heloderma horridum Dibamus deharvengi Anguis fragilis Iguana iguana Ctenosaura oedrina Uta stansburiana Polychrus marmoratus Anolis carolinensis Basiliscus plumifrons Laudakia caucasia Pogona vitticeps Uromastyx aegiptia Physignathus cocincinus Chamaeleo chamaeleon Frucifer pardalis Calumma parsonii Rhabdophis tigrinus Psammophis lineolatus Natrix natrix Malpolon monspessulanum Naja kaouthia Aspidelaps lubricus Boa constrictor Vipera renardi Cylindrophis ruffus Typhlops muelleri Crockodile niloticus Crocodylus porosus Crocodylus siamensis Caiman crocodilus Gallus gallus Taeniopygia guttata Testudo graeca Malacochersus tornieri Agrionemys horsfield Emys orbicularis Chrysemys picta Heospines spinosa Platemys platycephala

D. arm D. rad D. val D. ros D. por D. mix L. med L. agi L. str G. gal G. maj C. cat E. mur C. oce E. sch T. sci E. mac H. cau V. exa V. pra H. hor D. deh A. fra I. igu C. oed U. sta P. mar A. car B. plu L. cau P. vit U. aeg P. coc C. cha F. par C. par R.tig P. lin N. nat M. mon N. kao A. lub B. con V. ren C. ruf T. mue C. nil C. por C. sia C. cro Gal. g T. gut T. gra M. tor A. hor E. orb C. pic H. spi P. pla

AY696809 DQ184944 DQ184945 DQ184946 DQ184947 DQ343131 DQ343129 DQ343130 AY646810 DQ184943 JQ272188 JQ272189 DQ184940 HM636629 HM636630 JQ272187 HM636631 HM636632 DQ184942 HM744734 JQ272190 JQ272191 HM744735 DQ184939 EU407543.1 HM744736 HM744737 G889P67363RG3 HM803218 AY643402 HM803219 HM803220 HM803221 DQ184941 HM803222 HM803223 HM803224 HM852519 HM852520 HM852521 HM852522 HM852523 HM852524 HM852525 JQ272192 JQ272193 HM852526 EU727191 EU727190 HM852527 DQ018755 TGAM-0149O05 HM852528 JQ272196 JQ272194 HM852529 AY859625.1 HM852530 JQ272195

Gerrhosauridae Cordylidae Scincidae

Gekkonidae Varanidae Helodermatidae Dibamidae Anguanidae Iguanidae

Iguania

Phrynosomatidae Polychrotidae Corytophanidae Agamidae

Chamaeleonidae

Serpents

Colubridae

Elapidae Boidae Viperidae Cylindrophiidae Typhlopidae Crocodylidae

Crocodilia

Alligatoridae Phasianidae Estrildidae Testudinidae

Aves Testudines

Emydidae Bataguridae Chelidae Vertebrates which are not included in Table 1, but present in Supplements: Fishes:

Amphibians: Mammals:

Callorhinchus milii (GenBank: AY049811) Nothobranchius furzeri (GenBank: EU780557) Alopias pelagicus (GenBank: AY049806) Centroscyllium fabricii (GenBank: AY049818) Tor khudree (GenBank: GU568369) Xenopus laevis (GenBank: X59734) Xenopus borealis (GenBank: X59733) Rattus norvegicus (GenBank: J00781) Homo sapiens (GenBank: U13369)

88

N. Kupriyanova et al. / Gene 508 (2012) 85–91

Fig. 1. A scheme of secondary structure of vertebrate's ITS2 (Joseph et al., 1999). The consensuses are shown by the letters a–d. Consensus b is divided into b1 and b2 parts due to the presence of the DII′ insertion in some lizard's ITS2s. Stem–loop regions are denoted by letters A–D + their variable parts DI–DIV.

Species 1. I.igu 2. C.oed 3. U.sta 4. P.mar 5. A.car 6. B.plu 7. U.aeg 8. L.cau 9. P.vit 10. P.coc 11. C.cha 12. F.par 13. C.par 14. E.mac 15. H.cau 16. E.mur 17. C.oce 18. T.sci 19. E.sch 20. V.exa 21. V.pra 22. D.deh 23. A.fra 24. G.maj 25. C.cat 26. G.gal 27. D.arm 28. D.mix 29. D.rad 30. D.val 31. D.ros 32. D.por 33. L.str 34. L.med 35. L.agi 36 T.mue 37. R.tig 38. C.ruf 39. N.nat 40. B.con 41. N.kao 42. A.lub 43. V.ren 44. M.mon 45. P.lin 46. C.nil 47. C.por 48. C.sia 49. C.cro 50. T.gut 51. Gal.g 52. T.gra 53. C.pic 54. E.orb 55. H.spi 56. A.hor 57. P.pla 58. M.tor

a gacgg--tcaatcg gacgg--tcaatcg gacgg--tcaatcg aacgaa-tcaatcg gacgg--tcaattg gacgg—-tcaatcg gacag—-tcaatcg gacgg-—tcaatcg gacgg—-tcaatcg gacgg—-tcaatca gacgg--tcaatcg gacgg--tcaatcg gacgg--tcaatcg ca-gca-tcaatcg ca-gca-tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg—-tcaatcg attcg-ttcaatcg taacg—-tcaatcg gacgg--tcaatcg gacgg--tcaatcg gaagg--tcaatca gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg --cgggttcaatcg -acgg-ttcaatcg -acgg-ttcaatcg gacggattcaatcg gaagg-ttcaatcg -acgg-ttcaatcg -acgg-ttcaatcg gacggattcaatcg gacgg-ttcaatcg gacggattcaatcg gacg--atcaatcg gaagg--tcaatca gaagg--tcaatca gaa---ttcgattg gacg--atcaatcg --tgccatcaatcg gaagg--tcaatcg gaagg--tcaatcg gaagg—-tcaatcg gaagg—-tcaatcg ga-gg--tcaatcg gaagg--tcaatcg gaagg--tcaatcg

DI 26 28 36 30 16 26 18 26 30 47 29 29 34 31 31 29 18 34 28 31 32 14 61 21 27 70 80 79 83 77 82 82 97 97 97 37 96 97 102 104 122 115 119 141 102 70 58 58 84 29 16 109 103 90 99 89 94 87

acrodonts' ITS2s. These insertions exhibit internal complementarities and can be folded into stable hairpins. There are no noticeable differences in Lacertidae DII′ primary structures. Related specificity is characteristic for DII's of Chamaeleo chamaeleon, and Furcifer pardalis as well as agama Laudakia caucasia. DII′ primary structure hairpins reveal significantly lower stability in acrodonts. It is difficult to follow mechanisms of evolutionary material transfer. These insertions (16–30 nts) can invade into appropriate points of ancestors' ITS2s with subsequent modifications in a course of differentiation. The other way is a random insertion of unit nucleotides. The first suggestion seems to be more realistic at the expense of high extent of similarity between primary structures of new genetic material in more young taxons. The sizes of the major DII stems vary considerably between species from 8 bp in Basiliscus plumifrons to 109 bp in Testudo graeca (the lines 6 and 52 in Fig. 2). It is interesting, that sizes of DI and DII in true lizards and snakes are among the largest among reptiles under study, while DII is ten times shorter in snakes. The consensus ‘c’ consists of 22–25 nts., with high extent of similarity, but with only one absolutely conservative position AGA, and nearly conservative position AAG with two exceptions for a C. crocodilus and a bird Taeniopygia guttata (the lines 49 and 50 in Fig. 2). This segment contains cleavage site (C)3–5 (N)1–3AAG(N)3-4A^GA for 8S pre-rRNA processing experimentally determined earlier for vertebrates (Joseph et al., 1999; Michot et al., 1999). Nucleotides in other positions of the

b1 D II’ b2 D II c D III d cgcggctgggg-------------------------gcccttcgcag 33 ctacgcccccg-aagcg-cagaccc 78 cgcgcggctgtctgtggaga----cgcgcagggctgcccg cgcggctgggg----------------------gccccctctcgcag 36 ctacgcccccg-aagcg-cagatcc 91 cgcgcggctgtctgtggaga------cgcagggctgcccg cgcggctgggg--------------------------gccttcgcgg 35-ccacgccccc--aagcg-cagaccc 84 cgcgcggctgtctgtggaga----cgcgcagggctgcccg cgcggcttgaggg-----------------------ttccctcgcag 14 ctacgtccacc-aagcg-aagacgc 116 cgcgcggctgtctgtggaga------cgcagggctgcccg cggggctgggcggtggggg---------------ttcctcttagcag 38 cttcgcccct--aagcgc-agaacc 156 cgcgcggctgtctgtggaga------cgcagggctgcccg cgcggctggggg--------------------------tcttcgcag 8 ctacgtccccc-aagcgc-agaccc 51 cgcgcggctgtctgtggaga------cgcagggctgcccg cgcggctggggg----------------------tccctcgtcgcag 57 ctccgtccccccaagtcc-agacc 20 --cgcggccgtatgtggagt------cacagcgctgcccg cgcggctgggggtttcgtttt--aggaaaaaaccaccctcgtcgcag 65 ctccgcccccg-aagtcc-agaccc 47 cgcgcggctgtctgtggaga----cacacagcgctgcccg cgcggctggggg--------------------ccctcctcgtcgcgg 30 ccgcgtccccccaagtcc-agaccc 65 cgcgcggctgtctgtggaga----cacagcgc—-tgaccg tggggctggggg ----------------------ctcttttcggag 38 ttcatctccc-aaagtct-agatgt 134 tgtgcggctgt--gtggaga----cacagtgcc-tgcccg cgcggctggggg-taagggaagccctgcggggc-gtaccgctcgcag 35 ctgcgccccc--aagtcg-agaccc 130 cgcgcggctgtccgtggag----tcgcgcagggctgcccg cgcggctggggg-taagggaagccctgcggggc-gtaccgctcgcag 26 ccgcgcccccg-aagtcg-agaccc 130 cgcgtggctgtccgcggcactccccg-gcgctgctgcccg cgcggctgggggaaaaa-------------------gcccctcgcgg 30 ccgcgtccccccaagtcg-agaccc 90 cgcgcggctgtctgcgggacgcccag-----cgctgcccg cgcggctgggggggg-----------------------tcttcgcgg 33 cctcgccccccaaaggc--agatcg 203 cgcgcggctgtctgtggactcac------agcgctgcccg cgcggctgggg----------------------------cgtcgcag 33 cctcgcccccc-aaggc--agatgt 161 cgcgcggctgtctgtggacgcac------agcgctgcccg cgcggctggggg--------------------------ccctcgcag 47 ctccgccccc--aaggcc-agaccc 132 cgcgcggctgtccgtggcga------cgcggggctgcccg cgcggctgggag----------------------------ctcgcag 21 cttcgtcccc--aaggcc-agaccc 72 cgcgcggctgtccgtggaaa------cgcggggctgcccg cgcggctggggg--------------------------ccctcgcag 38-ctccgt-ccccccaaggccagactc 133 cgcgcggctgtccgtggga-------cgcggggctgcccg cgcggctgggag----------------------------ctcgcag 41 cttcgtcccc--aaggcc-agaccc 72 cgcgcggctgtccgtggaaa------cgcggggctgcccg cgcggctggggg----------------------ccgcccgtcgcag 69 cttcgccccc--aaggcgcagaccc 192 cgcgcggctgtctgtggagagagacccacagcgctgcccg cgcggctggggg----------------------ccgcccgtcgcag 62 cttcgtccccacaag-cgcagaccc 193 cgcgcggctgtctgtggagagagacacacagcgctgcccg cgcggctggggg-------------------- cttttccttcgcag 18 cttcgtccccctaaggtg-agaccg 141 cccgtgcgcggctgtctgtggagacaca--gtgctgcccg cgcggctggggg------------------------tcccctcgcag 31 cttcgtccccctaagtcg-agaccc 171 cgcgcggctgtctgtggaga------cacagcgctgcccg cgcggctggggg--------------------------ccctcccag 19cctgggt-ccccccaagtccagacacc141 cgcgcggctgtccgtggaaa------cgcggggctgcccg cgcggctggggg--------------------------ccctcccgg 18cc-gggt-ccccc-aagtccagacccc142 cgcgcggctgtctgtggg---tccctcacagagctgcccg tgcggctgggggaa-gc--gt--ggttcgc-ag--------tcgcag 72 ctacgccccc--aagtcc-agaccc 141 cgcgcggctgtctgtggaga------cacagcgctgcccg cgcggctggggaaa-gccggtccggtccgcccg-ccggcgttcgcag 72 ctacgccccc--aagtcc-agaccc 212 cgcgcggctgtctgtggaga------cacagggctgcccg cgcggctgggggaa-gccggtccggtccgcccggccggcgttcgcag 72 ctacgccccc--aagtcc-agaccc 217 cgcgcggctgtctgtggaga------cacagggctgcccg cgcggctggggggaagccggtccggtccgcccggccggcgttcgcag 72 ctacgccccc--aagtcc-agaccc 214 cgcgcggctgtctgtggaga------cacagggctgcccg cgcggctgggggaa-gccggtccggtccgcccggccggcgttcgcag 71 ctacgccccc--aagtcc-agaccc 215 cgcgcggctgtctgtggaga------cacagggctgcccg cgcggctgggggaa-gccggtccggtccgcccggccggcgttcgcag 68 ctacgccccc--aagtcc-agaccc 216 cgcgcggctgtctgtggaga------cacagggctgcccg cgcggctgggggaa-gccggtccggtccgcccggccggcgttcgcag 71 ctacgccccc--aagtcc-agaccc 218 cgcgcggctgtctgtggaga------cacagggctgcccg cgcggctgggggaa-gccgggcccggccggccggc----gttcgcag 74 ctacgccccc--aagtcc-agaccc 217 cgcgcggctgtctgtggagg------cacagggctgcccg cgcagctgggggaa-gccgggcccggccggtcggc----gttcgcag 74 ctacgccccc--aagtcc-agaccc 218 cgcgcggctgtctgtggagg------cacagggctgcccg cgcagctgggggaa-gccgggcccggccggtcggc----gttcgcag 74 ctacgccccc--aagtcc-agaccc 218 cgcgcggctgtctgtggagg------cacagggctgcccg cgcggctgggg-----------------------------ttcg-ag 7 ccttgtccccccaagcccagacccc 118 cgctcggctgtct---gtct-aga---ggagctctgcccg cgcggctgggg-----------------------------ttcttgg 11 ccccgtccccc-aagtcc-agaccc 215 cgcgcggctgtctgcggtgtcacc-----gttgctgcccg cgcggctgggg-----------------------------ttcg-ag 9 ccttgtccccccaagcccagacccc 180 cgctcggctgtctgtctagagga------gct-ctgcccg cgcggctgggg-----------------------------ttctcgg 19 ctccgcccccc-aagtcc-agaccc 273 tgtgcggctgtctgcggtgtgacc-----gttgctgcccg cgcggctgggg-----------------------------ttcgcgg 19 ctccgcctccc-aagtcc-agaccc 254 cgcgcggctgtctgcggtgtcccc-----gtcgctgcccg cgcggatgggg-----------------------------ttctcgg 11 ctcgtcccccc-aagtcc-agaccc 278 cgcgcggctgtctgcggtgtcacc-----gttgctgcccg cgcggatgggg-----------------------------ttctcgg 11 ctcgtcccccc-aagtcc-agaccc 281 cgcgcggctgtctgcggtgtcgcc-----gttgctgcccg cgcggctgggg-----------------------------ttctggg 11 ctccgcccccc-aagtcc-agaccc 275 cgcgcggctgtctgcggtgtcacc-----gtcgctgcccg cgcggctgggg-----------------------------ttctcgg 19 ctccgcccccc-aagtcc-agaccc 221 cgcgcggctgtctgcggtgtcacc-----gttgctgcccg cgcggctgggg-----------------------------ttctcgg 19 ctccgcccccc-aagtcc-agaccc 258 cgcgcggctgtcttcggtgtcacc-----gttgctgcccg cgcggctgggg--------------------------tgcctcgcag 96 cttcgccccctaaggtc--agactc 135 cgcgcggctgtcggtggcgacaca------gggctgaccg tgcagctagggggg-----------------------tgtctcgcag 83 cttcgcccc-t-aaggtc-agacat 132 cgtgcggctgtctcagg-------------------cccg tgcagctagggggg-----------------------tgtctcgcag 82 cttcgcccc-t-aaggtc-agacat 132 cgtgcggctgtctcagg-------------------cccg tgc-----ggg----------------------------tg-gccag 89 cgaggcccaatgcaag-cgagcgag 82 agcggggcagt--gcacag------ctccggt---cactg cgcggctgggg-------------------------tcgcatcgcag 45 cttcgccccctaaatg-c-agactc 221 cgggcggctgcgggtactcgt------gccgtgctg-ccg cgcggctgggg---------------------------cagtcgcag 12 cttcgccccct-aagtgc-agactc 135 ggcgcggctgccggtggaccactcgtctccgcgctgaccg cgcggctgggg----------------------------cgtcgcag 109 cttcgcccc-t-aagttc-agaccc 219 cgcgtggctgtctgtggcgaca--------cggctgcccg cgcggctgggg----------------------------tgtcgcag 78 cttcgcccc-t-aagttc-agaccc 243 cgcgcggctgtctgtggcgaca--------cggctgcccg cgcggctgggg----------------------------tgtcgcag 55 cttgtccccct-aagttc-agaccc 239 cgcgcggctgtctgtggcaaca--------cggctgcctg cgcggctgggg----------------------------cgtcgcag 95 cttcgtccccctaagttc-agaccc 322 gccgcgtctgtct-----------ccctccctcctgcccg cgcggctgggg----------------------------cgttgcag 142 ct-cgtcccgttcaagcggaggagg 158 cgaacggctgcccctgtttt----ggcaccg--ct-cccg cgcagctgggg----------------------------cgtcgcag 82 tt-cgtccccctaagttc-agaccc 158 cgcgcggctg—tctgcg-------ggcgc-tggct-cccg cgcggctgggg----------------------------cgtcgcag 81 tt-cgtccccctaggttc-agaccc 290 cgc-gggctg-tct-------gtggcgacccggctgcccg

Fig. 2. The aligned consensus ITS2 nucleotide sequences of the lizards and some other vertebrates. The sizes of variable parts of the stems A–D (D.I–D.III) are shown in a number of nucleotides. The region containing lizard specific sequences (DII′) is shown also on a light gray background. Substitutions in consensus regions are given in red.

N. Kupriyanova et al. / Gene 508 (2012) 85–91

C C

(Liz) (Liz)

+ +++ +++ ++ + + + 184-GCCCCgGGGAAGGCCGGGCGC-203 149-GACCC-CCCAAGTCCAGACCC-170

Fig. 3. Alignment of the “c” consensus with the “pseudo c ” upstream subrepeat from the ITS2 of Darevskia armeniaca. The results are similar for all 10 true lizards from our collection. A percent of similarity is about 44.5%. The distance between fragments is about 13 nt.

consensus ‘c’ reveal substitutions specific for concrete taxons. A maximal variability in this region is noted in Physignathus cocincinus and Caiman crocodiles (the lines 10 and 49 in Fig. 2). The stem DIII is the most variable one among all ITS2s segments presented in Fig. 2. It is equal to 20 bp in B. plumifrons and 322 bp in Heosemys spinosa (the lines 10 and 55 in Fig 2). The consensus ‘d’ is the longest one of all ITS2s consensuses studied here. It consists of 34–40 nts. with high extent of similarity (Fig. 2) and includes the most conservative segment CGGCTGTC^TGTGGA harboring cleavage site for 12S pre-rRNA processing experimentally determined earlier for the Xenopus laevis pre-rRNA (Labhart and Reeder, 1986). The conservative segment including corresponding cleavage site for 12S pre-rRNA in Muridae is CGTCCG^TGCGCCGA (Michot et al., 1999). It is interesting that the homologous site was found in DIII of all turtles examined with 75% similarity to Testudina graeca and Chrysemys picta, 69%, to Emys orbicularis, and 67%, to Heospines spinosa. The site is located 63 to 128 nt apart from consensus ‘c’. In ITS2s of rat,

89

mouse, and human this site is found 45 to 115 nt downstream from consensus c. These findings suggest the existence of a rather recent common ancestor of turtles, mammals, and crocodiles. The homologies like this are absent in Lacertidae and Serpentes.

3.3. Variable regions in the reptiles' ITS2s In our preceding work (Voronov et al., 2011) we have made total alignment of all DI, DII, and DIII variable regions with the use of the program on GeneBee Molecular Biology Server (Brodsky et al., 1995; Nikolaev et al., 1997). All the unrooted trees revealed correlation with traditional morphological classification inferring deep split between Iguania and Scleroglossa (Estes et al., 1988; Kumazawa, 2007; Townsend et al., 2004; Vidal and Hedges, 2005) in contradiction with molecular studies on mitochondrial DNA and some coding genes (Gorr et al., 1998; Kumazawa, 2007). Here, all the supplement material is divided into three files: Supplements 1, 2 and 3. Supplements 1–2 include phylogenetic unrooted trees based on vertebrate DII and DIII ITS2 sequences using the GeneBee Molecular Biology server program. Supplement 3 contains the phylogenetic tree constructed using the Bayesian inference method (with the MrBayes program) on the basis of an alignment of the sequences from the first nucleotide of the consensus ‘a’ to the last nucleotide of the consensus ‘d’.

Fig. 4. Schematic organization of the most conservative 5′-part of the reptiles' ITS2s. One-two species are chosen for each taxon, taking into account their internal patterns similarity.

90

N. Kupriyanova et al. / Gene 508 (2012) 85–91

3.4. Distances between consensus regions in different taxons A wide scatter in the DI–DII lengths (i.e. distances between ‘a’, ‘b’, and ‘c’ consensuses) has been estimated. The most surprising thing is that the shortest DI and DII are inherent for Iguania, Dibamidae and Mammalia ITS2, being substantially longer in Lacertidae, Serpents, Testudines and Crocodiles (Figs. 2–4). It worth to note, that incertae sedis (Amphisbaenia–Dibamidae) have many unique features that distinguish them from other reptiles. Internally, the right lung in Amphisbaenas (Dibamidae's sister group, according to some classifications) is reduced in size to fit their narrow bodies, whereas in snakes, it is always the left lung. Their skeletal structure and skin are also different from those of other Squamates (Gans, 1998). Analysis of ITS2s composition shows that in some taxons functionally active elements are brought into closest proximity (Fig. 2). All new stretches invading between them in other taxons are able to form stable stems that do not prevent bringing together of the consensus segments. DII′ appearance in ITS2s of Lacertian and some Acrodonts exhibits a vivid example of such invading. Subsequent evolutionary changes can lead to elongation of new hairpins, appearance of single nucleotide polymorphisms (SNPs) with following compensatory substitutions, stems brunching and so on. Most likely, opposite processes also take place. It is seen most clear in ITS2s of all snakes where DI is very long whereas DII in snakes is as short as DII in D. deharvengi. All the true lizards in our collection contain a subrepeat (cφ) upstream of the ‘c’ consensus that is lacking in snakes (Fig. 3). It is interesting that the ITS2 of the blind snake Typhlops muelleri in spite of its short total length is composed similarly to all true snakes, i.e. DI = 37 bp while DII is practically absent. It can be an argument in favor of Serpents' monophily. 3.5. Phylogenetic tree on the basis of manual alignment We have made also an attempt to align manually ITS2s sequences from the first nucleotide of the consensus ‘a’ up to the last nucleotide of the consensus ‘d’ in a row superimposing identical consensuses (four columns as in Fig. 2). The sequences with a high potential for secondary structures formation (DI–DIII stems) were also subjected to a thorough alignment. The phylogenetic tree was constructed using the Bayesian inference method (with the MrBayes program). The phylogenetic tree constructed on the basis of this alignment is not in contradiction with the trees for DI–DIII (Voronov et al., 2011). It is formed by the separate clade of Iguania (Iguanidae, Agamidae, Chameleonidae), suborder clades of Crocodilians+ Testudines, and Snakes+ True lizards. Infraorders Anguimorpha, Varanidae, Gekkota, and Scincomorpha form intermediate clades. It is known that reptiles since their appearance on Earth have experienced at least two massive extinctions. (1) Permian massive extinction resulted in the disappearance of many species, orders, and even classes. (2) Massive extinction at the end of Mesozoic (65.5 million years ago) is widely known as extinction of dinosaurs (Barry, 2002 Evans, 2003; Ivakhnenko and Kurochkina, 2008). Besides of turtles and crocodiles, this extinction was passed through by lepidosaurs (snakes, lizards, and legless lizards), the descendants of which still exist. The evolution acceleration in some turtles', snakes', and lizards' taxa after extinction of dinosaurs expressed as a rapid increase of the genome size (including ITS2) without of remarkable morphological changes. It remains a mystery why turtles, crocodiles, snakes, and true lizards, manifest themselves as “young species” on phylogenetic trees although their ancestors existed even before the Permian extinction. On the basis of phylogenetic trees comparison we have dared to hypothesize that all vertebrates originated from common ancient ancestor, which may be a primitive animal species that have received little attention from modern researchers. A rapid increase of the

genome size could sometimes be accompanied by long local deletions as it took place for the sudden DII shortening in ITS2 of snakes as well as DI and DII in Mammals. Checking of this hypothesis requires an increase of reliability of the plylogenetic trees. We are going for an example to use for the alignment more prolonged segments of the rDNA. Supplementary data to this article can be found online at http:// dx.doi.org/10.1016/j.gene.2012.07.024.

Acknowledgments We are grateful to D.B. Vasil'ev, V.V. Grechko, A.N. Kosushkin, N.B. Anannjeva, and N. Poyarkov who kindly provided us with animal blood or DNA samples. This study was supported by the Russian Foundation for Basic Research (project no. 10-04-00651-а), the Russian Academy of Sciences Programs (Biodiversity and Dynamics of Gene Pools and Molecular and Cell Biology) of the Presidium of the Russian Academy of Sciences, and President Grants for Government Support of Young Russian Scientists.

References Barry, P.L., 2002. The Great Dying: Science NASA. Science and Technology Directorate, Marshall Space Flight Center, NASA, Wikipedia. Brodsky, L.I., et al., 1995. GeneBee-NET: internet-based server for analyzing biopolymers structure. Biochemistry 60 (8), 923–928. Coleman, A.W., 2003. ITS2 is a double-edged tool for eukaryote evolutionary comparisons. Trends Genet. 19, 370–375. Cote, C., Greer, C., Peculis, B., 2002. Dynamic conformation model for the role of ITS2 in pre-rRNA processing in yeast. RNA 8, 786–797. Estes, R., de Queiroz, K., Gauthier, J., 1988. Phylogenetic relationships within Squamata. In: Estes, R., Pregill, G. (Eds.), The Phylogenetic Relationships of the Lizard Families. Stanford University Press, Palo Alto, pp. 119–281. Evans, S.E., 2003. At the feet of the dinosaurs: the early history and radiation of lizards. Biol. Rev. 78, 513–551. Fuller, S., Baverstock, P., King, D., 1998. Biogeographic origins of Goannas (Varanidae): a molecular perspective. Mol. Phylogenet. Evol. 9, 294–307. Gans, C., 1998. In: Cogger, H.G., Zweifel, R.G. (Eds.), Encyclopedia of Reptiles and Amphibians. Academic Press, San Diego, pp. 212–215. Gonzalez, I.L., Sylvester, J.E., 1995. Complete sequence of the 43-kb human ribosomal DNA repeat: Analysis of the intergenic spacer. Genomics 27, 320–328. Gorr, T.A., Mable, B.K., Kleinschmidt, T., 1998. Phylogenetic analysis of reptilian hemoglobins: trees, rates, and divergences. J. Mol. Evol. 47, 471–485. Hedges, S.B., Poling, L.L., 1999. A molecular phylogeny of reptiles. Science 283 (5404), 998–1001. Ivakhnenko, F., Kurochkina, E., 2008. In: E.N. (Ed.), Iskopaemye reptilii i ptitsy (Fossil Reptiles and Birds). GEOS, Moscow, pp. 88–104. part 1. Joseph, N., Krauskopf, E., Vera, M., Michot, B., 1999. Ribosomal internal transcribed spacer 2 (ITS2) exhibits a common core of secondary structure in vertebrates and yeast. Nucleic Acids Res. 27, 4533–4553. Kumazawa, Y., 2007. Mitochondrial genomes from major lizard families suggest their phylogenetic relationships and ancient radiations. Gene 15 (388), 19–26. Kupriyanova, N.S., 2009. Current views of the origin and diversification of tetrapods. Mol. Biol. 43 (5), 819–833 (Russian). Labhart, P., Reeder, R.H., 1986. Characterization of three sites of RNA 3′ end formation in the Xenopus ribosomal gene spacer. Cell 45 (3), 431–433. Macey, J.R., Papenfuss, T.J., Kuehl, J.V., Fourcade, H.M., Boore, J.L., 2004. Phylogenetic relationships among amphisbaenian reptiles based on complete mitochondrial genomic sequences. Mol. Phylogenet. Evol. 33 (1), 22–31. Michot, B., Joseph, N., Mazan, S., Bachellerie, J.P., 1999. Evolutionary conserved structural features in the ITS-2 of mammalian pre-rRNAs and potential interactions with the snoRNA U8 detected by comparative analysis of new mouse sequences. Nucleic Acids Res. 27, 2271–2284. Nikolaev, V.K., Leontovich, A.M., Drachev, V.A., Brodsky, L.I., 1997. Building multiple alignments using iterative analyzing biopolymers structure dynamic improvement of the initial motif alignment. Biochemistry 62 (6), 578–582. Tatarinov, L.P., 2006. In: Paleontol, Tr. (Ed.), Essays on the Evolution of Reptiles. : Inst. Ross. Akad. Nauk. GEOS, Moscow. Townsend, T., Larson, A., Louis, E., Macey, J.R., 2004. Molecular phylogenetics of squamata: the position of snakes, amphisbaenians, and dibamids, and the root of the squamate tree. Syst. Biol. 53 (5), 735–757. Townsend, T.M., Alegre, R.E., Kelley, S.T., Wiens, J.J., Reeder, T.W., 2008. Rapid development of multiple nuclear loci for phylogenetic analysis using genomic resources: an example from squamate reptiles. Mol. Phylogenet. Evol. 47 (1), 129–142. van Nues, R.W., Rientjes, J.M., Morre, S.A., 1997. Evolutionarily conserved structural elements are critical for processing of Internal Transcribed Spacer 2 from Saccharomyces cerevisiae precursor ribosomal RNA. J. Mol. Biol. 250, 24–36.

N. Kupriyanova et al. / Gene 508 (2012) 85–91 Vidal, N., Hedges, S.B., 2005. The phylogeny of squamate reptiles (lizards, snakes, and amphisbaenians) inferred from nine nuclear protein-coding genes. C. R. Biol. 328, 1000–1008. Vidal, N., Hedges, S.B., 2009. The molecular evolutionary tree of lizards, snakes, and amphisbaenians. C. R. Biol. 332 (2–3), 129–139.

91

Voronov, A.S., Shibalev, D.V., Kupriyanova, N.S., 2011. Evolutionary relationships between reptiles inferred from the comparison of their ITS2 sequences. Genetika 47 (7), 975–985. Zuker, M., 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31 (13), 3406–3415.