Molecular Phylogenetics and Evolution 68 (2013) 373–379
Contents lists available at SciVerse ScienceDirect
Molecular Phylogenetics and Evolution journal homepage: www.elsevier.com/locate/ympev
Non-concerted ITS evolution in fungi, as revealed from the important medicinal fungus Ophiocordyceps sinensis Yi Li a,b, Lei Jiao a, Yi-Jian Yao a,⇑ a b
State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China Graduate University of Chinese Academy of Sciences, Beijing 100049, China
a r t i c l e
i n f o
Article history: Received 10 September 2012 Revised 4 April 2013 Accepted 12 April 2013 Available online 22 April 2013 Keywords: Non-concerted evolution ITS Pseudogene Ophiocordyceps sinensis
a b s t r a c t The internal transcribed spacer (ITS) of nuclear ribosomal DNA (nrDNA) has been widely used as a molecular marker in phylogenetic studies and has been selected as a DNA barcode for fungi. It is generally believed that nrDNA conforms to concerted evolution in most eukaryotes; however, intraindividual– intraspecific polymorphisms of this region were reported in various organisms, suggesting a non-concerted evolutionary process. In Ophiocordyceps sinensis, one of the most valuable medicinal fungi, a remarkable variation of the ITS region has been revealed. Some highly divergent sequences were thought to represent cryptic species, different species or genotypes in previous studies. To clarify the unusual ITS polymorphisms observed in O. sinensis, specific primers were designed to amplify ITS paralogs from pure cultures of both single-ascospore and tissue isolates in this study. All of the available ITS sequences, including those generated by this group and those in GenBank, were analyzed. Several AT-biased ITS paralogs were classified as pseudogenes based on their nucleotide compositions, secondary structures and minimum free energies of their 5.8S rRNAs, substitution rates, phylogenetic positions and gene expression analyses. Furthermore, ITS pseudogenes were amplified with specific primers from 10 of the 28 strains tested, including eight single-ascospore and two tissue isolates. Divergent ITS paralogs were proved to coexist in individual genomes, suggesting a non-concerted mechanism of evolution in the ITS region of O. sinensis. The hypotheses that divergent ITS paralogs represent cryptic or other species or different genotypes were thus rejected. Ó 2013 Elsevier Inc. All rights reserved.
1. Introduction The internal transcribed spacer (ITS) region of nuclear ribosomal DNA (nrDNA) is one of the most widely used molecular markers for species identification and phylogenetic inference of fungi (James et al., 2006; Schoch et al., 2012). The nrDNA region was thought to conform to the paradigm of concerted evolution (Coen et al., 1982; Liao, 1999; Nei and Rooney, 2005; Ganley and Kobayashi, 2007), but exceptions such as pseudogenes have been identified in the ITS regions from different organisms. The ITS pseudogenes were first reported in Zea (Buckler and Holtsford, 1996a,b) and then extended to Gossypium, Nicotiana, Tripsacum and Winteraceae (Buckler et al., 1997). More ITS pseudogenes were discovered in various plants, including angiosperms (e.g., Hrˇibová et al., 2011; Muir et al., 2001; Razafimandimbison et al., 2004) and gymnosperms (Won and Renner, 2005; Xiao et al., 2010), in animals, e.g., hard coral Acropora (Márquez et al., 2003) and lagoon cockle Cerastoderma glaucum (Freire et al., 2010), and in the flagellate protist Symbiodinium (Thornhill et al., 2007). However, ITS ⇑ Corresponding author. Fax: +86 10 64807468. E-mail address:
[email protected] (Y.-J. Yao). 1055-7903/$ - see front matter Ó 2013 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.ympev.2013.04.010
pseudogenes have rarely been reported in fungi, although intragenomic ITS variations have been found in Fusarium (O’Donnell and Cigelnik, 1997), Scutellospora (Hijri et al., 1999), Ganoderma (Wang and Yao, 2005), Xanthophyllomyces (Fell et al., 2007) and several other plant pathogens (Simon and Weiß, 2008). Recently, Lindner and Banik (2011) presented a remarkable case in Laetiporus in which significant intragenomic ITS variations were observed, and several cloned ITS paralogs with a divergent level greater than 5% were considered to be putative pseudogenes. Additionally, pseudogenes of another nrDNA gene family, 5S, have been found in fungi (Rooney and Ward, 2005). As a type of ITS polymorphism, ITS pseudogenes can be distinguished from intraindividual–intraspecific and even interspecific variations. Various methods have been developed to identify ITS pseudogenes. Nucleotide substitution combined with the GC-content, secondary structure and minimum free energy of RNA transcripts, phylogenetic analyses and relative-rate tests are usually used as the primary criteria for ITS pseudogene identification (summarized in Bailey et al., 2003). Gene expression is sometimes used as an additional criterion (e.g., Hartmann et al., 2001; Muir et al., 2001; Xiao et al., 2010). Despite their wide occurrence, pseudogenes are usually detected when multiple bands are amplified or
374
Y. Li et al. / Molecular Phylogenetics and Evolution 68 (2013) 373–379
multiple sequence signals are obtained in direct sequencing (Razafimandimbison et al., 2004). Several studies (e.g., Buckler and Holtsford, 1996a; Buckler et al., 1997; Zheng et al., 2008) indicate that less stable paralogs (pseudogenes) are amplified well under standard PCR conditions, while additives, e.g., dimethylsulfoxide (DMSO), could increase the specificity of paralogs with high-stability (functional genes). Pseudogenes with large indels can even be separated from functional copies by agarose gel electrophoresis (Hartmann et al., 2001). Ophiocordyceps sinensis (Berk.) G.H. Sung, J.M. Sung, HywelJones & Spatafora (synonym Cordyceps sinensis (Berk.) Sacc.) has gained significant scientific attention as a valuable Traditional Chinese Medicine. ITS sequences have been widely used for species identification, establishing anamorph–teleomorph connections (e.g., Liu et al., 2001), genetic diversity (e.g., Zhang et al., 2009) and phylogenetic analyses of this fungus and related species (Jiang and Yao, 2004; Stensrud et al., 2005, 2007). Although the ITS sequence has been used as an important genetic marker in molecular studies of O. sinensis, abnormal variations have also been found, e.g., two subgroups of sequences that are strongly supported by neighbor-joining analyses but not strictly represented by geographical divergence (Kinjo and Zang, 2001) and a long branch among O. sinensis sequences due to a number of unique C ? T transitions (Stensrud et al., 2005). Stensrud et al. (2007) further analyzed the intraspecific ITS variation among 71 sequences of O. sinensis downloaded from EMBL/GenBank and observed three significantly divergent O. sinensis lineages (referred to as groups A–C). The lineages were considered to be cryptic (phylogenetic) species ascribed to O. sinensis; these species were hypothesized to be caused by a shift in life historical attributes or ecology (Stensrud et al., 2007). The group A (GC rich) lineage was categorized as true O. sinensis, and the two AT rich lineages (groups B and C) were categorized as other species (Xiao et al., 2009); however, the GCand AT-biased ITS sequences were considered to be two genotypes rather than two fungal species (Zhu et al., 2010). It was also reported (Zhu et al., 2010) that the AT-biased genotypes are not found in the sclerotium of O. sinensis, which is covered by the exoskeleton of the host larva, but predominate in the premature stroma, whereas the GC-biased genotypes exist in the opposite manner, indicating that the proportion of the two genotypes alternate during maturation. In the present study, various ITS paralogs from O. sinensis were investigated. The secondary structures, minimum free energies and substitution rates of 5.8S rRNAs and the GC contents of ITS1, 5.8S and ITS2 were compared. Phylogeny was constructed using all of the available ITS haplotypes of O. sinensis. Reverse transcription PCR was also performed. Non-functional ITS pseudogenes were identified and applied to explain the unusual diversity of ITS sequences in this species. Both functional and non-functional ITS sequences were amplified using specific primers from both single-ascospore and tissue isolates. Different types of ITS sequences were thus proved to coexist in a single genome.
and 0.5% peptone before incubation on a rotary shaker (100 rpm) at 18 °C with the same culture medium without agar. The fresh mycelia were harvested after 25 d of incubation. The details of strains used in this study are listed in Table S1. 2.2. DNA extraction, primer design, amplification and sequencing of ITS regions Total genomic DNA was extracted from cultured mycelia using the modified CTAB method described by Yao et al. (1999) removing the phenol/chloroform step to reduce the use of hazardous chemicals. The DNA used for PCR reactions was diluted 5–10 times. The fungal universal primer pairs ITS5 and ITS4 (White et al., 1990) were used for ITS amplification. Three pairs of primers were designed for this study, i.e., GAF (50 -TCCCAAACCCCCTGCGAACACC-30 )/GAR (50 -AGGTCAACTGGAGGGTGTGGTGGTTTC-30 ), GCF (50 -TAGCAGTTGCCTTAGCGGGACCGCCCTA-30 )/GCR (50 -AATCCGAG GTTAACTAAAAGGCGTA-30 ) and 5.8S-F (50 -ACTTTTAACAACGGA TCTCTT-30 )/5.8S-R (50 -AAGATAACGCTCGGATAAGCAT-30 ), based on the alignment of ITS sequences from O. sinensis including putative pseudogenes. The primer pair GAF/GAR was specific for functional copies, and the GCF/GCR primer pair was designed for AT-biased sequences of group C as defined in Stensrud et al. (2007). Primers 5.8S-F and 5.8S-R were designed using the conserved 5.8S region, which should be effective for all of the paralogs, and were used to amplify 5.8S cDNA. Amplification was performed using a thermal cycler (Eppendorf) in a 25 ll PCR reaction that contained 12.5 ll 2 Taq PCR Master Mix (Tiangen Biotech Co., Ltd., China), 0.25 ll of each primer (10 lM) and 1 ll diluted DNA template using the following conditions: 2 min at 94 °C; 30 cycles of 94 °C for 30 s, 53 °C for 30 s (or 55 °C for primers GAF/GAR and 60 °C for primers GCF/GCR) and 72 °C for 45 s; and a final extension at 72 °C for 10 min. Direct PCR and nested PCR were both performed with primers GAF/GAR and GCF/GCR. A 100-times diluted PCR product amplified with the ITS5/ITS4 primers was used as the template for nested PCR. PCR products were directly sequenced using PCR primers on a capillary sequencer (Applied Biosystems 3730 Analyzer, Foster City, California) by the Beijing Genomics Institute (Beijing, China). 2.3. Sequence dataset, alignment and GC-content GC-biased ITS sequences of O. sinensis used in published articles and all of the available AT-biased sequences of this species were retrieved from GenBank and included in the analyses (Table S2). The boundaries of the ITS1, 5.8S and ITS2 regions were determined according to previously published ITS sequences. Sequences downloaded from GenBank were aligned with all of the sequences obtained in this study using ClustalW (Thompson et al., 1994). After excluding the 18S and 28S regions, sequences were analyzed using DAMBE v.4.2.13 (Xia and Xie, 2001). All of the haplotypes were kept for the following analyses. BioEdit version 7.0.9.0 (Hall, 1999) was used to refine the alignment manually and to calculate the GC-content of ITS1, 5.8S, ITS2 and the entire region.
2. Materials and methods 2.4. Phylogenetic analyses 2.1. Fungal strains Strains of O. sinensis used in this study were isolated from single-ascospores and the tissue of sclerotium. The sclerotium-derived strains were isolated from fresh specimens collected from various O. sinensis production regions within China, and the single-ascospore-derived strains were isolated from mature fruiting bodies collected from Guoluo, Qinghai, following the method described in Jiao (2010). The stock strains were maintained at 4 °C on Potato Dextrose Agar (PDA) supplemented with 5% wheat bran
For phylogenetic reconstructions, both maximum parsimony (MP) and Bayesian analyses were conducted based on the ITS region of nrDNA. MP analyses were performed using PAUP 4.0b8 (Swofford, 2002) with the following settings: 1000 replicates of random sequence addition and tree bisection-reconnection branch swapping. All characters were equally weighted, and gaps were treated as missing data. Bootstrap proportions (BP) were calculated using analyses of 1000 replicates with five replicates of random sequence addition. Bayesian analyses were implemented with MrBa-
Y. Li et al. / Molecular Phylogenetics and Evolution 68 (2013) 373–379
yes v.3.1.2 (Ronquist and Huelsenbeck, 2003) using a GTR model of DNA substitution with gamma-distributed rate variation across invariant sites. Two independent analyses of two parallel runs and four chains were carried out for 5,000,000 generations. Trees were sampled every 1000 generations. The first 10% of the sampling trees was discarded as burn in. A consensus tree and posterior probabilities (PP) were calculated in MrBayes. Hirsutella rhossiliensis, H. uncinata, O. coccidiicola, O. cochlidiicola, O. emeiensis, and O. robertsii were selected as outgroup taxa according to previous studies (e.g., Stensrud et al., 2005, 2007; Jiao, 2010). 2.5. Secondary structure and minimum free energy Secondary structures of the 5.8S region at the optimal growth temperature (18 °C; Dong and Yao, 2011) of the fungus O. sinensis were predicted using Mfold version 2.3 on a web server (http://mfold.rna.albany.edu/?q=mfold/RNA-Folding-Form2.3, Mathews et al., 1999; Zuker, 2003) using a universal model as a guide (Vaughn et al., 1984). RNA structures were displayed using RnaViz 2 (De Rijk et al., 2003). The free energies of the presumptive secondary structures were recorded for stability comparison. 2.6. Relative-rate test A relative-rate test was applied to test the functional status of nrDNA genes according to Muir et al. (2001). The constancy of the evolutionary rate among the divergent clades defined in the phylogenetic analyses was tested by a two-cluster relative-rate test (Takezaki et al., 1995) using the software PHYLTEST version 2.0 (Kumar, 1996). The analyses of the ITS1, 5.8S and ITS2 regions were performed separately. Rate constancy was examined for pairs of linearized trees using Kimura’s distance model (Kimura, 1980) with rate heterogeneity. The Gamma distribution shape parameters for ITS1 (1.7680), 5.8S (1.1950) and ITS2 (0.9010) were estimated using jModelTest 0.1.1 (Posada, 2008) with the respective regions. 2.7. Reverse transcription PCR (RT-PCR) Total RNA was extracted from strains 1220 and 1773 (Table S1) using the E.Z.N.A.™ Fungal RNA Kit (Omega Bio-tek), following the manufacturer’s instructions, based on detection of AT-biased sequences in the former but not in the latter. The RNA was treated with DNase I (GenStar Biosolutions Co., Ltd., Beijing, China) until no DNA contamination could be detected using PCR amplification with primers ITS5 and ITS4. RT-PCR was performed using the StarScript II First-strand cDNA Synthesis Kit (GenStar Biosolutions) with provided random primers. The cDNA was used to amplify the partial 5.8S region with primers 5.8S-F and 5.8S-R. Amplification was performed for 5 min at 95 °C, followed by 30 cycles of 1 min at 94 °C, 30 s at 50 °C, 45 s at 72 °C and a final extension for 10 min at 72 °C. PCR products were purified using the TIANgel Midi Purification Kit (Tiangen Biotech, Beijing, China) and then cloned into the EN-T™ Vector (GenStar Biosolution, Beijing, China). Sequencing was conducted on an Applied Biosystems (ABI) 3730 DNA sequencer by the Beijing Genomics Institute (Beijing, China). 3. Results 3.1. ITS amplification and sequencing A total of 38 ITS sequences were obtained in this study, including 28 GC-biased sequences (15 single-ascospore isolates and 13 tissue isolates) and 10 AT-biased sequences from eight single-ascospore isolates and two tissue isolates (Table S1). GC-biased ITS
375
sequences were amplified by the universal ITS5/ITS4 primers and the specific GAF/GAR primers designed in this study using either direct PCR or nested PCR, whereas AT-biased sequences were amplified only by the specific primers GCF/GCR using nested PCR. The sequences obtained in this study were aligned with 158 ITS sequences (47 were previously submitted by this laboratory) retrieved from GenBank (142 GC-biased and 16 AT-biased, Table S2). Some 56 ITS haplotypes, including 12 AT-biased haplotypes, were identified from the total 196 sequences. This 56-haplotype dataset was used for subsequent analyses including phylogenetic construction, GC content, secondary structure, free energy and a relative-rate test. 3.2. Phylogenetic analyses The complete dataset for phylogenetic analyses consisted of 56 ITS haplotypes from O. sinensis and six sequences of the same region from the species selected as outgroups. Among the 530 characters, 326 were constant and 204 were variable, 137 of which were parsimony-informative. A total of 313 most parsimonious trees were obtained from the MP analyses. Nearly all of the trees produced similar topologies to the major clades. The consensus phylogeny inferred from the Bayesian analyses revealed similar topology to MP analyses. One of the 313 equally parsimonious trees is shown in Fig. 1. All of the ITS sequences of O. sinensis were recognized as distinctly monophyletic with strong support (BP = 100%, PP = 100%, Fig. 1). Three divergent clades (marked as groups A–C) of O. sinensis sequences were resolved, which is similar to the previous molecular study by Stensrud et al. (2007). Two long-branched clades (groups B and C) were also highly supported (BP = 100%, PP = 100%, Fig. 1).
3.3. GC content, secondary structure and free energy As shown in Table S3, the GC content of group A ranged from 64.24% to 67.92% in ITS1, 49.68% to 50.32% in 5.8S, 75.14% to 76.84% in ITS2, and 63.77% to 65.38% for the entire ITS1-5.8SITS2 region. In contrast, the GC content of groups B and C (marked as ‘W’ in Table S3) ranged from 47.17% to 59.12%, 36.13% to 38.71%, 63.28% to 68.93% and 51.12% to 54.58% in ITS1, 5.8S, ITS2 and the whole region, respectively. The ITS sequences in groups B and C had significantly lower GC contents in all three regions compared with that of group A. The group B sequences had a higher GC content than group C (Table S3). The GC contents of ITS1, ITS2 and the entire region for group C sequences acquired in this study were not calculated because the specific primers designed did not amplify the entire ITS1 and ITS2 regions. Group B and C sequences accumulated a great number of G:C to A:T transition mutations (mainly C ? T, G ? A) throughout the ITS region. One or more secondary structures of 5.8S rRNA were obtained using MFold under the designated settings for each sequence, while only structures with minimum free energies were compared. Structures of all group A haplotypes (Fig. 2) were similar and accordant with the universal models developed by Vaughn et al. (1984). The secondary structure of 5.8S rRNA showed five paired regions (helices 1 to 5; Fig. 2), an AT-rich multibranched loop, a single strain (AT-rich), 50 -end (A, 5.8S-LSU interaction) and 30 -end (B). The AT-rich multibranched loop was destroyed for all group B and C haplotypes (Figs. S1 and S2). The minimum free energy of the 5.8S rRNAs in group A varied from 41.50 to 34.48 kcal/ mol, whereas the values for those in groups B and C varied from –34.82 to –25.91 kcal/mol (Table S3). Additional paired regions formed in group C 5.8S rRNAs due to large numbers of G:C to A:T transitions (Fig. S2), leading to a relative lower minimum free energy.
376
Y. Li et al. / Molecular Phylogenetics and Evolution 68 (2013) 373–379
Fig. 1. One of the 313 most parsimonious trees based on ITS sequences from Ophiocordyceps sinensis. Parsimony bootstrap values and posterior probabilities are given above and below the nodes, respectively. Sequences in bold were obtained in this study. The number of ITS sequences represented by the same haplotype is in brackets after the accession number. Three major clades, marked as groups A to C, represent the divergent ITS sequences of this species.
3.4. Relative-rate test The evolutionary rates of the three groups (A–C) revealed in the phylogenetic analyses (Fig. 1) as calculated by the relative-rate test
(Table 1) showed a significantly lower substitution rate in group A (denoted as functional) than that in groups B and C (denoted as pseudogenes). Rate constancy was rejected at the 5% level between group A and groups B and C for all three regions (ITS1, 5.8S and
377
Y. Li et al. / Molecular Phylogenetics and Evolution 68 (2013) 373–379
Fig. 2. Secondary structure of 5.8S rRNA of Ophiocordyceps sinensis as illustrated with the 5.8S region of JQ900148 from 1206/CS 70 (Tables S1 and S3).
ITS2). Rate constancy was not rejected between groups B and C for the coding 5.8S and the noncoding ITS2 but was rejected for ITS1 (Table 1). 3.5. cDNA A total of 42 positive 5.8S rRNA cDNA clones from the two strains (32 from strain 1220 and 10 from strain 1773) were sequenced. Of these clones, 28 showed identical sequences (104 bp excluding the primers) to the most frequent sequence in group A (JQ900148). The remaining 14 sequences (10 from strain 1220 and four from strain 1773) had single substitutions at various positions, likely caused by mismatches during reverse transcription or PCR amplification. No group B or C sequences were detected in the positive clones from either strain, despite that genomic DNA from the single-ascospore isolate, 1220, was successfully amplified with primers designed for AT-biased sequences (GCF/GCR) (Table S1). 4. Discussion
the GC content (Table S3), accumulated throughout the entire region, including the conserved 5.8S rRNA region, in the group B and C ITS sequences. The relative-rate test showed an accelerated evolution of ITS1, 5.8S and ITS2 regions (Table 1) for these two groups. The secondary structures of the 5.8S rRNAs for group A sequences were highly conserved and produced similar structures consistent with the universal model discovered by Vaughn et al. (1984), while the presumptive 5.8S transcripts for groups B and C produced divergent structures with relatively higher minimum free energies (Table S3; Figs. S1 and S2). Furthermore, the 5.8S nrDNA from groups B and C was not expressed according to the cDNA analyses performed in this study. The combination of all of these data suggests that the sequences of group A were functional and that the sequences of groups B and C were pseudogenes. Because ITS pseudogenes and functional copies could be amplified from the same strains, especially single-ascospore isolates (Table S1), multiple paralogs of ITS were thus proved to coexist within individual genomes of the species. Various copies of ITS regions were not thoroughly homogenized, suggesting that the ITS region in O. sinensis evolved in a non-concerted way rather than a strictly concerted manner. Escape from concerted evolution of the ITS region has been observed in various fungal species (O’Donnell and Cigelnik, 1997; Hijri et al., 1999; Wang and Yao, 2005; Fell et al., 2007; Simon and Weiß, 2008; Lindner and Banik, 2011). It could be true that nrDNA might not always evolve in a highly concerted fashion in fungi. Although different hypotheses have been proposed for the ATbiased ITS sequences in O. sinensis, they are questionable because those sequences have been identified as pseudogenes in the present study. Apparently, the two subgroups in the ITS analyses by Kinjo and Zang (2001) were in fact represented by both functional and pseudogenic ITS copies. The long branch in the ITS analyses of O. sinensis discovered in Stensrud et al. (2005) was also a result of pseudogenes. The hypotheses that AT-biased ITS sequences represent cryptic (phylogenetic) species (Stensrud et al., 2007), different species (Xiao et al., 2009), or different genotypes (Zhu et al., 2010) are thus inaccurate because AT-biased sequences (ITS pseudogenes) coexist with functional copies in a single O. sinensis genome. As for the differential proliferation of two ‘genotypes’ during O. sinensis maturation demonstrated by Zhu et al. (2010), more in-depth studies are required to understand the reason for these differences.
4.1. ITS pseudogenes and non-concerted evolution Three sequence clusters (groups A, B and C) were highly supported by the ITS phylogeny of O. sinensis, but group B and C sequences caused long branches in the analyses (Fig. 1). A number of substitutions (mainly C ? T, G ? A), which greatly decreased
4.2. Mutation pattern of pseudogenes C ? T and G ? A were the two most frequent mutation types in all of the ITS pseudogenes found in O. sinensis, while very few
Table 1 Relative-rate tests for Ophiocordyceps sinensis nrDNA lineages. Region
I
a b c
LIa
Clusters
LIIa
db
z-Scorec
II
ITS1
Group Group Group Group
A A A B
Group Group Group Group
B, C B C C
0.02638 0.02017 0.03258 0.01601
0.164 0.09071 0.2373 0.1182
0.137653 0.0705395 0.204766 0.134227
2.54245 2.16934 2.70999 3.15685
5.8S
Group Group Group Group
A A A B
Group Group Group Group
B, C B C C
0.005426 0.003297 0.007556 0.06773
0.1889 0.201 0.1768 0.03932
0.183453 0.197656 0.169251 0.0284048
3.93224 3.56425 3.65624 0.682917
ITS2
Group Group Group Group
A A A B
Group Group Group Group
B, C B C C
0.01099 0.002052 0.02403 0.108
0.13 0.1445 0.1155 0.05294
0.119012 0.146543 0.0914808 0.0550624
3.13228 2.7442 2.28199 0.982182
LI and LII represent the average number of substitutions per site of clusters I and II, respectively. d = LI LII. z-Scores are the values from the standardized normal distribution. Rate constancy is rejected at the 5% level when the z-score is greater than 1.96.
378
Y. Li et al. / Molecular Phylogenetics and Evolution 68 (2013) 373–379
A:T ? G:C transition mutations were observed, and the transversion mutation (A ? T) was even less frequent. Similar substitution patterns of pseudogenes have been reported in bacteria (Andersson and Andersson, 1999; Hershberg and Petrov, 2010), animals (Petrov and Hartl, 1999) and plants (Márquez et al., 2003; Zheng et al., 2008). Furthermore, most of the variable sites that indicate parsimony in group A were located in unpaired regions that would not destroy the secondary structures, but those in groups B and C were located in both paired and unpaired regions (randomly) damaging the secondary structures as seen in Figs. S1 and S2. These findings regarding the mutation pattern of O. sinensis support the hypothesis that the mutation pattern is fairly conserved in all organisms (Mitchell and Graur, 2005). The high frequency of G:C to A:T transition mutations was thought to result from a mechanism called repeat-induced point mutation, which is unique to fungi and can cause numerous G:C to A:T point mutations within duplicated sequences (Selker et al., 1993). 4.3. Proportion of ITS pseudogenes The proportion of ITS pseudogenic copies varies in different organisms. In the cacti genus Mammillaria, less than 3% of the ITS copies were functional, whereas, no pseudogenic ITS sequences were detected in the plant model species, Arabidopsis thaliana (Harpke and Peterson, 2007). In O. sinensis, ITS pseudogenes were amplified from only some of the tested strains using the specific primers designed. This observation suggests a low proportion and differentiation of pseudogenic copies in O. sinensis individuals. It may be possible that pseudogenic copies in fungal genomes are far less abundant than that in plants and animals, explaining their decreased identification compared to those in plant and animal genomes. 4.4. Detection of ITS pseudogenes In O. sinensis, pseudogenic ITS copies frequently could not be amplified under normal PCR conditions because of their low proportion. Most ITS pseudogenes from O. sinensis previously deposited in GenBank were occasionally discovered by cloning and sequencing, e.g., AB067739–AB067749 submitted by Kinjo and Zang (2001). Xiao et al. (2009) designed specific primers for the ITS sequences of groups A, B and C but submitted only those of groups A and C to GenBank, EU555436 and EU555438, respectively. Attempts to amplify the sequences of group B using several specific primers designed in this study were also unsuccessful (data not shown). Direct sequencing of PCR products sometimes produces sequences with undeterminable sites, especially multiple peaks of accordant substitution of G:C ? A:T at various sites, which may indicate the presence of pseudogenes. Moreover, Zhu et al. (2010) developed a method called MassARRAY SNP genotyping to detect ITS mutants in O. sinensis, producing data consisting of single nucleotide positions but not the complete sequence. As stated above, it is an interesting lead requiring further studies. 4.5. The impact of non-concerted ITS evolution on phylogeny The impact of pseudogenic ITS sequences on phylogenetic and phylogeographic analyses has been widely explored since they were discovered. ITS pseudogenes provided an excellent outgroup in reconstructing the phylogeny of the genus Zea (Buckler and Holtsford, 1996a). Some types of ITS pseudogenes were thought to be of great phylogenetic utility in reconstructing well-supported phylogenetic trees in Pyrus (Zheng et al., 2008). Pseudogenic ITS sequences were also used in a phylogenetic analysis of Naucleeae when no sequences of their functional counterparts were available (Razafimandimbison et al., 2004) and in a phylogeographic study
of Carapichea ipecacuanha (Queiroz et al., 2011). In most cases, however, ITS pseudogenes caused confusion in phylogenetic reconstructions. AT-biased substitutions of ITS pseudogenes resulted in long branches (e.g., Buckler et al., 1997; Wei et al., 2003; Won and Renner, 2005), and the long-branch attraction could have caused phylogenetic errors (Wei et al., 2003). Putative ITS pseudogenes were sometimes not clustered with conspecific functional sequences (e.g., Muir et al., 2001), which would disturb the phylogenetic estimation. For O. sinensis in this study, although clustered with functional genes (group A) in the phylogenetic tree, pseudogenic ITS sequences (groups B and C) caused long branches (Fig. 1). Molecular studies based on ITS sequences may include risks. However, the impact of pseudogenes can be accounted for as done by Freire et al. (2012). Investigation of individual ITS polymorphisms and identification of pseudogenes are necessary when using ITS sequences for phylogenetic reconstruction. In addition, when using ITS sequences as the standard barcode markers for DNA barcoding of fungi (Seifert, 2009; Schoch et al., 2012), pseudogenes should also be considered. Acknowledgments This work is supported by the National Natural Science Foundation of China (31170017 and 30025002) and the Chinese Academy of Sciences (KSCX2-YW-G-076, KSCX2-YW-G-074-04, KSCX2-SW101C and the scheme of Introduction of Overseas Outstanding Talents). Appendix A. Supplementary material Supplementary data associated with this article can be found, in the online version, at http://dx.doi.org/10.1016/j.ympev.2013. 04.010. References Andersson, J.O., Andersson, S.G.E., 1999. Genome degradation is an ongoing process in Rickettsia. Mol. Biol. Evol. 16, 1178–1191. Bailey, C.D., Carr, T.G., Harris, S.A., Hughes, C.E., 2003. Characterization of angiosperm nrDNA polymorphism, paralogy, and pseudogenes. Mol. Phylogenet. Evol. 29, 435–455. Buckler, E.S., Holtsford, T.P., 1996a. Zea systematics: ribosomal ITS evidence. Mol. Biol. Evol. 13, 612–622. Buckler, E.S., Holtsford, T.P., 1996b. Zea ribosomal repeat evolution and substitution patterns. Mol. Biol. Evol. 13, 623–632. Buckler, E.S., Ippolito, A., Holtsford, T.P., 1997. The evolution of ribosomal DNA: divergent paralogues and phylogenetic implications. Genetics 145, 821–832. Coen, E., Strachan, T., Dover, G., 1982. Dynamics of concerted evolution of ribosomal DNA and histone gene families in the melanogaster species subgroup of Drosophila. J. Mol. Biol. 158, 17–35. De Rijk, P., Wuyts, J., De Wachter, R., 2003. RnaViz 2: an improved representation of RNA secondary structure. Bioinformatics 19, 299–300. Dong, C.H., Yao, Y.J., 2011. On the reliability of fungal materials used in studies on Ophiocordyceps sinensis. J. Ind. Microbiol. Biot. 38, 1027–1035. Fell, J.W., Scorzetti, G., Statzell-Tallman, A., Boundy-Mills, K., 2007. Molecular diversity and intragenomic variability in the yeast genus Xanthophyllomyces: the origin of Phaffia rhodozyma? FEMS Yeast Res. 7, 1399–1408. Freire, R., Arias, A., Méndez, J., Insua, A., 2010. Sequence variation of the internal transcribed spacer region of the ribosomal DNA in Cerastoderma species. J. Mollus. Stud. 76, 77–86. Freire, M.C.M., da Silva, M.R., Zhang, X.C., Almeida, A.M.R., Stacey, G., de Oliveira, L.O., 2012. Nucleotide polymorphism in the 5.8S nrDNA gene and internal transcribed spacers in Phakopsora pachyrhizi viewed from structural models. Fungal Genet. Biol. 49, 95–100. Ganley, A.R.D., Kobayashi, T., 2007. Highly efficient concerted evolution in the ribosomal DNA repeats: total rDNA repeat variation revealed by whole-genome shotgun sequence data. Genome Res. 17, 184–191. Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 41, 95–98. Harpke, D., Peterson, A., 2007. Quantitative PCR revealed a minority of its copies to be functional in mammillaria (cactaceae). Int. J. Plant Sci. 168, 1157–1160. Hartmann, S., Nason, J.D., Bhattacharya, D., 2001. Extensive ribosomal DNA genic variation in the columnar cactus Lophocereus. J. Mol. Evol. 53, 124–134. Hershberg, R., Petrov, P., 2010. Evidence that mutation is universally biased towards AT in bacteria. PLoS Genet. 6, e1001115.
Y. Li et al. / Molecular Phylogenetics and Evolution 68 (2013) 373–379 Hijri, M., Hosny, M., van Tuinen, D., Dulieu, H., 1999. Intraspecific ITS polymorphism in Scutellospora castanea (Glomales, Zygomycota) is structured within multinucleate spores. Fung. Genet. Biol. 26, 141–151. ˇ ízˇková, J., Christelová, P., Taudien, S., de Langhe, E., Dolezˇel, J., 2011. Hrˇibová, E., C The ITS1-5.8S-ITS2 sequence region in the Musaceae: structure, diversity and use in molecular phylogeny. PLoS One 6, e17863. James, T.Y., Kauff, F., Schoch, C.L., Matheny, P.B., Hofstetter, V., Cox, C.J., Celio, G., Gueidan, C., Fraker, E., Miadlikowska, J., Lumbsch, H.T., Rauhut, A., Reeb, V., Arnold, A.E., Amtoft, A., Stajich, J.E., Hosaka, K., Sung, G.H., Johnson, D., O’Rourke, B., Crockett, M., Binder, M., Curtis, J.M., Slot, J.C., Wang, Z., Wilson, A.W., Schübler, A., Longcore, J.E., O’Donnell, K., Mozley-Standridge, S., Porter, D., Letcher, P.M., Powell, M.J., Taylor, J.W., White, M.M., Griffith, G.W., Davies, D.R., Humber, R.A., Morton, J.B., Sugiyama, J., Rossman, A.Y., Rogers, J.D., Pfister, D.H., Hewitt, D., Hansen, K., Hambleton, S., Shoemaker, R.A., Kohlmeyer, J., Volkmann-Kohlmeyer, B., Spotts, R.A., Serdani, M., Crous, P.W., Hughes, K.W., Matsuura, K., Langer, E., Langer, G., Untereiner, W.A., Lücking, R., Büdel, B., Geiser, D.M., Aptroot, A., Diederich, P., Schmitt, I., Schultz, M., Yahr, R., Hibbett, D.S., Lutzoni, F., McLaughlin, D.J., Spatafora, J.W., Vilgalys, R., 2006. Reconstructing the early evolution of Fungi using a six-gene phylogeny. Nature 443, 818–822. Jiang, Y., Yao, Y.J., 2004. Current understanding of molecular systematics of Cordyceps. J. Fungal Res. 2, 58–67 (in Chinese). Jiao, L., 2010. Phylogeographic Study on Ophiocordyceps sinensis. Beijing: Thesis Submitted for the Doctoral Degree, Graduate School of Chinese Academy of Sciences. Kimura, M., 1980. A simple method for estimating evolutionary rate of base substitutions through comparative studies of nucleotide sequences. J. Mol. Evol. 16, 111–120. Kinjo, N., Zang, M., 2001. Morphological and phylogenetic studies on Cordyceps sinensis distributed in southwestern China. Mycoscience 42, 567–574. Kumar, S., 1996. PHYLTEST: Phylogenetic Hypothesis Testing, Version 2.0. Pennsylvania State University, University Park. Liao, D.Q., 1999. Concerted evolution: molecular mechanism and biological implications. Am. J. Hum. Genet. 64, 24–30. Lindner, D., Banik, M., 2011. Intra-genomic variation in the ITS rDNA region obscures phylogenetic relationships and inflates estimates of operational taxonomic units in genus Laetiporus. Mycologia 103, 731–740. Liu, Z.Y., Yao, Y.J., Liang, Z.Q., Liu, A.Y., Pegler, D.N., Chase, M.W., 2001. Molecular evidence for the anamorph–teleomorph connection in Cordyceps sinensis. Mycol. Res. 105, 827–832. Márquez, L.M., Miller, D.J., MacKenzie, J.B., van Oppen, M.J.H., 2003. Pseudogenes contribute to the extreme diversity of nuclear ribosomal DNA in the hard coral Acropora. Mol. Biol. Evol. 20, 1077–1086. Mathews, D.H., Sabina, J., Zuker, M., Turner, D.H., 1999. Expanded sequence dependence of thermodynamic parameters improves prediction of RNA secondary structure. J. Mol. Biol. 288, 911–940. Mitchell, A., Graur, D., 2005. Inferring the pattern of spontaneous mutation from the pattern of substitution in unitary pseudogenes of Mycobacterium leprae and a comparison of mutation patterns among distantly related organisms. J. Mol. Evol. 61, 795–803. Muir, G., Fleming, C.C., Schlötterer, C., 2001. Three divergent rDNA clusters predate the species divergence in Quercus petraea (Matt.) Liebl. and Quercus robur L.. Mol. Biol. Evol. 18, 112–119. Nei, M., Rooney, A.P., 2005. Concerted and birth-and-death evolution of multigene families. Annu. Rev. Genet. 39, 121–152. O’Donnell, K., Cigelnik, E., 1997. Two divergent intragenomic rDNA ITS2 types within a monophyletic lineage of the fungus Fusarium are nonorthologous. Mol. Phylogenet. Evol. 7, 103–116. Petrov, D.A., Hartl, D.L., 1999. Patterns of nucleotide substitution in Drosophila and mammalian genomes. Proc. Natl. Acad. Sci. USA 96, 1475–1479. Posada, D., 2008. JModelTest: phylogenetic model averaging. Mol. Biol. Evol. 25, 1253–1256. Queiroz, C.S., Batista, F.R.C., de Oliveira, L., 2011. Evolution of the 5.8S nrDNA gene and internal transcribed spacers in Carapichea ipecacuanha (Rubiaceae) within a phylogeographic context. Mol. Phylogenet. Evol. 59, 293–302. Razafimandimbison, S.G., Kellogg, E.A., Bremer, B., 2004. Recent origin and phylogenetic utility of divergent ITS putative pseudogenes: a case study from Naucleeae (Rubiaceae). Syst. Biol. 53, 177–192. Ronquist, F., Huelsenbeck, J., 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574.
379
Rooney, A.P., Ward, T.J., 2005. Evolution of a large ribosomal RNA multigene family in filamentous fungi: birth and death of a concerted evolution paradigm. Proc. Natl. Acad. Sci. USA 102, 5084–5089. Schoch, C.L., Seifert, K.A., Huhndorf, S., Robert, V., Spouge, J.L., Levesque, C.A., Chen, W., Fungal Barcoding Consortium, 2012. Nuclear ribosomal internal transcribed spacer region as a universal DNA barcode marker for Fungi. Proc. Natl. Acad. Sci. USA.
. Seifert, K.A., 2009. Progress towards DNA barcoding of fungi. Mol. Ecol. Resour. 9 (Suppl. 1), 83–89. Selker, E.U., Fritz, D.Y., Singer, M.J., 1993. Dense nonsymmetrical DNA methylation resulting from repeat-induced point mutation in Neurospora. Science 262, 1724–1728. Simon, U.K., Weiß, M., 2008. Intragenomic variation of fungal ribosomal genes is higher than previously thought. Mol. Biol. Evol. 25, 2251–2254. Stensrud, Ø., Hywel-Jones, N.L., Schumacher, T., 2005. Towards a phylogenetic classification of Cordyceps: ITS nrDNA sequence data confirm divergent lineages and paraphyly. Mycol. Res. 109, 41–56. Stensrud, Ø., Schumacher, T., Shalchian-Tabrizi, K., Svegården, I.B., Kauserud, H., 2007. Accelerated nrDNA evolution and profound AT bias in the medicinal fungus Cordyceps sinensis. Mycol. Res. 111, 409–415. Swofford, D.L., 2002. PAUP Beta Version. Phylogenetic Analysis Using Parsimony (and Other Methods). Sinauer Associated, Sunderland, MA. Takezaki, N., Rzhetsky, A., Nei, M., 1995. Phylogenetic test of the molecular clock and linearized trees. Mol. Biol. Evol. 12, 823–833. Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position, specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680. Thornhill, D.J., Lajeunesse, T.C., Santos, S.R., 2007. Measuring rDNA diversity in eukaryotic microbial systems: how intragenomic variation, pseudogenes, and PCR artifacts confound biodiversity estimates. Mol. Ecol. 16, 5326–5340. Vaughn, J.C., Sperbeck, S., Ramsey, W.J., Lawrence, C.B., 1984. A universal model for the secondary structure of 5.8S ribosomal RNA molecules, their contact sites with 28S ribosomal RNAs, and their prokaryotic equivalent. Nucleic Acids Res. 12, 7479–7502. Wang, D.M., Yao, Y.J., 2005. Intrastrain internal transcribed spacer heterogeneity in Ganoderma species. Can. J. Microbiol. 51, 113121. Wei, X.X., Wang, X.Q., Hong, D.Y., 2003. Marked intragenomic heterogeneity and geographical differentiation of nrDNA ITS in Larix potaninii (Pinaceae). J. Mol. Evol. 57, 623–635. White, T.J., Bruns, T., Lee, S., Taylor, J.W., 1990. Amplification and direct sequencing of fungal ribosomal RNA genes for phylogenetics. In: Innis, M.A., Gelfand, D.H., Sninsky, J.J., White, T.J. (Eds.), PCR Protocols, A Guide to Methods and Applications. Academic Press, London, pp. 315–322. Won, H., Renner, S.S., 2005. The internal transcribed spacer of nuclear ribosomal DNA in the gymnosperm Gnetum. Mol. Phylogenet. Evol. 36, 581–597. Xia, X., Xie, Z., 2001. DAMBE: data analysis in molecular biology and evolution. J. Hered. 92, 371–373. Xiao, W., Yang, J.L., Zhu, P., Cheng, K.D., He, H.X., Zhu, H.X., Wang, Q., 2009. Nonsupport of species complex hypothesis of Cordyceps sinensis by targeted rDNAITS sequence analysis. Mycosystema 28, 724–730. Xiao, L.Q., Möller, M., Zhu, H., 2010. High nrDNA ITS polymorphism in the ancient extant seed plant Cycas: incomplete concerted evolution and the origin of pseudogenes. Mol. Phylogenet. Evol. 55, 168–177. Yao, Y.J., Pegler, D.N., Chase, M.W., 1999. Application of ITS (nrDNA) sequences in the phylogenetic study of Tyromyces s.l. Mycol. Res. 103, 219–229. Zhang, Y.J., Xu, L.L., Zhang, S., Liu, X.Z., An, Z.Q., Wang, M., Guo, Y.L., 2009. Genetic diversity of Ophiocordyceps sinensis, a medicinal fungus endemic to the Tibetan Plateau: implications for its evolution and conservation. BMC Evol. Biol. 9, 290. Zheng, X.Y., Cai, D.Y., Yao, L.H., Teng, Y.W., 2008. Non-concerted ITS evolution, early origin and phylogenetic utility of ITS pseudogenes in Pyrus. Mol. Phylogenet. Evol. 48, 892–903. Zhu, J.S., Gao, L., Li, X.H., Yao, Y.S., Zhao, J.Q., Zhou, Y.J., Lu, J.H., 2010. Maturational alteration of oppositely orientated rDNA and differential proliferation of GCand AT-biased genotypes of Ophiocordyceps sinensis and Paecilomyces hepiali in natural Cordyceps sinensis. Am. J. Biomed. Sci. 2, 217–238. Zuker, M., 2003. Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 31, 3406–3415.