Journal Pre-proof Mitochondrial genomes of four satyrine butterflies and phylogenetic relationships of the family Nymphalidae (Lepidoptera: Papilionoidea)
Mingsheng Yang, Lu Song, Lin Zhou, Yuxia Shi, Nan Song, Yalin Zhang PII:
S0141-8130(19)37136-3
DOI:
https://doi.org/10.1016/j.ijbiomac.2019.12.008
Reference:
BIOMAC 14035
To appear in:
International Journal of Biological Macromolecules
Received date:
4 September 2019
Revised date:
1 December 2019
Accepted date:
2 December 2019
Please cite this article as: M. Yang, L. Song, L. Zhou, et al., Mitochondrial genomes of four satyrine butterflies and phylogenetic relationships of the family Nymphalidae (Lepidoptera: Papilionoidea), International Journal of Biological Macromolecules(2019), https://doi.org/10.1016/j.ijbiomac.2019.12.008
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2019 Published by Elsevier.
Journal Pre-proof Mitochondrial genomes of four satyrine butterflies and phylogenetic relationships of the family Nymphalidae (Lepidoptera: Papilionoidea)
Mingsheng Yanga, b
a
Lu Songb
Lin Zhoua Yuxia Shib
Nan Songc
Yalin Zhanga
Key Laboratory of Plant Protection Resources and Pest Management of Ministry of
Education, Entomological Museum, Northwest A&F University, Yangling, Shaanxi 712100,
College of Life Science and Agronomy, Zhoukou Normal University, Zhoukou, Henan,
ro
b
of
China
466000, China;
College of Plant Protection, Henan Agricultural University, Zhengzhou, Henan 450002,
-p
c
na
*Correspondence:
lP
re
China.
Jo ur
Key Laboratory of Plant Protection Resources and Pest Management of Ministry of Education, Entomological Museum, Northwest A&F University, Yangling, Shaanxi 712100, China. E-mail:
[email protected]
Journal Pre-proof ABSTRACT The complete mitochondrial genomes (mitogenomes) of four Satyrini butterflies are newly determined and comparatively analyzed. These mitogenomes are all circular, double-stranded molecules, with the lengths of 15,194 bp (Minois dryas), 15,232 bp (Ypthima motschulskyi), 15,217 bp (Neope muirheadi) and 15,279 bp (Mycalesis francisca). Gene content and arrangement of newly sequenced mitogenomes are highly conserved and are typical of Lepidoptera. Interestingly, in M. francisca, a 48-bp insertion of macrosatellite (TA)24 is
of
present at the trnE and trnF junction, which is rare in Lepidoptera. Among 13 protein-coding
ro
genes (PCGs) of reported Satyrinae mitogenomes, atp8 is a comparatively fast-evolving gene, and most PCGs of the four species sequenced show significant codon usage bias.
-p
Phylogenetic analyses based on the mitogenomes placed the four species sequenced in this
re
study in Satyrini, confirming the result of morphological phylogeny. Moreover, phylogenetic
lP
analyses of the family Nymphalidae based on an expanded sampling and gene data from the GenBank and the present study show that several subtribe-level relationships in the speciose
na
Satyrini are well supported as that previously defined by multiple-locus investigations.
Jo ur
However, the subfamily-level relationships are not fully consistent across inference methods, and this needs further investigation based on mitogenome sequences of increased taxon sampling.
Keywords: Insect mitogenome; next-generation sequencing; Satyrinae
Journal Pre-proof 1. Introduction The Nymphalidae (brush-footed butterflies) is the most speciose family of butterflies with about 6,000 species distributed on all continents except Antarctica [1, 2]. Owing to species richness and ecological diversification, nymphalids have been intensely investigated as model taxa in ecological, conservation, evolutionary and developmental studies [3–5]. Regarding the classifications of Nymphalidae, twelve subfamilies have been defined, and their relationships have been inferred based on comprehensive data consisting of morphological characters and
of
molecular sequences (e.g., Wahlberg et al. [6]) as well as phylogenomic dataset [7]. However,
ro
Nymphalidae phylogeny remains far from satisfactorily resolved mainly because of the existence of lowly supported or unstable nodes bearing some subfamilies. Satyrinae and
-p
Satyrini, including about 2,800 and 2,200 species, respectively, are the most diverse subfamily
re
and tribe in Nymphalidae. From the Satyrinae and Satyrini, nine tribes (including Satyrini)
lP
and 13 subtribes were delimited respectively [2, 6, 8–10, 11]. However, phylogeny among them remains largely unresolved [2, 6, 8, 10, 11], and the key factor responsible for this is
Jo ur
[10, 11].
na
regarded as the long branch attraction with respect to rapid radiation evolution of this group
The mitochondrial genome (mitogenome) is a circular, double-stranded molecule that usually encodes 37 genes, including 13 protein-coding genes (PCGs), two ribosomal RNA genes (rRNAs), 22 transfer RNA genes (tRNAs), and an A + T-rich region [12]. Due to cellular abundance, an absence of introns, rapid evolution, and a lack of extensive recombination, mitochondrial sequence can be easily amplified. Moreover, in recent years, the number of animal mitochondrial genome has been increasing following the development of sequencing technology, which in parallel provided effective data for studies on systematic, population genetics and evolutionary biology (reviewed by Cameron [13]; e.g., Timmermans et al. [14]; Song et al. [15]; Li et al. [16]; Nie et al. [17]; Song et al. [18]). In recent years, several mitogenome-based investigations focusing on Nymphalidae phylogeny were also performed. One landmark study was conducted by Wu et al. [19]. In this
Journal Pre-proof study, 30 nymphalid mitogenomes were sequenced and a phylogenetic analysis was performed based on these mitogenomes together with others on GenBank. The results on phylogenetic pattern of ten nymphalid subfamilies involved effectively conformed previous morphological, multiple-locus and phylogenomic studies [6, 7, 20]. A difference is that the basal group of Nymphalidae is Danainae in Wu et al. [19] rather than Libytheinae or Libytheinae plus Danainae recovered by the latter studies [6, 7, 20]. Recently, similar subfamily-level relationships (including the basal position of Danainae) with that of Wu et al.
of
[19] were recovered by other two mitogenomic studies [21, 22]. However, the relationships
ro
among the Apaturinae, Biblidinae and Cyrestinae were inconsistent between the two studies, and the Biblidinae and Cyrestinae were not sampled in Wu et al. [19].
-p
In previous mitogenome-based investigations, only partial genome sequences or limited
re
taxon sampling associated with Nymphalidae including the Satyrinae were conducted.
lP
Therefore, a comprehensive phylogenetic investigation on this family is needed based on denser taxon and mitochondrial sequence coverages. Satyrinae is the most diverse group of
na
Nymphalidae but mitogenomes of only 18 species were sequenced (Table 1). In the present
Jo ur
study, we newly determined and comparatively analyzed the complete mitogenomes of four Satyrini species: Minois dryas, Ypthima motschulskyi, Neope muirheadi and Mycalesis francisca. Besides, a comprehensive phylogenetic investigation on Nymphalidae was performed with expanded mitochondrial gene data and taxon sampling (70 species representing all 64 nymphalid genera with mitogenomes available). The aims of this study are to use all the available satyrine mitogenomes reported to improve the understanding of Satyrinae and Nymphalidae phylogeny.
2. Materials and methods 2.1. Sample collection, identification and DNA extraction Adult specimens were sampled at the Jigongshan mountain range, Henan Province, China, in June 2018. Specimens were identified according to morphological descriptions and
Journal Pre-proof illustrations (especially the genitalia) in Chou [31, 32]. Thorax muscle tissues were isolated to extract genomic DNA using the DNeasy tissue kit (Qiagen, Hilden, Germany). Voucher specimens are deposited in the Biology Laboratory of Zhoukou Normal University, China. 2.2. Mitogenome sequencing, assembly, annotation and sequence analysis Raw mitogenome sequences of four Satyrinae species were obtained by next-generation sequencing. After the exacted total DNA was quantified, Whole Genome Shotgun method was used to construct a library using TruSeq DNA PCR-Free Sample Preparation Kit (Illumina,
of
United States). Then, Illumina Miseq platform was employed for sequencing with a strategy
ro
of 250 paired-ends. FastQC was used for quality control. After processing with AdapterRemoval v. 2 [33] and SOAPec v. 2.01 [34], raw paired reads were filtered into
-p
high-quality reads. Then, A5-miseq v. 20150522 [35] and SPAdes v. 3.9.0 [36] were employed
re
in de novo assembly, generating contig and scaffold sequences. Lastly, mitochondrial
lP
sequences were identified using BLASTn method, and MUMmer v. 3.1 [37] was used to establish position relationships among contig sequences and to fill in possible gaps.
na
MITOS webserver was employed to annotate the complete mitogenome sequence with
Jo ur
invertebrate genetic code [38]. Furthermore, tRNAScan-SE server v. 1.21 [39] was used to reidentify the 22 tRNAs, as well as to reconfirm their secondary structures. MEGA v. 6.06 [40] was used to reconfirm gene boundaries by aligning the newly sequenced mitogenome with previously reported satyrine mitogenomes. In the A + T-rich region, Tandem Repeats Finder program (http://tandem.bu.edu/trf/trf.html) [41] was used to detect possible tandem repeat elements. Strand asymmetry was calculated according to the following formulas: AT skew = [A – T]/[A + T] and GC skew = [G –C]/[G + C] [42]. Base composition and relative synonymous codon usage (RSCU) were calculated using MEGA v. 6.06 [40]. Nucleotide diversity and the ratio of nonsynonymous substitution (Ka) to synonymous substitution (Ks) for all PCGs were calculated using DNASP v. 5.0 [43]. In addition, effective codon usage statistics (Nc) was measured using CodonW 1.4.2 [44].
Journal Pre-proof 2.3. Phylogenetic analyses Phylogenetic analyses were performed based on extensive taxon sampling consisting of all 64 nymphalid genera and all 22 species of 16 Satyrinae genera with mitogenome sequenced. Accordingly, the ingroup included up to 70 nymphalids, and three Lycaenidae species were selected for outgroup taxa (Table S1) because the Lycaenidae was recovered as sister to the Nymphalidae by Wu et al. [19]. Among the mitochondrial genes, the 13 PCGs have been widely used in previous mitogenomic studies, and the rRNA genes might also be
of
informative in resolving Satyrinae phylogeny demonstrated by our previous studies [45, 46].
ro
Therefore, two datasets were compiled in this study: one dataset consisting of all 37 mitochondrial genes (PCG-rRNA-tRNA) and the other consisting of the 13 PCGs and two
-p
rRNAs (PCG-rRNA). Sequence alignments were performed within the TranslatorX online
re
platform [47] for 13 PCGs, and within the MAFFT online platform under the Q-INS-i
lP
algorithm [48] for two rRNAs and 22 tRNAs. Nucleotide sequence substitution model was selected using PartitionFinder v. 1.1.1 [49], with the Baysian Information Criterion (BIC)
na
algorithm under a greedy search. The best partition scheme and corresponded nucleotide
Jo ur
substitution models are shown in Table S2. Maximum likelihood (ML) analysis was performed using the IQ-TREE 1.6.7.1 [50] with the model determined by PartitionFinder. Bootstrap support (BS) was assessed using 1,000 ultrafast bootstrap replicates. Bayesian inference (BI) analysis was performed using MrBayes 3.1.2 [51] based on the same model as used in ML analysis. In this analysis, two independent Markov chain Monte Carlo (MCMC) runs were performed for 10,000,000 generations sampling per 100 generations. The convergence between the two runs was established by the Tracer version 1.6 (Effective sample sizes > 200) [52]. After the first 25% of the yielded trees were discarded as burn-in, a 50% majority-rule consensus tree with the posterior probability (PP) was generated from the remaining trees.
Journal Pre-proof 3. Results and discussion 3.1. General features of the newly sequenced mitogenomes The four newly sequenced mitogenomes are all circular, double-stranded molecules, and the lengths are 15,194 bp (M. dryas), 15,232 bp (Y. motschulskyi), 15,217 bp (N. muirheadi) and 15,279 bp (M. francisca) (Table 2–3). The lengths are comparable to other completely sequenced Satyrinae mitogenomes, which range from 15,122 bp in Melanitis leda to 15,721 bp in Stichophthalma louisa. Gene content and arrangement of the four Satyrinae
of
mitogenomes are highly conserved and typical of Lepidoptera. The typical 37 mitochondrial
ro
genes (13 PCGs, 22 tRNAs, and two rRNAs) and an A + T-rich region are included, 23 of
-p
which (nine PCGs and 14 tRNAs) are encoded on the majority strand (J-strand), and the remaining genes are located on the minority strand (N-strand). Interestingly, in Lethe
re
albolineata mitogenome [25], two tRNA-like pseudo-genes were recognized in rrnL gene and
lP
A + T-rich region, respectively. The existence of pseudo-gene in A + T-rich region was also reported in other butterflies such as Spindasis takanonis and Protantigius superans of
na
Lycaenidae [53].
Jo ur
Similar to other insect mitogenomes [12], A/T content in Satyrinae mitogenomes are highly biased, ranging from 79.9% in M. francisca to 81.8% in Y. motschulskyi (Table 3). AT skew and GC skew are routinely used to describe the base composition of mitogenomes [42, 54]. The negligible T skew and moderate C skew in newly sequenced mitogenomes are similar to other Lepidoptera and most insect species [55].
3.2. Protein-coding genes The total length of 13 PCGs of M. dryas, Y. motschulskyi, N. muirheadi and M. francisca are 11,225 bp, 11,215 bp, 11,230 bp and 11,228 bp, encoding 3,741, 3,738, 3,743 and 3,742 amino acids respectively (Table 3). In all sequenced mitogenomes, nine of the 13 PCGs are encoded on the J-strand, while the other four are located on the N-strand. The A + T content of
Journal Pre-proof the 13 PCGs varies from 78.1% in M. francisca to 80.5% in Y. motschulskyi. Regarding start and stop codons, most PCGs use the conventional ATN as the start codon (Table 2). However, in cox1, the unconventional CGA is consistently found. TAA is employed as stop codon in most PCGs, but the incomplete termination codon T is consistently used in cox2 of three species and nad4 of all four species. Incomplete termination codons are commonly recognized across arthropod mitogenomes, which may be related to post-transcriptional modification during the mRNA maturation process [56].
of
To characterize evolutionary pattern of 13 PCGs, nucleotide diversity and the ratio of
ro
Ka/Ks across all Satyrinae mitogenomes were calculated for each PCG aligned. As shown in Fig. 1, nad6 and atp8 show the highest nucleotide diversity, in contrast to cox1 with the value
-p
of nucleotide diversity being the lowest. The Ka/Ks value for atp8 is the highest, followed by
re
the nad4l, nad5, nad6, and the lowest value is for cox1 as that of nucleotide diversity. Notably,
lP
Ka/Ks values for all PCGs are lower than one, indicating that they are evolving under purifying selection. From these analyses, we can conclude that the atp8 is under the least
Jo ur
Satyrinae.
na
selection pressure and the most fast-evolving gene amongst the mitochondrial PCGs in
To understand genetic codon bias of the newly sequenced mitogenomes, the RSCU and Nc were measured. As shown in Fig. 2, for most amino acids, the usages of synonymous codons are biased. Moreover, the synonymous codon preferences are conserved among the four species, as may ascribe to their close relationships belonging to the same butterfly tribe, and these preferences were also recognized in some other lepidopterans [57, 58]. In detail, the three most used codons for the four species sequenced are consistently AUU, UUA and UUU. These codons encode Ile, Leu and Phe which are also the frequently used amino acids in these species. The codon usage bias was further evaluated by the Nc values. The Nc values are routinely regarded between 20–61, and are negatively correlated with codon usage bias. The Nc value = 20 indicates absolute bias toward a synonymous codon whereas Nc = 61 indicates the neutral codon usage [59]. Nc values for our analyses ranged from 30.1 to 33.78 indicating
Journal Pre-proof some trends of codon usage bias among the four species sequenced.
3.3. Transfer and ribosomal RNA genes For all newly sequenced Satyrinae mitogenomes, the typical 22 tRNAs are expectedly recognized (Table 2). Among them, eight tRNAs are encoded by the N-strand and the remaining 14 by the J-strand. The lengths are from 62 bp (trnR) to 71 bp (trnK) in M. dryas, Y. motschulskyi and M. francisca, from 60 bp (trnS1) to 71 bp (trnK) in N. muirheadi. As shown
of
in Fig. 3, all tRNAs exhibit typical clover-leaf secondary structure, but trnS1 (AGN) lacks the
ro
DHU arm, a feature generally present in all Lepidoptera insects as well as in other metazoan mitogenomes [60, 61]. In four sequenced mitogenomes, 23, 18, 23 and 27 unmatched base
-p
pairs are present in M. dryas, Y. motschulskyi, N. muirheadi and M. francisca respectively,
re
most of which in each species are the overrepresented noncanonical G-U pair. This
lP
overrepresentation is commonly present in tRNAs of insect mitogenomes [62–65]. Further, comparative tRNA analyses among four newly sequenced mitogenomes found that substantial
na
nucleotide variation exists in several tRNAs, and most of these variations occurred in the
Jo ur
DHU loops, TψC arms and TψC loops (Fig. 3). Two rRNA genes, rrnS and rrnL, were recognized in all newly sequenced mitogenomes, and they are located between trnV and the A + T-rich region and between trnV and trnL1 respectively (Table 2). The rrnS lengths range from 772 bp in N. muirheadi to 815 bp in Y. motschulskyi; and the rrnL lengths are from 1,333 bp in N. muirheadi to 1,341 bp in M. francisca.
3.4. Gene overlapping and intergenic regions Four gene-overlapping regions (with the number of nucleotide > 4) are conserved and consistently present in four newly sequenced mitogenomes (Table 2; Fig. 4). One is composed of the “AAGCCTTA” at the trnW and trnC junction (Fig. 4A); the second is a shorter sequence of “TCTAA” locating at the cox1 and trnL2 junction (Fig. 4C). The
Journal Pre-proof gene-overlapping region between atp8 and atp6, a 7-bp motif of “ATGATAA”, is regarded as a common feature in Lepidoptera even all insects (Fig. 4B). The fourth gene-overlapping regions at the trnF and nad5 junctions across four mitogenomes are not completely identical in nucleotide composition (Fig. 4D). The one in Y. motschulskyi is longer than that of other three species. In addition to the A + T-rich region, 84, 119, 76 and 124 intergenic nucleotides were recognized in M. dryas, Y. motschulskyi, N. muirheadi and M. francisca respectively. The long
of
intergenic regions, which include 45–53 intergenic nucleotides, were expectedly detected and
ro
consistently present between trnQ and nad2 (Fig. 4E). This region, characterized by high A/T content, is widely present in Lepidoptera and even be regarded as a synapomorphy of
-p
lepidopteran species [66, 67]. The intergenic region at trnS2 and nad1 junction has been
re
widely reported in insect mitogenomes as characterized by the existence of the motif
lP
“ATACTAA” responsible for mitochondrion transcription [55, 67, 68]. However, our study reveals that the motif “ATACTAA” is putatively located in intergenic region of only Y.
na
motschulskyi, contrasting to at 3′ end of the nad1 for other three species (Fig. 4F).
Jo ur
Accordingly, the predicted nad1 length of the Y. motschulskyi is shorter than that of other three species. However, this result needs to be further confirmed based on the nad1 mRNA expression data [65, 69], mainly because that the 3’ end of nad1 was predicted only by the position of routinely used stop codon “TAA”. In addition, an interesting result regarding on the intergenic region is that, at the trnE and trnF junction, a 48-bp insertion of macrosatellite (TA)24 was detected only in M. francisca, which is rare in reported butterfly mitogenomes, to our knowledge. The A+T-rich regions of all four sequenced mitogenomes are located between rrnS and trnM (Table 4; Fig. 4G), and the A + T content of this region ranges from 92.7% in M. francisca to 94.6% in M. dryas. Insect mitochondrial A + T-rich region is usually structured by several conserved sequence blocks responsible for mitogenome replication and transcription [70]. These blocks include (from 5’ to 3’ end) the motif “ATAGA” and
Journal Pre-proof subsequent poly-T structure, the motif “ATTTA”, macrosatellite (AT)n or (TA)n element, and an “A”-rich 3’ end upstream of the trnM gene. Insect A + T-rich region is generally characterized by the presence of multiple tandem repeat elements [71]. However, in all four sequenced mitogenomes, no tandem repeat element was detected.
3.5. Phylogenetic analyses Two datasets based on the same inference method yielded identical topologies in terms
of
of subfamily-level relationships. The relationships of five previously defined Nymphalidae
ro
clades obtained from the ML and BI analyses (Figs. 5–6) are mostly identical, but differences exist. In ML analyses, the danaine + (satyrine + (libytheine + (nymphaline + heliconline)))
-p
was recovered, greatly reinforcing that of previous mitogenomic studies focusing on the
re
Nymphalidae [19, 21, 22]. In BI analyses, their relationships are libytheine + ((danaine +
lP
satyrine) + (nymphaline + heliconline)). In detail, the Libytheinae is basal to the rest Nymphalidae, which provides supports for several morphological, multiple-locus and
na
phylogenomic studies [6, 7, 20] rather than mitogenomic studies [19, 21, 22]. Moreover, the
Jo ur
sister danaine and satyrine was unexpectedly detected, but this relationship may be unstable because of the low posterior probabilities (< 0.95). In the nymphaline clade, the Nymphalinae + Heliconlinae is defined with strong supports by both ML and BI analyse, a result consistently recovered by other studies based on various data as well (e.g. Freitas and Brown [20]; Peña and Wahlberg [2]; Wahlberget et al. [6]; Wu et al. [19]; Shi et al. [21]; Liu et al. [22]; Espeland et al. [7]). The heliconline clade contains five subfamilies, and their relationships recovered herein are Pseudergolinae + (Nymphalinae + (Cyrestinae + (Apaturinae + Biblidinae))), which is completely identical to that of Liu et al. [22] based on 13 mitochondrial PCGs but showing a little difference with that of Shi et al. [21] employing 13 PCGs plus two rRNAs. In the latter study, the Cyrestinae, instead of Biblidinae, forms a sister group with Apaturinae. In addition, another proposal regarding the phylogeny of heliconline clade is present, that is, Cyrestinae first clusters with Nymphalinae, and then
Journal Pre-proof being sister to the Apaturinae + Biblidinae [6, 7]. The satyrine clade, including Calinaginae, Charaxinae and Satyrinae, is a well-defined group recovered by various studies [2, 6, 7, 19], and the Calinaginae is always recovered as sister to Charaxinae + Satyrinae excepting our ML analysis based on the dataset including 13 PCGs and two rRNAs. Collectively, our BI analyses firstly provide mitogenome evidence for morphological, multiple-locus and phylogenomic studies [6, 7, 20] in the phylogenetic position of Libytheinae in Nymphalidae. However, the results of our ML analyses are consistent with that of previous mitogenomic
of
studies. This discrepancy may be resolved by more comprehensive mitogenome sampling in
ro
future study.
In Nymphalidae, Satyrinae is the most speciose subfamily. However, its phylogenetic
-p
relationships at both tribe and subtribe levels remain largely unresolved [2, 6, 8, 45, 46]. With
re
an attempt to better understand their evolutionary relationships, a preliminary investigation
lP
was performed firstly based on mitogenomic data herein. At the tribe level, our ML and BI analyses recover the relationships among the four satyrine tribes analyzed as (Amathusiini +
na
Elymniini) + (Melanitini + Satyrini). The close Amathusiini and Elymniini is in accordance
Jo ur
with the BI analysis of Peña and Wahlberg [2], but some studies including our previous investigation [6, 45] regarded the Elymniini more close to the Melanitini. Thus, the tribe level relationships obtained herein need further verification based on more sampling taxa. Satyrini represents the most diverse tribe (about 2,200 speices) in Satyrinae, and 13 subtribes are defined [11]. In the present study, mitogenomes of four Satyrini species are newly sequenced, which increases the number of reported Satyrini mitogenomes to 17. Although the taxon sampling is still limited for the large Satyrini in species composition, the close relationships among some subtribes such as the grouping of Parargina, Lethina and Mycalesina, and the grouping of Melanargiina, Satyrina and Ypthimina, as recovered in our previous study [46] and other multiple-locus investigations [2, 6, 10], are well supported especially in our BI analysis. However, it should be noted that only seven of the 13 subtribes are analyzed in the present study, and further effort is needed to improve the understanding of
Journal Pre-proof the whole Satyrini phylogeny based on mitogenome sequences of increased sampling. Selecting suitable genetic markers is of great importance in studies of molecular systematics. Among mitochondrial genes, the 13 PCGs, possibly being easily alignable and without secondary structure, have been widely used and proven informative in insect systematics. Furthermore, our previous studies [45, 46] have verified that the rRNA genes were also informative for phylogenetic analysis of the Satyrinae. In the present study, to test the effectiveness of tRNAs in reconstructing Nymphalidae phylogeny, two datasets were
of
compiled: all 37 mitochondrial genes, and the 13 PCGs plus two rRNAs. Our results show
ro
that the inclusion of tRNAs, based the same inference method, generally not affect the topology. However, the supports of most nodes especially in ML analysis are increased,
Conflicts of Interest
lP
re
-p
indicating that tRNAs contribute positively to the nodes related in the present analyses.
na
All authors have read and approved the final manuscript. The authors declare no conflict
Jo ur
of interest.
Acknowledgements
This work was funded by the Key Laboratory of Plant Protection Resources and Pest Management, Ministry of Education of China (A115020002), the National Natural Science Foundation of China (31702046), and the Project of Scientific Research Innovation Fund for College Student (ZKNUD2019019 and ZKNUD2019076).
References [1] P.R. Ackery, R. de Jong, R.I. Vane-Wright, The Butterflies: Hedyloidea, Hesperioidea and Papilionoidea. In N.P. Kristensen (Ed.), Handbook of Zoology, vol. IV Arthropoda:
Journal Pre-proof Insecta. Lepidoptera, Moths and Butterflies, vol. 1: Evolution, Systematics and Biogeography, Walter de Gruyter, Berlin, 1999, pp. 263–300. [2] C. Peña, N. Wahlberg, Prehistorical climate change increased diversification of a group of butterflies, Biology Letters 4 (2008) 274–278. [3] P.M. Sheppard, J. Turner, K. Brown, W. Benson, M. Singer, Genetics and the evolution of muellerian mimicry in Heliconius butterflies, Philos. Trans. R Soc. Lond. B Biol. Sci. 308 (1985) 433–610.
of
[4] E. Pollard, T.J. Yates, Monitoring butterflies for ecology and conservation, The British
ro
Butterfly Monitoring Scheme, Chapman & Hall, London, 1993.
[5] P.R. Ehrlich, I. Hanski, On the wings of checkerspots: a model system for population
-p
biology, Oxford University Press, New York, 2004.
Nymphalid
butterflies
diversify
following
near
demise
at
the
lP
Brower,
re
[6] N. Wahlberg, J. Leneveu, U. Kodandaramaiah, C. Peña, S. Nylin, A.V.L. Freitas, A.V.Z.
276 (2009) 4295–4302.
na
Cretaceous/Tertiary boundary, Proceedings of the Royal Society B: Biological Science
Jo ur
[7] M. Espeland, J. Breinholt, K.R. Willmott, A.D. Warren, R. Vila, E.F.A. Toussaint, S.C. Maunsell, K. Aduse-Poku, G. Talavera, R. Eastwood, M.A. Jarzyna, R. Guralnick, D.J. Lohman, N.E. Pierce, A.Y. Kawahara, A comprehensive and dated phylogenomic analysis of butterflies, Current Biology 28 (2018) 770–778. [8] C. Peña, S. Nylin, A.V.L. Freitas, Higher level phylogeny of Satyrinae butterflies (Lepidoptera: Nymphalidae) based on DNA sequence data, Molecular Phylogenetics and Evolution 40 (2006) 29–49. [9] U. Kodandaramaiah, C. Peña, M.F. Braby, R. Grund, C.J. Müller, S. Nylin, N. Wahlberg, Phylogenetics of Coenonymphina (Nymphalidae: Satyrinae) and the problem of rooting rapid radiations, Molecular Phylogenetics and Evolution 54 (2010) 386–394. [10] C. Peña, S. Nylin, N. Wahlberg, The radiation of Satyrini butterflies (Nymphalidae:
Journal Pre-proof Satyrinae): a challenge for phylogenetic methods, Zoological Journal of the Linnean Society 161 (2011) 64–87. [11] M.A. Marín, C. Peña, A.V.L. Freitas, N. Wahlberg, S.I. Uribe, From the phylogeny of the Satyrinae butterflies to the systematics of Euptychiina (Lepidoptera: Nymphalidae): history, progress and prospects, Neotropical Entomology 40 (2011) 1–13. [12] J.L. Boore, Animal mitochondrial genomes, Nucleic. Acids. Res. 27 (1999) 1767–1780. [13] S.L. Cameron, Insect mitochondrial genomics: implications for evolution and phylogeny,
of
Annu. Rev. Entomol. 59 (2014) 95–117.
ro
[14] M.J.T.N. Timmermans, D.C. Lees, T.J. Simonsen, Towards a mitogenomic phylogeny of Lepidoptera, Molecular Phylogenetics and Evolution 79 (2014) 169–178.
-p
[15] F. Song, H. Li, P. Jiang, X.G. Zhou, J.P. Liu, C.H. Sun, A.P. Vogler, W.Z. Cai, Capturing
re
the phylogeny of Holometabola with mitochondrial genome data and Bayesian
na
1426.
lP
site-heterogeneous mixture models, Genome Biology and Evolution 8 (2016) 1411–
[16] H. Li, Jr.J.M. Leavengood, E.G. Chapman, D. Burkhardt, F. Song, P. Jiang, J. Liu, X.
Jo ur
Zhou, W. Cai, Mitochondrial phylogenomics of Hemiptera reveals adaptive innovations driving the diversification of true bugs, Proc. R. Soc. B 284 (2017) 20171223. [17] R.E. Nie, T. Breeschoten, M.J.T.N. Timmermans, K. Nadein, H.J. Xue, M. Bai, Y. Huang, X.K. Yang, A.P. Vogler, The phylogeny of Galerucinae (Coleoptera: Chrysomelidae) and the performance of mitochondrial genomes in phylogenetic inference compared to nuclear rRNA genes, Cladistics 33 (2017) 1–18. [18] N. Song, H. Zhang, T. Zhao, Insights into the phylogeny of Hemiptera from increased mitohenomic taxon sampling, Molecular Phylogenetics and Evolution 137 (2019) 236–249. [19] L.W. Wu, L.H. Lin, D.C. Lee, Y.F. Hsu, Mitogenomic sequences effectively recover relationships within brush-footed butterflies (Lepidoptera: Nymphalidae), BMC
Journal Pre-proof Genomics 15 (2014) 1–17. [20] A.V.L. Freitas, K. Brown, Phylogeny of the Nymphalidae (Lepidoptera), Syst. Biol. 53 (2004) 363–383. [21] Q.H. Shi, X.Y. Sun, Y.L. Wang, J.S. Hao, Q. Yang,
Morphological characters are
compatible with mitogenomic data in resolving the phylogeny of nymphalid butterflies (Lepidoptera: Papilionoidea: Nymphalidae), PLoS One 10 (2015) e0124349. [22] N. Liu, N. Li, P. Yang, C. Sun, J. Fang, S. Wang, The complete mitochondrial genome
of
of Damora sagana and phylogenetic analyses of the family Nymphalidae, Genes Genom.
ro
40 (2018) 109–122.
[23] Q. Shi, F. Zhao, J. Hao, Complete mitochondrial genome of the Common Evening
-p
Brown, Melanitis leda Linnaeus (Lepidoptera: Nymphalidae: Satyrinae), Mitochondrial
re
DNA 24 (2013) 492–494.
lP
[24] L.F.T. da Costa, The complete mitochondrial genome of Parage aegeria (Insecta: Lepidoptera: Papilionidae), Mitochondrial DNA Part A 27 (2016) 551–552.
na
[25] J. Li, C. Xu, Y. Lei, C. Fan, Y. Gao, C. Xu, R. Wang, Complete mitochondrial genome of
Jo ur
a satyrid butterfly, Lethe albolineata (Lepidoptera: Nymphalidae), Mitochondrial DNA Part A 27 (2016) 4195–4196.
[26] C. Fan, C. Xu, J. Li, Y. Lei, Y. Gao, C. Xu, R. Wang, Complete mitochondrial genome of a satyrid butterfly, Ninguta schrenkii (Lepidoptera: Nymphalidae), Mitochondrial DNA Part A 27 (2016) 80–100. [27] M. Tang, M. Tan, G. Meng, S. Yang, X. Su, S. Liu, W. Song, Y. Li, Q. Wu, A. Zhang, X. Zhou, Multiplex sequencing of pooled mitochondrial genomes-a crucial step toward biodiversity analysis using mito-metagenomics, Nucleic Acids Research 42 (2014) e166. [28] W. Zhang, S. Gan, N. Zuo, C. Chen, Y. Wang, J. Hao, The complete mitochondrial genome of Triphysa phryne (Lepidoptera: Nymphalidae: Satyrinae), Mitochondrial DNA Part A 27 (2016) 474–475.
Journal Pre-proof [29] D. Huang, J. Hao, W. Zhang, T. Su, Y. Wang, X. Xu, The complete mitochondrial genome of Melanargia asiatica (Lepidoptera: Nymphalidae: Satyrinae), Mitochondrial DNA Part A 27 (2016) 806–808. [30] M.J. Kim, X. Wan, K.G, Kim, J.S. Hwang, I. Kim, Complete nucleotide sequence and organization of the mitogenome of endangered Eumenis autonoe (Lepidoptera: Nymphalidae), African Journal of Biotechnology 9 (2010) 735–754.
Technological Publishing House, Zhengzhou, 1998.
of
[31] I. Chou, Classification and identification of Chinese butterflies, Henan Scientific and
ro
[32] I. Chou, Monograph of Chinese butterflies, first volume (revised edition), Henan
-p
Scientific and Technological Publishing House, Zhengzhou, 1999.
lP
BMC research notes 9 (2016) 88.
re
[33] S. Mikkel, AdapterRemoval v2: rapid adapter trimming, identification, and read merging,
[34] R. Luo, B. Liu, Y. Xie, Z. Li, W. Huang, J. Yuan, G. He, Y. Chen, Q. Pan, Y. Liu, J. Tang,
na
G. Wu, H. Zhang, Y. Shi, Y. Liu, C. Yu, B. Wang, Y. Lu, C. Han, D.W. Cheung, S.M. Yiu,
Jo ur
S. Peng, Z. Xiaoqian, G. Liu, X. Liao, Y. Li, H. Yang, J. Wang, T.W. Lam, J. Wang, SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler, Gigascience 1 (2012) 18.
[35] D. Coil, G. Jospin, A.E. Darling, A5-miseq: an updated pipeline to assemble microbial genomes from Illumina MiSeq data, Bioinformatics 31 (2015) 587–589. [36] A. Bankevich, S. Nurk, D. Antipov, A.A. Gurevich, M. Dvorkin, A.S. Kulikov, V.M. Lesin, S.I. Nikolenko, S. Pham, A.D. Prjibelski, A.V. Pyshkin, A.V. Sirotkin, N. Vyahhi, G. Tesler, M.A. Alekseyev, P.A. Pevzner, SPAdes: a new genome assembly algorithm and its applications to single-cell sequencing, J. Comput. Biol. 19 (2012) 455–477. [37] S. Kurtz, A. Phillippy, A.L. Delcher, M. Smoot, M. Shumway, C. Antonescu, S.L.
Journal Pre-proof Salzberg, Versatile and open software for comparing large genomes, Genome Biol. 5 (2004) R12. [38] M. Bernt, A. Donath, F. Jühling, F. Externbrink, C. Florentz, G. Fritzsch, J. Pütz, M. Middendorf, P.F. Stadler, MITOS: Improved de novo metazoan mitochondrial genome annotation, Mol. Phylogenet. Evol. 69 (2013) 313–319. [39] T.M. Lowe, S.R. Eddy, tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence, Nucleic Acids Res. 25 (1997) 955–964.
of
[40] K. Tamura, G. Stecher, D. Peterson, A. Filipski, S. Kumar, MEGA6: Molecular
ro
evolutionary genetics analysis version 6.0., Molecular Biology and Evolution 30 (2013)
-p
2725–2729.
[41] G. Benson, Tandem repeats finder: a program to analyze DNA sequences, Nucleic Acids
re
Res. 27 (1999) 573–580.
lP
[42] N.T. Perna, T.D. Kocher, Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes, Journal of Molecular Evolution 41 (1995) 353–358.
na
[43] P. Librado, J. Rozas, DnaSP v5: a software for comprehensive analysis of DNA
Jo ur
polymorphism data, Bioinformatics 25 (2009) 1451–1452. [44] J.F. Peden, Analysis of codon usage, University of Nottingham 90 (2000) 73–74. [45] M. Yang, Y. Zhang, Phylogenetic utility of ribosomal genes for reconstructing the phylogeny of five Chinese satyrine tribes (Lepidoptera: Nymphalidae), ZooKeys 488 (2015) 105–120. [46] M. Yang, Y. Zhang, Molecular phylogeny of the butterfly tribe Satyrini (Nymphalidae: Satyrinae) with emphasis on the utility of ribosomal genes mitochondrial 16s rDNA and nuclear 28s rDNA, Zootaxa 3985 (2015) 125–141. [47] F. Abascal, R. Zardoya, M.J. Telford, TranslatorX: Multiple alignment of nucleotide sequences guided by amino acid translations, Nucleic Acids Res. 38 (2010) W7–W13. [48] K. Katoh, J. Rozewicki, K.D. Yamada, MAFFT online service: multiple sequence
Journal Pre-proof alignment, interactive sequence choice and visualization, Briefings in Bioinformatics 2017 doi.org/10.1093/bib/bbx108. [49] R. Lanfear, B. Calcott, S.Y. Ho, S. Guindon, Partitionfinder: Combined selection of partitioning schemes and substitution models for phylogenetic analyses, Mol. Biol. Evol. 29 (2012) 1695–1701. [50] L.T. Nguyen, H.A. Schmidt, A. von Haeseler, B.Q. Minh, IQ-TREE: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies, Mol. Biol. Evol.
of
32 (2015) 268–274.
-p
models, Bioinformatics 19 (2003) 1572–1574.
ro
[51] F. Ronquist, J.P. Huelsenbeck, MrBayes3: Bayesian phylogenetic inference under mixed
re
[52] A. Rambaut, M.A. Suchard, D. Xie, A.J. Drummond, Tracer v1.6, (2014) Available from http://tree.bio.ed.ac.uk/software/tracer/
lP
[53] M.J. Kim, A.R. Kang, H.C. Jeong, K.G. Kim, I. Kim, Reconstructing intraordinal
na
relationships in Lepidoptera using mitochondrial genome data with the description of two newly sequenced lycaenids, Spindasis takanonis and Protantigius superans
Jo ur
(Lepidoptera: Lycaenidae), Mol. Phylogenet. Evol. 61 (2011) 436–445. [54] S.J. Wei, M. Shi, X.X. Chen, M.J. Sharkey, C. van Achterberg, G.Y. Ye, J.H. He, New views on strand asymmetry in insect mitochondrial genomes, PLoS One 5 (2010) e12708.
[55] S.L. Cameron, M.F. Whiting, The complete mitochondrial genome of the tobacco hornworm, Manduca sexta, (Insecta: Lepidoptera: Sphingidae), and an examination of mitochondrial gene variability within butterflies and moths, Gene 40 (2008) 112–123. [56] D. Ojala, J. Montoya, G. Attardi, tRNA punctuation model of RNA processing inhuman mitochondrial, Nature 290 (1981) 470–474. [57] Y.-P. Wu, J.-L. Zhao, T.-J. Su, A.-R. Luo, C.-D. Zhu, The complete mitochondrial genome of Choristoneura longicellana (Lepidoptera: Tortricidae) and phylogenetic
Journal Pre-proof analysis of Lepidoptera, Gene 591 (2016) 161–176. [58] M. Yang, L. Song, Y. Shi, Y. Yin, Y. Wang, P. Zhang, J. Chen, L. Lou, X. Liu, The complete mitochondrial genome of a medicinal insect, Hydrillodes repugnalis (Lepidoptera: Noctuoidea: Erebidae), and related phylogenetic analysis, International Journal of Biological Macromolecules 123 (2019) 485–493. [59] F. Wright, The ‘effective number of codons’ used in a gene, Gene 87 (1990) 23–29. [60] J.R. Garey, D.R. Wolstenholme, Platyhelminth mitochondrial DNA: evidence for early
of
evolutionary origin of a tRNAserAGN that contains a dihydrouridine arm replacement
ro
loop, and of serine-specifying AGA and AGG codons, J. Mol. Evol. 28 (1989) 374–387.
-p
[61] D.V. Lavrov, W.M. Brown, J.L. Boore, A novel type of RNA editing occurs in the mitochondrial tRNAs of the centipede Lithobius forficatus, Proc. Natl. Acad. Sci. USA
re
97 (2000) 13738–13742.
lP
[62] P. Salvato, M. Simonato, A. Battisti, E. Negrisolo, The complete mitochondrial genome of the bag-shelter moth Ochrogaster lunifer (Lepidoptera, Notodontidae), BMC
na
Genomics 9 (2008) 331.
Jo ur
[63] S. Chen, F.-H. Li, X.-E. Lan, P. You, The complete mitochondrial genome of Pycnarmon lactiferalis (Lepidoptera: Crambidae), Mitochondrial DNA Part B 1 (2016) 638–639. [64] Z.-T. Chen, Y.-Z. Du, The first two mitochondrial genomes from Taeniopterygidae (Insecta: Plecoptera): Structural features and phylogenetic implications. International Journal of Biological Macromolecules 111 (2017) 70–76. [65] M. Yang., L. Song., Y. Shi., J. Li., Y. Zhang, N. Song, The first mitochondrial genome of the family Epicopeiidae and higher-level phylogeny of Macroheterocera (Lepidoptera: Ditrysia), International Journal of Biological Macromolecules 136 (2019) 123–132. [66] S.S. Cao, W.W. Yu, M. Sun, Y.Z. Du, Characterization of the complete mitochondrial genome of Tryporyza incertulas, in comparison with seven other Pyraloidea moths, Gene 533 (2014) 356–365.
Journal Pre-proof [67] M. Yang, S. Shi, P. Dai, L. Song, X. Liu, Complete mitochondrial genome of Palpita hypohomalia (Lepidoptera: Pyraloidea: Crambidae) and its phylogenetic implications, Eur. J. Entomol. 115 (2018) 708–717. [68] J.W. Taanman, The mitochondrial genome: structure, transcription, translation and replication, Biochim Biophys Acta 1410 (1999) 103–123. [69] M.J. Kim, J.S. Jeong, J.S. Kim, S.Y. Jeong, I. Kim, Complete mitochondrial genome of the lappet moth, Kunugia undans (Lepidoptera: Lasiocampidae): genomic comparisons
of
among macroheteroceran superfamilies, Genet. Mol. Biol. 40 (2017) 717–723.
ro
[70] D.-X. Zhang, G.M. Hewitt, Insect mitochondrial control region: a review of its structure, evolution and usefulness in evolutionary studies, Biochem. Syst. Ecol. 25 (1997)
-p
99–120.
re
[71] M. Vila, M. Björklund, The utility of the neglected mitochondrial control region for
Jo ur
na
lP
evolutionary studies in Lepidoptera (Insecta), J. Mol. Evol. 58 (2004) 280–290.
Journal Pre-proof Table 1 The Satyrinae species with available mitogenome on GenBank. Taxa
Mitogenome size (bp)
GenBank accession no.
Reference
15,721
KP247523
Unpublished
14,020
KF990129
Shi et al. [21]
15,167
KF906484
Shi et al. [21]
Melanitis phedima
15,142
KF590538
Wu et al. [19]
Melanitis leda
15,122
JF905446
Shi et al., [23]
Satyrinae Amathusiini Stichophthalma louisa Stichophthalma howqua
Elymniini Elymnias hypermnestra
of
Melanitini
Satyrini 15,240
Lasiommata deidamia
15,244
Lethina Lethe dura
15,259
Neope pulaha
15,209
Neope muirheadi Ninguta schrenckii
da Costa [24]
MG880214
Unpublished
KF881051
Li et al. [25]
KF906485
Shi et al. [21]
KF590543
Wu et al. [19]
15,217
MN242789
This study
15,261
KF881052
Fan et al. [26]
15,267
KM244676
Tang et al. [27]
15,279
MN242790
This study
15,143
KF906487
Zhang et al. [28]
15,142
KF906486
Huang et al. [29]
Davidina armandi
15,214
KF881046
Unpublished
Hipparchia autonoe
15,489
GQ868707
Kim et al. [30]
Minois dryas
15,194
MN242787
This study
Ypthima akragas
15,227
KF590553
Wu et al. [19]
Ypthima motschulskyi
15,232
MN242788
This study
Callerebia suroia
15,208
KF906483
Unpublished
Mycalesis mineus Mycalesis francisca
Jo ur
Coenonymphina
Triphysa phryne Melanargiina
Melanargia asiatica Satyrina
na
Mycalesina
re
15,248
lP
Lethe albolineata
KJ547676
-p
Pararge aegeria
ro
Parargina
Ypthimina
Note: * the mitochondrial genome of the indicated species is incomplete
Journal Pre-proof Table 2 The four newly determined Satyrinae mitogenomes. Position
Anticodon
Intergenic nucleotides
67/67/68/66
CAT
0/-1/0/1
68/67/69/68
132/132/136/134
GAT
-3/-3/-3/-3
N
130/130/134/132
198/198/202/200
TTG
53/51/45/51
nad2
J
252/250/248/252
1,265/1,263/1,261/1,265
trnW
J
1,264/1,262/1,260/1,264
1,330/1,330/1,326/1,330
trnC
N
1,323/1,323/1,319/1,323
1,386/1,387/1,381/1,386
trnY
N
1,387/1,387/1,382/1,386
1,451/1,450/1,444/1,452
cox1
J
1,457/1,454/1,447/1,458
2,992/2,989/2,982/2,993
trnL2
J
2,988/2,985/2,978/2,989
3,054/3,051/3,044/3,055
cox2
J
3,055/3,052/3,046/3,056
3,733/3,726/3,724/3,734
trnK
J
3,731/3,728/3,722/3,732
3,801/3,798/3,792/3,802
trnD
J
3,803/3,805/3,794/3,803
3,868/3,873/3,859/3,869
atp8
J
3,869/3,874/3,860/3,870
4,030/4,038/4,018/4,031
ATT/ATT/ATC/ATC
TAA/TAA/TAA/TAA
-7/-7/-7/-7
atp6
J
4,024/4,032/4,012/4,025
4,701/4,709/4,689/4,702
ATG/ATG/ATG/ATG
TAA/TAA/TAA/TAA
0/-1/3/-1
cox3
J
4,702/4,709/4,693/4,702
5,490/5,497/5,481/5,490
ATG/ATG/ATG/ATG
TAA/TAA/TAA/TAA
2/9/2/2
trnG
J
5,493/5,507/5,484/5,493
5,559/5,572/5,549/5,558
nad3
J
5,560/5,573/5,550/5,559
5,913/5,926/5,903/5,912
trnA
J
5,915/5,929/5,905/5,912
5,979/5,993/5,970/5,979
TGC
1/-1/1/4
trnR
J
5,981/5,993/5,972/5,984
6,042/6,054/6,034/6,045
TCG
4/1/2/6
trnN
J
6,047/6,056/6,037/6,052
6,113/6,121/6,103/6,117
GTT
-3/-3/-3/-3
trnS1
J
6,111/6,119/6,101/6,115
6,170/6,178/6,160/6,174
GCT
1/1/8/1
trnE
J
6,172/6,180/6,169/6,176
6,236/6,246/6,233/6,247
TTC
-2/3/-2/48
trnF
N
6,235/6,250/6,232/6,296
6,300/6,315/6,299/6,361
GAA
-17/-29/-17/-17
Feature
Strand
trnM
From
To
J
1/1/1/1
trnI
J
trnQ
Stop codon
f o
ATT/ATT/ATT/ATT
o r p
e
r P
rn
u o
J
l a
Initiation codon
CGA/CGA/CGA/CGA
ATG/ATG/ATG/ATG
TAA/TAA/TAA/TAA
-2/-2/-2/-2 TCA
-8/-8/-8/-8
GCA
0/-1/0/-1
GTA
5/3/2/5
TAA/TAA/TAA/TAA
-5/-5/-5/-5 TAA
T/TAA/T/T
-3/1/-3/-3 CTT
1/6/1/0
GTC
0/0/0/0
TCC ATT/ATT/ATT/ATT
0/0/1/0
TAA/TAA/TAA/TAA
0/0/0/0 1/2/1/-1
Journal Pre-proof nad5
N
6,284/6,287/6,283/6,345
8,035/8,053/8,037/8,099
trnH
N
8,036/8,051/8,035/8,097
8,101/8,117/8,098/8,163
nad4
N
8,105/8,121/8,102/8,167
9,440/9,456/9,437/9,502
ATG/ATG/ATG/ATG
T/T/T/T
-1/1/-1/-1
nad4l
N
9,440/9,458/9,437/9,502
9,727/9,745/9,727/9,789
ATG/ATG/ATG/ATG
TAA/TAA/TAA/TAA
2/2/2/3
trnT
J
9,730/9,748/9,730/9,793
9,793/9,811/9,794/9,856
TGT
0/0/0/0
trnP
N
9,794/9,812/9,795/9,857
9,858/9,876/9,859/9,922
TGG
2/2/2/-2
nad6
J
9,861/9,879/9,862/9,925
10,388/10,403/10,395/10,449
ATC/ATT/ATC/ATA
cob
J
10,388/10,403/10,395/10,449
11,539/11,554/11,546/11,603
ATG/ATG/ATG/ATG
trnS2
J
11,548/11,553/11,545/11,602
11,612/11,618/11,612/11,667
nad1
N
11,614/11,653/11,612/11,666
12,570/12,588/12,565/12,622
trnL1
N
12,572/12,590/12,569/12,623
12,638/12,659/12,636/12,689
rrnL
N
12,639/12,635/12,637/12,690
13,976/13,995/13,969/14,030
trnV
N
13,977/13,994/13,970/14,031
14,040/14,058/14,032/14,094
rrnS
N
14,041/14,058/14,033/14,095
14,819/14,871/14,804/14,869
14,820/14,872/14,805/14,870
15,194/15,232/15,217/15,279
A+T-rich region
n r u
l a
ATT/ATG/ATT/ATT
TAG/TAA/TAA/TAA
0/-3/-3/-3 GTG
o r p
ATA/ATG/ATA/ATG
e
r P
f o
TAA/TAA/TAA/TAA
-1/-1/0/-1
TAA/TAA/TAA/TAA
8/-2/-2/-2 TGA
TAA/TAA/TAA/TAA
o J
1/34/-1/-2 1/1/3/0
TAG
0/0/0/0 0/0/0/0
TAC
Note: “J” indicates the majority strand and “N” indicates the minority strand; The characters divided by the “/” correspond to the M. dryas/Y. motschulskyi/N. muirheadi/M. francisca.
3/3/3/3
0/0/0/0 0/0/0/0
Journal Pre-proof Table 3 Nucleotide composition of the four newly determined Satyrinae mitogenomes. Size (bp)
AT content (%)
AT-skew
GC-skew
Genome
15,194/15,232/15,217/15,279
80.2/81.8/80/79.9
-0.026/-0.051/-0.055/-0.036
-0.222/-0.209/-0.236/-0.214
PCGs
11,225/11,215/11,230/11,228
78.6/80.5/78.5/78.1
-0.16/-0.15/-0.157/-0.16
-0.009/0.026/-0.009/0.014
tRNAs
1,444/1,454/1,450/1,459
80.9/81.8/80.7/81.3
0.028/0.017/0.016/0.031
0.173/0.127/0.161/0.176
rRNAs
2,117/2,175/2,105/2,116
85.2/85.9/84.7/85.3
0.086/0.099/0.096/0.086
0.347/0.348/0.359/0.329
A+T-rich region
315/361/413/410
94.6/93.9/93/92.7
0.021/-0.086/-0.088/-0.115
-0.296/-0.377/-0.029/0
f o
o r p
Note: The characters divided by the “/” correspond to the M. dryas/Y. motschulskyi/N. muirheadi/M. francisca.
l a
o J
n r u
r P
e
Journal Pre-proof Figure legends:
Fig. 1. Nucleotide diversity (A) and the ratio of Ka/Ks (B) of PCGs from 22 reported Satyrinae mitogenomes. Fig. 2. Relative synonymous codon usages (RSCU) in PCGs of four newly determined mitogenomes. Codon families are indicated below the X-axis. Fig. 3. Putative secondary structures of tRNAs from Y. motschulskyi mitogenome. The tRNAs are labeled with the abbreviations of their corresponding amino acids. Dashes indicate the
of
Watson-Crick base pairs; dots indicate the wobble GU pairs; and the other non-canonical pairs
ro
are not marked. The nucleotides variable among four newly sequenced mitogenomes is
-p
marked.
re
Fig. 4. Gene overlapping and intergenic regions among four sequenced Satyrinae mitogenomes. Nucleotides colored red indicate the sequence of overlapping or intergenic
lP
regions unless where is further explanatory. A. The overlapping region between the trnW and
na
trnC. B. The overlapping region between the atp8 and atp6. C. The overlapping region between the cox1 and trnL2. D. The overlapping region between the trnF and nad5. E. The
Jo ur
intergenic region between trnQ and nad2. F. The intergenic sequence between trnS2 and nad1, and the motif “ATACTAA” routinely found in insect mitogenomes is colored red. G. Schematic illustration of the A + T-rich region from four newly sequenced mitogenomes. The conserved motifs ATAGA and ATTTA were colored green. Dots indicate omitted sequences, and the number of dot is not proportional to nucleotide number of corresponding part. Fig. 5. ML trees inferred from IQ-TREE analyses. The species with newly sequenced mitogenome are emphasized in bold. Numbers separated by slash (/) on node represent the bootstrap supports based on the PCG-rRNA-tRNA and PCG-rRNA datasets respectively. The “-” represents unrecovered node based on PCG-rRNA dataset. Fig. 6. BI trees inferred from MrBayes analyses. The species with newly sequenced mitogenome are emphasized in bold. Numbers separated by slash (/) on node represent the
Journal Pre-proof posterior probabilities based on the PCG-rRNA-tRNA and PCG-rRNA datasets respectively.
Jo ur
na
lP
re
-p
ro
of
The “-” represents unrecovered node based on PCG-rRNA dataset.
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Fig.1
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Fig.2
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Fig.3
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Fig.4
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Fig.5
Journal Pre-proof
Jo ur
na
lP
re
-p
ro
of
Fig.6
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6