GENE-39163; No. of pages: 9; 4C: Gene xxx (2013) xxx–xxx
Contents lists available at ScienceDirect
Gene journal homepage: www.elsevier.com/locate/gene
3Q1
Li Guo, Yang Zhao, Hui Zhang, Sheng Yang, Feng Chen ⁎
4Q7
Department of Epidemiology and Biostatistics, Ministry of Education Key Lab for Modern Toxicology, School of Public Health, Nanjing Medical University, 211166, China
O
F
2
Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships
1
a r t i c l e
i n f o
a b s t r a c t
Article history: Accepted 15 October 2013 Available online xxxx
P
Many microRNAs (miRNAs) are clustered on chromosomes and co-transcribed as polycistronic transcripts. Here, an integrated evolutionary analysis of human miRNA gene clusters and families was performed. Generally, miRNA gene clusters include 2–8 members, but some larger clusters have been found to have more members (over 40 miRNAs). 62.22% of them have been shown to be involved in homologous miRNA genes, including multicopy pre-miRNAs and sense/antisense homologous miRNAs. Multicopy pre-miRNAs can enrich the distribution and relationship between miRNA clusters and families. An miRNA family may be located in one or more clusters, and a cluster may be involved in one or more families. Members of different families have been shown to be prone to appear in clusters, and vice versa. Reconstructed phylogenetic trees and networks may indicate potential evolutionary relationships, which also indicate duplication history in specific related gene clusters and families. Related miRNA families are always found to share common target mRNAs and biological pathways. Some clusters containing non-homologous miRNAs also tend to be clustered together as well as homologous miRNAs. In the present work, it is shown that homologous miRNAs are prone to appear in clusters based on functional and evolutionary pressures. The phenomenon of miRNA clusters containing homologous or genetic relationships is quite common. The integrative evolutionary analysis will provide more potential evolutionary and functional relationships between homologous and clustered miRNAs. © 2013 Published by Elsevier B.V.
D
Keywords: Integrated analysis microRNA (miRNA) miRNA gene cluster/family Evolution
C
T
E
6 7 8 9 11 10 12 13 14 15 16 17
R O
5
E
37 36
1. Introduction
39 40
MicroRNAs (miRNAs) are a class of small negative non-coding RNA (ncRNA) regulatory molecules. They play important biological roles via negatively regulating gene expression and translation processes (Bartel, 2004). The short ncRNAs tend to be extremely well conserved and can be found in genomic sequence datasets (Berezikov et al., 2005). They guide animal development through playing an evolutionary role (Grimson et al., 2008; Liu et al., 2008; Niwa and Slack, 2007; Sempere et al., 2006). miRNAs are widely studied especially because of their contributions to cancer development, and they may be novel biomarkers for diagnosis of cancer and other diseases (Cho, 2010; Wang et al., 2009). miRNAs are not randomly distributed on chromosomes, and they are prone to cluster in a single polycistronic transcript (Lagos-Quintana et al., 2003; Lai et al., 2003; Lee et al., 2002; Mourelatos et al., 2002). They may be co-expressed and may play similar roles in the same biological processes, often through coordinate regulation of those processes (Bashirullah et al., 2003; Baskerville and Bartel, 2005; Seitz et al., 2004). However, they always have different levels of enrichment because of
45 46 47 48 49 50 51 52 53 54 55
R
N C O
43 44
U
41 42
R
38
Abbreviations: miRNA, microRNA; ncRNA, non-coding RNA; chr, chromosome; premiRNA, precursors miRNA; NJ, neighbor-joining; MJ, median-joining; GO, gene ontology. ⁎ Corresponding author. E-mail addresses:
[email protected],
[email protected] (L. Guo),
[email protected] (F. Chen).
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 35 34
complex maturation and degradation processes (Guo and Lu, 2010; Viswanathan et al., 2009; Yu et al., 2006), even when they are transcribed at equal rates via co-transcription. Some clustered miRNAs share very similar sequences, and they are identified as members of the same miRNA gene family (Aravin et al., 2003). The phenomenon further complicates distribution of miRNA gene clusters and families. Clustered and homologous miRNAs may have evolved from the same ancestral gene through complex duplication processes, perhaps even genomewide duplication history (Guo et al., 2009; Heimberg et al., 2008; Hertel et al., 2006; Sun et al., 2013; Zhang et al., 2007). The potential functional relationships among these miRNAs have attracted considerable attention. Increasing numbers of reports indicates that many miRNA gene clusters and families have important roles in cancer development. For example, the miR-17-92 cluster has been shown to contain potential oncomiRs and contribute to the occurrence and development of multiple human cancers (Cho, 2007; Concepcion et al., 2012; Olive et al., 2010). Although the expression and evolutionary patterns of these miRNAs are widely studied, the evolutionary relationships between miRNA gene clusters and families remain largely unexamined. miRNA gene cluster is defined if two or more miRNAs have close physical distance (such as less than10 kb), and miRNA gene family is defined if two or more miRNAs have higher sequence similarity. According to the location distribution and sequence similarity, these miRNA groups may also have evolutionary and functional relationships. It is therefore quite necessary to assess the distribution and evolutionary patterns between these different miRNAs.
0378-1119/$ – see front matter © 2013 Published by Elsevier B.V. http://dx.doi.org/10.1016/j.gene.2013.10.037
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80
2
81 82
L. Guo et al. / Gene xxx (2013) xxx–xxx
86
2. Results
87
2.1. Overview of human gene clusters and families
88 89
One hundred and thirty-five human miRNA gene clusters were found to be involved in 422 pre-miRNA sequences (http://www.mirbase.org/ cgi-bin/mirna_summary.pl?org=hsa&cluster=10000). These miRNAs
O R O P D E T C E R R O C N
90
U
83 84
were more prone to cluster on chromosomes (chr) 8, 17 and X than most miRNAs (Figs. 1A and B). Although the numbers of clusters and known miRNAs showed moderate relationships, they also showed different distributions. These miRNA clusters always had 2–8 members, and 68.49% of them were composed of two miRNA genes (Fig. 1B). However, the two special larger clusters (mir-379 cluster and mir-5121 cluster), located on chromosomes 14 and 19, were found to involve 42 and 46 members, respectively. We found that 62.22% of the clusters were involved in homologous miRNAs (Fig. 1C). Of these, 19.05% were composed of multicopy miRNA genes. The multicopy pre-miRNAs could yield to the same mature miRNAs, although they might have different sequences and might be located on different genomic regions. Generally, these multicopy clusters
F
85
In order to identify the functional and evolutionary relationships among these miRNAs, an integrated analysis of human miRNA gene clusters and families was performed. These results provide data regarding the potential genetic, evolutionary, and functional relationships between miRNA gene clusters and families.
Fig. 1. Human miRNA gene clusters and related gene families. (A) Distributions of all known miRNAs (miRNAs), miRNA gene clusters (miRNA clusters), and members of those families (miRNA members) on human chromosomes. Different distributions are detected. (B) The details of the distributions of the clusters and the number of miRNAs on each chromosome. The larger miRNA gene clusters (containing at least 40 miRNA genes) are indicated with red bars. (C) Pie chart of clusters and related gene families. Out of all the gene clusters, 62.22% are found to contain homologous miRNAs (gene family), and members of 37.78% of the clusters are not in any gene families (not family). In the clusters with homologous miRNAs, 19.05% are found to contain multicopy pre-miRNAs (multicopy pre-miRNAs), but others are not found to be in any gene family (other gene family). (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
91 92 93 94 95 96 97 98 99 100 101 102 103
L. Guo et al. / Gene xxx (2013) xxx–xxx
111 112 113 114 115 116 117 118
119 120 121 122
2.2. Evolutionary analysis of related miRNA gene clusters based on gene family
Simultaneously, we also analyzed related miRNA gene families based on special clusters. For example, mir-17, mir-106a and mir106b clusters showed close relationships due to involvement in homologous miRNAs between and within clusters (Fig. 4). Homologous miRNAs could cluster together, and the mir-19 family formed the outer clade (Fig. 4A). The three miRNA gene families shared over 1/3 target
Because many clusters were involved in homologous miRNAs, further evolutionary analysis was performed based on gene family. For example, a larger let-7 family was found to be involved in 6 gene clusters with other miRNA families (Fig. 3). Some let-7 members were
E T C E R R N C O U Q2
127 128 129 130 131 132 133 134 135 136 137
2.3. Evolutionary analysis of related miRNA gene families based on gene 138 cluster 139
D
123 124
125 126
F
110
O
108 109
also prone to cluster on chromosomes with various physical distances (Fig. 3A). Phylogenetic networks of related families made using the neighbor-net method indicated that three clusters could be split into subgroups (Fig. 3B). Homologous miRNAs could cluster together, and the let-7 family showed large genetic distances from other families. Common target mRNAs were detected between the let-7 group and other groups, especially between the let-7 and mir-125 families. Evolutionary networks of let-7-5p and let-7-3p were reconstructed using the median-joining method. Let-7-3p sequences were involved in more nucleotide substitutions than let-7-5p, including substitutions between multicopy pre-miRNAs. The more pronounced nucleotide divergence led to a more complex evolutionary pattern with more median vectors (Fig. 3C).
R O
106 107
tended to be located on different strands of specific genomic regions and tended to be composed of sense and antisense miRNA genes. For example, mir-3116-1 and mir-3116-2 were clustered on chr1 as sense and antisense miRNAs (Fig. 2). A total of 54 clusters were identified as sense and antisense miRNAs, and 47 of these contained homologous miRNAs (Fig. 2A). Sense and antisense clusters were more likely to be located on chromosomes 2, 8, and 17 than miRNA clusters in general. They and their mature products were reverse complement and form duplexes (Figs. 2B and C). Except for these multicopy pre-miRNAs, some members of clusters were identified as homologous miRNAs and showed considerable sequence similarity, even with the same or similar “seed sequences” (nucleotides 2–8, Fig. S1C). They were always members of specific miRNA gene families. For example, the mir-34b cluster included mir-34b and mir-34c on chr11, and the two miRNAs were found to be homologous members in mir-34 family.
P
104 105
3
Fig. 2. Examples of sense and antisense miRNAs in human miRNA gene clusters. (A) The left figure shows cross-distributions of sense and antisense miRNAs in human gene clusters. The term “human miRNA cluster” here refers to all the human miRNA gene clusters. The term “gene family” refers to clusters that contain homologous miRNA. “Multicopy” refers to members in cluster are multicopy pre-miRNAs. “Sense” and “antisense” refer to members in cluster are sense and antisense miRNAs. A total of 54 clusters are identified here as sense and antisense miRNAs. The figure on the right shows the distributions of sense and antisense clusters and other clusters. Sense and antisense miRNA clusters are found to be more likely to be located on chromosomes 2, 8, or 17. (B) Examples of the stem–loop structures of sense and antisense miRNA clusters (mir-103a-1 and mir-103b-1 on chr5). The miR-103a and miR-103b are highlighted in red. (C) Examples of these sense and antisense miRNAs that can be inactivated through complementarily binding. mir-103a-1, mir-103b-1 and their mature miRNAs are found to form complete and partial duplexes through reverse complementary binding events. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
140 141 142 143 144 145
L. Guo et al. / Gene xxx (2013) xxx–xxx
U
N
C
O
R
R
E
C
T
E
D
P
R O
O
F
4
Q3
Fig. 3. Examples of let-7 gene family and related clusters. (A) Phylogenetic tree of let-7 gene family based on neighbor-joining (NJ) method, and related gene clusters. The six related gene clusters are labeled, and two of them contain homologous members of let-7. Another five miRNAs (members of the mir-99, mir-125a, and mir-4763 families) are found to be involved in the related clusters. (B) Based on all the involved miRNA genes, phylogenetic network is reconstructed using neighbor-net method. These genes are split in three clusters based on family units. The common target mRNAs can be obtained across the three gene families, especially between let-7 and mir-125 families. (C) The evolutionary networks of the products (let-7-5p and let-7-3p) of the let-7 family are reconstructed using median-joining (MJ) method. The trees of the two arms show different evolutionary patterns. Let-7-3p has been shown to be involved in more pronounced nucleotide divergence and has a complex evolutionary history. The red circles show hypothesized sequences within the network. (For interpretation of the references to color in this figure, the reader is referred to the web version of this article.)
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
5
U
N C O
R
R
E
C
T
E
D
P
R O
O
F
L. Guo et al. / Gene xxx (2013) xxx–xxx
Q4
Fig. 4. Example of three related clusters and families. (A) Phylogenetic relationships of three gene families and distributions of the relate clusters. These clusters contain members of different families. The NJ tree shows that the three phylogenetic groups are clustered based around the three gene families. The mir-19 family is the outer cluster and may be the older miRNA genes. (B) Distributions of shared and private target mRNAs. Over 1/3 targets are shared by the three families. (C) Evolutionary networks of miR-5p and miR-3p in the three families. The two products are found to be involved in different nucleotide divergence and evolutionary patterns.
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
6
146 147 148 149 150 151
mRNAs (Fig. 4B), although they showed considerable genetic distances. Evolutionary networks of miR-#-5p and miR-#-3p suggested that the two products indicated different evolutionary patterns due to various nucleotide divergences (Fig. 4C). The evolutionary pattern of miR-#3p (annotated as mature miRNA) was similar to their pre-miRNAs, while miR-#-5p (annotated as miRNA* or non-dominant miRNA) showed inconsistent pattern (Fig. 4).
2.4. Evolutionary analysis of different miRNA gene clusters
153
Clusters containing independent and/or homologous members were reconstructed to allow assessment of the phylogenetic relationships among non-homologous clusters. Interestingly, we found that many miRNAs with close physical distances also tended to be clustered together, although they did not share enough sequence similarity to
154 155
U
N
C
O
R
R
E
C
T
E
D
P
R O
O
F
152
L. Guo et al. / Gene xxx (2013) xxx–xxx
Q5
Fig. 5. Examples of miRNA clusters prone to be clustered together in phylogenetic networks. (A) Ten miRNA clusters are selected to perform the analysis. Three clusters are characterized as homologous miRNAs, two as sense and antisense, and others are identified as independent miRNA species. (B) Phylogenetic relationships indicate that members of clusters are prone to be clustered together, although they may be not homologous species. For example, mir-182-183, mir-193b-365a and mir-298-296 gene clusters.
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
156 157 158
L. Guo et al. / Gene xxx (2013) xxx–xxx
3. Discussion
173
According to location distributions and physical distances, clustered miRNAs are quite popular. Some clustered miRNAs share sequence similarity, and these homologous miRNAs are considered as miRNA gene families. These clustered and homologous small ncRNAs have potential functional relationships involving coordinate regulation of biological processes or contribution to the same biological processes. The results of integrated evolutionary analysis of human miRNA gene clusters and families suggest that clustered miRNAs and homologous miRNA have moderately close relationships, although they are characterized by close physical distribution and pronounced sequence similarity, respectively. The distributions of clustered miRNAs, numbers of miRNA genes and involved homologous miRNAs show specific patterns on the chromosomes different from those of known miRNAs (Figs. 1 and 2). The main reason for this is the large differences in the number of miRNAs in different clusters. Although miRNA gene clusters with two members were quite numerous, larger clusters may have more than 40 members (Fig. 1B). The distribution analyses indicate the non-random distribution of miRNA clusters. For example, chromosome X is found to be enriched with 12.59% miRNA clusters, although miRNAs
184 185 186 187 188 189 190 191
Q6t1:1 t1:2 t1:3 t1:4 t1:5 t1:6 t1:7 t1:8 t1:9 t1:10 t1:11 t1:12 t1:13 t1:14 t1:15 t1:16 t1:17 t1:18 t1:19 t1:20 t1:21 t1:22 t1:23 t1:24 t1:25 t1:26 t1:27 t1:28 t1:29 t1:30
C
182 183
E
180 181
R
178 179
Table 1 Top 10 enriched GO terms of the six miRNA gene families that are prone to cluster together.
R
176 177
Enriched GO term
Prone to be clustered together mir-17
N C O
174 175
MAPK signaling pathway Axon guidance Colorectal cancer Focal adhesion Glioma Regulation of actin cytoskeleton Wnt signaling pathway Melanogenesis Pancreatic cancer Calcium signaling pathway Chronic myeloid leukemia Insulin signaling pathway mTOR signaling pathway TGF-beta signaling pathway Basal cell carcinoma ECM-receptor interaction ErbB signaling pathway Heparan sulfate biosynthesis Long-term potentiation Melanoma p53 signaling pathway Phosphatidylinositol signaling system Prostate cancer Small cell lung cancer Ubiquitin mediated proteolysis
U
168 169 Q9
F
172
166 167
O
170 171
Finally, we further queried for GO terms of the above 6 miRNA gene families through bioinformatic analysis. They could contribute to the same essential biological processes, such as MAPK signaling, axon guidance, and colorectal cancer (Table 1). miRNA families that were clustered and distributed, such as mir-17, mir-25 and mir-19 families, let-7, mir-125 and mir-99 families, always shared more terms with each other.
R O
165
P
2.5. Gene ontology enrichment analysis
located on chromosome X are not the most dominant (Fig. 1A). The enrichment distribution implies that these miRNAs may have potential functional relationships and different biological roles. Of these clusters, 62.22% are found to contain homologous miRNAs, some of which can be characterized as clustered homologous miRNAs, including sense/antisense and multicopy pre-miRNA clusters (Figs. 1 and 2). The existence of these multicopy pre-miRNAs further complicates the distributions and relationships of clusters and families (Figs. S1A and B). The multicopy pre-miRNAs provide potential production bases for adaptation to changing levels of miRNA expression. However, these pre-miRNAs may be involved in nucleotide divergence in other regions (Figs. 3, 4, and S1C), and may be located in different genomic regions and clusters (Fig. 3B). miRNAs that are identified as sense/antisense miRNAs can be used to assess the likelihood of interactions that restrict the final expression via reverse complementary binding events (Guo et al., 2011a, 2012; Hongay et al., 2006; Lai et al., 2004; Shearwin et al., 2005; Stark et al., 2008). Neither of these distributions is random. These distributions may have been produced through a complex set of duplication processes influenced by functional and evolutionary pressures. Complex distribution patterns are observed in both miRNA gene clusters and families, even though the two types of classification refer to physical position and sequence similarity, respectively (Figs. 3 and 4). An miRNA family may be located in one or more clusters, and a cluster may be involved in one or more families. Members of different families tend to be located in related clusters, and vice versa. These related miRNA gene clusters and families always have close functional relationships (Table 1, Figs. 3 and 4). Phylogenetic relationships indicate that complex duplication processes or gene expansion events may have taken place within and between these miRNA families (Figs. 3 and 4). Even non-homologous miRNA were also prone to cluster together (Fig. 5). The result suggests that clustered miRNAs may have evolved from common ancestral miRNA genes. They may have been subject to more rapid evolutionary processes than clustered homologous miRNAs. These findings indicate that the phenomenon of miRNA gene cluster is not a random event but is rather attributable to functional and evolutionary pressures. The relationships among physical
D
164
161 162
T
163
be considered homologous (Fig. 5). According to their physical distances (b10 kb), they were characterized as miRNA gene cluster. For example, mir-298-296 and mir-193b-365a clusters were clustered together in the network, but their members were identified in different miRNA families (Fig. 5).
E
159 160
7
44/2.93E-27 30/7.60E-24 17/4.13E-13 26/3.57E-14 15/1.29E-12 29/3.31E-16 23/2.30E-14
Prone to be clustered together
mir-25
mir-19
let-7
mir-125
mir-99
22/3.66E-11 12/2.22E-07
35/8.08E-19 23/5.23E-16 17/3.38E-13 30/4.85E-18 14/1.68E-11
41/6.81E-27 17/5.12E-11 16/4.97E-13 24/6.78E-14 13/3.99E-11
43/1.08E-28 15/5.21E-09 13/1.43E-09 21/4.55E-11
3/1.32E-3
15/1.54E-07 19/1.85E-10 11/1.77E-07
24/1.78E-15 20/7.89E-15
16/5.20E-09
22/8.47E-12
16/5.43E-13
19/5.73E-09 17/8.16E-10 14/1.29E-11
19/1.98E-16
3/4.07E-05 2/1.21E-3 3/6.76E-04 3/2.36E-04 3/7.69E-05
12/3.00E-09
15/1.26E-12 13/5.49E-08 8/4.68E-07
20/2.00E-13 13/1.16E-11
18/5.62E-14
14/1.95E-10 2/8.69E-04 14/9.41E-11 13/2.23E-09 2/2.24E-04 3/2.90E-05 14/8.68E-12 14/5.74E-12 15/1.22E-11 2/2.20E-3 10/2.62E-07 12/4.60E-07
These miRNA families were prone to cluster together within the genome (Figs. 3 and 4). The number of target genes and P values is given.
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
192 193 194 195 196 197Q10 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228
L. Guo et al. / Gene xxx (2013) xxx–xxx
295
The miRNAs, pre-miRNAs (precursors miRNAs), and clustered and homologous miRNAs were obtained from the miRBase database (Release 19.0, http://www.mirbase.org/) (Kozomara and GriffithsJones, 2011). Gene clusters were defined here as miRNAs sharing relationships based on location distributions, and gene families are defined here as miRNAs sharing relationships based on sequence similarity. For miRNA gene clusters, the default inter-miRNA distance was set at 10 kb. The members of each cluster were identified based on the locations of the miRNA genes (pre-miRNAs). Related miRNAs and pre-miRNAs were aligned using Clustal X 2.0 with multiple sequence alignment (Larkin et al., 2007). To further assess the phylogenetic relationships among pre-miRNAs and miRNAs in related gene families and clusters, phylogenetic trees and networks were constructed using MEGA 5.10 based on the neighbor-joining (NJ) method, SplitsTree 4.10 based on the neighbor-net method, and Network 4.6.1.0 (http://www.fluxus-engineering.com/) based on the median-joining (MJ) method, respectively (Tamura et al., 2011). Further investigation of the functional relationships among these miRNA was performed through the collection of the predicted target mRNAs of miRNA gene families using TargetScan (Lewis et al., 2003). An enrichment gene ontology (GO) analysis was performed using the CapitalBio Molecule Annotation System V4.0 (MAS, http://bioinfo. capitalbio.com/mas3/).
296 297
P
R O
O
F
4. Materials and methods
C
E
R
R
O
C
N
U
298 299 Q13 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 Q8 319
Supplementary data to this article can be found online at http://dx. 320 doi.org/10.1016/j.gene.2013.10.037. 321 Acknowledgments
322
The work was supported by the National Natural Science Foundation of China (no. 61301251, 81072389 and 81373102), the Research Fund for the Doctoral Program of Higher Education of China (no. 211323411002), the China Postdoctoral Science Foundation funded project (no. 2012M521100), key grant of the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (no. 10KJA33034), the National Natural Science Foundation of Jiangsu (No. BK20130885), the Natural Science Foundation of the Jiangsu Higher Education Institutions (no. 12KJB310003 and 13KJB330003), the Jiangsu Planned Projects for Postdoctoral Research Funds (no. 1201022B), the Science and Technology Development Fund Key Project of Nanjing Medical University (no. 2012NJMU001), and the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD).
323 324
References
336
Aravin, A.A., et al., 2003. The small RNA profile during Drosophila melanogaster development. Dev. Cell 5, 337–350. Bartel, D.P., 2004. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell 116, 281–297. Bashirullah, A., Pasquinelli, A.E., Kiger, A.A., Perrimon, N., Ruvkun, G., Thummel, C.S., 2003. Coordinate regulation of small temporal RNAs at the onset of Drosophila metamorphosis. Dev. Biol. 259, 1–8. Baskerville, S., Bartel, D.P., 2005. Microarray profiling of microRNAs reveals frequent coexpression with neighboring miRNAs and host genes. RNA 11, 241–247. Berezikov, E., Guryev, V., van de Belt, J., Wienholds, E., Plasterk, R.H.A., Cuppen, E., 2005. Phylogenetic shadowing and computational identification of human microRNA genes. Cell 120, 21–24. Burroughs, A.M., et al., 2010. A comprehensive survey of 3′ animal miRNA modification events and a possible role for 3′ adenylation in modulating miRNA targeting effectiveness. Genome Res. 20, 1398–1410. Cho, W.C., 2007. OncomiRs: the discovery and progress of microRNAs in cancers. Mol. Cancer 6, 60. Cho, W.C.S., 2010. MicroRNAs: potential biomarkers for cancer diagnosis, prognosis and targets for therapy. Int. J. Biochem. Cell Biol. 42, 1273–1281. Concepcion, C.P., Bonetti, C., Ventura, A., 2012. The MicroRNA-17-92 family of MicroRNA clusters in development and disease. Cancer J. 18, 262–267. Devor, E.J., Peek, A.S., Lanier, W., Samollow, P.B., 2009. Marsupial-specific microRNAs evolved from marsupial-specific transposable elements. Gene 448, 187–191. Grimson, A., et al., 2008. Early origins and evolution of microRNAs and Piwi-interacting RNAs in animals. Nature 455, 1193–1197.
337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361
T
distribution, sequence and function facilitate co-regulation and coordinate regulation of biological processes. Of these, functional 231 pressures may be the most important. Coordinated interactions may 232 further enrich and complicate the coding-non-coding RNA network. 233 The cross-talk between miRNAs contributes to the robust regulatory 234 network. Many clusters are co-transcribed from the genomic DNA 235 sequence as a single polycistronic transcript (Lagos-Quintana et al., 236 2003; Lai et al., 2003; Lee et al., 2002; Mourelatos et al., 2002), and 237 the process contributes to the cross-talk between different miRNAs. 238 The distribution and relationships between miRNA gene clusters and 239 families are mainly derived from duplication processes within and 240 between species. Indeed, genomic duplication event of ancestral miRNA 241 genes may be a major driving force in origin and evolution of miRNA 242 gene clusters (Sun et al., 2013). The duplication dynamic may be 243 attributable to functional pressure. miRNAs are crucial negative 244 regulatory molecules, and they contribute to many different biological 245 processes, including cancer development. The new or young miRNA 246 genes are mainly evolved from ancestral genes via tandem duplication 247 (local duplication) (Tanzer and Stadler, 2004; Zhang et al., 2007), 248 segmental duplication (Zhang et al., 2007), and duplication of repetitive 249 elements (especially transposable elements) (Devor et al., 2009; Hertel 250 et al., 2006; Liang et al., 2012; Yuan et al., 2011). The presence of clustered 251 homologous miRNAs indicates that tandem duplication contributes 252 to the emergence of clusters (Marco et al., 2012). The evolutionary 253 processes may provide a potential production base of miRNAs 254 (multicopy miRNA genes or multicopy pre-miRNAs) and partner 255 miRNAs (homologous miRNA genes). These novel characteristics 256 should provide sufficient production bases for dynamic expression. 257 Simultaneously, evolutionary processes exert their effects both 258 within and across species. For this reason, multicopy pre-miRNAs 259 may be diverged even though they produce the same mature 260 miRNAs. If the evolutionary process is involved in nucleotide 261 divergence in the miRNA region, homologous miRNA genes may be 262 generated. The duplication process largely contributes to the 263 phenomenon of clustered miRNAs with homologous members. 264 Functional shifting events can be widely detected across animal 265 species (Guo et al., 2009; Wheeler et al., 2009), but they also exist in 266 homologous miRNAs in specific species (Fig. S1C). This phenomenon 267 is common based on the annotated or canonical miRNA sequences, as 268 indicated by high-throughput sequencing datasets. The miRNA locus 269 always yields a series of miRNA variants (also termed isomiRs) mainly 270 via imprecise cleavage of Drosha and Dicer (Burroughs et al., 2010; 271 Guo et al., 2011b; Landgraf et al., 2007; Lee et al., 2010; Morin et al., 272 2008). These isomiRs are involved in various 5′ and 3′ ends, and some 273 Q11 are even detected by varied nucleotides, especially 3′ additional non274 template nucleotides. For a given miRNA locus, seed shifting events 275 may be detected especially if the sequence has novel 5′ ends. In this 276 way, functional shifts are often found between homologous miRNAs at 277 the isomiR level. This phenomenon may be exploited to enrich small 278 non-coding RNA regulatory networks. 279 Q12 Collectively, miRNA clusters and families are found here to have close 280 functional and evolutionary relationships. This closeness becomes even 281 more pronounced if the arbitrary inter-miRNA distance is set at 50 kb 282 (Baskerville and Bartel, 2005). The distributions of homologous miRNAs 283 are not occasional, and some of them experience rapid expansion within 284 and between species. Duplicated miRNAs are always well conserved, 285 which can lead to multicopy pre-miRNAs or homologous miRNAs. 286 Expansion may also be involved in nucleotide substitution (transition 287 or transversion) and insertion/deletion, which may lead to novel young 288 miRNA genes. However, all duplication processes increase nucleotide 289 divergence in non-miRNA regions of pre-miRNAs, including their 290 passenger strands (Fig. S1C). Therefore, miR-#-5p and miR-#-3p or 291 miRNA and miRNA*, including the two products from the multicopy 292 pre-miRNAs, always show different evolutionary patterns (Figs. 3 293 and 4). Evolutionary processes further enrich distributions and 294 compositions of miRNA gene families and clusters.
D
229 230
E
8
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
325 326 327 328 329 330 331 332 333 334 335
L. Guo et al. / Gene xxx (2013) xxx–xxx
D
P
R O
O
F
Marco, A., Hooks, K., Griffiths-Jones, S., 2012. Evolution and function of the extended miR2 microRNA family. RNA Biol. 9, 242–248. Morin, R.D., et al., 2008. Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells. Genome Res. 18, 610–621. Mourelatos, Z., et al., 2002. miRNPs: a novel class of ribonucleoproteins containing numerous microRNAs. Genes Dev. 16, 720–728. Niwa, R., Slack, F.J., 2007. The evolution of animal microRNA function. Curr. Opin. Genet. Dev. 17, 145–150. Olive, V., Jiang, I., He, L., 2010. mir-17-92, a cluster of miRNAs in the midst of the cancer network. Int. J. Biochem. Cell Biol. 42, 1348–1354. Seitz, H., Royo, H., Bortolin, M.L., Lin, S.P., Ferguson-Smith, A.C., Cavaille, J., 2004. A large imprinted microRNA gene cluster at the mouse Dlkl-Gtl2 domain. Genome Res. 14, 1741–1748. Sempere, L.F., Cole, C.N., McPeek, M.A., Peterson, K.J., 2006. The phylogenetic distribution of metazoan microRNAs: insights into evolutionary complexity and constraint. J. Exp. Zool. B Mol. Dev. Evol. 306, 575–588. Shearwin, K.E., Callen, B.P., Egan, J.B., 2005. Transcriptional interference — a crash course. Trends Genet. 21, 339–345. Stark, A., et al., 2008. A single Hox locus in Drosophila produces functional microRNAs from opposite DNA strands. Genes Dev. 22, 8–13. Sun, J., et al., 2013. Comparative genomic analysis reveals evolutionary characteristics and patterns of microRNA clusters in vertebrates. Gene 512, 383–391. Tamura, K., Peterson, D., Peterson, N., Stecher, G., Nei, M., Kumar, S., 2011. MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol. Biol. Evol. 28, 2731–2739. Tanzer, A., Stadler, P.F., 2004. Molecular evolution of a microRNA cluster. J. Mol. Biol. 339, 327–335. Viswanathan, S.R., Mermel, C.H., Lu, J., Lu, C.W., Golub, T.R., Daley, G.Q., 2009. microRNA expression during trophectoderm specification. PLoS One 4, e6143. Wang, K., et al., 2009. Circulating microRNAs, potential biomarkers for drug-induced liver injury. Proc. Natl. Acad. Sci. U. S. A. 106, 4402–4407. Wheeler, B.M., et al., 2009. The deep evolution of metazoan microRNAs. Evol. Dev. 11, 50–68. Yu, J., et al., 2006. Human microRNA clusters: genomic organization and expression profile in leukemia cell lines. Biochem. Biophys. Res. Commun. 349, 59–68. Yuan, Z., Sun, X., Liu, H., Xie, J., 2011. MicroRNA genes derived from repetitive elements and expanded by segmental duplication events in mammalian genomes. PLoS One 6, e17666. Zhang, R., Peng, Y., Wang, W., Su, B., 2007. Rapid evolution of an X-linked microRNA cluster in primates. Genome Res. 17, 612–617.
E
Guo, L., Lu, Z., 2010. Global expression analysis of miRNA gene cluster and family based on isomiRs from deep sequencing data. Comput. Biol. Chem. 34, 165–171. Guo, L., Sun, B.L., Sang, F., Wang, W., Lu, Z.H., 2009. Haplotype distribution and evolutionary pattern of miR-17 and miR-124 families based on population analysis. PLoS One 4, e7944. Guo, L., Liang, T., Gu, W., Xu, Y., Bai, Y., Lu, Z., 2011a. Cross-mapping events in miRNAs reveal potential miRNA-mimics and evolutionary implications. PLoS One 6, e20517. Guo, L., et al., 2011b. A comprehensive survey of miRNA repertoire and 3′ addition events in the placentas of patients with pre-eclampsia from high-throughput sequencing. PLoS One 6, e21072. Guo, L., Sun, B., Wu, Q., Yang, S., Chen, F., 2012. miRNA–miRNA interaction implicates for potential mutual regulatory pattern. Gene 511, 187–194. Heimberg, A.M., Sempere, L.F., Moy, V.N., Donoghue, P.C.J., Peterson, K.J., 2008. MicroRNAs and the advent of vertebrate morphological complexity. Proc. Natl. Acad. Sci. U. S. A. 105, 2946–2950. Hertel, J., et al., 2006. The expansion of the metazoan microRNA repertoire. BMC Genomics 7, 25. Hongay, C.F., Grisafi, P.L., Galitski, T., Fink, G.R., 2006. Antisense transcription controls cell fate in Saccharomyces cerevisiae. Cell 127, 735–745. Kozomara, A., Griffiths-Jones, S., 2011. miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 39, D152–D157. Lagos-Quintana, M., Rauhut, R., Meyer, J., Borkhardt, A., Tuschl, T., 2003. New microRNAs from mouse and human. RNA 9, 175–179. Lai, E.C., Tomancak, P., Williams, R.W., Rubin, G.M., 2003. Computational identification of Drosophila microRNA genes. Genome Biol. 4, R42. Lai, E.C., Wiel, C., Rubin, G.M., 2004. Complementary miRNA pairs suggest a regulatory role for miRNA: miRNA duplexes. RNA 10, 171–175. Landgraf, P., et al., 2007. A mammalian microRNA expression atlas based on small RNA library sequencing. Cell 129, 1401–1414. Larkin, M.A., et al., 2007. Clustal W and clustal X version 2.0. Bioinformatics 23, 2947–2948. Lee, Y., Jeon, K., Lee, J.T., Kim, S., Kim, V.N., 2002. MicroRNA maturation: stepwise processing and subcellular localization. EMBO J. 21, 4663–4670. Lee, L.W., et al., 2010. Complexity of the microRNA repertoire revealed by next generation sequencing. RNA 16, 2170–2180. Lewis, B.P., Shih, I.H., Jones-Rhoades, M.W., Bartel, D.P., Burge, C.B., 2003. Prediction of mammalian microRNA targets. Cell 115, 787–798. Liang, T., Guo, L., Liu, C., 2012. Genome-wide analysis of mir-548 gene family reveals evolutionary and functional implications. J. Biomed. Biotechnol. 2012, 679563. Liu, N., Okamura, K., Tyler, D.M., Phillips, M.D., Chung, W.J., Lai, E.C., 2008. The evolution and functional diversification of animal microRNA genes. Cell Res. 18, 985–996.
T
362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401
9
U
N C O
R
R
E
C
443
Please cite this article as: Guo, L., et al., Integrated evolutionary analysis of human miRNA gene clusters and families implicated evolutionary relationships, Gene (2013), http://dx.doi.org/10.1016/j.gene.2013.10.037
402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442