Identification of milRNAs and their target genes in Ganoderma lucidum by high-throughput sequencing and degradome analysis

Identification of milRNAs and their target genes in Ganoderma lucidum by high-throughput sequencing and degradome analysis

Journal Pre-proofs Identification of milRNAs and their target genes in Ganoderma lucidum by high-throughput sequencing and degradome analysis Junjie S...

2MB Sizes 0 Downloads 17 Views

Journal Pre-proofs Identification of milRNAs and their target genes in Ganoderma lucidum by high-throughput sequencing and degradome analysis Junjie Shao, Liqiang Wang, Yang Liu, Qianru Qi, Bin Wang, Shanfa Lu, Chang Liu PII: DOI: Reference:

S1087-1845(18)30180-4 https://doi.org/10.1016/j.fgb.2019.103313 YFGBI 103313

To appear in:

Fungal Genetics and Biology

Received Date: Revised Date: Accepted Date:

5 September 2018 9 August 2019 15 November 2019

Please cite this article as: Shao, J., Wang, L., Liu, Y., Qi, Q., Wang, B., Lu, S., Liu, C., Identification of milRNAs and their target genes in Ganoderma lucidum by high-throughput sequencing and degradome analysis, Fungal Genetics and Biology (2019), doi: https://doi.org/10.1016/j.fgb.2019.103313

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2019 Published by Elsevier Inc.

Title Identification of milRNAs and their target genes in Ganoderma lucidum by high-throughput sequencing and degradome analysis Junjie Shao&1, Liqiang Wang&1, Yang Liu2, Qianru Qi2, Bin Wang2, Shanfa Lu1*, Chang Liu1* 1Key

Laboratory of Bioactive Substances and Resource Utilization of Chinese Herbal

Medicine from Ministry of Education, Institute of Medicinal Plant Development, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing 100193, P.R.China; 2School

of Pharmaceutical Sciences, Xiangnan University, Chenzhou, Hunan Province,

P.R.China.

E-mails: JJS: [email protected] LQW: [email protected] YL: [email protected] QRQ: [email protected] BW: [email protected] SFL: [email protected] CL: [email protected] &Contributed

equally

*Correspondence: [email protected] (SFL), Tel.: +86-010-5783-3366 (SFL); [email protected] (CL), Tel.: +86-010-5783-3111 (CL)

1 / 35

Abstract MicroRNAs (miRNAs in animals and plants or milRNAs in fungi) are endogenous noncoding RNAs that can regulate gene expression. However, little information is known about milRNAs and their target genes in Ganoderma lucidum. Here, we systematically predicted and characterised the milRNAs and their target genes across the three developmental stages of G. lucidum. A total of 168 unique milRNAs were predicted using a small RNA sequencing method. For them, 1,612 target sequences corresponding to 1,311 unique genes were predicted by degradome sequencing. We selected 42 predicted milRNAs and performed RTPCR amplification and Sanger sequencing of the products. Five products were found to have sequences similar to those predicted, confirming the presence of milRNAs in G. lucidum, but demonstrating the difficulty in their validation. Among the 168 milRNAs, 111 were found to be significantly differentially expressed across the three developmental stages (q ≤ 0.05). The expression levels of 12 milRNAs were measured by stem-loop quantitative real-time polymerase chain reaction. Eight of them were in line with the sequencing results (r ≥ 0.9, p ≤ 0.05). These 12 milRNAs and their target genes form 16 milRNA-target gene pairs. The expression profiles of 8 of these 16 miRNA-target pairs were negatively correlated, according to real-time quantitative analysis, whereas the other eight pairs were positively correlated. Furthermore, the results of functional enrichment analysis showed that the target genes of milRNAs mapped to the Gene Ontology terms ‘GTP binding’ and ‘FAD binding’ were enriched in specific developmental stages. These target genes were related to the biosynthesis of triterpenes and polysaccharides and lignin degradation pathway in G. lucidum. In summary, this study has indicated that milRNAs may play crucial regulatory roles in various biological processes of G. lucidum for the first time and open up new avenues for research on milRNAs’ biosyntheses and functions in basidiomycetes.

Keywords: Ganoderma lucidum; milRNAs; small RNA sequencing; degradome sequencing; stem-loop RT-qPCR

2 / 35

Introduction Ganoderma lucidum (Leyss. ex Fr.) Karst is one of the most famous medicinal fungi belonging to the family Ganodermataceae and order Polyporales (Richter et al. 2015). G. lucidum has received worldwide attention because it can serve as plant pathogens (Roozbeh et al. 2013) and bioreactor of lignin-modifying enzymes (Kaur & Kaur 2016); (Zhou et al. 2013). In addition, G. lucidum has been used widely as functional food and herbal drug, because it contains various natural active medicinal components, such as triterpenes, polysaccharides, steroids and terpenes. Among them, triterpenes and polysaccharides are considered to be the main bioactive compounds of G. lucidum (Xia et al. 2014). The significant medicinal values and lignin degradation capabilities and the availability of wealthy genomic information (Chen et al. 2012) enabled G. lucidum an ideal model species in studying the biology of basidiomycetes. Previously, we sequenced and characterised the complete genome of G. lucidum and then identified all the protein coding genes participating in the biosynthetic pathways of the secondary metabolites, growth and development through subsequent transcriptomic studies (Chen et al. 2012). To determine if noncoding RNAs play important roles in any biological process, we have identified and characterised long intergenic noncoding RNA (lncRNAs) and natural antisense transcripts by using nonstrand-specific RNA-Seq data or strand specific RNA-Seq data, respectively (Li et al. 2014; Shao et al. 2017). Additional noncoding RNAs need to be identified and characterized in order to gain complete understanding of the regulatory network in G. lucidum. Small RNAs (sRNA) are a group of RNA molecules with 19–28 nucleotides (nt) in length. Several types of sRNAs have been identified, including: microRNA (miRNA) (Lee & Ambros 2001); (Lau et al. 2001); (Lagosquintana et al. 2001), endogenous trans-acting siRNA (transiRNA), repeat-associated siRNA (rasiRNA), small scan RNA (scnRNA) (Mello & Conte 2004) and piwi-interacting RNAs (piRNAs) (Aravin et al. 2007). Among them, microRNAs (miRNAs) are endogenous noncoding sRNAs and they are derived from primary transcripts (pri-miRNA) with a characteristic stem-loop structure (Bartel 2004). miRNAs are involved in gene expression regulation in a diverse range of biological processes, such as cell differentiation or proliferation and oxidative stress resistance (Moss et al. 1997) in plants (Jonesrhoades et al. 2006; Mao et al. 2012), animals (Lewis et al. 2003) and fungi (Lee et al. 2010). In animals and plants, they were produced by Dicer-like enzymes or Drosha proteins from pri-miRNA having a stem-loop structure (Bartel 2004). In fungi, miRNAs were produced by different combinations of proteins including Dicers (Caterina et al. 2004), QDE-2 (Maiti et al. 2007), the exonuclease QIP (Maiti et al. 2007) and MRPL3 (an RNase III domaincontaining protein) (Lee et al. 2010), suggesting a biosynthesis mechanism different from those in animals and plants. As a result, the miRNAs in fungi are usually called miRNA-like RNAs (milRNAs) (Lee et al. 2010). Since the first milRNA in fungi was reported in Neurospora crassain (Lee et al. 2010), the milRNA were identified in other fungal species, such as Sclerotinias clerotiorum, Cryptococcus neoformans, Fusarium oxysporum, Metarhizium 3 / 35

anisopliae, Trichoderma reesei, Aspergillus fumigatus, Aspergillus flavus, Ophiocordyceps sinensis, Coprinopsis cinerea, Penicillium marneffei, and so on (Shao et al. 2019; Wang et al. 2018). The significant roles of miRNA have been reported in plants and animals. The exact mechanism of miRNA-mediated translational repression is yet to be fully determined, which regulate gene expression post-transcriptionally. They generally bind to the 3'-UTR (untranslated region) of their target mRNAs and repress protein production by destabilizing the mRNA and translational silencing (Cannell et al. 2007). However, the regulating roles of milRNA and how they participate in the regulation of gene expression in fungi was relatively lacking compared those for animals and plants. The identification of milRNA and their targets is the prerequisite for understanding the milRNAs’ biological functions and regulatory mechanism. Several methods have been developed to identify the target genes of milRNAs, including computational target prediction, degradome sequencing and 5′ RACE (Liang et al. 2010). In general, the method of computational prediction suffers high rate of false positive, and the method of 5′ RACE is low throughput and time consuming. By contrast, the method of degradome sequencing is with higher throughput and predicted rate than the methods of computational prediction and 5′ RACE. It has become a powerful tool to identify miRNA targets. Moreover, it has been widely used in other species. Degradome sequencing is an efficient way to identify miRNA targets as it can sequence the 5’ ends of uncapped RNA fragments through a high-throughput way (Addo-Quaye et al. 2008; Jiang et al. 2014). Previously, 89 potential milRNAs were predicted from the database of expressed sequence tags of G. lucidum, and 28 potential targets were predicted computationally (Mu et al. 2015). However, no experimental validation of the predicted milRNAs has been performed. Considering the roles of miRNAs played in a wide range of organisms, we hypothesized that milRNAs might also play an important role in G. lucidum’s biological processes. In this study, we performed small RNA sequencing on samples from mycelia, primordia and fruiting bodies and degradome sequencing on samples mixed from these three developmental stages. The milRNAs and their targets were systematically identified. Our study for the first time painted a global picture of the milRNAs and their target genes in G. lucidum by using experimental data. The results suggested that milRNAs might mediate the biological processes, which is responsible for the biosynthesis of triterpenoids and polysaccharides and the degradation of woody materials in G. lucidum.

Materials and Methods Strain and culture conditions The dikaryotic strain G. lucidum CGMCC5.0026 was obtained from China General Microbiological Culture Collection Center (Beijing, China). The mycelia, primordia and fruiting bodies were cultured as described previously (Chen et al. 2012).

Identification of Dicer-like and Argonaute protein homologs in G. lucidum To identify the homologues of Dicer-like and Argonaute proteins in G. lucidum, Dicer-like and Argonaute protein sequences of Trichoderma reesei (Kang et al. 2013), Metarhizium 4 / 35

anisopliae (Zhou et al. 2012b) and Neurospora crassa (Lee et al. 2010) were downloaded from GenBank on June 2018. These sequences were then compared with the protein sequences of G. lucidum by BLASTP by using a cutoff e-value of 1e-10. The best hits of BLASTP result were conducted to domain identification by NCBI-conserved domain search subsequently (https://www.ncbi.nlm.nih.gov/).

Construction of sRNA libraries and high-throughput sequencing Total RNAs were extracted from samples of mycelia, primordia and fruiting bodies of G. lucidum by using Trizol Reagent (Invitrogen, CA, USA). Each sample has two biological replicates. Genomic DNA contaminations were eliminated by RNase-free DNase I (Tiangen, Beijing, China). The quality and integrity of total RNAs were detected by Agilent 2100 (Agilent, CA, USA). The small RNA libraries were constructed using the procedures described below. The purified total RNAs were reverse-transcribed by Superscript II reverse transcriptase (Invitrogen, USA) with random primers, and the cDNAs were then ligated with 5′ and 3′ adapters. The ligation products were purified and were then subjected to polymerase chain reaction (PCR) amplification and gel selection for products having sizes between 400 and 500 bp. The selected products were sequenced with Illumina Hiseq2500 (BerryGenomics, Beijing, China).

Identification of milRNAs The resulting raw sequences of each sample from mycelia, primordia and fruiting bodies were subjected to pre-processing by using FASTX-Toolkit (v0.0.14, http://hannonlab.cshl.edu/fastx_toolkit/) as described below, respectively. Firstly, adaptor sequences were identified and removed. Secondly, the sequences were removed if the reads contain at least 3% of Ns, or 50% of the bases in the reads have a Phred base quality score less than 33. Lastly, the sequences shorter than 18 nt or longer than 30 nt were removed. The resulting reads were compared against the Rfam (http://rfam.janelia.org/) and Repeat database (http://www.girinst.org/repbase/) by using script rfam_scan.pl (Sarah W et al. 2013). Those reads-matching entries of Rfam including mRNA, rRNA, tRNA, snoRNA, snRNA, other noncoding RNAs and repeat sequences were filtered out. The resulting sequences were then searched against miRBase 21.0 (Griffithsjones et al. 2008) (ftp://mirbase.org/pub/mirbase/CURRENT/), and only the sequences with three or few mismatches compared with mature miRNAs in miRBase were considered conserved miRNAs (Mu et al. 2015). The remaining sequences were further analysed with miRDeep2 software (FriedlanNder 2012)). A miRNA is considered novel if it meets the following criteria. (1) At least two reads were mapped to the miRNA, and the read counts of miRNA are more than those of the miRNA*, which is the complementary strand of the hairpin (Blake C. Meyers 2008). (2) The asymmetry between mature milRNAs and milRNAs* should have five or few mismatches (Jiang et al. 2007). In addition, mature milRNAs and milRNAs* present no more than two nucleotides at the 3′ overhangs (Lee et al. 2010). (3) The number of asymmetric bulges in miRNA/miRNA* duplex should not exceed 2 (Zhang et al. 2006). (4) The minimum free energy of the predicted 5 / 35

hairpin precursor should be less than -20 kcal/mol (Tong et al. 2016).

Degradome analysis and target gene identification To construct the degradome analysis library, total RNAs were extracted as described above. Then, 20 µg of total RNAs from each sample were mixed together, and the quality and integrity of total RNA were examined using Agilent 2100. The degradome library was prepared according to the procedure described previously (Addo-Quaye et al. 2008); (Ma et al. 2010). The library was then subjected to DNA sequencing on Illumina Hiseq2500 platform (LC-BIO, Hangzhou, China). The sequencing data were analysed using software CleaveLand (v4.4.4) to identify milRNA-target pairs (Addoquaye et al. 2009). The degradome sequencing reads were mapped to the transcriptomes of G. lucidum (Shao et al. 2017). For each transcript, the mapped reads were examined for their abundance and location. All of the potential cleaved transcripts were classified into five categories based on the signature abundance at each occupied transcript position as described previously (Xu et al., 2013; Yang et al., 2013). In category 0, over one raw read mapped is observed at the position. The abundance at the position is equal to the maximum on the transcript, and only one maximum on the transcript is found. In category 1, over one raw read is noted at the position. Abundance at the position is equal to the maximum on the transcript, and more than one maximum position is found on the transcript. In category 2, over one raw read is observed at the position, and abundance at the position is less than the maximum but higher than the median for the transcript. In category 3, over one raw read is noted at the position, and abundance at the position is equal to or less than the median for the transcript. In category 4, only one raw read is observed at the position (Addoquaye et al. 2009). The sequence alignment between each candidate milRNA and its potential target was evaluated using a score called Allen score (Allen et al. 2005a). The match between the milRNA and its target gene was considered significant if the alignment of the pair has an Allen score less than 7 and the mismatches between milRNA and its target genes were located at the 10-11th base from the 5′ end of the mature milRNA sequence (Allen et al. 2005b). These significant pairs of milRNA and target genes were selected for further analyses.

Functional analysis of target genes To identify the potential functions of milRNAs, all target genes of milRNA identified in G. lucidum undergone Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) Pathway analyses. The G. lucidum genes and their associated GO terms and KEGG pathway information were retrieved from the previous study (Chen et al. 2012). To determine whether the enrichment is statistically significant, the hypergeometric probability was calculated for each GO term or KEGG pathway by using formula (1) as described previously (Shao et al. 2017).

( )( ) ( ) 𝑛 𝑚 𝑖 𝑁―𝑖

𝑃(x = i) =

𝑚+𝑛 𝑁

6 / 35

,

𝑚!𝑛!𝑁!(𝑚 + 𝑛 ― 𝑁)!

= 𝑖!(𝑛 ― 𝑖)!(𝑚 + 𝑖 ― 𝑁)!(𝑁 ― 𝑖)!(𝑚 + 𝑛)!,. where ‘P’ is the hypergeometric probability. ‘n’ is the number of all genes in G. lucidum mapped to particular GO terms or KEGG pathways. ‘m’ is the number of all genes in G. lucidum that are not related to the GO terms or KEGG pathways of interest. ‘N’ is the number of all target genes, and ‘x’ is the number of target genes related to particular GO terms or KEGG pathways. The p-value was then subjected to multiple test correlation. The False Discovery Rate (FDR, q-value) was calculated and used to represent the significance of the enrichment for particular GO term or KEGG pathway.

Validation of the predicted milRNAs using stem-loop RT-PCR and Sanger sequencing To determine if the milRNAs predicted are actually present in the cell. We selected 42 predicted milRNAs. Primers were designed to amplify the expected milRNAs using cDNA, or total RNAs as template. The total RNAs were pre-treated with DNAse I to remove DNA contaminations. The forward and reverse primers were designed according to those described previously (Chen et al. 2005) using software primer 6.0 (Primer-E Ltd., Plymouth, UK). All primer sequences are listed in Table S1.The RNA samples used for RT-PCR validation were similar to those used for high-throughput DNA sequencing experiments. The reverse transcription was performed using RT-PCR Kit (AT301-02, TransGen, Beijing, China). The conditions of the reverse transcription reaction were as follows: 45 °C for 35 min, 85 °C for 5 min. The cDNA was diluted and then PCR-amplified in triplicates by using the Bio-Rad CFX96 Real-Time PCR System (Bio-Rad, CA, USA). Each PCR reaction was conducted in a final volume of 20 µl containing 1 µl diluted cDNA, 10 µM forward primer, 10 µM reverse primer, and 10 µl 2 × TransStart Top Green qPCR SuperMix (AQ131, TransGen, Beijing, China). The PCR products were sent for Sanger sequencing directly.

Reverse Transcription and Quantitative PCR (RT-qPCR) analysis A stem-loop RT-qPCR method was applied to quantitate the expression levels of milRNA and their target genes across the three developmental stages (Chen et al. 2005). Sixteen milRNAtarget pairs were chosen according to their functional annotation (Chen et al. 2012). The 18S rRNA and RPL4 were selected as endogenous references for milRNAs and mRNAs, respectively, as described previously (Mu et al. 2015; Xu et al. 2014). The RNA samples used for this experiment were the same as those described above. The cDNA synthesis was carried out as described above. The RT-qPCR procedure was as follows: 94 °C for 30 s, 35 cycles of 94 °C and 60 °C for 34 s. The endogenous reference of milRNA was reverse-transcribed from total RNA samples by using random primers. The relative expression levels of milRNAs and target genes were calculated by 2−△△Ct method (Livak & Schmittgen 2001).

Statistical analysis Statistical significance of Pearson’s correlation coefficient was calculated using t-test as described previously (Shao et al. 2017). Briefly, the t-value of Pearson’s correlation coefficient was calculated using the following formula (2): 7 / 35

𝑡=

𝑟 𝑠𝑞𝑟𝑡[(1 ― 𝑟2) (𝑁 ― 2)]

where r indicates the value of Pearson’s correlation coefficient, and N indicates the sample size.

Results Analysis of sRNA sequencing data in G. lucidum A total of 32,405,494, 33,608,256 and 51,407,302 raw reads were obtained from duplicated samples of mycelia, primordia and fruiting bodies, respectively. After filtering, 24,955,930, 26,161,471 and 40,245,715 of total reads were mapped to G. lucidum reference genome. After removing redundant reads, 2,813,028, 6,141,444 and 3,136,501 unique reads were identified from data of mycelia, primordia and fruiting bodies respectively. Mapping of these unique reads revealed that 1,826,715, 3,996,818 and 1,898,669 reads were mapped to the reference genome (Table S2). Length distribution of total sequences showed that most sRNAs have length ranging from 19 nt to 23 nt, representing 70.01%, 78.14% and 71.87% of the total sRNAs identified in mycelia, primordia and fruiting bodies, respectively (Figure 1A). In terms of unique sequences, most sRNAs also have lengths ranging from 19 nt to 23 nt, representing 71.83%, 76.98% and 75.18% of the unique reads of sRNAs identified in mycelia, primordia and fruiting bodies, respectively (Figure 1B). Among the total sequences, most sRNAs have sizes of 22 nt (15.47%), 21 nt (18.65%) and 21 nt (16.25%) in mycelia, primordia and fruiting body libraries, respectively (Figure 1A), whereas sRNAs having length of 20 nt (26.14%), 20 nt (23.59%) and 20 nt (26.01%) are the most abundant types in mycelia, primordia and fruiting bodies, respectively (Figure 1B).

Identification of Argonaute and Dicer-like protein homologs in G. lucidum If G. ludicum has a functional milRNA machinery, G. lucidum genome would encode the proteins that are involved in the milRNA biogenesis and functioning. As a result, we searched G. lucidum genome for Argonaute and Dicer-like proteins by using BLASTP. The best hits of sequence comparison include five Argonaute homologues (GL15827, GL20838, GL21067, GL22457 and GL31293) in G. lucidum genome (Figure S1). To ensure that these proteins are indeed Argonaute protein homologues, we searched these sequences for conserved Argonaute protein domains, which include PIWI, PAZ and N-terminal domains (Poulsen et al. 2013). The results of domain identification showed that all five Argonaute homologues contain PIWI, PAZ, N-terminal domains and Argonaute liker 1/2 domain. Besides, GL22457 contains Mid domains of Argonaute. Through similar sequence analyses, three Dicer-like protein homologues (GL16791, GL22869 and GL26805) were found in G. lucidum (Figure S2). Dicer-like proteins have two conserved RNase III domains (Macrae et al. 2006). Domain analysis found that both GL16791 and GL22869 contain two RNase III domains. In addition, GL16791 contains another conserved Dicer dimerisation domain. However, GL26805 contains only one RNase III domain. Overall, these results indicate that G. lucidum contains functional milRNA 8 / 35

machinery.

Prediction of milRNAs Initially, we searched the miRBase (version 21.0) with all sRNA reads from mycelia, primordia and fruiting bodies by using BLASTN with a threshold of less than three mismatches (Mu et al. 2015). Unfortunately, no hits were identified under this condition. Then, we applied the miRDeep2 program to predict milRNAs in G. lucidum with default parameters because no specific parameters have been reported to be useful to predict milRNAs in fungi. The hairpin structures of milRNAs predicted in G. lucidum were then manually checked according to the parameters described above. A total of 92, 149 and 134 milRNAs were identified in both replicate samples from mycelia, primordia and fruiting bodies, respectively. We retrieved those milRNAs identified in both replicate samples, thereby resulting in 168 unique milRNAs across the three developmental stages. Details for these milRNAs are shown in Table S3. The distribution and the heatmap of expression level of each milRNA among the three developmental stages are shown in Figure 2 and Figure S3, respectively. The volcano plot showing the expression changes of the milRNAs in different stage of G. lucidum is shown in Figure S4. Seventy seven out of 168 (45.83%) milRNAs were shared across all the three developmental stages. By contrast, 2, 22 and 14 out of 168 milRNAs (1.19%, 13.10% and 8.33%) were found only in the mycelia, primordia and fruiting bodies, respectively.

Prediction of the target genes of milRNAs As described above, degradome sequencing was conducted to identify potential target genes of milRNAs in G. lucidum. A total of 16,633,927 raw reads were identified, and these corresponded to 3,057,338 unique raw reads. A total of 13,439,330 reads (80%) could be mapped to the reference genome of G. lucidum by using BLASTN with an e-value cutoff of 1e-10. For the 168 milRNAs identified above, 130 targeted 2,394 sequences corresponding to 1,804 specific genes, wherein 14, 15, 544, 459 and 1414 sequences belong to categories 0, 1, 2, 3 and 4, respectively (Table S4). This result suggested that milRNA can regulate several target genes, and more than one milRNA can regulate one target gene, consistent with those described previously (Zhou et al. 2016). For example, milRNA159 targeted 202 sequences of 202 specific genes, and milRNA116 targeted 181 sequences of 178 specific genes. The gene of GL25035 was targeted by six milRNAs. According to the functions of targeted genes, 17 target genes belonged to the CAZy family, seven target genes were involved in the lignin degradation pathway, two target genes were related to the key enzymes in the mevalonate (MVA) pathway, and four target genes participated in the polysaccharide biosynthesis (Table S5).

Characterisation of milRNAs and their precursors Length distribution showed that mature milRNAs ranged from 18 nt to 25 nt in length. Most of these milRNAs have lengths ranging from 21 nt to 22 nt, accounting for 52.38% of the total mature milRNAs (Figure 3A). Interestingly, the milRNAs with lengths of 18, 21–23 and 25 nt showed a strong preference for a 5′ uracil in the first position similar to those found in both animals and plants (Angélique et al. 2006; Bartel 2004; Lin et al. 2016a; Wang et al. 2015). Meanwhile, those having lengths of 19, 20 and 24 nt were enriched with cytosine, guanine 9 / 35

and guanine at the 5′ end, respectively (Figure 3B). The length of milRNA precursors varied from 36 nt to 93 nt, with an average of 56 nt (Figure 3C). These values were similar to those obtained for the milRNA precursors found in animals (Ambros et al. 2003; Bartel 2004) . Among these 168 mature milRNAs, 87 (51.8%) were at the 5′ end of their precursor sequences, whereas 81 (48.2%) were at the 3′ end. 74 (44%) of the mature milRNAs originated from the exon regions of protein coding genes, 70 (41.7%) originated from intergenic regions and 24 (14.3%) originated from the intron regions (Figure 3D). The number of mature milRNAs mapped to the exon and intron regions exceeded that mapped to the intergenic regions (Table S3), which were significantly different from those observed in plants (Lu et al. 2005). The precursor genes were those from which the milRNAs originate. They contain the sequences for the stem-loop structure producing the milRNAs (Bartel 2004) . Using MiDeep2 software, 154 precursor genes were identified for 168 milRNAs. Based on the sequence similarities, the 154 milRNA precursors were classified into 147 families. In general, we found that some members of a family generated distinct mature milRNA sequences. Family milRNA012 had five members. Families of milRNA002 and milRNA008 had four members. Families of milRNA007 and milRNA009 had three members. The other milRNA families had one or two members (Table S3). Interestingly, multiple milRNAs can originate from the same precursors. For example, the precursors of milRNA117 and milRNA130, milRNA101 and milRNA115 and milRNA023 and milRNA113 were mapped to three precursor genes. This phenomenon was consistent with that found in animals and plants (Altuvia et al. 2005; Frazier et al. 2010; Tanzer et al. 2005; Tanzer & Stadler 2004). Both milRNA117 and milRNA130 were at the 3′ arm of hairpin structures and were unidirectional in the unigene. milRNA 101 and milRNA113 were at the 5′ arm of hairpin structures, whereas milRNA115 and milRNA023 were at the 3′ arm of hairpin structures. The presence of two milRNA precursors in a unigene suggested that the generated milRNAs might be co-regulated and are probably involved in the same network of biological process (Altuvia et al. 2005; Tanzer et al. 2005; Tanzer & Stadler 2004).

Validation of milRNAs using RT-PCR and Sanger Sequencing We used RT-PCR and Sanger Sequencing in addition to the degradome analysis to validate these milRNAs. From target genes that are involved in the biosynthesis of triterpenes and polysaccharides and lignin degradation pathway, we selected 42 milRNAs for validation using RT-PCR. The primer sequences are listed in (Table S1). Twenty-three of them had amplified products. Examination of the results showed that eighteen of them contained only the primer sequences. The other five (milRNA024, milRNA033, milRNA038, milRNA067 and milRNA137) contained sequences similar to those predicted (see below for details). Primers specific for RPL4 were also used to amplify the cDNAs. It served as positive control to ensure that our PCR reactions worked properly. The gel electrophoresis results of PCR products 10 / 35

were shown Figure 4A. Lane 1 shows the results using cDNA as template; lane 2 shows the results using water as template; and lane 3 shows the results using Total RNA treated with DNAase as template. Using water as template is to ensure that the PCR products are specific to the input cDNA template. Using Total RNAs as template is to ensure that the PCR products are not amplified from contaminating genomic DNA. As expected, there are PCR products only in lane 1, where the cDNAs were used as template. The PCR products amplified from the cDNA templates were then sent for Sanger sequencing. Figure 4B showed the chromatograms of Sanger-sequencing results. The comparison of the five predicted sequences and those sequenced in chromatograms showed high sequence similarity (Figure 4C). Three of the five have sequences identical to those predicted: milRNA024, milRNA033 and milRNA067. The predicted sequences of milRNA038 and milRNA137 had 4 and 9 base deletions compared with those obtained experimentally. The reasons remain to be identified.

Validation of milRNAs by Bioinformatic Analyses The low validation rate obtained above suggests that experimental validation of milRNAs is difficult due to reasons currently unknown. As a result, we conducted additional bioinformatic analysis in order to determine the likelihood of these predicted milRNAs being real. We selected twelve milRNAs whose putative target genes are involved in the lignin degradation pathway, MVA pathway and polysaccharide biosynthesis pathway: including milRNA008, milRNA020, milRNA042, milRNA061, milRNA063, milRNA120, milRNA139, milRNA143, milRNA152, milRNA154, milRNA159 and milRNA160. The hairpin structures of these 12 selected milRNAs sequences were shown in Figure 5. Firstly, we determined the copy number of each milRNA in the genome of G. lucidum by comparing the milRNA with the genome using the BLASTN program. All the 12 milRNA genes had only one copy in the genome. The alignment and location information of the milRNA were shown in Table 1. The sequence identity between the 12 milRNAs and the aligned regions of the genome was all 100%. Six of the 12 milRNA located in the exons, 4 milRNA located in intergenic regions, and 2 milRNA located in introns. Secondly, we determined the expression levels of the 12 consensus precursor genes of milRNA genes by using the lncRNA sequencing data reported before (Shao et al. 2017) (Table 2). The expression levels of precursor genes in primordia and fruit body were higher than that in mycelia. The total number of reads mapped to the milRNA139 precursor gene in the three developmental stages was 1,837, the largest number among those for the 12 precursor genes. Interestingly, the expression level of precursors of milRNA042, milRNA063, milRNA139, milRNA152 and milRNA160 weren’t detected in mycelia, suggesting that the expression of these precursor genes might be regulated across different developmental stages. Thirdly, we determined the expression levels of the 12 milRNA by using the sRNA sequencing data. As shown in Table 3, the majority of the 12 selected milRNAs exhibited 11 / 35

tissue-specific expression. For example, milRNA139, milRNA160, milRNA152 and milRNA120 showed higher expression in fruiting bodies than in mycelia and primordia, whereas milRNA159, milRNA063, milRNA020, milRNA008, milRNA143 and milRNA042 showed higher expression in primordia than that in mycelia and fruiting bodies. On the contrary, the expression levels of milRNA061 and milRNA154 were relatively higher than those in mycelia than in primordia and fruiting bodies. Stage-specific expression of milRNAs suggest that they might play species-specific functions (Xia et al. 2016). Furthermore, the expression levels of the 12 selected milRNA obtained from the small RNA sequencing and stem-loop RT-qPCR experiments at particular stages were logtransformed and normalised against the average expression levels among the three developmental stages. The Pearson’s coefficients (r) between the expression profiles obtained from the two experiment technology platforms were calculated (Table S6). The stem-loop RT-qPCR results for 8 out of 12 (67%) milRNAs were consistent with those obtained from the small sequencing data (r ≥ 0.90, p ≤ 0.05) (Figure 6). Fourthly, the correlation between the expression profiles of the 12 milRNAs and their target genes were analyzed. The 12 selected milRNA and their target genes formed 16 milRNA-target pairs. The 16 paires were classified into three groups. Group one has nine pairs, which are closely related to lignin degradation pathway. Group two has four pairs, which participate in the MVA pathway. Group three has three pairs, which participate in polysaccharide biosynthesis pathway. In terms of the expression correlation, four pairs (Figures 7A, C, D and I) in group one, two pairs (Figures 7J and K) in group two and two pairs (Figures 7O and P) in group three exhibited a negative relationship at the expression level, suggesting that a transcriptional repression may be mediated on these target genes through their milRNAs (Wang et al. 2015). By contrast, we found that five pairs (Figures 7B, E–H) in group one, two pairs (Figures 7 L and M) in group two and one pair in group three (Figure 7N) presented a positive relationship at the expression level. Detailed information for these 16 milRNA-target gene pairs can be found in Table S7. Fifthly, we analysed the cleaved transcript categories of the 12 selected milRNA targetgenes (Table 4). All the target genes had reads which were mapped to the cleaved positions. The target genes mainly belonged to categories 2, 3 and 4. Only one and two target genes of milRNA042 and milRNA152 belonged to category 0. The T-plots of the target genes were shown in Figure S5. The regions and splice positions recognized by milRNAs were shown in Table S7. Taken together, these 12 milRNAs are likely to represent true milRNAs.

Function enrichment analysis of target genes of milRNA across three developmental stages To understand the potential functions of milRNAs, we conducted functional enrichment analysis of targeted genes in different developmental stages. Among the 1,804 target genes, 903 were mapped to 629 GO terms. The q-values for the enrichment of 38 GO terms were less than 0.05. Thirty-three GO terms had more than 10 target genes (Table S8). The GO terms with the largest number of target genes are catalytic activity (GO:0003824, 138), ATP 12 / 35

binding (GO:0005524, 123), metabolic process (GO:0008152, 119) and oxidation reduction (GO:0055114, 97) (Table S8). Enrichment analysis was carried out for the KEGG pathways similar to the enrichment analysis for GO terms. Among the 1,804 target genes, 491 of them were mapped to 240 KEGG pathways. The q-values for the enrichment of 35 KEGG pathways were less than 0.05. Twenty-two KEGG pathways displayed more than 10 target genes mapped to them (Table S9). The KEGG pathways with the largest numbers of target genes were mapped to the metabolic pathways (ko01100, 138), biosynthesis of secondary metabolites (ko01110, 54), biosynthesis of antibiotics (ko01130, 44) and microbial metabolism in diverse environments (ko01120, 40) (Table S9). To find out if milRNAs functioned in a developmental stage-specific manner, we conducted functional enrichment analysis of the target genes in each of the three developmental stages. A total of 10, 18 and 27 GO terms were enriched in mycelia, primordia and fruiting bodies, respectively, with q-value < 0.05 (Table S10). Enrichment analysis results for the KEGG pathways are shown in Table S11. A total of 6, 6 and 15 KEGG pathways were enriched in mycelia, primordia and fruiting bodies, respectively, with q-value < 0.05. The most significantly enriched GO terms and KEGG pathways in each of the three developmental stages are listed in Table S12. For example, GO term nucleotide binding (GO:0000166), protein binding (GO:0005515) and transcription regulator activity (GO:0030528) were found to be enriched in both mycelia and primordia. FAD binding (GO:0050660) was only enriched in mycelia. GTP cyclohydrolase II activity (GO:0003935) and GTP binding (GO:0005525) were only enriched in primordia. NAD or NADH binding (GO:0051287) and phospholipid biosynthetic process (GO:0008654) were only enriched in fruiting bodies. KEGG pathway biosynthesis of secondary metabolites (ko01110) was found to be enriched in the mycelia and primordia. Oxidation phosphorylation (ko00190) and nitrogen metabolism (ko00910) were only significantly enriched in fruiting bodies. These results suggested that milRNAs widely participated in many biological processes in a developmental-stage specific manner in G. lucidum.

Correlation analysis of expression profiles of milRNAs and their target genes To explore the variable trend of the transcriptome at three developmental stages in G. lucidum, milRNAs identified in G. lucidum were applied to differential expression analysis by using t-test followed by multiple comparison correction with FDR (q). In total, 111 milRNAs were significantly differentially expressed (q < 0.05) across the three developmental stages (Table S13). A total of 13, 62 and 36 milRNAs exhibited the highest expression in the mycelia, primordia and fruiting bodies, respectively. The expression profiles of 183 of 424 (43.8%) milRNA-target pairs were significantly positively correlated (r ≥ 0.9, q < 0.01), whereas those of 241 of 424 (56.2%) milRNA-target pairs were negatively correlated (r ≤ −0.9, q < 0.01) (Table S5). Furthermore, the expression levels of milRNA-target pairs were found significantly positively and negatively correlated, consistent with the results of previous findings (Han et al. 2016).

13 / 35

Discussion Integrated approach for identifying milRNAs in G. lucidum Compared with milRNAs identified from plants, animals and viruses (Griffiths-Jones et al. 2006; Zhang et al. 2009; Zhang et al. 2008b; Zhu et al. 2009), less attention has been paid on the fungi, especially in basidiomycetes. In this study, we analysed the transcriptome, small RNA and degradome sequencing data to identify milRNAs and their targets in G. lucidum. A total of 168 unique milRNAs were identified. However, these G. lucidum milRNAs had no homology with those in the miRBase database (Griffithsjones et al. 2008), even with those from N. crassa (Lee et al. 2010) and G. sinense (Zhu et al. 2015). This result suggested that milRNAs may have evolved independently in basidiomycetes. Identification of milRNA targets is a critical way to uncover the regulatory roles of milRNAs. Through degradome analysis, 130 milRNAs were found to target 2,394 sequences of 1,804 specific genes. The identified target genes were mainly involved in ‘metabolic process’ (Behringer et al. 2015), ‘oxidation reduction’ (Bel et al. 2013), ‘FAD binding’ (Manavalan et al. 2015), ‘GTP binding’ (Reyes-Medina & Macías-Sánchez 2015; Richthammer et al. 2012; Macías-Sánchez et al. 2011) and ‘biosynthesis of secondary metabolites’ (Hošťálek et al. 1969), which are closely related to the triterpenoid, polysaccharide biosyntheses and lignin degradation pathway in G. lucidum. These results, for the first time, demonstrated that milRNAs might play important roles in the biological processes of G. lucidum.

Validation of the predicted milRNA by bioinformatics analyses We use the stem-loop RT-PCR to analyze 42 milRNA, only three of them were validated (Figure 4). This might suggest that the stem-loop RT-qPCR might not be appropriate to validate the milRNAs predicted in G. lucidum. As a result, additional bioinformatics analyses were conducted to determine the reliability of the predicted milRNA. We selected 12 milRNA for further analyses. We detected the different expression level of the selected milRNA in different developmental stage by using the sRNA sequencing data. The same trend of expression level was also validated by stem-loop RT-qPCR results (Figure 7). For the selected milRNA in the genome of G. lucidum, only one copy was detected. Furthermore, the consensus precursors were also detected in the lncRNA sequencing data. This indicated the high abundance levels of the predicted milRNA might result from the highly expression levels of the milRNA genes. The expression levels of most milRNAs had the significantly positive or negative relationship with the expression level of their target genes. For the cleaved transcripts analysis, all the target genes had reads which were mapped to the cleaved positions. Overall, these results suggested that these predicted milRNAs might represent truly functional small RNAs.

Features of milRNAs in G. lucidum In terms of length, miRNAs are nearly 21 nt in length and are generated from a precursor with the characteristic hairpin or stem-loop structure in plants and animals (Bartel 2004) . In our study, the length of milRNAs in G. lucidum peaked range from 21 nt to 22 nt, consistent with those reported previously in plants, animals and fungi (Wu et al. 2012; Zhou et al. 2016). 14 / 35

In terms of 5′ base bias, 30.5% milRNAs had a strong preference for 5′ uracil in the first position, consistent with those described for Arabidopsis (Mi et al. 2008). This bias suggested a loading affinity for AGO1 in G. lucidum. Furthermore, 15.48% and 28.57% of the first 5′ nucleotides of milRNAs in G. lucidum are adenine or cytosine, consistent with those described in Arabidopsis (Mi et al. 2008). This bias suggested a preferential binding for AGO2, AGO4 and AGO5. Additionally, 25% of milRNAs have guanine. This finding was also consistent with those reported previously (Lin et al. 2015). Interestingly, two intergenic regions in G. lucidum genome were found to be bidirectionally transcribed (Table S3). Precursors GaLu96scf_6_825138_825188 and GaLu96scf_3_1794605_1794660 can generate milRNA007 and milRNA002, respectively, whereas their antisense transcripts produce milRNA008 and milRNA003, respectively. Although bidirectionally transcribed milRNAs have been reported in animals and plants (Alexander et al. 2008; David M et al. 2008; Xie et al. 2010; Zhang et al. 2008a), they have not been reported to our knowledge in fungi, especially in basidiomycetes.

Potential biosynthesis mechanism of milRNAs in G. lucidum The biosynthesis pathways of milRNA had been studied in the model fungus N. crassa in detail (Lee et al. 2010). Surprisingly, milRNAs were generated by at least four different pathways that required a distinct combination of factors, including Dicers, the exonuclease QIP, QDE2 (an Argonaute protein) and MRPL3 (an RNase III domain-containing protein). However, information is lacking on the components of the milRNA production in basidiomycetes, especially in Ganoderma. In this study, we identified three Dicer-like and five Argonaute protein homologues in G. lucidum based on sequence similarity. As a result, the synthesis pathway of milRNA in G. lucidum might be similar to those found in T. reesei (Kang et al. 2013), N. crassa (Lee et al. 2010), S. sclerotiorum (Zhou et al. 2012a) and M. anisopliae (Zhou et al. 2012b). Our finding provides valuable information for the further characterisation of homologous Dicer-like and Argonaute proteins in G. lucidum.

Potential functions of milRNA in fungi biology Several earlier reports suggest that milRNAs function generally in post-transcriptional gene silencing in filamentous fungi Sclerotinia sclerotiorum (Zhou et al. 2012a), Trichoderma reesei (Kang et al. 2013), Metarhizium anisopliae (Zhou et al. 2012b), Penicillium marneffei (Lau et al. 2013) and basidiomycetous fungus Cryptococcus neoformans (Dumesic et al. 2013); (Jiang et al. 2012). Later reports further suggest that miRNAs are involved in a variety of essential cellular processes, such as the regulation of development and morphology, toxin generation, pathogenesis and etc. In the development study of Antrodia cinnamomea, 4 predicted conserved miRNAs and 63 novel predicted milRNAs from two states of A. cinnamomea were identified. Through target prediction, several important key enzymes or regulatory factors were predicted to be regulated by the miRNAs. The target genes may be candidates for improvement in triterpenoids and secondary metabolite synthesis, as well as regulatory 15 / 35

factors of mycelium growth and the sexual regulation (Lin et al. 2015). The milRNAs also play vital roles in sexual development in Cordyceps militaris. For example, C. militaris could not form fruiting bodies after the disruption of milR4, while the perithecium was formed in advance after over-expression of the milR4. Abnormal pale yellow fruiting body primordium, covered with abnormal primordium, was formed in the strain with miR16 disruption (Shao et al. 2019). The pathogenic mechanisms of milRNAs in the rice sheath blight pathogen have been studies. Four potential mechanisms have been proposed. Firstly, ABC-G transporter and hydrolase involved in drug resistance and translocation might be regulated by milRNAs. Secondly, virulence-associated factors might be regulated by milRNAs. Thirdly, transcription factors that regulate pathogenic factors during infection might be regulated by milRNAs. Lastly, cell wall-degrading enzymes involving in fungal pathogenicity might be regulated by milRNAs (Lin et al. 2016b). For the regulation of toxin generation, the milRNA of Fon-miR7696a-3p and FonmiR6108a in Fusarium oxysporum f. sp. niveum (Fon) can down-regulate two interesting genes involved in trichothecene production, necrosis and ethylene-inducing peptide 1 (NEP1) biosynthesis. The negative correlation of the expression levels between these two milRNAs and their potential target genes imply that they play a role in trichothecene and NEP1 biosynthesis (Jiang et al. 2017).

Potential functions of milRNAs in G. lucidum In this study, we identified 111 milRNAs that were significantly differentially expressed across the three developmental stages. Stem-loop RT-PCR analysis for most of the expression profiles of the differentially expressed milRNAs and their targets confirmed those obtained from high-throughput sequencing results. Interestingly, the correlation between milRNAs and their putative target genes can be either positive or negative. For example, eight milRNAtarget pairs exhibited negative relationship at the expression level. For example, milRNA139GL21192 and milRNA061-GL22248 were related to the copper radical oxidase and the benzoquinone reductase in lignin degradation pathway (Kersten & Dan 2014; Mori et al. 2016), respectively. The milRNA154-GL23376 was associated with squalene monooxygenase in MVA pathway (Shiao et al. 1994), and milRNA152-GL23023 was related to RHO1 GDP-GTP exchange protein in polysaccharide biosynthesis pathway (MacíasSánchez et al. 2011; Reyes-Medina & Macías-Sánchez 2015; Richthammer et al. 2012). By contrast, the expression profiles of the other eight milRNA-target pairs were positively correlated. For instance, milRNA020-GL22248, milRNA159-GL24922 and mlRNA160GL20535 were associated with benzoquinone reductase in lignin degradation pathway (Mori et al. 2016), glutaryl-CoA synthase 3-hydroxy-3-methyl in MVA pathway (Ren et al. 2013) and 1,3-beta-glucan synthase in polysaccharide pathway (Moradali et al. 2007), respectively. Whether or not these observations represent novel mechanisms of milRNAs or are simply the results of non-causative association will be the subject of future study. These findings suggest that milRNAs may play crucial regulatory roles in the biological process of G. lucidum through complex networks. 16 / 35

Technical considerations for this study Several technical limitations should be considered when utilizing the data reported here. Firstly, the validation success rate is relative low. One possible reason is that milRNAs might be expressed in a temporal or developmental stage-specific manner (Bartel 2004; Carrington & Ambros 2003; Kaufman & Miska 2010; Reinhart et al. 2002). As a result, they might not be identified from the conditions under which we extracted RNA sample. Another possible reason is that the methods to detect milRNAs are not optimized to detect milRNAs in G. ludicum. Secondly, the functions of these milRNAs have not been studied in details due to the lack of transformation system for G. lucidum. The construction of milRNA overexpression or knockout transgenic lines for particular milRNAs as well as proteins such as Dicer, Agonaute identified in G. lucidum are needed to determine their functions in the future. Thirdly, we have not been able to find any of the milRNAs being conserved. There are two possible reasons. One is that some of these 168 milRNA might not be real. Another reason is possible that there are not enough milRNAs being identified in fungi until now. With the increase of milRNAs identified from fungi, we might be able to identify some of these milRNAs being conserved in the future. Many of the twelve milRNAs are not shared in the Mycelia stage, these milRNAs, if real, are likely to play roles in the primordia and fruiting bodies. These suggest that lignin degradation, biosynthesis of polysaccharides and MVA pathway are more active in the later two stages than those in the first stages. However, the exact roles need to be clarified by knocking-down knocking-out of these genes, which will be subject of future studies.

Acknowledgements The work was supported by the CAMS Innovation Fund for Medical Sciences (2016-I2M-3016) and fund from National Science Foundation of China (81872966).

Author Contributions CL and SL conceived and designed this study. JS extracted RNA for next-generation sequencing and performed reference-based small RNA-Seq and degradome data analysis. LW, YL and QQ performed the stem-loop RT-PCR experiments. JS and LW carried out the differential expression analysis for the candidate milRNA. JS, LW and CL wrote the manuscript. SL and BW critically reviewed the manuscript. All authors have read the work and agreed with its contents.

References Addo-Quaye, C., Eshoo, T. W. et al., 2008. Endogenous siRNA and miRNA targets identified by sequencing of the Arabidopsis degradome. Current Biology. 18, 758-762. Addoquaye, C., Miller, W. et al., 2009. CleaveLand: a pipeline for using degradome data to find 17 / 35

cleaved small RNA targets. Bioinformatics. 25, 130-131. Alexander, S., Natascha, B. et al., 2008. A single Hox locus in Drosophila produces functional microRNAs from opposite DNA strands. Genes & Development. 22, 8-13. Allen, E., Xie, Z. et al., 2005a. microRNA-directed phasing during trans-acting siRNA biogenesis in plants. Cell. 121, 207-221. Allen, E., Xie, Z. et al., 2005b. microRNA-directed phasing during trans-acting siRNA biogenesis in plants. Cell. 121, 207-221. Altuvia, Y., Landgraf, P. et al., 2005. Clustering and conservation patterns of human microRNAs. Nucleic Acids Research. 33, 2697-2706. Ambros, V., Bartel, B. et al., 2003. A uniform system for microRNA annotation. RNA. 9, 277. Angélique, G., Ravi, S. et al., 2006. A germline-specific class of small RNAs binds mammalian Piwi proteins. Nature. 442, 199-202. Aravin, A. A., Hannon, G. J. et al., 2007. The Piwi-piRNA pathway provides an adaptive defense in the transposon arms race. Science. 318, 761-764. Bartel, D. P., 2004. MicroRNAs: genomics, biogenesis, mechanism, and function. Cell. 116, 281-297. Behringer, D., Zimmermann, H. et al., 2015. Differential gene expression reveals candidate genes for drought stress response in Abies alba (Pinaceae). Plos One. 10, e0124564. Bel, Y., Jakubowska, A. K. et al., 2013. Comprehensive analysis of gene expression profiles of the beet armyworm Spodoptera exigua larvae challenged with Bacillus thuringiensis Vip3Aa toxin. Plos One. 8, e81927. Blake C. Meyers, M. J. A., et, al, 2008. Criteria for annotation of plant microRNAs. Plant Cell. 20, 31863190. Cannell, I. G., Wen, K. Y. et al., 2007. How do microRNAs regulate gene expression? Sciences Stke Signal Transduction Knowledge Environment. 2007, re1. Carrington, J. C., and Ambros, V., 2003. Role of microRNAs in plant and animal development. Science. 301, 336-338. Caterina, C., Massimiliano, P. et al., 2004. Redundancy of the two dicer genes in transgene-induced posttranscriptional gene silencing in Neurospora crassa. Molecular & Cellular Biology. 24, 2536-2545. Chen, C., Ridzon, D. A. et al., 2005. Real-time quantification of microRNAs by stem-loop RT-PCR. Nucleic Acids Research. 33, e179. Chen, S., Xu, J. et al., 2012. Genome sequence of the model medicinal mushroom Ganoderma lucidum. Nature Communications. 3, 913. David M, T., Katsutomo, O. et al., 2008. Functionally distinct regulatory RNAs generated by bidirectional transcription and processing of microRNA loci. Genes & Development. 22, 2636. Dumesic, P. A., Natarajan, P. et al., 2013. Stalled spliceosomes are a signal for RNAi-mediated genome defense. Cell. 152, 957. Frazier, T. P., Xie, F. et al., 2010. Identification and characterization of microRNAs and their target genes in tobacco (Nicotiana tabacum). Planta. 232, 1289-1308. FriedlanNder, M. R. M. S., Li N, Chen W, Rajewsky N, 2012. miRDeep2 accurately indentifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic Acids Research. 40, 37-52. Griffiths-Jones, S., Grocock, R. J. et al., 2006. miRBase: microRNA sequences, targets and gene 18 / 35

nomenclature. Nucleic Acids Research. 34, 140-144. Griffithsjones, S., Saini, H. K. et al., 2008. miRBase: tools for microRNA genomics. Nucleic acids research. 36, D154. Han, X., Yin, H. et al., 2016. Integration of small RNAs, degradome and transcriptome sequencing in hyperaccumulator Sedum alfredii uncovers a complex regulatory network and provides insights into cadmium phytoremediation. Plant Biotechnology Journal. 14, 1470-1483. Hošťálek, Z., Ryabushko, T. A. et al., 1969. Regulation of biosynthesis of secondary metabolites. Folia Microbiologica. 14, 121. Jiang, J., Lv, M. et al., 2014. Identification of novel and conserved miRNAs involved in pollen development in Brassica campestris ssp. chinensis by high-throughput sequencing and degradome analysis. BMC Genomics. 15, 146. Jiang, N., Yang, Y. et al., 2012. Identification and functional demonstration of miRNAs in the fungus Cryptococcus neoformans. Plos One. 7, e52734. Jiang, P., Wu, H. et al., 2007. MiPred: classification of real and pseudo microRNA precursors using random forest prediction model with combined features. Nucleic Acids Research. 35, W339. Jiang, X., Qiao, F. et al., 2017. MicroRNA-like RNAs in plant pathogenic fungus Fusarium oxysporum f. sp. niveum are involved in toxin gene expression fine tuning. 3 Biotech. 7, 354. Jonesrhoades, M. W., Bartel, D. P. et al., 2006. MicroRNAS and their regulatory roles in plants. Annual Review of Plant Biology. 57, 19-53. Kang, K., Zhong, J. et al., 2013. Identification of microRNA-Like RNAs in the filamentous fungus Trichoderma reesei by solexa sequencing. Plos One. 8, e76288. Kaufman, E. J., and Miska, E. A., 2010. The microRNAs of Caenorhabditis elegans. Seminars in Cell & Developmental Biology. 21, 728. Kaur, H., and Kaur, G., 2016. Application of ligninolytic potentials of a white-rot fungus Ganoderma lucidum for degradation of lindane. Environmental Monitoring & Assessment. 188, 588. Kersten, P., and Dan, C., 2014. Copper radical oxidases and related extracellular oxidoreductases of wood-decay Agaricomycetes. Fungal Genetics & Biology. 72, 124-130. Lagosquintana, M., Rauhut, R. et al., 2001. Identification of novel genes coding for small expressed RNAs. Science. 294, 853-858. Lau, N. C., Lim, L. P. et al., 2001. An abundant class of tiny RNAs with probable regulatory roles in Caenorhabditis elegans. Science. 858-862. Lau, S. K. P., Chow, W. N. et al., 2013. Identification of microRNA-Like RNAs in mycelial and yeast phases of the thermal dimorphic fungus Penicillium marneffei. Plos Neglected Tropical Diseases. 7, e2398. Lee, H. C., Li, L. et al., 2010. Diverse pathways generate microRNA-like RNAs and Dicer-independent small interfering RNAs in fungi. Molecular Cell. 38, 803-814. Lee, R. C., and Ambros, V., 2001. An extensive class of small RNAs in Caenorhabditis elegans. Science. 294, 862. Lewis, B. P., Shih, I. H. et al., 2003. Prediction of mammalian microRNA targets. Cell. 115, 787-798. Li, J., Wu, B. et al., 2014. Genome-wide identification and characterization of long intergenic noncoding RNAs in Ganoderma lucidum. Plos One. 9, e99442. Liang, C., Zhang, X. et al., 2010. Identification of miRNA from Porphyra yezoensisby high-throughput sequencing and bioinformatics analysis. Plos One. 5, e10698. Lin, P. C., Lu, C. W. et al., 2016a. Identification of miRNAs and their targets in the liverwort Marchantia 19 / 35

polymorpha by integrating RNA-seq and degradome analyses. Plant & Cell Physiology. 57, 339-358. Lin, R., He, L. et al., 2016b. Comprehensive analysis of microRNA-seq and target mRNAs of rice sheath blight pathogen provides new insights into pathogenic regulatory mechanisms. DNA Research. Lin, Y. L., Ma, L. T. et al., 2015. MicroRNA-like small RNAs prediction in the development of Antrodia cinnamomea. Plos One. 10, e0123245. Livak, K. J., and Schmittgen, T. D., 2001. Analysis of relative gene expression data using real-time quantitative PCR and the 2(-Delta Delta C(T)) Method. Methods. 25, 402-408. Lu, C., Tej, S. S. et al., 2005. Elucidation of the small RNA component of the transcriptome. Science. 309, 1567-1569. Ma, Z. R., Coruh, C. et al., 2010. Arabidopsis lyrata small RNAs: transient mIRNA and small interfering RNA loci within the Arabidopsis Genus. Plant Cell. 20, 1090-1103. Macías-Sánchez, K., García-Soto, J. et al., 2011. Rho1 and other GTP-binding proteins are associated with vesicles carrying glucose oxidase activity from Fusarium oxysporum f. sp. lycopersici. Antonie Van Leeuwenhoek. 99, 671-680. Macrae, I. J., Zhou, K. et al., 2006. Structural basis for double-stranded RNA processing by Dicer. Science. 311, 195-198. Maiti, M., Lee, H. C. et al., 2007. QIP, a putative exonuclease, interacts with the Neurospora Argonaute protein and facilitates conversion of duplex siRNA into single strands. Genes Dev. 21, 590-600. Manavalan, T., Manavalan, A. et al., 2015. Characterization of lignocellulolytic enzymes from white-rot fungi. Current Microbiology. 70, 485-498. Mao, W., Li, Z. et al., 2012. A combined approach of high-throughput sequencing and degradome analysis reveals tissue specific expression of microRNAs and their targets in cucumber. Plos One. 7, e33040. Mello, C. C., and Conte, D., 2004. Revealing the world of RNA interference. Nature. 431, 338-342. Mi, S., Cai, T. et al., 2008. Sorting of small RNAs into Arabidopsis argonaute complexes is directed by the 5'-terminal nucleotide. Cell. 133, 116-127. Moradali, M. F., Mostafavi, H. et al., 2007. Immunomodulating and anticancer agents in the realm of macromycetes fungi (macrofungi). International Immunopharmacology. 7, 701-724. Mori, T., Koyama, G. et al., 2016. Effects of homologous expression of 1,4-benzoquinone reductase and homogentisate 1,2-dioxygenase genes on wood decay in hyper-lignin-degrading fungus phanerochaete sordida YK-624. Current Microbiology. 73, 512-518. Moss, E. G., Lee, R. C. et al., 1997. The cold shock domain protein LIN-28 controls developmental timing in C. elegans and is regulated by the lin-4 RNA. Cell. 88, 637-646. Mu, D. S., Li, C. et al., 2015. Bioinformatic identification of potential microRNAs and their targets in the Lingzhi or Reishi medicinal mushroom Ganoderma lucidum (higher Basidiomycetes). International Journal of Medicinal Mushrooms. 17, 783-797. Poulsen, C., Vaucheret, H. et al., 2013. Lessons on RNA silencing mechanisms in plants from eukaryotic argonaute structures. Plant Cell. 25, 22-37. Reinhart, B. J., Weinstein, E. G. et al., 2002. MicroRNAs in plants. Genes & Development. 16, 1616. Ren, A., Ouyang, X. et al., 2013. Molecular characterization and expression analysis of GlHMGS, a gene encoding hydroxymethylglutaryl-CoA synthase from Ganoderma lucidum (Ling-zhi) in 20 / 35

ganoderic acid biosynthesis pathway. World Journal of Microbiology & Biotechnology. 29, 523-531. Reyes-Medina, M. A., and Macías-Sánchez, K. L., 2015. GTPase Rho1 regulates the expression of xyl3 and laccase genes in Fusarium oxysporum. Biotechnology Letters. 37, 679-683. Richter, C., Wittstein, K. et al., 2015. An assessment of the taxonomy and chemotaxonomy of Ganoderma. Fungal Diversity. 71, 1-15. Richthammer, C., Enseleit, M. et al., 2012. RHO1 and RHO2 share partially overlapping functions in the regulation of cell wall integrity and hyphal polarity in Neurospora crassa. Molecular Microbiology. 85, 716-733. Roozbeh, H., Nor Azah, Y. et al., 2013. Detection and control of Ganoderma boninense: strategies and perspectives. Springerplus. 2, 555. Sarah W, B., Jennifer, D. et al., 2013. Rfam 11.0: 10 years of RNA families. Nucleic Acids Research. 41, 226-232. Shao, J., Chen, H. et al., 2017. Genome-wide identification and characterization of natural antisense transcripts by strand-specifc RNA sequencing in G. lucidum. Scientific Reports. 7:5711, Shao, Y., Tang, J. et al., 2019. milR4 and milR16 mediated fruiting body development in the medicinal fungus Cordyceps militaris. Frontiers in Microbiology. 10, 83. Shiao, M. S., Lee, K. R. et al. 1994. Natural products and biological activities of the Chinese medicinal fungus Ganoderma lucidum. Tanzer, A., Amemiya, C. T. et al., 2005. Evolution of microRNAs located within Hox gene clusters. Journal of Experimental Zoology Part B Molecular & Developmental Evolution. 304, 75-85. Tanzer, A., and Stadler, P. F., 2004. Molecular evolution of a microRNA cluster. Journal of Molecular Biology. 339, 327-335. Tong, L., Hu, J. et al., 2016. Identification of microRNA-like RNAs from Curvularia lunata associated with maize leaf spot by bioinformation analysis and deep sequencing. Molecular Genetics & Genomics Mgg. 291, 587-596. Wang, L., Xu, X. et al., 2018. Integrated microRNA and mRNA analysis in the pathogenic filamentous fungus Trichophyton rubrum. BMC Genomics. 19, 933. Wang, M., Wu, B. et al., 2015. Identification of m RNA-like non-coding RNAs and validation of a mighty one named MAR in Panax ginseng. Journal of Integrative Plant Biology. 57, 256-270. Wu, B., Wang, M. et al., 2012. High-throughput sequencing and characterization of the small RNA transcriptome reveal features of novel and conserved microRNAs in Panax ginseng. Plos One. 7, e44385. Xia, H., Zhang, L. et al., 2016. Genome-wide identification and characterization of microRNAs and target genes in Lonicera japonica. Plos One. 11, e0164140. Xia, Q., Zhang, H. et al., 2014. A comprehensive review of the structure elucidation and biological activity of triterpenoids from Ganoderma spp. Molecules. 19, 17478-17535. Xie, F., Frazier, T. P. et al., 2010. Identification and characterization of microRNAs and their targets in the bioenergy plant switchgrass (Panicum virgatum). Planta. 232, 417-434. Xu, J., Xu, Z. et al., 2014. Identification and evaluation of reference genes for qRT-PCR normalization in Ganoderma lucidum. Current Microbiology. 68, 120-126. Zhang, B., Pan, X. et al., 2008a. Identification of soybean microRNAs and their targets. Planta. 229, 161-182. Zhang, B. H., Pan, X. P. et al., 2006. Evidence that miRNAs are different from other RNAs. Cellular & 21 / 35

Molecular Life Sciences Cmls. 63, 246-254. Zhang, H., Yang, J. H. et al., 2009. Genome-wide analysis of small RNA and novel microRNA discovery in human acute lymphoblastic leukemia based on extensive sequencing approach. Plos One. 4, e6849. Zhang, W., Zheng, Y. et al., 2008b. Identification of novel and candidate miRNAs in rice by high throughput sequencing. BMC Plant Biology. 8, 25. Zhou, J., Fu, Y. et al., 2012a. Identification of microRNA-like RNAs in a plant pathogenic fungus Sclerotinia sclerotiorum by high-throughput sequencing. Molecular Genetics & Genomics. 287, 275-282. Zhou, Q., Wang, Z. et al., 2012b. Genome-wide identification and profiling of microRNA-like RNAs from Metarhizium anisopliae during development. Fungal Biology. 116, 1156-1162. Zhou, R., Wang, Q. et al., 2016. Identification of miRNAs and their targets in wild tomato at moderately and acutely elevated temperatures by high-throughput sequencing and degradome analysis. Scientific Reports. 6, 33777. Zhou, X. W., Cong, W. R. et al., 2013. Ligninolytic enzymes from Ganoderma spp: current status and potential applications. Critical Reviews in Microbiology. 39, 416-426. Zhu, J. Y., Pfuhl, T. N. et al., 2009. Identification of novel Epstein-Barr virus microRNA genes from nasopharyngeal carcinomas. Journal of Virology. 83, 3333. Zhu, Y., Xu, J. et al., 2015. Chromosome-level genome map provides insights into diverse defense mechanisms in the medicinal fungus Ganoderma sinense. Scientific Reports. 5, 11087.

22 / 35

Tables and Figure legends Table 1 Location of the 12 validated milRNA identified by aligning the genome of G. lucidum milRNA ID

Scaffolda ID

milRNA Start

End

milRNA008 GaLu96scf_6 21 1 milRNA020 GaLu96scf_6 19 1 milRNA042 GaLu96scf_32 19 1 milRNA061 GaLu96scf_3 1 19 milRNA063 GaLu96scf_7 1 18 milRNA120 GaLu96scf_1 20 1 milRNA139 GaLu96scf_29 1 20 milRNA143 GaLu96scf_6 1 21 milRNA152 GaLu96scf_15 18 1 milRNA154 GaLu96scf_5 1 19 milRNA159 GaLu96scf_29 18 1 milRNA160 GaLu96scf_4 1 20 GaLu: Ganoderma lucidum a: the name of the assembled scaffold of G. lucidum genome

Scaffold Start

End

825,165 597,420 163,593 1,802,849 1,207,581 3,782,218 176,977 1,114,695 956,058 18,169 148,760 977,419

825,185 597,438 163,611 1,802,867 1,207,598 3,782,237 176,996 1,114,715 956,075 18,187 148,777 977,438

23 / 35

milRNA length (bp) 21 19 19 19 18 20 20 21 18 19 18 20

Alignment length (bp) 21 19 19 19 18 20 20 21 18 19 18 20

Identity (%)

Location

100 100 100 100 100 100 100 100 100 100 100 100

intergenic exon exon intergenic exon intergenic exon intergenic exon intron exon intron

Table 2 The numbers of reads mapped to the milRNA precursors in different developmental stages of G. lucidum. Precursors of milRNA

Total reads number of precursors in different developmental stages M P FB milRNA008 precursor 5 113 88 milRNA020 precursor 22 105 447 milRNA042 precursor 0 219 145 milRNA061 precursor 15 465 180 milRNA063 precursor 0 19 26 milRNA120 precursor 6 102 71 milRNA139 precursor 0 965 872 milRNA143 precursor 12 143 173 milRNA152 precursor 0 681 416 milRNA154 precursor 5 128 141 milRNA159 precursor 40 747 640 milRNA160 precursor 0 53 45 M: Mycelia, P: Primordia, FB: Fruiting Bodies

Total reads number 206 574 364 660 45 179 1,837 328 1,097 274 1,427 98

Table 3 The expression levels of the 12 milRNAs in different developmental stages of G. lucidum. milRNA ID

Total number of milRNA reads in different developmental stages M P milRNA008 133 262 milRNA020 1 26 milRNA042 0 16 milRNA061 15 32 milRNA063 0 37 milRNA120 0 3 milRNA139 0 12 milRNA143 0 12 milRNA152 0 6 milRNA154 0 10 milRNA159 0 18 milRNA160 0 2 M: Mycelia, P: Primordia, FB: Fruiting Bodies 24 / 35

FB 141 70 14 65 31 5 38 27 20 13 0 14

Total number of reads 536 97 30 112 68 8 50 39 26 23 18 16

Table 4 The classification of cleaved transcripts of milRNA target genes. milRNA ID

Total No. of cleaved positions

No. of cleaved No. of cleaved No. of cleaved No. of cleaved No. of cleaved positions in positions in positions in positions in positions in category 0 category 1 category 2 category 3 category 4 milRNA008 2 0 0 0 1 1 milRNA020 15 0 0 4 2 9 milRNA042 44 1 0 13 11 19 milRNA061 33 0 0 8 5 19 milRNA063 24 0 0 2 5 17 milRNA120 21 0 0 6 4 11 milRNA139 29 0 0 6 6 17 milRNA143 74 0 0 22 12 40 milRNA152 75 2 1 18 11 43 milRNA154 118 0 1 18 19 80 milRNA159 202 0 2 49 43 108 milRNA160 132 0 4 31 24 73 Note: Category 0, over one raw read mapped to the cleaved position. The abundance at the position is equal to the maximum on the transcript, and only one maximum on the transcript is found. Category 1, over one raw read mapped to the cleaved position. Abundance at the position is equal to the maximum on the transcript, and more than one maximum position is found on the transcript. Category 2, over one raw read mapped to the cleaved position. Abundance at the position is less than the maximum but higher than the median for the transcript. Category 3, over one raw read mapped to the cleaved position. Abundance at the position is equal to or less than the median for the transcript. Category 4, only one raw read mapped to the cleaved position (Addoquaye et al. 2009).

25 / 35

Figure 1 Length distribution of sRNAs across the three developmental stages in G. lucidum. A: Total sequences; B: Unique sequences. Figure 2 Venn diagrams of milRNAs identified across the three developmental stages of G. lucidum. M: mycelia; P: primordia; FB: fruiting bodies. Figure 3 Characterisation of milRNAs identified across the three developmental stages of G. lucidum. A: Length distribution of milRNA identified across the three developmental stages of G. lucidum. B: First nucleotide bias of milRNAs identified in G. lucidum. C: Length distribution of milRNA precursors identified in G. lucidum. D: Distribution of milRNA precursors in the genome of G. lucidum.

Figure 4 Validation results of five milRNAs by stem-loop RT-PCR and Sanger Sequencing methods. (A) The gel electrophoresis results of PCR products using Total RNA treated with DNAase I and cDNA as template for five predicted milRNAs. Six gels were shown, five of them are for each of the predicted milRNAs, and the last one is for an internal control gene RPL4 as described previously (Xu et al. 2014). In each gel, the names of the predicted milRNAs (milRNA024, milRNA033, milRNA038, milRNA067 and milRNA137) are shown on the top of each gel. For each gel picture, four lanes are shown: lane ‘M’ shows the results of DNA Marker (DL1000, Takara Bio Inc.); lane 1 shows the results using cDNA as template; lane 2 shows the results using water as template; and lane 3 shows the results using Total RNA as template. (B) The chromatograms of Sanger-sequencing results of the PCR products amplified from cDNA template. The four colours represent different nucleotides, green: Adenine (A); blue: Cytidine (C); black: Guanine (G); red: Thymine (T). (C) Comparison of the predicted sequences and those shown in (B) for the five milRNAs. The number in parentheses represented the length of the predicted and those obtained from Sanger sequencing.

Figure 5 Hairpin structure of 12 predicted milRNAs analysed further in this study. The mature milRNAs sequence is highlighted in red, whereas the milRNAs* sequence is highlighted in purple, and the stem-loop region is shown in yellow. (A) milRNA120; (B) milRNA152; (C) milRNA139; (D) milRNA159; (E) milRNA061; (F) milRNA042; (G) milRNA160; (H) milRNA154; (I) milRNA143; (J) milRNA020; (K) milRNA008 and (L) milRNA063. ‘ ’: Lignin degradation pathway; ‘ ’: Polysaccharide biosynthesis pathway; ‘ ’: Mevalonate pathway.

Figure 6 Validation of small RNA-Seq results with stem-loop RT-qPCR experiments. Twelve milRNAs were selected and subjected to stem-loop RT-qPCR analysis. The expression levels obtained from the small RNA-Seq and stem-loop RT-qPCR experiments for these milRNAs are shown in panels A–L. For results from both experiments, the expression levels at each developmental stage were normalised to the mean expression levels across the three stages. In each panel, the name milRNAs is shown at the bottom. The X axis indicates the conditions under which the expression levels were measured, including the developmental stages and technologies used to generate the data. The Y axis indicates the relative expression levels obtained from the small RNA-Seq and stem-loop RT-qPCR experiments. The error bar for the stem-loop RT-qPCR results corresponds to the variations among the three biological replicates, each of which has three technical replicates. M: mycelia; P: primordia; FB: fruiting bodies.

Figure 7 Correlation of expression profiles for milRNAs and their target genes. Panels A–P show the expression files of milRNAs and their target genes from 16 milRNAtarget pairs. For each panel, the X axis indicates the corresponding developmental stage. The Y axis indicates the relative expression levels. The names of the milRNA-target pairs and the 26 / 35

Pearson’s correlation coefficient between their expression profiles are shown at the top. The error bar denotes the standard deviations among the three biological replicates, each of which has three technical replicates in the RT-qPCR experiments. ‘ ’: Lignin degradation pathway; ‘ ’: Polysaccharide biosynthesis pathway; ‘ ’: Mevalonate pathway; M: mycelia; P: primordia; FB: fruiting bodies.

27 / 35

Figures

Figure 1

28 / 35

Figure 2

29 / 35

Figure 3

30 / 35

Figure 4

31 / 35

Figure 5

32 / 35

61

Figure 6

33 / 35

Figure 7

34 / 35

Highlights: 1) For the first time, Identied the milRNAs and their targets experimentally in 2) 3) 4) 5)

G. lucidum. Identified many miRNA features that may be unique to basidiomycetes. The milRNAs’ functions are developmental stages-dependent. Different mechanism of regulation for miRNA might be present. Conserved miRNA biosynthesis and processing machineries are present in basdiomycetes.

35 / 35