Testes transcriptome profiles of the anadromous fish Coilia nasus during the onset of spermatogenesis

Testes transcriptome profiles of the anadromous fish Coilia nasus during the onset of spermatogenesis

MARGEN-00348; No of Pages 3 Marine Genomics xxx (2015) xxx–xxx Contents lists available at ScienceDirect Marine Genomics journal homepage: www.elsev...

938KB Sizes 1 Downloads 31 Views

MARGEN-00348; No of Pages 3 Marine Genomics xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Marine Genomics journal homepage: www.elsevier.com/locate/margen

Genomics/Technical resources

Testes transcriptome profiles of the anadromous fish Coilia nasus during the onset of spermatogenesis Yan-Feng Zhou a, Jin-Rong Duan a, Kai Liu a, Dong-Po Xu a, Min-Ying Zhang a, Di-An Fang a,b,⁎, Pao Xu a,b,⁎ a b

Freshwater Fisheries Research Center, Chinese Academy of Fishery Sciences, Wuxi, Xuejiali 69, 214128, China Wuxi Fisheries College, Nanjing Agricultural University, Wuxi, Xuejiali 69, 214128, China

a r t i c l e

i n f o

Article history: Received 10 April 2015 Received in revised form 26 May 2015 Accepted 8 June 2015 Available online xxxx Keywords: Illumina sequencing Transcriptome Coilia nasus Spermatogenesis

a b s t r a c t RNA-Seq technology has been widely applied to transcriptomics, genomics and functional gene study. Here, we performed de novo transcriptome sequencing to produce 23,842,172 clean reads representing a total of 4,815,798,404 (4.8 Gb) nucleotides from comprehensive transcript dataset for testis of Coilia nasus. Over 20 million Illumina reads were assembled into 194,636 unigenes, and 42,642 annotated genes were predicted by Blastx and ESTScan, respectively. Applying Blast analysis and functional annotation (e.g., GO, COG, SwissProt and KEGG) using the assembled gene models from catalogs of other species, we have sampled an extensive and diverse expressed gene catalog for C. nasus representing a large proportion of the genes involved in the onset of spermatogenesis. The results will provide a general clue to the potential spermatogenesis molecular mechanisms for this species. © 2015 Elsevier B.V. All rights reserved.

1. Introduction

2. Data description

Coilia nasus is a kind of anadromous fish in Clupeiformes, Engraulidae (Jiang et al., 2012). It is famous for its important fishery resource, nutritive value and delicacy (Xu et al., 2011). The fish migrates from the sea to the Yangtze River and its affiliated lakes in China annually (Zhong et al., 2007; Yang et al., 2011). The fish reaches sexual maturity at 2–3 year-old age and migrates from April to October, spawning once every year (Liu et al., 2014). Mature individuals migrate hundreds of miles upstream in rivers like the Yangtze River and spawn in the lower and middle reaches of waters (Li et al., 2007). Nantong section of Yangtze River is always considered as the migration starting point in the fresh water, and when the fish comes here its testis is in the stage of the onset of spermatogenesis (Li et al., 2007; Jiang et al., 2012). Mechanism study of the testis transcriptome information is identified as one of the most valid methods to resolve fish germplasm resource exhaustion (Chalmel et al., 2007; Bissonnette et al., 2009). In this study we described the de novo assembly and annotation of the C. nasus testis transcriptome and provided a general clue to the potential spermatogenesis molecular mechanisms for this species.

2.1. Fish collection Healthy fish samples from Nantong (NT) section (testis in stage I, total 5 individuals) were collected during the onset of spermatogenesis. All individual testes were removed surgically and immediately placed in liquid nitrogen and stored at −80 °C until used. Testes tissues of different individuals were selected for RNA extraction and then all pooled as one sample for the transcriptomics construction. 2.2. Sample preparation Total RNA was extracted using Trizol Lysis Reagent and then purified on RNA easy kit (Invitrogen, Beijing, China) as the manufacturer's instructions. Equal total RNA purified from each testis was pooled and then the mRNA was isolated using the Oligotex mRNA Kit (Invitrogen, Beijing, China). The paired-end library was synthesized using the Genomic Sample Prep kit (Illumina, Beijing, China). The RNA integrity (RNA Integrity Score is 7.8) and quantity were estimated by spectrophotometry (absorbance at 260 nm) and agarose gel electrophoresis, respectively. 2.3. Testes transcriptome sequencing and assembling

⁎ Corresponding authors at: Freshwater Fisheries Research Center, Chinese Academy of Fishery Sciences, Wuxi, Xuejiali 69, 214128, China. Tel./fax: +86 510 85390025. E-mail addresses: [email protected] (D.-A. Fang), [email protected] (P. Xu).

Before constructing the transcriptome, NEBNext Poly(A) mRNA Magnetic Isolation Module (NEB, E7490) was used to concentrate

http://dx.doi.org/10.1016/j.margen.2015.06.007 1874-7787/© 2015 Elsevier B.V. All rights reserved.

Please cite this article as: Zhou, Y.-F., et al., Testes transcriptome profiles of the anadromous fish Coilia nasus during the onset of spermatogenesis, Mar. Genomics (2015), http://dx.doi.org/10.1016/j.margen.2015.06.007

2

Y.-F. Zhou et al. / Marine Genomics xxx (2015) xxx–xxx

Table 1 Summary of the transcriptome. Name

Number of sequences

Mean length (bp)

Total reads Total nucleotides (nt) Total number of unigenes

23,842,172 4,815,798,404 194,636

203 – 688

Annotated databases

All sequences

≥300 bp

≥1000 bp

COG GO KEGG SwissProt nr All

8,087 26,152 13,260 24,143 42,291 42,642

2,883 10,807 5380 9728 18,926 19,068

4,469 11,510 5950 10,975 14,188 14,210

mRNA, and MICROBExpress Bacterial mRNA Enrichment Kit (Invitrogen, AM1905) was used to remove rRNA. Saved enrichment mRNA as template, NEB Next mRNA Library Prep Master Mix Set for Illumina (NEB, E6110) and NEB Next Multiplex Oligos for Illumina (NEB, E7500) were used to construct the sequencing library. A mixed cDNA sample library representing the onset of spermatogenesis was prepared and sequenced using the Illumina HiSeq™ 2000 sequencing technology. Transcriptome de novo assembly was carried out with the Short Read Assembling Program (SRAP) de novo (Metzker, 2010). The reads of certain lengths of overlap with no uncalled bases (N) were combined to form longer fragments (Cox et al., 2010). And then fragments were connected using N to represent the unknown sequence to form scaffolds (Simpson et al., 2009). Paired-end reads were used for gap filling of scaffolds to obtain sequences with the smallest number of N's and low quality sequences (with quality score less than 10), the remaining clean reads were assembled using trinity software as described for de novo transcriptome assembly without a reference genome (Grabherr et al., 2011). In the final step, Blastx alignments between unigenes and sequences in protein databases, including the National Center for Biotechnology Information (NCBI) non-redundant (nr) database, Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG) and Clusters of Orthologous Groups (COG) were performed to identify the direction sequence of unigenes (Tatusov et al., 2000; Kanehisa et al., 2004; Conesa et al., 2005). If results of different databases were conflicting, a priority order of alignments from the nr, Swiss-Prot, KEGG and COG databases was followed to decide the sequence direction (Metzker, 2010).

Illumina high-throughput second generation sequencing produced 23,842,172 clean reads representing a total of 4,815,798,404 (4.8 Gb) nucleotides (Table 1). Average read size, Q20 percentage, Q30 percentage and GC content were 98 bp, 100%, 79.93% and 50.05%, respectively. As shown in Table 1, 194,636 unigenes were obtained; with a median length of 547 bp. Altogether we obtained fundamental transcriptome resource for further studies of comparative genomics, gene functions, spermatogenesis and anadromous migration mechanism. 2.4. Functional unigene annotation Distinct gene sequences were first searched with Blastx against the NCBI nr, SwissProt, GO, COG, and KEGG databases (Conesa et al., 2005; Metzker, 2010), and a total of 42,642 unigenes (21.8% of all unigenes) were functionally annotated via EST scan analysis. 78.1% of the unigene could not be matched to known genes. Among these, as many as 110,856 (76.1%) unigenes were found to be involved in biological processes, 84,365 (43.3%) unigenes were classified according to a cellular component and 34,664 (17.8%) transcripts had potential molecular functions (Fig. 1 & Table S1). Furthermore, functional annotation using the COG database classified 11,015 unigenes into 25 categories and 230 biological pathways (Fig. 2& Table S2). In the testis transcriptome of C. nasus described here, both gene annotation and pathway analysis helped predict their potential roles at the onset of spermatogenesis. Enrichment analyses of GO functions and KEGG pathways lend support to the biological significance of transcriptome profiles derived from short-read sequencing technology. This transcriptome data thus increased the sequence resource available for the commercially important anadromous fish species and for researchers in the anadromous fish reproductive biology. 2.5. Data deposition The raw sequence data from C. nasus testes were deposited in the Sequence Read Archive (SRA) database with accession number SRP040708. The full dataset is also available from Di-an Fang on request ([email protected]). Competing interests The authors declare that they have no competing interests.

Fig. 1. Unigene GO Cluster Distribution. Unigenes were classified into three main categories: biological process, cellular component and molecular function.

Please cite this article as: Zhou, Y.-F., et al., Testes transcriptome profiles of the anadromous fish Coilia nasus during the onset of spermatogenesis, Mar. Genomics (2015), http://dx.doi.org/10.1016/j.margen.2015.06.007

Y.-F. Zhou et al. / Marine Genomics xxx (2015) xxx–xxx

3

Fig. 2. Clusters of Orthologous Group (COG) classification of consensus sequence.

Authors' contributions Profs. Pao Xu and Di-An Fang were responsible for the experiment design and conception. Yan-Feng Zhou was responsible for data mining and writing the manuscript. Jin-Rong Duan, Dong-Po Xu, Min-Ying Zhang and Kai Liu helped in selecting the fish sample, RNA extraction and testis transcriptome analysis during manuscript preparation. All authors have read and approved the final manuscript. Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.margen.2015.06.007. Acknowledgments This work was supported by the National Natural Science Foundation of China for Young Scientists (31302169), the China Postdoctoral Science Foundation (2014M561675), the Jiangsu Postdoctoral Science Foundation (1302001B) and the Fundamental Research Funds from the FFRC (2013JFBR02). References Bissonnette, N., Lévesque-Sergerie, J.-P., Thibault, C., Boissonneault, G., 2009. Spermatozoal transcriptome profiling for bull sperm motility: a potential tool to evaluate semen quality. Reproduction 138, 65–80. Chalmel, F., Rolland, A.D., Niederhauser-Wiederkehr, C., Chung, S.S.W., Demougin, P., Gattiker, A., Moore, J., Patard, J.-J., Wolgemuth, D.J., Jégou, B., et al., 2007. The conserved transcriptome in human and rodent male gametogenesis. Proc. Natl. Acad. Sci. 104, 8346–8351.

Conesa, A., Götz, S., García-Gómez, J.M., Terol, J., Talón, M., Robles, M., 2005. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674–3676. Cox, M., Peterson, D., Biggs, P., 2010. SolexaQA: at-a-glance quality assessment of Illumina second-generation sequencing data. BMC Bioinformatics 11, 485. Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I., Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., et al., 2011. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat. Biotechnol. 29, 644–652. Jiang, T., Yang, J., Liu, H., X-q, S., 2012. Life history of Coilia nasus from the Yellow Sea inferred from otolith Sr:Ca ratios. Environ. Biol. Fish 95, 503–508. Kanehisa, M., Goto, S., Kawashima, S., Okuno, Y., Hattori, M., 2004. The KEGG resource for deciphering the genome. Nucleic Acids Res. 32, D277–D280. Li, Y., Xie, S., Li, Z., Gong, W., He, W., 2007. Gonad development of an anadromous fish Coilia ectenes (Engraulidae) in lower reach of Yangtze River, China. Fish. Sci. 73, 1224–1230. Liu, D., Li, Y., Tang, W., Yang, J., Guo, H., Zhu, G., Li, H., 2014. Population structure of Coilia nasus in the Yangtze River revealed by insertion of short interspersed elements. Biochem. Syst. Ecol. 54, 103–112. Metzker, M., 2010. Sequencing technologies — the next generation. Nat. Rev. Genet. 11, 31–46. Simpson, J.T., Wong, K., Jackman, S.D., Schein, J.E., Jones, S.J.M., Birol, İ., 2009. ABySS: a parallel assembler for short read sequence data. Genome Res. 19, 1117–1123. Tatusov, R.L., Galperin, M.Y., Natale, D.A., Koonin, E.V., 2000. The COG database: a tool for genome-scale analysis of protein functions and evolution. Nucleic Acids Res. 28, 33–36. Xu, G., Xu, P., Gu, R., Zhang, C., Zheng, J., 2011. Feeding and growth in pond Coilia nasus juveniles. Chin. J. Ecol. 2014–2018. Yang, Q., Gao, T., Miao, Z., 2011. Differentiation between populations of Japanese grenadier anchovy (Coilia nasus) in Northwestern Pacific based on ISSR markers: implications for biogeography. Biochem. Syst. Ecol. 39, 286–296. Zhong, L., Guo, H., Shen, H., Li, X., Tang, W., Liu, J., Jin, J., Mi, Y., 2007. Preliminary results of Sr:Ca ratios of Coilia nasus in otoliths by micro-PIXE. Nucl. Inst. Methods Phys. Res. B 260, 349–352.

Please cite this article as: Zhou, Y.-F., et al., Testes transcriptome profiles of the anadromous fish Coilia nasus during the onset of spermatogenesis, Mar. Genomics (2015), http://dx.doi.org/10.1016/j.margen.2015.06.007