Genome-wide transcriptome analysis in the ovaries of two goats identifies differentially expressed genes related to fecundity

Genome-wide transcriptome analysis in the ovaries of two goats identifies differentially expressed genes related to fecundity

    Genome-wide transcriptome analysis in the ovaries of two goats identifies differentially expressed genes related to fecundity Xiangya...

1MB Sizes 2 Downloads 58 Views

    Genome-wide transcriptome analysis in the ovaries of two goats identifies differentially expressed genes related to fecundity Xiangyang Miao, Qingmiao Luo, Xiaoyu Qin PII: DOI: Reference:

S0378-1119(16)30017-8 doi: 10.1016/j.gene.2016.01.047 GENE 41150

To appear in:

Gene

Received date: Revised date: Accepted date:

11 September 2015 6 January 2016 28 January 2016

Please cite this article as: Miao, Xiangyang, Luo, Qingmiao, Qin, Xiaoyu, Genome-wide transcriptome analysis in the ovaries of two goats identifies differentially expressed genes related to fecundity, Gene (2016), doi: 10.1016/j.gene.2016.01.047

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Genome-wide transcriptome analysis in the ovaries of two goats identifies differentially expressed genes related to fecundity

IP

T

Xiangyang Miao Qingmiao Luo Xiaoyu Qin

SC R

Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing, 100193, China

Corresponding author: Xiangyang Miao; Tel: 86-10-62895663(China),

MA

NU

E-mail: [email protected]; [email protected]

D

Genome-wide analysis of two species of goats

AC

CE P

TE

Running title:

ACCEPTED MANUSCRIPT Abstract The goats are widely kept as livestock throughout the world. Two excellent domestic breeds in China, the Laiwu Black and Jining Grey goats, have different

T

fecundities and prolificacies. Although the goat genome sequences have been

IP

resolved recently, little is known about the gene regulations at the transcriptional level in goat. To understand the molecular and genetic mechanisms related to the

SC R

fecundities and prolificacies, we performed genome-wide sequencing of the mRNAs from two breeds of goat using the next-generation RNA-Seq technology and used functional annotation to identify pathways of interest. Digital gene expression

NU

analysis showed 338 genes were up-regulated in the Jining Grey goats and 404 were up-regulated in the Laiwu Black goats. Quantitative real-time PCR verified the

MA

reliability of the RNA-Seq data. This study suggests that multiple genes responsible for various biological functions and signaling pathways are differentially expressed in the two different goat breeds, and these genes might be involved in the regulation of

D

goat fecundity and prolificacy. Taken together, our study provides insight into the

TE

transcriptional regulation in the ovaries of 2 species of goats that might serve as a key resource for understanding goat fecundity, prolificacy and genetic diversity between

CE P

species.

AC

Key words: Goat, High prolificacy, mRNA, RNA-Seq

ACCEPTED MANUSCRIPT Introduction Previously, the most widely used method of transcriptome analysis was the microarray (Eklund et al. 2006). The data obtained from microarray analyses showed

T

that transcriptome analysis is an effective way to analyze biological processes at the

IP

molecular level. Thus, research efforts aimed at improving the technology arose. Recently, a new whole transcriptome shotgun sequencing method was developed.

SC R

This new technology, the RNA-Seq approach, uses next-generation deep-sequencing and is described as a powerful tool for RNA analysis (Wang et al. 2009). The resulting sequence reads are individually mapped to the source genome and counted

NU

to obtain the number and density of reads corresponding to the RNA from each known exon, splice event or new candidate gene (Mortazavi et al. 2008). This technique has

MA

been successfully used for genome-wide analysis of mRNAs in multiple organisms, such as yeast (Nagalakshmi et al. 2008; Wilhelm et al. 2008) , sheep (Miao et al. 2015 ) and human (Peng et al. 2012).

D

The domestic goat (Capra hircus) is one of the oldest species and is raised all over

TE

the world for meat, milk, skin and fiber. Over 90% of the world’s goat population is kept in small herds by farmers in developing countries (Mak 2013). Recently, goats

CE P

have been recognized for their value as animal models in biomedical research and the genome sequence of a female Yunnan Black goat was recently published (Ko et al. 2000; Dong et al. 2013). These types of molecular analyses have been performed in

AC

other domestic animals to examine fecundity. For example, previous studies indicated that the bone morphogenetic protein receptor-1B (BMPR1B) gene regulates the fecundity and ovulation rate of sheep (Wilson et al. 2001; Davis 2005; Mulsant et al. 2001). In addition, a variety of molecular studies regarding fecundity have been conducted in sheep. Sheep fecundity is dependent on multiple genes, including the bone morphogenetic protein 15 gene (BMP15) (Shimasaki et al. 1999) and the BMP15 receptor gene in the Booroola breed (BMPR1B) (Paradis et al. 2009). The Booroola (FecB) phenotype, which is associated with a mutation in BMPR1B (Souza et al. 2001), is highly related to sheep fecundity and prolificacy. However, little is understood about the genetic determinants for fecundity in goat. Goat fecundity, which is important for agriculture, is said to vary between species of different genetic

ACCEPTED MANUSCRIPT backgrounds. The Jining Grey goat is an excellent local breed in China that possesses the characteristics of high prolificacy and year-round estrus (Huang et al. 2012). The

IP

T

mean litter size of the Jining Grey has been reports to be 2.94 (Tu et al .1989). The fecundity of the Laiwu Black goat is much lower than that of the Jining Grey goat.

SC R

Limited studies have identified genes that might regulate fecundity in goats. The BMPR1B gene might be a major gene that influences prolificacy of Black Bengal goats (Polley et al. 2009) and also associated with the reproductive differences

NU

between the Lezhi Black and Tibetan goats (Yang et al. 2012). Previous studies indicated that bone morphogenetic protein receptor-1B (BMPR1B) gene regulates the

MA

fecundity and ovulation rate of sheep (Wilson et al. 2001; Davis et al. 2005; Mulsant et al. 2001). These preliminary data, together with what has been shown in sheep,

D

suggest that through genome-wide analyses of mRNAs, these two goat breeds with

of fecundity in goats.

TE

different prolificacies should be examined to uncover potential molecular mechanisms

CE P

In this study, using RNA-Seq technology, we performed genome-wide sequencing of the mRNAs prepared from the ovaries of the Jining Grey goat and Laiwu Black goat. In total, 3,493 candidates were predicted as new genes, and 742 genes were

AC

differentially expressed between these two goats. The functional annotation of the differentially regulated mRNAs points to multiple cellular functions and pathways that might be involved in regulating the different fecundities of these two goat species.

Materials and Methods Ethics statement All of the experiments were performed in accordance with the relevant guidelines and regulations and approved by the Institutional Animal Care and Use Committee of Institute of Animal Sciences, Chinese Academy of Agricultural Sciences.

ACCEPTED MANUSCRIPT Goat sample preparation The goats used in this study were obtained from the Qingdao Aote Farm (Shandong, China). Five female Laiwu Black goats and five female Jining Grey goats

IP

T

of similar ages and good status were selected. All of the 10 goats were treated with intravaginal sponges followed by injection of pregnant mare serum gonadotropin

SC R

(PMSG) intramuscularly to synchronize estrus, as previously described (Miao et al. 2013). The estrus status was verified afterwards and the goats were killed 4-5 hours after estrus. Whole ovaries were excised and were immediately snap-frozen in liquid

NU

nitrogen and stored at -70°C for total RNA extraction (Miao et al. 2013).

MA

Construction of mRNA libraries and sequencing

Total RNA was extracted with Trizol (Invitrogen Inc., California, USA) from the

D

ovaries of the Laiwu Black and Jining Grey goats according to the manufacturer’s

TE

instructions. The quality and quantity of the RNA samples were assessed on a Bioanalyzer 2100 system using an RNA 6000 Nano kit (Agilent Technologies, Palo

CE P

Alto, CA). The RNA from the 5 goats in each group was pooled to generate two libraries. The recently developed deep-sequencing technology was used for RNA-Seq (Wang et al. 2009). RNase-free DNase I (Ambion Inc., Texas, USA) was used to

AC

eliminate potential genomic DNA contamination. Approximately 1 mg of RNA was used to generate RNA-Seq cDNA libraries for sequencing using the TruSeq RNA Sample Prep Kit v2 (Illumina Inc., San Diego, CA) according to the manufacturer's instructions. Briefly, the workflow included the isolation of polyadenylated RNA molecules

using

poly-T

oligo-attached

magnetic

beads,

enzymatic

RNA

fragmentation, cDNA synthesis, ligation of bar-coded adapters, and PCR amplification. DNA size and purity of the cDNA library were checked using a high sensitivity DNA 1000 kit on a Bioanalyzer 2100 system (Agilent Technologies, Santa Clara, CA). The quantification of the cDNA libraries was performed with Qubit™dsDNA HS kit on Qubit® 2.0 Fluorometer (Life Technologies, Carlsbad, CA). The cDNA libraries were then diluted to 4 nM, and a 120 ml aliquot was used to generate clusters on a paired-end flow cell using the cBOT (Illumina) and sequenced

ACCEPTED MANUSCRIPT on the Illumina Genome Analyzer IIx (GAIIx) using the SBS 36-cycle Sequencing Kit (v5) at Shanghai Biotechnology Corporation (Shanghai, China) according to manufacturer-recommended cycling parameters. One lane for the samples from each

IP

T

species was sequenced as 100-bp reads. The image analysis and base calling were performed with SCS2.8/RTA1.8 (Illumina). The FASTQ file generation and the

SC R

removal of failed reads were performed using CASAVA ver.1.8.2 (Illumina).

Mapping Reads to the Reference Genome and Transcript Reconstruction. clean

reads

were

mapped

to

goat

NU

The

reference

gene

sequences

(http://goat.kiz.ac.cn/GGD/download.htm) and to goat reference genome sequences

MA

(http://goat.kiz.ac.cn/GGD/download.htm) set using SOAPaligner/SOAP2 (Li et al. 2009). No more than two mismatches were permitted in the alignment. The alignment

D

of the RNA-Seq reads and the assembly of the alignments into a parsimonious set of

TE

transcripts was done using Cufflinks (http://cole-trapnell-lab.github.io/cufflinks/).

CE P

Alternative Splicing Analysis

To analyze the alternative splicing, the program AS detector was used (ASD can be

AC

found at http://www.novelbio.com/asd/ASD.html). The different transcriptomes were reconstructed and were merged to yield comprehensive reannotated transcripts for the subsequent alternative splicing (AS) analysis. After mapping, an ‘accepted_hits.bam’ file was generated that contained information regarding the chromosome position for exonic reads and exon–exon junction reads. A program AS detector is utilized to: (i) reconstruct exon-clusters so as to identify common modes of AS events for each exon-cluster; (ii) count the number of junction reads that align either to the inclusion or exclusion isoforms in all of the samples and finally calculating a P-value; (iii) calculate the read coverage for the alternative exon and its corresponding gene in all

ACCEPTED MANUSCRIPT of the samples and calculate a second P-value by a Fisher’s exact test based on the alternative exon read coverage relative to its gene read coverage between the samples

IP

T

and (iv) combine the P-values to get an adjusted P-value for assessing the statistical

SC R

difference of the AS between the two samples.

Workflow of bioinformatics analysis

The workflow of the bioinformatics analysis of the RNA-Seq results is shown

NU

(Figure 1). The summary of the workflow figure is that the reference genome and the gene model annotation files were downloaded from genome website directly. The

MA

index of the reference genome was built using Bowtie v2.0.6 and the clean reads were aligned to the reference genome using TopHat v2.0.9 The Cufflinks v2.1.1 Reference

D

Annotation Based Transcript (RABT) assembly method was used to construct and

TE

identify both the known and novel transcripts from TopHat alignment results. We selected TopHat as the mapping tool. Then, HTSeq v0.6.1 was used to count the reads

CE P

numbers mapped to each, and the RPKM of each gene was calculated based on the length of the gene and the reads count mapped to this gene. In addition, the differential expression analysis of the two conditions/groups was performed using the

AC

IDEG6 software. The resulting LOG2FC were adjusted using the EBSeq’s approach for controlling the false discovery rate. The genes with an adjusted LOG2FC>1 or <-1 (FDR<0.05) found by IDEG6 software were assigned as differentially expressed.

Expression level analysis The original mRNA sequencing reads were filtered and only the unique mapping tags were used for gene expression analysis (Zhou et al. 2010). To enable transparent comparison of transcript levels, the mapped read counts for each gene were normalized for RNA length and for the total read number in the lane according to reads per kilobase of exon model per million mapped reads (RPKM) (Miao et al. 2013; Robinson et al. 2010; Li et al. 2010). IDEG6 software was used to identify the differentially expressed genes (Romualdi et al. 2003).

ACCEPTED MANUSCRIPT

GO and KEGG pathway analysis Gene Ontology (GO) is an international standardized gene functional

IP

T

classification system, which offers a dynamic-updated controlled vocabulary and a strictly defined concept to comprehensively describe properties of genes and their

SC R

products in any organism (Barrell et al. 2009; Young et al. 2010). A hypergeometric test was applied to map all of the differentially expressed genes to terms in GO. Three structured controlled vocabularies (ontologies), including cellular component,

NU

molecular function and biological process, were used (Harris et al. 2004). Like GO enrichment, clusters of orthologous groups of proteins (COGs) were analyzed

MA

(Tatusov et al. 2003; Tatusov et al. 1997) and the association of the genes with different pathways was computed using the Kyoto Encyclopedia of Genes and

TE

Aoki-Kinoshita et al. 2007).

D

Genomes (KEGG) database (http://www.genome.jp/kegg) (Ogata et al. 1998;

CE P

Validation of RNA-Seq data

To validate the RNA-Seq results, the expressions of several mRNAs were confirmed by quantitative real-time PCR. In brief, cDNA was synthesized from the

AC

mRNA by reverse transcription using the PrimerScript RT reagent Kit (TaKara, Japan) in a GeneAmp® PCR System 9700 (Applied Biosystems, USA). The synthesized cDNA was used as the template for qRT-PCR using SYBR Green I Master (Roche, Swiss). The primer sequences were designed in the laboratory and synthesized by Generay Biotech (Generay, PRC) based on the mRNA sequences obtained from the NCBI database. ACTIN was used as internal control and the expression levels were calculated according to 2-ΔΔCt values. The primer sequences can be found in Table S1.

Results

Quality control of the RNA-Seq reads and gene structure analysis The original RNA-Seq reads from the Jining Grey and Laiwu Blackgoats were

ACCEPTED MANUSCRIPT analyzed for quality control before continuing with further analysis (Table 1). More than 25 million reads were obtained for each species of goat. The data showed that the 13,454,559 and 13,588,179 reads uniquely mapped to the Jining Grey and the Laiwu

IP

T

Black goats, respectively. As expected, the insert size of the RNA-Seq reads ranged from 100 bp to 150 bp.

SC R

During gene transcription, a variety mRNAs may be generated from the same DNA template by the process of alternative splicing, and these alternatively spliced isoforms can be translated into different proteins. Seven types of alternative splicing

NU

are generally recognized (Matlin et al. 2005): exon skipping, intron retention, mutually exclusive exon, alternative first exon, alternative last exon, alternative 5'

MA

splice site and alternative 3' splice site. With this, 3,493 new genes were predicted

D

from the RNA-Seq reads of the two different goats (Table S2).

TE

Gene expression profiling of the two species of goats The expression of 21,322 genes in the Jining Grey goat and the Laiwu Black goat

CE P

was determined using RNA-Seq technology (Table S3). Here we quantified the transcript levels of all of the identified genes in fragments per kilobase of transcript per million mapped reads (FPKM) to account for the dependency between paired-end

AC

reads in the reads per kilobase of the exon model per million mapped reads (RPKM) (Mortazavi et al. 2008; Garber et al. 2011). The TopHat program was used to map the RNA-Seq data to the reference genome (Trapnell et al. 2009). Gene structure analyses were done as indicated (Figure 2A), and the distribution of the reads on the chromosomes are shown (Figure 2B). Next, IDEG6 software was used to identify the differentially expressed genes (Romualdi et al. 2003). The mRNAs were considered to be differentially expressed between the different sheep species if the |log2(FPKM ratio)| > 1 (False Discovery Rate, FDR < 0.05). In total, 742 genes were found to be differentially expressed between the Jining Grey and Laiwu Black goats (Table 2). Among the differentially expressed genes, 338 were up-regulated in the Jining Grey goat and 404 were up-regulated in the Laiwu Black goat. These data identify differential regulation between the two species of goats that requires further

ACCEPTED MANUSCRIPT investigation.

Annotation of the differentially expressed genes

IP

T

The Gene Ontology (GO) terms were further assigned to goat differential expression genes based on their sequence similarities to known proteins in the

SC R

UniProt database annotated with GO terms as well as the InterPro and Pfam domains they contain. The GO annotation classified the differentially expressed genes by the categories biological process, cellular component and molecular function (Figure 3A,

NU

Table S4). The terms annotated under the biological processes category include cell proliferation, death, metabolic process, reproduction, signaling, etc. The annotation

MA

under the cellular component category includes the terms extracellular matrix, membrane, cell part and organelle. The differentially expressed genes are involved in

D

various molecular functions, including binding, catalytic activity, molecular

TE

transducer activity, receptor activity, structural molecule activity, etc. A GO graphic tree of the annotation was constructed (Figure 3B).

CE P

Next, the differentially expressed genes were annotated using KEGG to identify pathway enrichments (Figure 4A and B, Table S5) (Kanehisa et al. 2004). A variety of pathways were enriched, including ribosome, complement and coagulation cascades,

AC

antigen processing and presentation, focal adhesion, ECM-receptor interaction, etc. A graphic of the network showing the interaction between the different pathways was generated (Figure 4C). The results of the GO categories and KEGG pathway analysis suggest that one gene may be involved in several pathways or interact with several other genes. To explore this, the differentially expressed genes were pooled and a Gene-Act network of the interactions between the differentially expressed genes was generated. (Figure 5).

RT-PCR validation of RNA sequencing To validate the RNA-Seq data, six differentially expressed mRNAs were further examined by RT-PCR and the expression levels were calculated according to 2-ΔΔCt values (Figure 6). The RNA-Seq results indicated that the expression of IGF1R,

ACCEPTED MANUSCRIPT BMPR1B, TGFBR1, SMAD3, SRD5A2 and BMPR2 were up-regulated in the Jining Grey goat by two to three fold. Accordingly, the RT-PCR data also indicated that these genes were up-regulated in the Jining Grey goat. These data validate the findings by

IP

T

RNA-Seq.

SC R

Discussion

In this study, we performed a genome-wide analysis of mRNAs from two domestic species of goat raised in China using RNA-Seq technology. Recent studies

NU

show that a prolific local goat in India, the Assam hill goat, has a nucleotide polymorphism in the BMPR1B gene (Dutta et al. 2014). In addition, BMPR1B is

MA

suggested to be associated with the reproductive differences between the Lezhi Black and Tibetan goats (Yang et al. 2012). The RNA-Seq results presented here found

D

BMPR1B to be one of the differentially regulated genes between the two species of

TE

goats, indicating that BMPR1B might also be important in goat fecundity. The identification of genes known to regulate fecundity in other species lays the

CE P

foundation for understanding this process in goats, which is lacking in the literature. In addition to candidates already described to play a role in fecundity, we identified many other genes that are differentially regulated between the two species of goats.

AC

Our data is based on RNA from the ovaries of goats from 2 different species. Therefore, we propose that these data will serve as a powerful tool and resource for future research related to goat genetics on a variety of levels. Through the GO and KEGG pathway analyses, our data identified various genes involved in cell adhesion as being differentially expressed between the two different goats. In a effort to aid the search for relevant pathways for future studies we presented our annotation data in several graphical formats that provide clues as to which pathways might be the most interesting. These results are supported by data in sheep, in which a number of important cell adhesion regulators, such as cadherin-2 (Rieger et al. 2009), dystonin, claudin 10 (Man et al. 2013), periostin, embigin (Huang et al. 1993) and adenosine deaminase (Ginés et al. 2002), have been shown to be differentially expressed. Several molecules related to transcription, translation,

ACCEPTED MANUSCRIPT immune response, proteasome binding, IGF-1 binding, cilium and metabolism (e.g., metabolism of lipid, steroid and cholesterol) were also differentially expressed between the two species of goats. Previous studies have linked energy balance and

IP

T

metabolism with fecundity and body size (Valeggia et al. 2009; Briegel et al. 1990); thus, our study may provide clues to establish a link between goat fecundity variation

SC R

and metabolism as well as other biological processes.

Using Caenorhabditis elegans as a model system, the TGF-ß/Sma/Mab pathway was identified as a regulator of reproductive aging and reproductive span independent

NU

of longevity (Luo et al. 2009). Additionally, in Neurospora crassa, the MAPK pathway is related to asexual development and female fertility (Jones et al. 2007).

MA

Here, the KEGG pathway annotation identified multiple signaling pathways that were enriched in the differentially expressed genes, including the TGF-β and the

D

mitogen-activated protein kinase (MAPK) signaling pathways. These data suggest

TE

that these pathways might play a role in goat fecundity and the genes identified here should be further explored to assess these relationships.

CE P

In addition, other pathways were enriched in our data set. These pathways included the ECM-receptor interaction pathway, pathways in cancer, the complement and coagulation cascades and the ribosome pathway. These might provide new clues,

AC

for uncovering important functions of signaling pathways that play a role in genetic diversity and goat development and should be investigated further. In summary, RNA-Seq technology gives us the capacity to produce an extraordinary global view of the transcriptome and its organization for a variety of species and cell types (Wang et al. 2009). Using RNA-Seq technology, we characterized the differentially expressed genes between the Jining Grey and Laiwu Black goats. A functional annotation of these genes identified various biological functions and signaling pathways that hold clues as to which genes might be involved in regulating goat fecundity and prolificacy. The tendency toward dizygotic twinning is inherited in both humans and sheep, but human twinning is not linked to the chromosome region syntenic with the sheep gene FecB (Duffy et al. 1999) and no evidence suggests the involvement of TGFBR1 and BMPR2 in dizygotic twinning in

ACCEPTED MANUSCRIPT humans (Luong et al. 2011). Polymorphisms of the major ovine fecundity genes FecB and FecX widely exist in sheep, but no polymorphism was observed in goats (Hua et al. 2008). This suggests a great degree of diversity in different species when it comes

IP

T

to the molecular mechanisms related to fecundity. Thus, our data is important because it not only provides clues that might reveal a molecular mechanism related to goat

SC R

prolificacy, but also is a great resource for research on goats and other related species to better understand species’ diversity. Importantly, further biochemical and physiological analyses should be conducted in the future using genes found in our These genes may represent not only fecundity differences, but also

NU

study.

reproductive differences and general species differences that will be important for

MA

goat breeders. Based on our data, future transcriptome analyses, like those conducted here, should be performed in other tissues and goat breeds and deeper insights in

CE P

Competing interests

TE

diversity and prolificacy.

D

specific genes should be explored in order to gain a better understanding of goat

AC

The authors declare that they have no competing interests.

Acknowledgements This work was supported by a grant from The Major Science and Technology Project of

New

Variety

Breeding

of

Genetically

Modified

Organisms

(Nos.

2009ZX08008-004B and 2008ZX08008-003) , the National High Technology Research Development Program of China (863 Program No. 2008AA10Z140), the National Natural Science Foundation of China (No. 30571339) and the Innovation Research Foundation of CAAS (No. 2004-CAAS-1),Basic Research Fund for Central Public Research Institutes of CAAS (2013ywf-yb-5, 2013ywf-zd-2),The Agricultural Science and Technology Innovation Program.

ACCEPTED MANUSCRIPT References Aoki-Kinoshita KF, Kanehisa M. 2007. Gene annotation and pathway mapping in KEGG. Methods Mol Biol 396: 71-91.

T

Barrell D, Dimmer E, Huntley RP, Binns D, O'Donovan C, et al. 2009. The GOA

IP

database in 2009--an integrated Gene Ontology Annotation resource. Nucleic Acids Res 37: D396-403.

SC R

Briegel H. 1990. Fecundity, Metabolism, and Body Size in Anopheles (Diptera, Culicidae), Vectors of Malaria. Journal of Medical Entomology 27: 839-850. Davis GH. 2005. Major genes affecting ovulation rate in sheep. Genet Sel Evol 37

NU

Suppl 1: S11-23.

Dong Y, Xie M, Jiang Y, Xiao N, Du X, et al. 2013. Sequencing and automated

MA

whole-genome optical mapping of the genome of a domestic goat (Capra hircus). Nat Biotechnol 31: 135-141.

Dutta R, Laskar S, Borah P, Kalita D, Das B, et al. 2014. Polymorphism and

TE

Rep 41: 3677-3681.

D

nucleotide sequencing of BMPR1B gene in prolific Assam hill goat. Mol Biol

Eklund AC, Turner LR, Chen P, Jensen RV, deFeo G, et al. 2006. Replacing cRNA

CE P

targets with cDNA reduces microarray cross-hybridization. Nat Biotechnol 24: 1071-1073.

Garber M, Grabherr MG, Guttman M, Trapnell C. 2011. Computational methods for

AC

transcriptome annotation and quantification using RNA-Seq. Nat Methods 8: 469-477.

Gines S, Marino M, Mallol J, Canela EI, Morimoto C, et al. 2002. Regulation of epithelial and lymphocyte cell adhesion by adenosine deaminase-CD26 interaction. Biochem J 361: 203-209. Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, et al. 2004. The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res 32: D258-261. Huang DW, Di R, Wang JX, Chu MX, He JN, et al. 2012. Analysis on DNA sequence of goat RFRP gene and its possible association with average daily sunshine duration. Mol Biol Rep 39: 9167-9177. Huang RP, Ozawa M, Kadomatsu K, Muramatsu T. 1993. Embigin, a member of the immunoglobulin superfamily expressed in embryonic cells, enhances

ACCEPTED MANUSCRIPT cell-substratum adhesion. Dev Biol 155: 307-314. Hua GH, Chen SL, Ai JT, Yang LG. 2008. None of polymorphism of ovine fecundity major genes FecB and FecX was tested in goat. Animal Reproduction Science

T

108: 279-286.

IP

Jones CA, Greer-Phillips SE, Borkovich KA. 2007. The response regulator RRG-1 functions upstream of a mitogen-activated protein kinase pathway impacting

SC R

asexual development, female fertility, osmotic stress, and fungicide resistance in Neurospora crassa. Mol Biol Cell 18: 2123-2136.

Kanehisa M, Goto S, Kawashima S, Okuno Y, Hattori M. 2004. The KEGG resource

NU

for deciphering the genome. Nucleic Acids Res 32: D277-280. Ko JH, Lee CS, Kim KH, Pang MG, Koo JS, et al. 2000. Production of biologically

MA

active human granulocyte colony stimulating factor in the milk of transgenic goat. Transgenic Res 9: 215-222.

Li B, Ruotti V, Stewart RM, Thomson JA, Dewey CN. 2010 RNA-Seq gene

D

expression estimation with read mapping uncertainty. Bioinformatics 26:

TE

493-500.

CE P

Li XL, Zhao JW, Tang CJ, Zhou RY, Zheng G, Li LH, Guo XL. 2009. Sequencing of part of the goat agouti gene and SNP identification. Biochem Genet 48:152-156. Luong HTT, Chaplin J, McRae AF, Medland SE, Willemsen G, et al. 2011. Variation in BMPR1B, TGFRB1 and BMPR2 and Control of Dizygotic Twinning. Twin

AC

Research and Human Genetics 14: 408-416. Luo S, Shaw WM, Ashraf J, Murphy CT. 2009. TGF-beta Sma/Mab signaling mutations uncouple reproductive aging from somatic aging. PLoS Genet 5: e1000789. Mak HC. 2013. Goat genome sequence by optical mapping. Nat Biotechnol 31: 123. Man G, Wei L, Haiming W, Guanjun W. 2013. The distinct expression patterns of claudin-10, -14, -17 and E-cadherin between adjacent non-neoplastic tissues and gastric cancer tissues. Diagn Pathol 8: 205. Martin NG, Duffy DL, Hall J, Mayne C, Healey S, et al. 1999. The human twinning gene is not syntenic with the sheep twinning gene (FecB) on human chromosome 4. American Journal of Human Genetics 65: A389-A389. Matlin AJ, Clark F, Smith CW. 2005. Understanding alternative splicing: towards a cellular code. Nat Rev Mol Cell Biol 6: 386-398.

ACCEPTED MANUSCRIPT Miao X, Luo Q. 2013. Genome-wide transcriptome analysis between small-tail Han sheep and the Surabaya fur sheep using high-throughput RNA sequencing. Reproduction 145: 587-596.

T

Miao XY, Luo QM, Qin XY. 2015. Genome-wide transcriptome analysis of mRNAs

IP

and microRNAs in Dorset and Small Tail Han sheep to explore the regulation of fecundity. Molecular and Cellular Endocrinology 402: 32–42.

SC R

Miao XY, Luo QM, Qin XY. 2015. Genome-wide analysis reveals the differential regulations of mRNAs and miRNAs in Dorset and Small Tail Han sheep muscles. Gene 562(2): 188–196.

NU

Miao, X., Luo, Q., Qin, X., Guo, Y. & Zhao, H. Genome-wide mRNA-seq profiling reveals predominant down-regulation of lipid metabolic processes in adipose

MA

tissues of Small Tail Han than Dorset sheep. Biochem Biophys Res Commun 467, 413-420.

Miao, X., Luo, Q., Qin, X. & Guo, Y. Genome-wide analysis of microRNAs identifies

D

the lipid metabolism pathway to be a defining factor in adipose tissue from

TE

different sheep. Scientific reports 5, 18470, doi:10.1038/srep18470 (2015). Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. 2008. Mapping and

CE P

quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5: 621-628.

Mulsant P, Lecerf F, Fabre S, Schibler L, Monget P, et al. 2001. Mutation in bone morphogenetic protein receptor-IB is associated with increased ovulation rate

AC

in Booroola Merino ewes. Proceedings of the National Academy of Sciences of the United States of America 98: 5104-5109.

Nagalakshmi U, Wang Z, Waern K, Shou C, Raha D, et al. 2008. The transcriptional landscape of the yeast genome defined by RNA sequencing. Science 320: 1344-1349. Ogata H, Goto S, Fujibuchi W, Kanehisa M. 1998. Computation with the KEGG pathway database. Biosystems 47: 119-128. Paradis F, Novak S, Murdoch GK, Dyck MK, Dixon WT, et al. 2009. Temporal regulation of BMP2, BMP6, BMP15, GDF9, BMPR1A, BMPR1B, BMPR2 and TGFBR1 mRNA expression in the oocyte, granulosa and theca cells of developing preovulatory follicles in the pig. Reproduction 138: 115-129. Peng Z, Cheng Y, Tan BC, Kang L, Tian Z, et al. 2012. Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. Nat

ACCEPTED MANUSCRIPT Biotechnol 30: 253-260. Polley S, De S, Batabyal S, Kaushik R, Yadav P, et al. 2009. Polymorphism of fecundity genes ( BMPR1B, BMP15 and GDF9 ) in the Indian prolific Black

T

Bengal goat. Small Ruminant Res 85: 122–129.

IP

Rieger S, Senghaas N, Walch A, Koster RW. 2009. Cadherin-2 controls directional chain migration of cerebellar granule neurons. PLoS Biol 7: e1000240.

SC R

Robinson MD, Oshlack A. 2010. A scaling normalization method for differential expression analysis of RNA-Seq data. Genome Biol 11: R25. Romualdi C, Bortoluzzi S, D'Alessi F, Danieli GA. 2003. IDEG6: a web tool for

NU

detection of differentially expressed genes in multiple tag sampling experiments. Physiol Genomics 12: 159-162.

MA

Shimasaki S, Zachow RJ, Li D, Kim H, Iemura S, et al. 1999. A functional bone morphogenetic protein system in the ovary. Proc Natl Acad Sci U S A 96: 7282-7287.

D

Souza CJ, MacDougall C, Campbell BK, McNeilly AS, Baird DT. 2001. The

TE

Booroola (FecB) phenotype is associated with a mutation in the bone morphogenetic receptor type 1 B (BMPR1B) gene. J Endocrinol 169: R1-6.

CE P

Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, et al. 2003. The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4: 41. Tatusov RL, Koonin EV, Lipman DJ. 1997. A genomic perspective on protein families. Science 278: 631-637.

AC

Trapnell C, Pachter L, Salzberg SL. 2009. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics 25: 1105-1111.

Tu YR .1989. The sheep and goat breeds in China. Shanghai Science and Technology Press, Shanghai, pp 88–90, 98–101. Valeggia C, Ellison PT. 2009. Interactions Between Metabolic and Reproductive Functions in the Resumption of Postpartum Fecundity. American Journal of Human Biology 21: 559-566. Wang Z, Gerstein M, Snyder M. 2009. RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet 10: 57-63. Wilhelm BT, Marguerat S, Watt S, Schubert F, Wood V, et al. 2008. Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution. Nature 453: 1239-1243. Wilson T, Wu XY, Juengel JL, Ross IK, Lumsden JM, et al. 2001. Highly prolific

ACCEPTED MANUSCRIPT Booroola sheep have a mutation in the intracellular kinase domain of bone morphogenetic protein IB receptor (ALK-6) that is expressed in both oocytes and granulosa cells. Biology of Reproduction 64: 1225-1235.

T

Yang CX, Zi XD, Wang Y, Yang DQ, Ma L, et al. 2012. Cloning and mRNA

non-prolific goat breeds. Mol Reprod Dev 79: 2.

IP

expression levels of GDF9, BMP15, and BMPR1B genes in prolific and

SC R

Young MD, Wakefield MJ, Smyth GK, Oshlack A. 2010. Gene ontology analysis for RNA-Seq: accounting for selection bias. Genome Biol 11: R14. Zhou L, Chen J, Li Z, Li X, Hu X, et al. 2010. Integrated profiling of microRNAs and

AC

CE P

TE

D

MA

carcinoma. PLoS One 5: e15224.

NU

mRNAs: microRNAs located on Xq27.3 associate with clear cell renal cell

ACCEPTED MANUSCRIPT Figure legends

Figure 1. Workflow of the bioinformatics analysis of the RNA-Seq results. Quality

IP

T

control was performed followed by basic and advanced analyses.

SC R

Figure 2. Genome-wide profiling of goat mRNAs. (A) Gene structure analysis of mRNA reads in the Laiwu Black and Jining Grey goats. UTR5, UTR3, Exon, Intron, Tss, Tes, InterGenic and IntraGenic types are shown. (B) Mapping of the mRNA reads

NU

with the genome. The distributions at different chromosomes are shown.

MA

Figure 3. Gene ontology annotation of the differentially expressed genes. (A) The GO categories biological process (BP), molecular function (MF) and cellular component

D

(CC) are shown. (B) A graphic tree displaying the different GO terms. Red is the GO

TE

terms for the up-regulated genes, green is for down-regulated genes, and yellow indicates the GO terms that came from both the up-regulated and the down-regulated

CE P

genes.

Figure 4. KEGG pathway annotation of the differentially expressed genes. (A and B)

AC

KEGG pathway enrichment values are shown. Red is for the significantly enriched pathways and blue indicates those that were not significantly enriched. (C) A KEGG pathway network graphic was constructed. Red indicates the pathways for the up-regulated genes, green indicates the pathways for the down-regulated genes, and yellow indicates the pathways that came from both the up-regulated and down-regulated genes.

Figure 5. A Gene-Act network of the interactions between the differentially expressed genes . Red indicates up-regulation and green represents down-regulation.

Figure 6. Validation of the RNA-Seq results using RT-PCR. RNA-Seq data from the Jining Grey goat are presented as FPKM values followed by normalization. RT-PCR

ACCEPTED MANUSCRIPT data are presented as means with standard deviations (SD). ACTIN was used as the

AC

CE P

TE

D

MA

NU

SC R

IP

T

reference gene.

ACCEPTED MANUSCRIPT Table 1 Quality control of RNA-Seq data. Goat sample

Read

Total

Total

Q20

GC

Length(bp)

reads

nucleotides(b

percentage

percentage

25596027

3890596104

Laiwu Black 2 x 76

25769481

3916961112

IP

2 x 76

86.48

52.63

87.92

51.51

SC R

Jining Grey

T

p)

AC

CE P

TE

D

MA

NU

Q20 percentage, percentage of nucleotides with more than 20 bases.

AC

CE P

TE

D

MA

NU

SC R

IP

T

ACCEPTED MANUSCRIPT

AC

CE P

TE

D

MA

NU

SC R

IP

T

ACCEPTED MANUSCRIPT

AC

CE P

TE

D

MA

NU

SC R

IP

T

ACCEPTED MANUSCRIPT

AC

CE P

TE

D

MA

NU

SC R

IP

T

ACCEPTED MANUSCRIPT

AC

CE P

TE

D

MA

NU

SC R

IP

T

ACCEPTED MANUSCRIPT

MA

NU

SC R

IP

T

ACCEPTED MANUSCRIPT

AC

CE P

TE

D

Figure 6

ACCEPTED MANUSCRIPT Abbreviations list

AC

CE P

TE

D

MA

NU

SC R

IP

T

(BMPR1B) gene BMPR1B, bone morphogenetic protein receptor-1B; BMP15, bone morphogenetic protein 15 gene; BMPR2, bone morphogenetic protein receptor, type II; FDR, COGs clusters of orthologous groups; False Discovery Rate; FecB , Fecundity gene, Boorla, of sheep; FPKM, fragments per kilobase of transcript per million mapped reads; GO, Gene Ontology; KEGG, Kyoto Encyclopedia of Genes and Genomes; IGF1R, insulin-like growth factor1 receptor; MAPK, mitogen-activated protein kinase; PMSG, pregnant mare serum gonadotropin; RPKM, reads per kilobase of the exon model per million mapped reads; SMAD3, SMAD family member 3; TGFBR1, transforming growth factor, beta receptor 1.

ACCEPTED MANUSCRIPT

AC

CE P

TE

D

MA

NU

SC R

IP

T

Highlights ·Fecundity differences between 2 groups of goats were examined at the genomic level by RNA-seq. ·mRNA profiles are presented for 2 groups of goats. ·The differentially expressed mRNA uniquely define each group of goats. ·Quantitative real-time PCR verified the reliability of the RNA-seq data. ·These data may lead to the identification of novel regulators of goat fecundity.