Accepted Manuscript Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to Mud crab reovirus (MCRV) Shanshan Liu, Guanxing Chen, Haidong Xu, Weibin Zou, Wenrui Yan, Qianqian Wang, Hengwei Deng, Heqian Zhang, Guojiao Yu, Jianguo He, Shaoping Weng PII:
S1050-4648(16)30467-3
DOI:
10.1016/j.fsi.2016.07.033
Reference:
YFSIM 4092
To appear in:
Fish and Shellfish Immunology
Received Date: 7 May 2016 Revised Date:
22 July 2016
Accepted Date: 31 July 2016
Please cite this article as: Liu S, Chen G, Xu H, Zou W, Yan W, Wang Q, Deng H, Zhang H, Yu G, He J, Weng S, Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to Mud crab reovirus (MCRV), Fish and Shellfish Immunology (2016), doi: 10.1016/j.fsi.2016.07.033. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT 1
Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to
2
Mud crab reovirus (MCRV)
3
Shanshan Liu
4
Qianqian Wang a,b, Hengwei Deng a,b, Heqian Zhang a,b, Guojiao Yu a,b, Jianguo He a,b,c,,
5
Shaoping Weng a,b*
6
a
7
Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China
8
b
9
Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China c
SC
Institute of Aquatic Economic Animals and Guangdong Province Key Laboratory for
School of Marine Sciences, Sun Yat-sen University, Guangzhou, PR China
TE D
12 13 14
EP
15
19
AC C
16
18
, Haidong Xu c, Weibin Zou a,b, Wenrui Yan a,b,
MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for
11
17
a,b
RI PT
, Guanxing Chen
M AN U
10
a,b
* Corresponding author: MOE Key Laboratory of Aquatic Product Safety/State
20
Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, 135
21
Xingang Road West, Guangzhou 510275, People's Republic of China. Fax: + 86 20
22
84113229.
23
E-mail address:
[email protected] (S. Weng).
ACCEPTED MANUSCRIPT 24
25
Abstract
Mud crab (Scylla paramamosain) is an economically important marine cultured species in China’s coastal area. Mud crab reovirus (MCRV) is the most important
27
pathogen of mud crab, resulting in large economic losses in crab farming. In this
28
paper, next-generation sequencing technology and bioinformatics analysis are used to
29
study transcriptome differences between MCRV-infected mud crab and normal
30
control. A total of 104.3 million clean reads were obtained, including 52.7 million and
31
51.6 million clean reads from MCRV-infected (CA) and controlled (HA) mud crabs
32
respectively. 81,901, 70,059 and 67,279 unigenes were gained respectively from HA
33
reads, CA reads and HA&CA reads. A total of 32,547 unigenes from HA&CA reads
34
called All-Unigenes were matched to at least one database among Nr, Nt, Swiss-prot,
35
COG, GO and KEGG databases. Among these, 13,039, 20,260 and 11,866 unigenes
36
belonged to the 3, 258 and 25 categories of GO, KEGG pathway, and COG databases,
37
respectively. Solexa/Illumina’s DGE platform was also used, and about 13,856
38
differentially expressed genes (DEGs), including 4,444 significantly upregulated and
39
9,412 downregulated DEGs were detected in diseased crabs compared with the
40
control. KEGG pathway analysis revealed that DEGs were obviously enriched in the
41
pathways related to different diseases or infections. This transcriptome analysis
42
provided valuable information on gene functions associated with the response to
43
MCRV in mud crab, as well as detail information for identifying novel genes in the
44
absence of the mud crab genome database.
45
AC C
EP
TE D
M AN U
SC
RI PT
26
ACCEPTED MANUSCRIPT 46
Key words: mud crab, MCRV, transcriptome, immune, KEGG pathway
47
Introduction
48
Mud crab is an economically important marine cultured species that is widely distributed in southeast China coasts [1]. However, Mud crab reovirus (MCRV) is the
50
main causative agent of a “sleeping disease” with about 70% mortality, which results
51
in large economic losses [2, 3]. MCRV is composed of 12 dsRNA segments with a
52
total length of 24.464 kb, and the majority of MCRV genes share low homology with
53
the counterpart genes of other reoviruses [3]. The innate immune response is the first
54
line of defense against microbial infections in crustaceans, this response is consisted
55
of humoral and cellular responses against viral infections [4-6]. The humoral immune
56
includes recognition to microbes, signal transduction and production of immune
57
effectors in fat body such as Drosophila, and the cellular immune includes
58
encapsulation, phagocytosis, coagulation and melanization [7]. The defense molecules
59
include phenoloxidases, clotting factors, complement factors, lectins, protease
60
inhibitors, antimicrobial peptides and Toll receptors [8]. Recently, more
61
immuno-related genes of mud crab have been cloned and characterized. For instance,
62
antioxidant enzyme gene CAT is cloned from mud crab and involved in immune
63
response after bacterial infection [9]. Lin et al [10] cloned and characterized a Toll
64
gene from mud crab, they also suggested a novel Toll homolog in crab and identified
65
a SNP with potential pathogen-resistant activities. However, little is known about the
66
global molecule mechanisms underlying the immune response to such dsRNA virus
67
infection in mud crab.
AC C
EP
TE D
M AN U
SC
RI PT
49
ACCEPTED MANUSCRIPT To date, next-generation sequencing technologies, such as the Solexa/Illumina
69
technology have developed rapidly and offered significant advantages in analyzing
70
the functional complexity of the transcriptome [11]. Next-generation sequencing
71
technology has provided data on sequencing polymorphisms and the levels of gene
72
expression involved in cellular development, cancer, immune responses and
73
reproduction [12-14]. For instance, in a transcriptome data obtained from the 454 GS
74
FLX sequencing system, a total of 4,021 gonad differentially, 10,522 ovary
75
specificially, and 19,013 testis-specifically expressed genes were identified after
76
comparing mud crab libraries [12]. In another study, Illumina paired-end sequencing
77
platform was used to investigate the molecular mechanisms of Scylla paramamosain
78
underlying the immune response to Vibrio parahaemolyticus. A total of 52,934,042
79
clean reads were obtained from the hemocytes of V. parahaemolyticus-infected mud
80
crabs and controls; these clean reads were assembled into 186,193 contigs and a total
81
of 538 significantly upregulated and 675 downregulated genes were identified from
82
pathogen-infected crabs [13]. Although the transcriptome information of mud crabs
83
mentioned above has been published, the genes expressing information about their
84
virus disease should still be studied. The gills of crustaceans play well-recognized
85
roles in respiration and acid–base balance [15], but they also play a less
86
well-understood role in immune response. In this paper, we have provided the
87
transcriptome profile of mud crab gills challenged with dsRNA virus MCRV.
88 89
AC C
EP
TE D
M AN U
SC
RI PT
68
In the present study, we used Illumina Hiseq 2000 to conduct a comparative transcriptome profiling analysis between the gills of MCRV-infected mud crab and
ACCEPTED MANUSCRIPT uninfected controls. This study aims to excavate potential immune-related genes in
91
mud crab and better understand the virus-host interaction. Moreover, the
92
high-throughput sequencing produced a large number of transcripts, thereby providing
93
a strong basis for future genomic research on crab.
94
Materials and Methods
95
Mud crab sample
Mud crabs (100±10 g) purchased from the mud crab cultivating area of Nansha in
SC
96
RI PT
90
Guangdong province were acclimated in the laboratory for a week in salinity 8‰ at
98
25°C. During the whole experimental period, all crabs were fed with oyster meat and
99
the water was changed every day. Mud crabs were detected free of MCRV by using
M AN U
97
PCR amplification [16]. Forty MCRV-free crabs were used in the experiment. A total
101
of twenty mud crabs were injected intramuscularly with 0.1 mL of MCRV at 104
102
copies/g body weight, and the remaining twenty other mud crabs were injected with
103
PBS only as control.
104
Gill collection and RNA extraction
EP
Five mud crabs were randomly selected from the experimental groups at 2, 4, 8
AC C
105
TE D
100
106
and 12h after challenge. Twenty crabs were all anesthetized on ice and dissected to
107
collect gill samples. Simultaneously, five mud crabs in the control group were
108
selected at 2, 4, 8, and 12h after PBS injection, and gill samples were subsequently
109
collected. All of the samples were immediately immersed in RNAlater (Ambion, USA)
110
at 4°C overnight and at -20°C until RNA extraction within 2 weeks. The total RNAs
111
of each group were obtained by using TRIzol reagent (Invitrogen, USA), following
ACCEPTED MANUSCRIPT the manufacturer’s instructions. An Agilent 2100 bioanalyzer was used to determine
113
the integrity and quality of the total RNA. The RNA with a RNA integrity number
114
value higher than seven was considered qualified for RNA-seq. For RNA library
115
construction and deep sequencing, equal quantities of RNA of each crab from control
116
and virus group were dissolved in RNase-free water and pooled in equal quantities to
117
generate the control and virus group samples.
118
cDNA library preparation and Illumina sequencing
SC
Poly(A) RNA was purified from total RNA using poly(dT) oligo-attached
M AN U
119
RI PT
112
magnetic beads, and full-length cDNAs were synthesized with a TruSeq RNA Sample
121
Preparation Kit (Illumina Inc., USA), according to the manufacturer's protocol. The
122
following procedures including RNA fragmentation, cDNA synthesis, size selection,
123
PCR amplification and RNA-seq were performed at Beijing Genome Institute
124
(Shenzhen, China). In a typical procedure, mRNAs were fragmented into small pieces
125
(200-700 nt) in the fragmentation buffer (Ambion, USA). Subsequently, random
126
hexamer primer was used to synthesize the first-strand cDNA using the cDNA
127
Synthesis Kit (Stratagene, USA), following the manufacturer’s protocol. The short
128
fragments were purified using the QiaQuick PCR extraction kit (Qiagen, Germany) to
129
repair the end by adding a poly(A) tail. Fifteen rounds of PCR amplification were
130
carried out to enrich the purified cDNA. The cDNA libraries were sequenced on a
131
high-throughput sequencing platform of Illumina HiSeqTM 2000.
132
Analysis of Illumina sequencing results
133
AC C
EP
TE D
120
A total of 57,062,102 and 58,517,020 raw reads for PBS (HA) and virus groups
ACCEPTED MANUSCRIPT (CA) were sequenced respectively. The Q20 value (representing the accuracy of the
135
sequencing) was higher than 96% for each sample. To filter the low-quality reads, the
136
reads with adapters were removed. The reads in which unknown bases were more
137
than 10% and with a percentage of low-quality (Q value < 20) bases higher than 30%
138
were also removed. After filtering, the remaining reads were called “clean reads” and
139
used for downstream bioinformatics analysis. The mean length of the clean reads was
140
approximately 90 nt with paired ends [17]. Transcriptome denovo assembly was
141
carried out using the short read assembling program (SOAPdenovo) -Trinity with
142
default parameters for the k-mer value (k=25) [18]. First, the clean reads were
143
combined with a certain length of overlap through SOAPdenovo to form longer
144
fragments without N. These formed fragments were called contigs. Subsequently, the
145
reads were mapped back to the contigs. This program could detect contigs from the
146
same transcript, as well as the distances between these contigs with paired-end reads.
147
Afterwards, scaffolds were constructed by using SOAPdenovo through connecting the
148
contigs with N to present unknown sequences between each two contigs in the same
149
transcript. The paired-end reads were used again for filling the gap between scaffolds
150
to obtain sequences that had least Ns and could not be extended on either end. Such
151
sequences are defined as unigenes [19-21]. Therefore, unigenes from HA or CA were
152
obtained, and the above two groups of unigenes (HA&CA) were merged to produce
153
long unigenes, which were called All-Unigenes.
154 155
AC C
EP
TE D
M AN U
SC
RI PT
134
All-Unigenes were subjected to blastx similarity search with E-value threshold of 1 × 10-5 against different databases, including Nr, Swiss-Prot, KEGG, COG and GO
ACCEPTED MANUSCRIPT databases [22-25]. When a unigene could not be aligned with any of the above
157
databases, ESTScan was used to predict the coding regions and determine the
158
sequence direction [26]. Based on Nr annotation, Blast2GO program was used to
159
obtain the GO annotation of unigenes [27]. Moreover, the GO functional
160
classification of All-Unigenes was performed using WEGO software so as to
161
understand the distribution of gene functions of the species at macro level [28]. To
162
determine the functions of the gene products in different processes, KEGG pathway
163
analysis was also performed [22].
164
Differential gene expression analysis
M AN U
SC
RI PT
156
For differential gene expression analysis, the reads per-kb per-million reads
166
(RPKM) method was used as the value of normalized gene expression levels [29].
167
The cutoff value to determine the gene transcriptional activity was determined based
168
on a 95% confidence interval for all of the RPKM values of each gene. The up- or
169
downregulation of a gene was decided according to its log2 (CA_RPKM/HA_RPKM)
170
value. When log2 (CA_RPKM/HA_RPKM) >0, then a gene was upregulated in CA.
171
When log2 (CA_RPKM/HA_RPKM) <0, then a gene was downregulated in CA. A
172
GO classification and KEGG pathway analysis of the differentially expressed
173
unigenes (DEG) were performed to obtain an overall understanding of the
174
transcriptome differences between these two samples. DEGs with at least three
175
database annotations were selected for further analysis. The proportions of up- and
176
downregulated unigenes of several important pathways were analyzed to obtain a
177
clear overview of the biological processes.
AC C
EP
TE D
165
ACCEPTED MANUSCRIPT 178 179
Confirmation using quantitative real-time PCR (qRT-PCR) To validate our Hiseq2000 sequencing data, 10 differentially expressed mud crab genes were selected randomly for qRT-PCR analysis. Primers were designed
181
using the Primer6 software (Premier Biosoft International) (S1 Table). The prepared
182
total RNA used in RT-PCR analysis was isolated from the same treated crab gills as
183
that in Illumina sequencing. Reverse transcription was performed with PrimeScript®
184
RT Reagent Kit (TaKaRa, Japan), according to the manufacturer’s instructions. qRT
185
-PCR was conducted with SYBR Premix Ex Taq (TaKaRa, Japan) on the LightCycle
186
480 System (Roche, Germany), according to the manufacturer’s instructions. 18s
187
rRNA (GenBank accession number: FJ646616.1) of mud crab was used as an internal
188
control [12] to normalize the expression level and all experiments were performed in
189
triplicate. PCR amplification was performed under the following conditions: 95°C for
190
2 min, followed by 40 cycles of 95°C for 15 s and 60°C for 1 min. To confirm that
191
only one PCR product was amplified, dissociation curve analysis of amplification
192
products was performed at the end of each PCR reaction. After the PCR program, data
193
were analyzed with LightCycler 480 software (Roche, Germany). Furthermore, the
194
comparative CT method (2-∆∆CT method) [30] was used to analyze the expression
195
level of different genes.
196
Results
197
RNA-seq Using Illumina Platform and Assembly of Unigenes
AC C
EP
TE D
M AN U
SC
RI PT
180
198
To reveal the differences in gene expression between the PBS and virus groups, a
199
pooled cDNA sample from each crab after 2, 4, 8, 12h injection, was sequenced using
ACCEPTED MANUSCRIPT the Illumina Hiseq 2000 sequencing platform. About 51.6 and 52.7 million clean
201
reads of 90 bp were obtained from HA and CA after filtering out the dirty raw reads
202
respectively. Finally, a total of 125,599 contigs with a mean length of 388 bp in HA,
203
and 106,587 contigs with an average length of 408 bp in CA were obtained (Table 1).
204
With paired-end read joining and gap-filling, the above contigs were further
205
assembled into scaffolds. With gap-filling and long sequence clustering, the scaffolds
206
were assembled into 81,901 HA- and 70,059 CA-Unigenes with a mean length of 640
207
and 700 bp, respectively. In addition, a total of 67,279 All-Unigenes were obtained
208
using the same strategy; these unigenes had a mean size of 878 bp from the HA clean
209
reads and CA clean reads combined (All-Unigenes originated from both HA and CA
210
reads; HA-Unigenes from HA reads; CA-Unigenes from CA reads). The unigene
211
sequence of could not be extended on either end and had least Ns. HA- and
212
CA-Unigenes had a similar distribution in length, with 200-400 bp sequences
213
representing the highest proportion, followed by 400-600 bp of sequences.
SC
M AN U
TE D
In total, All-Unigenes were better than HA- and CA-Unigenes in quality.
EP
214
RI PT
200
Altogether, 14,781 HA-Unigenes (18.05% of 81,901 Unigenes), 14,655 CA-Unigenes
216
(20.92% of 70,059 Unigenes), 18,539 All-Unigenes (27.56% of 67,279 Unigenes)
217
were higher than 1000 nt.
218
Annotation of All-Unigenes
219
AC C
215
Unigene annotation provides information on the expression and functional
220
annotation of the unigene. First, unigene sequences were aligned to protein databases
221
such as Nr, Swiss-Prot, KEGG, and COG, as well as nucleotide database Nt.
ACCEPTED MANUSCRIPT Consequently, proteins with the highest sequence similarity were obtained with the
223
given unigenes, along with their protein functional annotations. Finally, a total of
224
32,547 unigenes (48.38% of the total) were matched with at least one database. (Table
225
2).
226
RI PT
222
Among these unigenes, 27,532 (40.92% of all 67,279 All-Unigenes) unigenes
were annotated using blastx against Nr, with a cut-off E-value of 0.00001 (Figure 1A).
228
The E-value distributions of the unigenes in the Nr database showed that
229
approximately 17% of the unigenes had a strong similarity (smaller than 1e-100),
230
whereas the remaining 83% of the sequences ranged from 1e-5 to 1e-100. The ratios
231
of the similarity distributions demonstrated that 23.1% and 76.9% of the sequences
232
had a similarity over 60% and ranging within 11% to 60%, respectively (Figure 1B).
233
The species distributions for the most suitable match from each sequence are shown
234
in Figure 1C. A total of 2,146 unigenes were matched with Daphnia pulex (7.79% of
235
27,352 Nr-matched hits), which had the highest blast matched ratio of the matched
236
species. Unigenes were also aligned to other databases, including 22,936 (34.09% of
237
All-Unigenes) sequences in Swiss-Prot. In addition, approximately 20,260 (30.11% of
238
All-Unigenes), 13,039 (19.58%% of All-Unigenes) and 11,866 (17.64% of
239
All-Unigenes) sequences in KEGG, GO and COG sequences, respectively; these
240
sequences had the same identical cutoff E-value to supplement the annotations and
241
functions. Furthermore, about 15,824 (23.52% of All-Unigenes) sequences were
242
aligned in Nt through blastx.
243
AC C
EP
TE D
M AN U
SC
227
In the COG classification, 11,866 were categorized in 25 functional COG
ACCEPTED MANUSCRIPT clusters. The top five enriched categories were R (general function prediction,
245
15.95%), J (translation, ribosomal structure and biogenesis 10.40%), K (transcription
246
7.85%), L (replication, recombination, and repair 7.62%) and D (cell cycle control,
247
cell division, and chromosome partitioning, 6.51%) (Figure 2).
RI PT
244
In the GO classification, 13,039 unigenes from All-Unigenes were categorized
249
into 56 GO terms consisting of three domains: biological process, cellular component
250
and molecular function. The top five clustered classes in GO were “cellular process”
251
(63.05%), “catalytic activity” (51.14%), “binding” (49.80%), “metabolic process”
252
(49.59%) and “cell part” (49.39%).
M AN U
253
SC
248
In the KEGG pathway analysis, 20,260 unigenes were categorized into 258 KEGG pathway (level 3). These unigenes were included in 42 KEGG pathways of
255
level 2 and 6 KEGG pathways of level 1. The top five clustered level 3 pathways
256
were “metabolic pathways” (12.68%), “amoebiasis” (4.38%), “regulation of actin
257
cytoskeleton” (4.25%), “Vibrio cholerae infection” (3.88%) and “focal adhesion”
258
(3.55%).
259
Analysis of the DEGs between HA and CA
EP
AC C
260
TE D
254
Among All-Unigenes, 13,856 unigenes with a false discovery rate (FDR) ≤0.001
261
and |log2Ratio|≥1 were identified as DEGs [31], which accounted for 20.59% of
262
All-Unigenes. Otherwise, 8,001 DEGs (57.74% of total DEGs) were annotated by
263
matching with at least one database. Among the DEGs, 4,444 unigenes (32.07%) were
264
upregulated in CA, whereas 9,412 (67.93%) were downregulated (S1 Fig.). The
265
number of downregulated genes was more than twofold that of upregulated genes. To
ACCEPTED MANUSCRIPT further understand the transcriptional differences between HA and CA, the GO
267
classifications and KEGG pathway analysis were also performed for the DEGs.
268
According to the GO classification, 3,110 DEGs were categorized into 54 GO terms
269
(Figure 3). The top five clustered classes in GO were “catalytic activity” (62.70%),
270
“cellular process” (53.38%), “metabolic process” (45.79%), “binding” (45.69%), and
271
“cell part” (41.00%). The top five clustered classes in “molecular function
272
classification” of GO were “catalytic activity” (62.70%), “binding” (45.69%),
273
“transporter activity” (4.92%), “structural molecule activity” (4.73%), and “molecular
274
transducer activity” (1.48%). KEGG pathway analysis was performed to explore the
275
biological function of the significantly DEGs, and 20,260 DEGs were mapped to 255
276
level 3 KEGG pathways. These 255 level 3 KEGG pathways were included in 42
277
level 2 KEGG pathways, which could be clustered into six categories (Figure 4). The
278
top ten mapped pathways among these level 3 KEGG pathways are shown in Table 3.
279
“Metabolic pathways” enriched most DEGs (13.25%), which is probably because
280
metabolism is one of the most important pathways on the whole transcriptome
281
background. In addition, DEGs were obviously enriched in the pathways related to
282
different diseases or infections, such as “Amoebiasis”, “V. cholerae infection”,
283
“Huntington's disease”, “Epstein-Barr virus infection” and “Pathways in cancer”.
284
Interestingly, most of the unigenes mapped to the above pathways were dramatically
285
downregulated in CA. Except for the “ribosome” pathway, the number of
286
downregulated unigenes in top nine other DEG enriched pathways was more than that
287
of upregulated unigenes; this number for “ribosome” pathway was 27%.
AC C
EP
TE D
M AN U
SC
RI PT
266
ACCEPTED MANUSCRIPT 288
For further analysis, the DEG ratio of All-Unigenes was calculated in different pathways. After filtering out the pathways that enriched less than 20 DEGs, the
290
pathways with the significantly highest DEG ratio were observed (Table 4).
291
Remarkably, these top 10 pathways observed were divided into two categories. The
292
first category is related to disease resistance and immune, such as “antigen processing
293
and presentation”, “regulation of autophagy”, “proteasome”, “rheumatoid arthritis”,
294
“complement and coagulation cascades”. The second one is related to metabolism,
295
such as “aldosterone-regulated sodium reabsorption”, “proximal tubule bicarbonate
296
reclamation”, “drug metabolism-cytochrome P450, “metabolism of xenobiotics by
297
cytochrome P450”, and “steroid hormone biosynthesis”. In 8/10 of these top pathways,
298
the number of downregulated unigenes was more than 2.5-fold that of upregulated
299
unigenes. Furthermore, the DEGs in “proteasome” pathway were dramatically
300
downregulated. Otherwise, the number of downregulated and upregulated DEGs were
301
approximated in the pathway of “complement and coagulation cascades”. These
302
results might indicate that the immune system and metabolism of crabs were
303
influenced by the virus infection.
SC
M AN U
TE D
EP
AC C
304
RI PT
289
Among DEGs, the key genes involved in Wnt signaling pathway,
305
mitogen-activated protein kinase (MAPK) signaling pathway and apoptosis pathway
306
were focused. The components of these immune pathways are described in S2 Table.
307
Validation of Illumina sequencing via qRT-PCR
308 309
Ten unigenes were used for sequencing and computational analysis to confirm the characterizations of expression by using qRT-PCR analysis. The melting-curve
ACCEPTED MANUSCRIPT analysis of qRT-PCR demonstrated a single product for all tested genes. The
311
qRT-PCR results confirmed that the data obtained from Hiseq2000 sequencing
312
analysis showed similar trends in the up- or downregulation of host genes (Figure 5).
313
For example, based on Hiseq2000 sequencing analysis, serine protease,
314
prophenoloxidase and peroxinectin were regulated by 2.96, -2.71 and 3.11 log2-fold,
315
and they showed 3.29, -2.10 and 3.06 log2-fold changes, respectively in qRT-PCR
316
analyses (Figure 5). These results showed that qRT-PCR and Illumina RNA-seq
317
analyses of these had a good agreement, thereby indicating that the transcriptome data
318
can reflect an actual gene expression profile in MCRV-infected mud crabs.
319
Discussion
SC
M AN U
320
RI PT
310
Recently, high-throughput sequencing techniques, such as Solexa/Illumina (Illumina), 454 (Roche), and SOLID (ABI) have developed rapidly [32]. They
322
provide a rapid and high-throughput method to identify DEGs and their express
323
profile, particularly in species whose genomic information is unavailable [33-35].
324
These techniques are suitable for genomic research, such as denovo and re-sequencing
325
of genome, mRNA, and microRNA [21, 36-39]. In transcriptome analysis, the usage
326
of the next-generation sequencing techniques causes the easy establishment of cDNA
327
libraries without bacteria cells as carriers, which could introduce DNA fragment
328
deletion [40]. This denovo transcriptome sequencing technology has been used for
329
many crustaceans, such as Parhyale hawaiensis [41], Euphausia superb [42],
330
Macrobrachium rosenbergii [43] using 454 sequencing, Eriocheir sinensis [44],
331
Portunus trituberculatus [45], and Litopenaeus vannamei [46] using Illumina
AC C
EP
TE D
321
ACCEPTED MANUSCRIPT sequencing. In the present study, we used gills of mud crab to perform a genome-wide
333
investigation through transcriptional sequencing and gene expression profile analysis,
334
and the reliability and accuracy of qRT-PCR was confirmed. This study is the first
335
transcriptomic analysis on mud crab involved in MCRV infection.
RI PT
332
A total of 104.3 million clean reads from the gills of MCRV-infected mud crabs
337
and controls were obtained and assembled into 67,279 unigenes, and about 32,547 of
338
these unigenes were annotated as known proteins according to their homology. Xie et
339
al. performed a transcriptomic study in the hemocytes of mud crabs challenged with V.
340
parahaemolyticus and obtained 81,709 assembled unigenes [13]. In another study, the
341
hepatopancreas transcriptome of E. sinensis infected with a mixture of three pathogen
342
strains was analyzed by Illumina technique [6]. These studies have provided a source
343
for identification of novel genes and largely enriched the transcriptome data of crabs.
M AN U
TE D
344
SC
336
Previously, MCRV has caused serious problems in cultured mud crab [2]. Diseased crabs are sluggish and show loss of appetite and grey colouration. Moribund
346
crabs show an atrophied hepatopancreas and loose gills prominently [2]. In the current
347
transcriptomic analysis, a total of 13,856 DEGs, including 4,444 significantly
348
upregulated and 9,412 downregulated DEGs, were detected in diseased crabs. More
349
downregulated genes were observed than the upregulated ones, which implied that
350
gene expression was largely inhibited by MCRV infection, and biological functions
351
were impaired. According to the KEGG pathway analysis of DEGs, the top five
352
pathways were metabolic pathways, amoebiasis, V. cholerae infection, Huntington's
353
disease, and ribosome. Among these pathways, many unigenes were classified into
AC C
EP
345
ACCEPTED MANUSCRIPT the amoebiasis and “V. cholerae infection” KEGG pathways. Therefore, as an animal
355
living in water, mud crabs are vulnerable to bacterial and parasitic infection and may
356
have involved in complicated systems and signal pathway against infection when they
357
become infected with MCRV.
358
RI PT
354
As crab lack an acquired immune system, their defense depends entirely on an innate, non-adaptive mechanism to defend invasion of pathogens, such as virus,
360
bacteria, and parasite [47]. MAPK signaling is activated during virus infection and
361
contributes to virus replication in animal cells [48]. In the present study, many genes
362
involved in MAPK pathway were found to undergo expression changes in the gills
363
following MCRV injection. For instance, ATF2 (cyclic AMP-dependent transcription
364
factor), HGK (mitogen-activated protein kinase kinase kinase kinase 4) and ECSIT
365
(evolutionarily conserved signaling intermediate in Toll pathway) were up regulated
366
significantly, hence indicating that they might play an important role in response to
367
MCRV infection. However, the underlying molecular mechanism also remains unclear.
368
In mammals, ATF2 has both tumor-promoting as well as tumor-suppressive roles, as
369
well as crucial roles in oncogenic transformation and tumorigenesis [49]. To date, this
370
study is the first report showing the changes in the ATF2 level of MAPK pathway in
371
MCRV-infected crabs.
M AN U
TE D
EP
AC C
372
SC
359
Melanization, which is performed by PO and controlled by the proPO, is an
373
important immune response in invertebrates [50]. In our transcriptome data, unigenes
374
annotated as expressed PO and proPO were exhibited significantly after MCRV
375
infection. For instance, CL1697.Contig1_All, annotated as peroxinectin, was
ACCEPTED MANUSCRIPT upregulated in the MCRV-infected crab with the 3.52-fold changes (log2 ratio). PO is
377
a multifunctional immune component with both cell adhesive and peroxidase
378
activities, and it plays a crucial role in clearing the invading bacteria or parasites in
379
crustaceans [51, 52]. This component is also involved in the host response to exterior
380
stimulus, such as beads, lipopolysaccharide, and Aeromonas hydrophila [53].
381
Furthermore, the expression level of peroxinectin gene was upregulated in L.
382
vannamei after being challenged with V.alginolyticus at 6 h [54], and at 48 h post
383
WSSV challenge in S. paramamosain [55]. The upregulation of peroxinectin is
384
associated with enhanced encapsulation and phagocytic activity to boost pathogen
385
resistance [56]. Invading pathogens may initiate the proPO-activating system that
386
catalyzes proPO to active form PO, which is involved in wound healing and
387
sequestration of pathogens [57-61]. Unigene19990_All has been annotated as proPO,
388
which is downregulated with the 2.71-fold changes (log2 ratio) in mud crab after
389
being challenged with MCRV. proPO-b is also downregulated in WSSV-infected L.
390
vannamei [62]. A similar significant downregulation of proPO was observed in the
391
gills of MCRV-infected mud crab, which suggested that virus infection might inhibit
392
the expression of proPO and weaken the immune system of the crab.
SC
M AN U
TE D
EP
AC C
393
RI PT
376
In summary, this study focused on the difference of the transcriptome between
394
MCRV-infected and non-infected crabs. 13,856 differentially expressed genes
395
including 4,444 significantly upregulated and 9,412 downregulated DEGs were
396
obtained, which enriched in the pathways related to different diseases or infections.
397
Although the molecular functions of most genes and their associated pathways remain
ACCEPTED MANUSCRIPT 398
largely unknown, this study provides a valuable information on the antiviral
399
mechanism in crab. Furthermore, the result helps identify these novel genes in the
400
gills of mud crab in the absence of the genome database of crab.
RI PT
401
Support information
403
S1 Table. Primers used in qRT-PCR
404
S2 Table. Candidate genes involved in mud crab immune response
405
S1 Fig. Differences in DEGs expression profile between HA and CA. “DEGs”
406
indicates those unigenes with FDR≤0.001 and |log2Ratio|≥1.
407
Acknowledgments
408 409
This research was supported by the National Natural Science Foundation of China under Grant Nos. C190602.
M AN U
SC
402
References:
1. Ma H, Ma C, Li S, Jiang W, Li X, Liu Y, et al. Transcriptome Analysis of the Mud Crab (Scylla paramamosain) by 454 Deep Sequencing: Assembly, Annotation, and Marker Discovery. PLOS ONE. 2014 9.
2. Weng S, Guo Z, Sun J, Chan S, He J. A reovirus disease in cultured mud crab, Scylla serrata, in
EP
southern China. J FISH DIS. 2007 30:133-9.
3. Deng XX, Lu L, Ou YJ, Su HJ, Li G, Guo ZX, et al. Sequence analysis of 12 genome segments of mud crab reovirus (MCRV). VIROLOGY. 2012 422:185-94. 4. Hoffmann JA, Kafatos FC, Janeway CA, Ezekowitz RA. Phylogenetic perspectives in innate
AC C
411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430
TE D
410
immunity. SCIENCE. 1999 284:1313-8. 5. Li F, Xiang J. Recent advances in researches on the innate immunity of shrimp in China. DEV COMP IMMUNOL. 2013 39:11-26. 6. Li X, Cui Z, Liu Y, Song C, Shi G. Transcriptome Analysis and Discovery of Genes Involved in Immune Pathways from Hepatopancreas of Microbial Challenged Mitten Crab Eriocheir sinensis. PLOS ONE. 2013 8. 7. Lemaitre B, Hoffmann J. The host defense of Drosophila melanogaster. ANNU REV IMMUNOL. 2007 25:697-743. 8. Iwanaga S, Lee BL. Recent advances in the innate immunity of invertebrate animals. J Biochem Mol Biol. 2005 38:128-50. 9. Liu HP, Chen FY, Gopalakrishnan S, Qiao K, Bo J, Wang KJ. Antioxidant enzymes from the crab Scylla paramamosain: gene cloning and gene/protein expression profiles against LPS challenge. Fish
ACCEPTED MANUSCRIPT Shellfish Immunol. 2010 28:862-71. 10. Lin Z, Qiao J, Zhang Y, Guo L, Huang H, Yan F, et al. Cloning and characterisation of the SpToll gene from green mud crab, Scylla paramamosain. DEV COMP IMMUNOL. 2012 37:164-75. 11. Czesny S, Epifanio J, Michalak P. Genetic Divergence between Freshwater and Marine Morphs of Alewife (Alosa pseudoharengus): A 'Next-Generation' Sequencing Analysis. PLOS ONE. 2012 7. 12. Gao J, Wang X, Zou Z, Jia X, Wang Y, Zhang Z. Transcriptome analysis of the differences in gene expression between testis and ovary in green mud crab (Scylla paramamosain). BMC
RI PT
GENOMICS. 2014 15.
13. Xie C, Chen Y, Sun W, Ding J, Zhou L, Wang S, et al. Transcriptome and Expression Profiling Analysis of the Hemocytes Reveals a Large Number of Immune-Related Genes in Mud Crab Scylla paramamosain during Vibrio parahaemolyticus Infection. PLOS ONE. 2014 9.
14. Zhenzhen X, Ling X, Dengdong W, Chao F, Qiongyu L, Zihao L, et al. Transcriptome analysis of microsatellite markers. PLOS ONE. 2014 9:e109419.
SC
the Trachinotus ovatus: identification of reproduction, growth and immune-related genes and 15. Henry RP, Lucu C, Onken H, Weihrauch D. Multiple functions of the crustacean gill: FRONTIERS IN PHYSIOLOGY. 2012 3.
M AN U
osmotic/ionic regulation, acid-base balance, ammonia excretion, and bioaccumulation of toxic metals. 16. Guo Z, Weng S, Li G, Chan S, He J. Development of an RT-PCR detection method for mud crab reovirus. J VIROL METHODS. 2008 151:237-41.
17. Zhang YJ, Wang XJ, Wu JX, Chen SY, Chen H, Chai LJ, et al. Comparative transcriptome analyses between a spontaneous late-ripening sweet orange mutant and its wild type suggest the functions of ABA, sucrose and JA during citrus fruit ripening. PLOS ONE. 2014 9:e116056. 18. Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, et al. De novo assembly of human genomes with
TE D
massively parallel short read sequencing. GENOME RES. 2010 20:265-72. 19. Hao DC, Ge G, Xiao P, Zhang Y, Yang L. The first insight into the tissue specific taxus transcriptome via Illumina second generation sequencing. PLOS ONE. 2011 6:e21220. 20. Qin YF, Fang HM, Tian QN, Bao ZX, Lu P, Zhao JM, et al. Transcriptome profiling and digital gene expression by deep-sequencing in normal/regenerative tissues of planarian Dugesia japonica.
EP
GENOMICS. 2011 97:364-71.
21. Surget-Groba Y, Montoya-Burgos JI. Optimization of de novo transcriptome assembly from next-generation sequencing data. GENOME RES. 2010 20:1432-40. 22. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto Encyclopedia of
AC C
431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474
Genes and Genomes. NUCLEIC ACIDS RES. 1999 27:29-34. 23. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG database: an updated version includes eukaryotes. BMC BIOINFORMATICS. 2003 4:41. 24. Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for genome-scale analysis of protein functions and evolution. NUCLEIC ACIDS RES. 2000 28:33-6. 25. Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, et al. The COG database: new developments in phylogenetic classification of proteins from complete genomes. NUCLEIC ACIDS RES. 2001 29:22-8. 26. Iseli C, Jongeneel CV, Bucher P. ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. Proc Int Conf Intell Syst Mol Biol. 1999:138-48. 27. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for
ACCEPTED MANUSCRIPT annotation, visualization and analysis in functional genomics research. BIOINFORMATICS. 2005 21:3674-6. 28. Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, et al. WEGO: a web tool for plotting GO annotations. NUCLEIC ACIDS RES. 2006 34:W293-7. 29. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. NAT METHODS. 2008 5:621-8. PCR and the 2(T)(-Delta Delta C) method. METHODS. 2001 25:402-8.
RI PT
30. Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative 31. Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I. Controlling the false discovery rate in behavior genetics research. BEHAV BRAIN RES. 2001 125:279-84.
32. Metzker ML. Sequencing technologies - the next generation. NAT REV GENET. 2010 11:31-46. 33. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, et al. Genome
SC
sequencing in microfabricated high-density picolitre reactors. NATURE. 2005 437:376-80.
34. Liu L, Li Y, Li S, Hu N, He Y, Pong R, et al. Comparison of Next-Generation Sequencing Systems. JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY. 2012.
35. Guryev V, Cuppen E. Next-generation sequencing approaches in genetic rodent model systems to
M AN U
study functional effects of human genetic variation. FEBS LETT. 2009 583:1668-73. 36. Morozova O, Marra MA. Applications of next-generation sequencing technologies in functional genomics. GENOMICS. 2008 92:255-64.
37. Git A, Dvinge H, Salmon-Divon M, Osborne M, Kutter C, Hadfield J, et al. Systematic comparison of microarray profiling, real-time PCR, and next-generation sequencing technologies for measuring differential microRNA expression. RNA. 2010 16:991-1006.
38. Hashimoto S, Qu W, Ahsan B, Ogoshi K, Sasaki A, Nakatani Y, et al. High-resolution analysis of sequencer. PLOS ONE. 2009 4:e4108.
TE D
the 5'-end transcriptome using a next generation DNA
39. Li R, Fan W, Tian G, Zhu H, He L, Cai J, et al. The sequence and de novo assembly of the giant panda genome. NATURE. 2010 463:311-7.
40. Wall PK, Leebens-Mack J, Chanderbali AS, Barakat A, Wolcott E, Liang H, et al. Comparison of next generation sequencing technologies for transcriptome characterization. BMC GENOMICS. 2009
EP
10:347.
41. Zeng V, Villanueva KE, Ewen-Campen BS, Alwes F, Browne WE, Extavour CG. De novo assembly and characterization of a maternal and developmental transcriptome for the emerging model crustacean Parhyale hawaiensis. BMC GENOMICS. 2011 12:581.
AC C
475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518
42. Clark MS, Thorne MA, Toullec JY, Meng Y, Guan LL, Peck LS, et al. Antarctic krill 454 pyrosequencing reveals chaperone and stress transcriptome. PLOS ONE. 2011 6:e15919. 43. Jung H, Lyons RE, Dinh H, Hurwood DA, McWilliam S, Mather PB. Transcriptomics of a giant freshwater prawn (Macrobrachium rosenbergii): de novo
assembly, annotation and marker discovery.
PLOS ONE. 2011 6:e27938. 44. He L, Wang Q, Jin X, Wang Y, Chen L, Liu L, et al. Transcriptome profiling of testis during sexual maturation stages in Eriocheir sinensis using Illumina sequencing. PLOS ONE. 2012 7:e33735. 45. Lv J, Liu P, Wang Y, Gao B, Chen P, Li J. Transcriptome analysis of Portunus trituberculatus in response to salinity stress provides insights into the molecular basis of osmoregulation. PLOS ONE. 2013 8:e82155. 46. Li C, Weng S, Chen Y, Yu X, Lu L, Zhang H, et al. Analysis of Litopenaeus vannamei transcriptome using the next-generation DNA sequencing technique. PLOS ONE. 2012 7:e47442.
ACCEPTED MANUSCRIPT
557 558 559
47. Kurata S, Ariki S, Kawabata S. Recognition of pathogens and activation of immune responses in Drosophila and horseshoe crab innate immunity. IMMUNOBIOLOGY. 2006 211:237-49. 48. Andrade AA, Silva PN, Pereira AC, De Sousa LP, Ferreira PC, Gazzinelli RT, et al. The vaccinia virus-stimulated mitogen-activated protein kinase (MAPK) pathway is
required for virus
multiplication. BIOCHEM J. 2004 381:437-46. 49. Gozdecka M, Breitwieser W. The roles of ATF2 (activating transcription factor 2) in tumorigenesis. Biochem Soc Trans. 2012 40:230-4.
RI PT
50. Cerenius L, Lee BL, Soderhall K. The proPO-system: pros and cons for its role in invertebrate immunity. TRENDS IMMUNOL. 2008 29:263-71.
51. Johansson MW. Cell adhesion molecules in invertebrate immunity. DEV COMP IMMUNOL. 1999 23:303-15.
52. Jiravanichpaisal P, Lee BL, Soderhall K. Cell-mediated immunity in arthropods: hematopoiesis,
SC
coagulation, melanization and opsonization. IMMUNOBIOLOGY. 2006 211:213-36.
53. Lv S, Lu B, Xu J, Xu H, Zhao J, Li S, et al. Immune response of peroxinectin of Chinese mitten crab Eriocheir sinensis to exterior stimulation. DEV COMP IMMUNOL. 2015 51:56-64. 54. Liu CH, Cheng W, Chen JC. The peroxinectin of white shrimp Litopenaeus vannamei is
M AN U
synthesised in the semi-granular and granular cells, and its transcription is up-regulated with Vibrio alginolyticus infection. FISH SHELLFISH IMMUN. 2005 18:431-44.
55. Du Z, Ren Q, Huang A, Fang W, Zhou J, Gao L, et al. A novel peroxinectin involved in antiviral and antibacterial immunity of mud crab, Scylla paramamosain. MOL BIOL REP. 2013 40:6873-81. 56. Kobayashi M, Johansson MW, Ll KSD. The 76 kD cell-adhesion factor from crayfish haemocytes promotes encapsulation in vitro. CELL TISSUE RES. 1990 260:13-8.
57. Barrett FM. Wound-healing phenoloxidase in larval cuticle of Calpodes ethlius (Lepidoptera:
TE D
Hesperiidae). Canadian Journal of Zoology. 1984 62:834-8.
58. T L Hopkins A, Kramer KJ. Insect Cuticle Sclerotization. ANNU REV ENTOMOL. 2003 37:273-302.
59. Lai SC, Chen CC, Hou RF. Immunolocalization of prophenoloxidase in the process of wound 39:266-74.
EP
healing in the mosquito Armigeres subalbatus (Diptera : Culicidae). J MED ENTOMOL. 2002 60. Ratcliffe NA, Brookman JL, Rowley AF. Activation of the prophenoloxidase cascade and initiation of nodule formation in locusts by bacterial lipopolysaccharides. DEV COMP IMMUNOL. 1991 15:33-9.
AC C
519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556
61. Aspan A, Huang T, Cerenius L, Soderhall K. CDNA cloning of prophenoloxidase from the freshwater crayfish Pacifastacus leniusculus and its activation. P NATL ACAD SCI USA. 1995 92:939-43.
62. Ai H, Huang Y, Li S, Weng S, Yu X, He J. Characterization of a prophenoloxidase from hemocytes of the shrimp Litopenaeus vannamei that is down-regulated by white spot syndrome virus. FISH SHELLFISH IMMUN. 2008 25:28-39.
ACCEPTED MANUSCRIPT 560
Table legends
561
Table 1 Summary of the transcriptome assembly in HA and CA.
562
HA: PBS group
563
(HA&CA) were merged to produce long unigenes, called ALL-Unigenes.
564
Table 2 Summary of the annotations of S. paramamosain unigenes.
ALL: the above two groups of unigenes
RI PT
CA: virus group
565
Table 3 Summary of the top 10 DEGs enriched pathway by KEGG Pathway analysis.
SC
566
M AN U
567 568
Table 4 Summary of the pathway with top 10 DEGs ratio.
569
DEGs ratio= (the number of DEGs in the pathway/the number of mapped unigenes in
570
the pathway) %
TE D
571 572 573
Figure legends
575
Figure 1 Figures of Nr classification. (A) E-value distribution. (B) Similarity
576
distribution. (C) Species distribution.
AC C
577
EP
574
578
Figure 2 COG function classification of unigenes. Unigenes annotated by COG were
579
classified into 25 categories.
580 581
Figure 3 Histogram presentation of Gene Ontology classification of the DEGs
ACCEPTED MANUSCRIPT (HA-VS-CA). Major GO categories terms are shown in the x-axis and grouped into
583
three main ontologies: biological process, cellular component and molecular function.
584
The righty-axis indicates the number of genes in each category, while the left y-axis
585
indicates the percentage of total genes in that category.
586
A. biological process
587
B. cellular component
588
C. molecular function
SC
RI PT
582
M AN U
589
Figure 4 Histogram presentation of KEGG pathway (lv2) analysis of the DEGs
591
(HA-VS-CA). These KEGG pathway (level 2) are clustered into 6 KEGG pathway
592
(level 1).
593
A. Organismal Systems;
594
B. Metabolism;
595
C. Human Diseases;
596
D. Genetic Information Processing;
597
E. Environmental Information Processing;
598
F. Cellular Processes.
EP
AC C
599
TE D
590
600
Figure 5 Comparison of the expression profiles of 10 selected genes as determined by
601
Hiseq2000 sequencing and qRT-PCR. Gene abbreviations are as follows: Cu/Zn SOD,
602
copper/zinc
603
antilipopolysaccharide factor; HSP90, heat shock protein 90
superoxide;
FABP,
fatty
acid
binding
protein;
ALF,
ACCEPTED MANUSCRIPT
Table 1
HA
81,901
Total Length(nt) 52,431,015
CA
70,059
49,072,111
ALL
67,279
59,103,964
HA
125,599
48,695,656
CA
106,587
43,491,909
Mean Length(nt)
N50
640
1,239
700
1,442
878
1,601
388
708
408
812
AC C
EP
TE D
M AN U
SC
Contigs
Total Number
RI PT
Unigenes
Sample
ACCEPTED MANUSCRIPT
Table 2 Number of
Ratio of
Annotation
blasted Unigenes
ALL-Unigenes
NR
27,532
40.92%
NT
15,824
23.52%
Swiss-Prot
22,936
34.09%
KEGG
20,260
30.11%
COG
11,866
17.64%
GO
13,039
total
32,547
SC
M AN U
TE D EP AC C
RI PT
Database for
19.38%
48.38%
ACCEPTED MANUSCRIPT
#Pathway
Up
Down
DEGs
RATIO in ALL DEGs
Metabolic pathways
164
490
654
13.25%
Amoebiasis
87
147
234
4.74%
Vibrio cholerae infection
77
147
224
Huntington's disease
29
193
222
Ribosome
96
122
218
RI PT
Table 3
Regulation of actin cytoskeleton
50
134
184
3.73%
Phagosome
17
143
160
3.24%
Epstein-Barr virus infection
34
Focal adhesion
35
Pathways in cancer
42
4.50%
M AN U
SC
4.42%
126
160
3.24%
120
155
3.14%
108
150
3.04%
TE D EP AC C
4.54%
ACCEPTED MANUSCRIPT
Table 4 Down/ #Pathway
Up
Down
DEGs
All-Unigenes
DEG/ALL Up
69
80
156
1
35
36
73
0
46
46
101
5
29
7
54
and presentation Regulation of
Proteasome Aldosterone-regulate d sodium reabsorption
Proximal tubule
reclamation
5
22
EP
bicarbonate
6.27
49.32%
35.00
45.54%
--
34
75
45.33%
5.80
61
135
45.19%
7.71
27
61
44.26%
4.40
TE D
Rheumatoid arthritis
M AN U
autophagy
51.28%
SC
11
RI PT
Antigen processing
AC C
Drug metabolism -
6
24
30
69
43.48%
4.00
24
22
46
108
42.59%
0.92
7
24
31
73
42.47%
3.43
cytochrome P450 Complement and
coagulation cascades Metabolism of xenobiotics by cytochrome P450
ACCEPTED MANUSCRIPT Steroid hormone 6
15
21
51
41.18%
2.50
AC C
EP
TE D
M AN U
SC
RI PT
biosynthesis
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
AC C
EP
TE D
M AN U
SC
RI PT
ACCEPTED MANUSCRIPT
ACCEPTED MANUSCRIPT Highlights:
RI PT
SC
M AN U
TE D
EP
First transcriptome of Mud crab gills response to Mud crab reovirus infection using Illumina Hiseq 2000. A total of 104.3 million clean reads were obtained, and 81,901, 70,059 and 67,279 unigenes were gained respectively from HA reads, CA reads and HA&CA reads. About 13,856 DEGs were detected in diseased crabs compared with the control, including 4,444 significantly upregulated and 9,412 downregulated DEGs. Ten of differently expressed genes were selected for validation with RT-qPCR. Immune-related genes involved in Wnt signaling pathway, MAPK signaling pathway and apoptosis pathway were focused.
AC C