Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to Mud crab reovirus (MCRV)

Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to Mud crab reovirus (MCRV)

Accepted Manuscript Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to Mud crab reovirus (MCRV) Shanshan Liu, Guanxing Chen...

1MB Sizes 67 Downloads 190 Views

Accepted Manuscript Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to Mud crab reovirus (MCRV) Shanshan Liu, Guanxing Chen, Haidong Xu, Weibin Zou, Wenrui Yan, Qianqian Wang, Hengwei Deng, Heqian Zhang, Guojiao Yu, Jianguo He, Shaoping Weng PII:

S1050-4648(16)30467-3

DOI:

10.1016/j.fsi.2016.07.033

Reference:

YFSIM 4092

To appear in:

Fish and Shellfish Immunology

Received Date: 7 May 2016 Revised Date:

22 July 2016

Accepted Date: 31 July 2016

Please cite this article as: Liu S, Chen G, Xu H, Zou W, Yan W, Wang Q, Deng H, Zhang H, Yu G, He J, Weng S, Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to Mud crab reovirus (MCRV), Fish and Shellfish Immunology (2016), doi: 10.1016/j.fsi.2016.07.033. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT 1

Transcriptome analysis of mud crab (Scylla paramamosain) gills in response to

2

Mud crab reovirus (MCRV)

3

Shanshan Liu

4

Qianqian Wang a,b, Hengwei Deng a,b, Heqian Zhang a,b, Guojiao Yu a,b, Jianguo He a,b,c,,

5

Shaoping Weng a,b*

6

a

7

Biocontrol, School of Life Sciences, Sun Yat-sen University, Guangzhou, PR China

8

b

9

Aquatic Economic Animals, Sun Yat-sen University, Guangzhou, PR China c

SC

Institute of Aquatic Economic Animals and Guangdong Province Key Laboratory for

School of Marine Sciences, Sun Yat-sen University, Guangzhou, PR China

TE D

12 13 14

EP

15

19

AC C

16

18

, Haidong Xu c, Weibin Zou a,b, Wenrui Yan a,b,

MOE Key Laboratory of Aquatic Product Safety/State Key Laboratory for

11

17

a,b

RI PT

, Guanxing Chen

M AN U

10

a,b

* Corresponding author: MOE Key Laboratory of Aquatic Product Safety/State

20

Key Laboratory for Biocontrol, School of Life Sciences, Sun Yat-sen University, 135

21

Xingang Road West, Guangzhou 510275, People's Republic of China. Fax: + 86 20

22

84113229.

23

E-mail address: [email protected] (S. Weng).

ACCEPTED MANUSCRIPT 24

25

Abstract

Mud crab (Scylla paramamosain) is an economically important marine cultured species in China’s coastal area. Mud crab reovirus (MCRV) is the most important

27

pathogen of mud crab, resulting in large economic losses in crab farming. In this

28

paper, next-generation sequencing technology and bioinformatics analysis are used to

29

study transcriptome differences between MCRV-infected mud crab and normal

30

control. A total of 104.3 million clean reads were obtained, including 52.7 million and

31

51.6 million clean reads from MCRV-infected (CA) and controlled (HA) mud crabs

32

respectively. 81,901, 70,059 and 67,279 unigenes were gained respectively from HA

33

reads, CA reads and HA&CA reads. A total of 32,547 unigenes from HA&CA reads

34

called All-Unigenes were matched to at least one database among Nr, Nt, Swiss-prot,

35

COG, GO and KEGG databases. Among these, 13,039, 20,260 and 11,866 unigenes

36

belonged to the 3, 258 and 25 categories of GO, KEGG pathway, and COG databases,

37

respectively. Solexa/Illumina’s DGE platform was also used, and about 13,856

38

differentially expressed genes (DEGs), including 4,444 significantly upregulated and

39

9,412 downregulated DEGs were detected in diseased crabs compared with the

40

control. KEGG pathway analysis revealed that DEGs were obviously enriched in the

41

pathways related to different diseases or infections. This transcriptome analysis

42

provided valuable information on gene functions associated with the response to

43

MCRV in mud crab, as well as detail information for identifying novel genes in the

44

absence of the mud crab genome database.

45

AC C

EP

TE D

M AN U

SC

RI PT

26

ACCEPTED MANUSCRIPT 46

Key words: mud crab, MCRV, transcriptome, immune, KEGG pathway

47

Introduction

48

Mud crab is an economically important marine cultured species that is widely distributed in southeast China coasts [1]. However, Mud crab reovirus (MCRV) is the

50

main causative agent of a “sleeping disease” with about 70% mortality, which results

51

in large economic losses [2, 3]. MCRV is composed of 12 dsRNA segments with a

52

total length of 24.464 kb, and the majority of MCRV genes share low homology with

53

the counterpart genes of other reoviruses [3]. The innate immune response is the first

54

line of defense against microbial infections in crustaceans, this response is consisted

55

of humoral and cellular responses against viral infections [4-6]. The humoral immune

56

includes recognition to microbes, signal transduction and production of immune

57

effectors in fat body such as Drosophila, and the cellular immune includes

58

encapsulation, phagocytosis, coagulation and melanization [7]. The defense molecules

59

include phenoloxidases, clotting factors, complement factors, lectins, protease

60

inhibitors, antimicrobial peptides and Toll receptors [8]. Recently, more

61

immuno-related genes of mud crab have been cloned and characterized. For instance,

62

antioxidant enzyme gene CAT is cloned from mud crab and involved in immune

63

response after bacterial infection [9]. Lin et al [10] cloned and characterized a Toll

64

gene from mud crab, they also suggested a novel Toll homolog in crab and identified

65

a SNP with potential pathogen-resistant activities. However, little is known about the

66

global molecule mechanisms underlying the immune response to such dsRNA virus

67

infection in mud crab.

AC C

EP

TE D

M AN U

SC

RI PT

49

ACCEPTED MANUSCRIPT To date, next-generation sequencing technologies, such as the Solexa/Illumina

69

technology have developed rapidly and offered significant advantages in analyzing

70

the functional complexity of the transcriptome [11]. Next-generation sequencing

71

technology has provided data on sequencing polymorphisms and the levels of gene

72

expression involved in cellular development, cancer, immune responses and

73

reproduction [12-14]. For instance, in a transcriptome data obtained from the 454 GS

74

FLX sequencing system, a total of 4,021 gonad differentially, 10,522 ovary

75

specificially, and 19,013 testis-specifically expressed genes were identified after

76

comparing mud crab libraries [12]. In another study, Illumina paired-end sequencing

77

platform was used to investigate the molecular mechanisms of Scylla paramamosain

78

underlying the immune response to Vibrio parahaemolyticus. A total of 52,934,042

79

clean reads were obtained from the hemocytes of V. parahaemolyticus-infected mud

80

crabs and controls; these clean reads were assembled into 186,193 contigs and a total

81

of 538 significantly upregulated and 675 downregulated genes were identified from

82

pathogen-infected crabs [13]. Although the transcriptome information of mud crabs

83

mentioned above has been published, the genes expressing information about their

84

virus disease should still be studied. The gills of crustaceans play well-recognized

85

roles in respiration and acid–base balance [15], but they also play a less

86

well-understood role in immune response. In this paper, we have provided the

87

transcriptome profile of mud crab gills challenged with dsRNA virus MCRV.

88 89

AC C

EP

TE D

M AN U

SC

RI PT

68

In the present study, we used Illumina Hiseq 2000 to conduct a comparative transcriptome profiling analysis between the gills of MCRV-infected mud crab and

ACCEPTED MANUSCRIPT uninfected controls. This study aims to excavate potential immune-related genes in

91

mud crab and better understand the virus-host interaction. Moreover, the

92

high-throughput sequencing produced a large number of transcripts, thereby providing

93

a strong basis for future genomic research on crab.

94

Materials and Methods

95

Mud crab sample

Mud crabs (100±10 g) purchased from the mud crab cultivating area of Nansha in

SC

96

RI PT

90

Guangdong province were acclimated in the laboratory for a week in salinity 8‰ at

98

25°C. During the whole experimental period, all crabs were fed with oyster meat and

99

the water was changed every day. Mud crabs were detected free of MCRV by using

M AN U

97

PCR amplification [16]. Forty MCRV-free crabs were used in the experiment. A total

101

of twenty mud crabs were injected intramuscularly with 0.1 mL of MCRV at 104

102

copies/g body weight, and the remaining twenty other mud crabs were injected with

103

PBS only as control.

104

Gill collection and RNA extraction

EP

Five mud crabs were randomly selected from the experimental groups at 2, 4, 8

AC C

105

TE D

100

106

and 12h after challenge. Twenty crabs were all anesthetized on ice and dissected to

107

collect gill samples. Simultaneously, five mud crabs in the control group were

108

selected at 2, 4, 8, and 12h after PBS injection, and gill samples were subsequently

109

collected. All of the samples were immediately immersed in RNAlater (Ambion, USA)

110

at 4°C overnight and at -20°C until RNA extraction within 2 weeks. The total RNAs

111

of each group were obtained by using TRIzol reagent (Invitrogen, USA), following

ACCEPTED MANUSCRIPT the manufacturer’s instructions. An Agilent 2100 bioanalyzer was used to determine

113

the integrity and quality of the total RNA. The RNA with a RNA integrity number

114

value higher than seven was considered qualified for RNA-seq. For RNA library

115

construction and deep sequencing, equal quantities of RNA of each crab from control

116

and virus group were dissolved in RNase-free water and pooled in equal quantities to

117

generate the control and virus group samples.

118

cDNA library preparation and Illumina sequencing

SC

Poly(A) RNA was purified from total RNA using poly(dT) oligo-attached

M AN U

119

RI PT

112

magnetic beads, and full-length cDNAs were synthesized with a TruSeq RNA Sample

121

Preparation Kit (Illumina Inc., USA), according to the manufacturer's protocol. The

122

following procedures including RNA fragmentation, cDNA synthesis, size selection,

123

PCR amplification and RNA-seq were performed at Beijing Genome Institute

124

(Shenzhen, China). In a typical procedure, mRNAs were fragmented into small pieces

125

(200-700 nt) in the fragmentation buffer (Ambion, USA). Subsequently, random

126

hexamer primer was used to synthesize the first-strand cDNA using the cDNA

127

Synthesis Kit (Stratagene, USA), following the manufacturer’s protocol. The short

128

fragments were purified using the QiaQuick PCR extraction kit (Qiagen, Germany) to

129

repair the end by adding a poly(A) tail. Fifteen rounds of PCR amplification were

130

carried out to enrich the purified cDNA. The cDNA libraries were sequenced on a

131

high-throughput sequencing platform of Illumina HiSeqTM 2000.

132

Analysis of Illumina sequencing results

133

AC C

EP

TE D

120

A total of 57,062,102 and 58,517,020 raw reads for PBS (HA) and virus groups

ACCEPTED MANUSCRIPT (CA) were sequenced respectively. The Q20 value (representing the accuracy of the

135

sequencing) was higher than 96% for each sample. To filter the low-quality reads, the

136

reads with adapters were removed. The reads in which unknown bases were more

137

than 10% and with a percentage of low-quality (Q value < 20) bases higher than 30%

138

were also removed. After filtering, the remaining reads were called “clean reads” and

139

used for downstream bioinformatics analysis. The mean length of the clean reads was

140

approximately 90 nt with paired ends [17]. Transcriptome denovo assembly was

141

carried out using the short read assembling program (SOAPdenovo) -Trinity with

142

default parameters for the k-mer value (k=25) [18]. First, the clean reads were

143

combined with a certain length of overlap through SOAPdenovo to form longer

144

fragments without N. These formed fragments were called contigs. Subsequently, the

145

reads were mapped back to the contigs. This program could detect contigs from the

146

same transcript, as well as the distances between these contigs with paired-end reads.

147

Afterwards, scaffolds were constructed by using SOAPdenovo through connecting the

148

contigs with N to present unknown sequences between each two contigs in the same

149

transcript. The paired-end reads were used again for filling the gap between scaffolds

150

to obtain sequences that had least Ns and could not be extended on either end. Such

151

sequences are defined as unigenes [19-21]. Therefore, unigenes from HA or CA were

152

obtained, and the above two groups of unigenes (HA&CA) were merged to produce

153

long unigenes, which were called All-Unigenes.

154 155

AC C

EP

TE D

M AN U

SC

RI PT

134

All-Unigenes were subjected to blastx similarity search with E-value threshold of 1 × 10-5 against different databases, including Nr, Swiss-Prot, KEGG, COG and GO

ACCEPTED MANUSCRIPT databases [22-25]. When a unigene could not be aligned with any of the above

157

databases, ESTScan was used to predict the coding regions and determine the

158

sequence direction [26]. Based on Nr annotation, Blast2GO program was used to

159

obtain the GO annotation of unigenes [27]. Moreover, the GO functional

160

classification of All-Unigenes was performed using WEGO software so as to

161

understand the distribution of gene functions of the species at macro level [28]. To

162

determine the functions of the gene products in different processes, KEGG pathway

163

analysis was also performed [22].

164

Differential gene expression analysis

M AN U

SC

RI PT

156

For differential gene expression analysis, the reads per-kb per-million reads

166

(RPKM) method was used as the value of normalized gene expression levels [29].

167

The cutoff value to determine the gene transcriptional activity was determined based

168

on a 95% confidence interval for all of the RPKM values of each gene. The up- or

169

downregulation of a gene was decided according to its log2 (CA_RPKM/HA_RPKM)

170

value. When log2 (CA_RPKM/HA_RPKM) >0, then a gene was upregulated in CA.

171

When log2 (CA_RPKM/HA_RPKM) <0, then a gene was downregulated in CA. A

172

GO classification and KEGG pathway analysis of the differentially expressed

173

unigenes (DEG) were performed to obtain an overall understanding of the

174

transcriptome differences between these two samples. DEGs with at least three

175

database annotations were selected for further analysis. The proportions of up- and

176

downregulated unigenes of several important pathways were analyzed to obtain a

177

clear overview of the biological processes.

AC C

EP

TE D

165

ACCEPTED MANUSCRIPT 178 179

Confirmation using quantitative real-time PCR (qRT-PCR) To validate our Hiseq2000 sequencing data, 10 differentially expressed mud crab genes were selected randomly for qRT-PCR analysis. Primers were designed

181

using the Primer6 software (Premier Biosoft International) (S1 Table). The prepared

182

total RNA used in RT-PCR analysis was isolated from the same treated crab gills as

183

that in Illumina sequencing. Reverse transcription was performed with PrimeScript®

184

RT Reagent Kit (TaKaRa, Japan), according to the manufacturer’s instructions. qRT

185

-PCR was conducted with SYBR Premix Ex Taq (TaKaRa, Japan) on the LightCycle

186

480 System (Roche, Germany), according to the manufacturer’s instructions. 18s

187

rRNA (GenBank accession number: FJ646616.1) of mud crab was used as an internal

188

control [12] to normalize the expression level and all experiments were performed in

189

triplicate. PCR amplification was performed under the following conditions: 95°C for

190

2 min, followed by 40 cycles of 95°C for 15 s and 60°C for 1 min. To confirm that

191

only one PCR product was amplified, dissociation curve analysis of amplification

192

products was performed at the end of each PCR reaction. After the PCR program, data

193

were analyzed with LightCycler 480 software (Roche, Germany). Furthermore, the

194

comparative CT method (2-∆∆CT method) [30] was used to analyze the expression

195

level of different genes.

196

Results

197

RNA-seq Using Illumina Platform and Assembly of Unigenes

AC C

EP

TE D

M AN U

SC

RI PT

180

198

To reveal the differences in gene expression between the PBS and virus groups, a

199

pooled cDNA sample from each crab after 2, 4, 8, 12h injection, was sequenced using

ACCEPTED MANUSCRIPT the Illumina Hiseq 2000 sequencing platform. About 51.6 and 52.7 million clean

201

reads of 90 bp were obtained from HA and CA after filtering out the dirty raw reads

202

respectively. Finally, a total of 125,599 contigs with a mean length of 388 bp in HA,

203

and 106,587 contigs with an average length of 408 bp in CA were obtained (Table 1).

204

With paired-end read joining and gap-filling, the above contigs were further

205

assembled into scaffolds. With gap-filling and long sequence clustering, the scaffolds

206

were assembled into 81,901 HA- and 70,059 CA-Unigenes with a mean length of 640

207

and 700 bp, respectively. In addition, a total of 67,279 All-Unigenes were obtained

208

using the same strategy; these unigenes had a mean size of 878 bp from the HA clean

209

reads and CA clean reads combined (All-Unigenes originated from both HA and CA

210

reads; HA-Unigenes from HA reads; CA-Unigenes from CA reads). The unigene

211

sequence of could not be extended on either end and had least Ns. HA- and

212

CA-Unigenes had a similar distribution in length, with 200-400 bp sequences

213

representing the highest proportion, followed by 400-600 bp of sequences.

SC

M AN U

TE D

In total, All-Unigenes were better than HA- and CA-Unigenes in quality.

EP

214

RI PT

200

Altogether, 14,781 HA-Unigenes (18.05% of 81,901 Unigenes), 14,655 CA-Unigenes

216

(20.92% of 70,059 Unigenes), 18,539 All-Unigenes (27.56% of 67,279 Unigenes)

217

were higher than 1000 nt.

218

Annotation of All-Unigenes

219

AC C

215

Unigene annotation provides information on the expression and functional

220

annotation of the unigene. First, unigene sequences were aligned to protein databases

221

such as Nr, Swiss-Prot, KEGG, and COG, as well as nucleotide database Nt.

ACCEPTED MANUSCRIPT Consequently, proteins with the highest sequence similarity were obtained with the

223

given unigenes, along with their protein functional annotations. Finally, a total of

224

32,547 unigenes (48.38% of the total) were matched with at least one database. (Table

225

2).

226

RI PT

222

Among these unigenes, 27,532 (40.92% of all 67,279 All-Unigenes) unigenes

were annotated using blastx against Nr, with a cut-off E-value of 0.00001 (Figure 1A).

228

The E-value distributions of the unigenes in the Nr database showed that

229

approximately 17% of the unigenes had a strong similarity (smaller than 1e-100),

230

whereas the remaining 83% of the sequences ranged from 1e-5 to 1e-100. The ratios

231

of the similarity distributions demonstrated that 23.1% and 76.9% of the sequences

232

had a similarity over 60% and ranging within 11% to 60%, respectively (Figure 1B).

233

The species distributions for the most suitable match from each sequence are shown

234

in Figure 1C. A total of 2,146 unigenes were matched with Daphnia pulex (7.79% of

235

27,352 Nr-matched hits), which had the highest blast matched ratio of the matched

236

species. Unigenes were also aligned to other databases, including 22,936 (34.09% of

237

All-Unigenes) sequences in Swiss-Prot. In addition, approximately 20,260 (30.11% of

238

All-Unigenes), 13,039 (19.58%% of All-Unigenes) and 11,866 (17.64% of

239

All-Unigenes) sequences in KEGG, GO and COG sequences, respectively; these

240

sequences had the same identical cutoff E-value to supplement the annotations and

241

functions. Furthermore, about 15,824 (23.52% of All-Unigenes) sequences were

242

aligned in Nt through blastx.

243

AC C

EP

TE D

M AN U

SC

227

In the COG classification, 11,866 were categorized in 25 functional COG

ACCEPTED MANUSCRIPT clusters. The top five enriched categories were R (general function prediction,

245

15.95%), J (translation, ribosomal structure and biogenesis 10.40%), K (transcription

246

7.85%), L (replication, recombination, and repair 7.62%) and D (cell cycle control,

247

cell division, and chromosome partitioning, 6.51%) (Figure 2).

RI PT

244

In the GO classification, 13,039 unigenes from All-Unigenes were categorized

249

into 56 GO terms consisting of three domains: biological process, cellular component

250

and molecular function. The top five clustered classes in GO were “cellular process”

251

(63.05%), “catalytic activity” (51.14%), “binding” (49.80%), “metabolic process”

252

(49.59%) and “cell part” (49.39%).

M AN U

253

SC

248

In the KEGG pathway analysis, 20,260 unigenes were categorized into 258 KEGG pathway (level 3). These unigenes were included in 42 KEGG pathways of

255

level 2 and 6 KEGG pathways of level 1. The top five clustered level 3 pathways

256

were “metabolic pathways” (12.68%), “amoebiasis” (4.38%), “regulation of actin

257

cytoskeleton” (4.25%), “Vibrio cholerae infection” (3.88%) and “focal adhesion”

258

(3.55%).

259

Analysis of the DEGs between HA and CA

EP

AC C

260

TE D

254

Among All-Unigenes, 13,856 unigenes with a false discovery rate (FDR) ≤0.001

261

and |log2Ratio|≥1 were identified as DEGs [31], which accounted for 20.59% of

262

All-Unigenes. Otherwise, 8,001 DEGs (57.74% of total DEGs) were annotated by

263

matching with at least one database. Among the DEGs, 4,444 unigenes (32.07%) were

264

upregulated in CA, whereas 9,412 (67.93%) were downregulated (S1 Fig.). The

265

number of downregulated genes was more than twofold that of upregulated genes. To

ACCEPTED MANUSCRIPT further understand the transcriptional differences between HA and CA, the GO

267

classifications and KEGG pathway analysis were also performed for the DEGs.

268

According to the GO classification, 3,110 DEGs were categorized into 54 GO terms

269

(Figure 3). The top five clustered classes in GO were “catalytic activity” (62.70%),

270

“cellular process” (53.38%), “metabolic process” (45.79%), “binding” (45.69%), and

271

“cell part” (41.00%). The top five clustered classes in “molecular function

272

classification” of GO were “catalytic activity” (62.70%), “binding” (45.69%),

273

“transporter activity” (4.92%), “structural molecule activity” (4.73%), and “molecular

274

transducer activity” (1.48%). KEGG pathway analysis was performed to explore the

275

biological function of the significantly DEGs, and 20,260 DEGs were mapped to 255

276

level 3 KEGG pathways. These 255 level 3 KEGG pathways were included in 42

277

level 2 KEGG pathways, which could be clustered into six categories (Figure 4). The

278

top ten mapped pathways among these level 3 KEGG pathways are shown in Table 3.

279

“Metabolic pathways” enriched most DEGs (13.25%), which is probably because

280

metabolism is one of the most important pathways on the whole transcriptome

281

background. In addition, DEGs were obviously enriched in the pathways related to

282

different diseases or infections, such as “Amoebiasis”, “V. cholerae infection”,

283

“Huntington's disease”, “Epstein-Barr virus infection” and “Pathways in cancer”.

284

Interestingly, most of the unigenes mapped to the above pathways were dramatically

285

downregulated in CA. Except for the “ribosome” pathway, the number of

286

downregulated unigenes in top nine other DEG enriched pathways was more than that

287

of upregulated unigenes; this number for “ribosome” pathway was 27%.

AC C

EP

TE D

M AN U

SC

RI PT

266

ACCEPTED MANUSCRIPT 288

For further analysis, the DEG ratio of All-Unigenes was calculated in different pathways. After filtering out the pathways that enriched less than 20 DEGs, the

290

pathways with the significantly highest DEG ratio were observed (Table 4).

291

Remarkably, these top 10 pathways observed were divided into two categories. The

292

first category is related to disease resistance and immune, such as “antigen processing

293

and presentation”, “regulation of autophagy”, “proteasome”, “rheumatoid arthritis”,

294

“complement and coagulation cascades”. The second one is related to metabolism,

295

such as “aldosterone-regulated sodium reabsorption”, “proximal tubule bicarbonate

296

reclamation”, “drug metabolism-cytochrome P450, “metabolism of xenobiotics by

297

cytochrome P450”, and “steroid hormone biosynthesis”. In 8/10 of these top pathways,

298

the number of downregulated unigenes was more than 2.5-fold that of upregulated

299

unigenes. Furthermore, the DEGs in “proteasome” pathway were dramatically

300

downregulated. Otherwise, the number of downregulated and upregulated DEGs were

301

approximated in the pathway of “complement and coagulation cascades”. These

302

results might indicate that the immune system and metabolism of crabs were

303

influenced by the virus infection.

SC

M AN U

TE D

EP

AC C

304

RI PT

289

Among DEGs, the key genes involved in Wnt signaling pathway,

305

mitogen-activated protein kinase (MAPK) signaling pathway and apoptosis pathway

306

were focused. The components of these immune pathways are described in S2 Table.

307

Validation of Illumina sequencing via qRT-PCR

308 309

Ten unigenes were used for sequencing and computational analysis to confirm the characterizations of expression by using qRT-PCR analysis. The melting-curve

ACCEPTED MANUSCRIPT analysis of qRT-PCR demonstrated a single product for all tested genes. The

311

qRT-PCR results confirmed that the data obtained from Hiseq2000 sequencing

312

analysis showed similar trends in the up- or downregulation of host genes (Figure 5).

313

For example, based on Hiseq2000 sequencing analysis, serine protease,

314

prophenoloxidase and peroxinectin were regulated by 2.96, -2.71 and 3.11 log2-fold,

315

and they showed 3.29, -2.10 and 3.06 log2-fold changes, respectively in qRT-PCR

316

analyses (Figure 5). These results showed that qRT-PCR and Illumina RNA-seq

317

analyses of these had a good agreement, thereby indicating that the transcriptome data

318

can reflect an actual gene expression profile in MCRV-infected mud crabs.

319

Discussion

SC

M AN U

320

RI PT

310

Recently, high-throughput sequencing techniques, such as Solexa/Illumina (Illumina), 454 (Roche), and SOLID (ABI) have developed rapidly [32]. They

322

provide a rapid and high-throughput method to identify DEGs and their express

323

profile, particularly in species whose genomic information is unavailable [33-35].

324

These techniques are suitable for genomic research, such as denovo and re-sequencing

325

of genome, mRNA, and microRNA [21, 36-39]. In transcriptome analysis, the usage

326

of the next-generation sequencing techniques causes the easy establishment of cDNA

327

libraries without bacteria cells as carriers, which could introduce DNA fragment

328

deletion [40]. This denovo transcriptome sequencing technology has been used for

329

many crustaceans, such as Parhyale hawaiensis [41], Euphausia superb [42],

330

Macrobrachium rosenbergii [43] using 454 sequencing, Eriocheir sinensis [44],

331

Portunus trituberculatus [45], and Litopenaeus vannamei [46] using Illumina

AC C

EP

TE D

321

ACCEPTED MANUSCRIPT sequencing. In the present study, we used gills of mud crab to perform a genome-wide

333

investigation through transcriptional sequencing and gene expression profile analysis,

334

and the reliability and accuracy of qRT-PCR was confirmed. This study is the first

335

transcriptomic analysis on mud crab involved in MCRV infection.

RI PT

332

A total of 104.3 million clean reads from the gills of MCRV-infected mud crabs

337

and controls were obtained and assembled into 67,279 unigenes, and about 32,547 of

338

these unigenes were annotated as known proteins according to their homology. Xie et

339

al. performed a transcriptomic study in the hemocytes of mud crabs challenged with V.

340

parahaemolyticus and obtained 81,709 assembled unigenes [13]. In another study, the

341

hepatopancreas transcriptome of E. sinensis infected with a mixture of three pathogen

342

strains was analyzed by Illumina technique [6]. These studies have provided a source

343

for identification of novel genes and largely enriched the transcriptome data of crabs.

M AN U

TE D

344

SC

336

Previously, MCRV has caused serious problems in cultured mud crab [2]. Diseased crabs are sluggish and show loss of appetite and grey colouration. Moribund

346

crabs show an atrophied hepatopancreas and loose gills prominently [2]. In the current

347

transcriptomic analysis, a total of 13,856 DEGs, including 4,444 significantly

348

upregulated and 9,412 downregulated DEGs, were detected in diseased crabs. More

349

downregulated genes were observed than the upregulated ones, which implied that

350

gene expression was largely inhibited by MCRV infection, and biological functions

351

were impaired. According to the KEGG pathway analysis of DEGs, the top five

352

pathways were metabolic pathways, amoebiasis, V. cholerae infection, Huntington's

353

disease, and ribosome. Among these pathways, many unigenes were classified into

AC C

EP

345

ACCEPTED MANUSCRIPT the amoebiasis and “V. cholerae infection” KEGG pathways. Therefore, as an animal

355

living in water, mud crabs are vulnerable to bacterial and parasitic infection and may

356

have involved in complicated systems and signal pathway against infection when they

357

become infected with MCRV.

358

RI PT

354

As crab lack an acquired immune system, their defense depends entirely on an innate, non-adaptive mechanism to defend invasion of pathogens, such as virus,

360

bacteria, and parasite [47]. MAPK signaling is activated during virus infection and

361

contributes to virus replication in animal cells [48]. In the present study, many genes

362

involved in MAPK pathway were found to undergo expression changes in the gills

363

following MCRV injection. For instance, ATF2 (cyclic AMP-dependent transcription

364

factor), HGK (mitogen-activated protein kinase kinase kinase kinase 4) and ECSIT

365

(evolutionarily conserved signaling intermediate in Toll pathway) were up regulated

366

significantly, hence indicating that they might play an important role in response to

367

MCRV infection. However, the underlying molecular mechanism also remains unclear.

368

In mammals, ATF2 has both tumor-promoting as well as tumor-suppressive roles, as

369

well as crucial roles in oncogenic transformation and tumorigenesis [49]. To date, this

370

study is the first report showing the changes in the ATF2 level of MAPK pathway in

371

MCRV-infected crabs.

M AN U

TE D

EP

AC C

372

SC

359

Melanization, which is performed by PO and controlled by the proPO, is an

373

important immune response in invertebrates [50]. In our transcriptome data, unigenes

374

annotated as expressed PO and proPO were exhibited significantly after MCRV

375

infection. For instance, CL1697.Contig1_All, annotated as peroxinectin, was

ACCEPTED MANUSCRIPT upregulated in the MCRV-infected crab with the 3.52-fold changes (log2 ratio). PO is

377

a multifunctional immune component with both cell adhesive and peroxidase

378

activities, and it plays a crucial role in clearing the invading bacteria or parasites in

379

crustaceans [51, 52]. This component is also involved in the host response to exterior

380

stimulus, such as beads, lipopolysaccharide, and Aeromonas hydrophila [53].

381

Furthermore, the expression level of peroxinectin gene was upregulated in L.

382

vannamei after being challenged with V.alginolyticus at 6 h [54], and at 48 h post

383

WSSV challenge in S. paramamosain [55]. The upregulation of peroxinectin is

384

associated with enhanced encapsulation and phagocytic activity to boost pathogen

385

resistance [56]. Invading pathogens may initiate the proPO-activating system that

386

catalyzes proPO to active form PO, which is involved in wound healing and

387

sequestration of pathogens [57-61]. Unigene19990_All has been annotated as proPO,

388

which is downregulated with the 2.71-fold changes (log2 ratio) in mud crab after

389

being challenged with MCRV. proPO-b is also downregulated in WSSV-infected L.

390

vannamei [62]. A similar significant downregulation of proPO was observed in the

391

gills of MCRV-infected mud crab, which suggested that virus infection might inhibit

392

the expression of proPO and weaken the immune system of the crab.

SC

M AN U

TE D

EP

AC C

393

RI PT

376

In summary, this study focused on the difference of the transcriptome between

394

MCRV-infected and non-infected crabs. 13,856 differentially expressed genes

395

including 4,444 significantly upregulated and 9,412 downregulated DEGs were

396

obtained, which enriched in the pathways related to different diseases or infections.

397

Although the molecular functions of most genes and their associated pathways remain

ACCEPTED MANUSCRIPT 398

largely unknown, this study provides a valuable information on the antiviral

399

mechanism in crab. Furthermore, the result helps identify these novel genes in the

400

gills of mud crab in the absence of the genome database of crab.

RI PT

401

Support information

403

S1 Table. Primers used in qRT-PCR

404

S2 Table. Candidate genes involved in mud crab immune response

405

S1 Fig. Differences in DEGs expression profile between HA and CA. “DEGs”

406

indicates those unigenes with FDR≤0.001 and |log2Ratio|≥1.

407

Acknowledgments

408 409

This research was supported by the National Natural Science Foundation of China under Grant Nos. C190602.

M AN U

SC

402

References:

1. Ma H, Ma C, Li S, Jiang W, Li X, Liu Y, et al. Transcriptome Analysis of the Mud Crab (Scylla paramamosain) by 454 Deep Sequencing: Assembly, Annotation, and Marker Discovery. PLOS ONE. 2014 9.

2. Weng S, Guo Z, Sun J, Chan S, He J. A reovirus disease in cultured mud crab, Scylla serrata, in

EP

southern China. J FISH DIS. 2007 30:133-9.

3. Deng XX, Lu L, Ou YJ, Su HJ, Li G, Guo ZX, et al. Sequence analysis of 12 genome segments of mud crab reovirus (MCRV). VIROLOGY. 2012 422:185-94. 4. Hoffmann JA, Kafatos FC, Janeway CA, Ezekowitz RA. Phylogenetic perspectives in innate

AC C

411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430

TE D

410

immunity. SCIENCE. 1999 284:1313-8. 5. Li F, Xiang J. Recent advances in researches on the innate immunity of shrimp in China. DEV COMP IMMUNOL. 2013 39:11-26. 6. Li X, Cui Z, Liu Y, Song C, Shi G. Transcriptome Analysis and Discovery of Genes Involved in Immune Pathways from Hepatopancreas of Microbial Challenged Mitten Crab Eriocheir sinensis. PLOS ONE. 2013 8. 7. Lemaitre B, Hoffmann J. The host defense of Drosophila melanogaster. ANNU REV IMMUNOL. 2007 25:697-743. 8. Iwanaga S, Lee BL. Recent advances in the innate immunity of invertebrate animals. J Biochem Mol Biol. 2005 38:128-50. 9. Liu HP, Chen FY, Gopalakrishnan S, Qiao K, Bo J, Wang KJ. Antioxidant enzymes from the crab Scylla paramamosain: gene cloning and gene/protein expression profiles against LPS challenge. Fish

ACCEPTED MANUSCRIPT Shellfish Immunol. 2010 28:862-71. 10. Lin Z, Qiao J, Zhang Y, Guo L, Huang H, Yan F, et al. Cloning and characterisation of the SpToll gene from green mud crab, Scylla paramamosain. DEV COMP IMMUNOL. 2012 37:164-75. 11. Czesny S, Epifanio J, Michalak P. Genetic Divergence between Freshwater and Marine Morphs of Alewife (Alosa pseudoharengus): A 'Next-Generation' Sequencing Analysis. PLOS ONE. 2012 7. 12. Gao J, Wang X, Zou Z, Jia X, Wang Y, Zhang Z. Transcriptome analysis of the differences in gene expression between testis and ovary in green mud crab (Scylla paramamosain). BMC

RI PT

GENOMICS. 2014 15.

13. Xie C, Chen Y, Sun W, Ding J, Zhou L, Wang S, et al. Transcriptome and Expression Profiling Analysis of the Hemocytes Reveals a Large Number of Immune-Related Genes in Mud Crab Scylla paramamosain during Vibrio parahaemolyticus Infection. PLOS ONE. 2014 9.

14. Zhenzhen X, Ling X, Dengdong W, Chao F, Qiongyu L, Zihao L, et al. Transcriptome analysis of microsatellite markers. PLOS ONE. 2014 9:e109419.

SC

the Trachinotus ovatus: identification of reproduction, growth and immune-related genes and 15. Henry RP, Lucu C, Onken H, Weihrauch D. Multiple functions of the crustacean gill: FRONTIERS IN PHYSIOLOGY. 2012 3.

M AN U

osmotic/ionic regulation, acid-base balance, ammonia excretion, and bioaccumulation of toxic metals. 16. Guo Z, Weng S, Li G, Chan S, He J. Development of an RT-PCR detection method for mud crab reovirus. J VIROL METHODS. 2008 151:237-41.

17. Zhang YJ, Wang XJ, Wu JX, Chen SY, Chen H, Chai LJ, et al. Comparative transcriptome analyses between a spontaneous late-ripening sweet orange mutant and its wild type suggest the functions of ABA, sucrose and JA during citrus fruit ripening. PLOS ONE. 2014 9:e116056. 18. Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, et al. De novo assembly of human genomes with

TE D

massively parallel short read sequencing. GENOME RES. 2010 20:265-72. 19. Hao DC, Ge G, Xiao P, Zhang Y, Yang L. The first insight into the tissue specific taxus transcriptome via Illumina second generation sequencing. PLOS ONE. 2011 6:e21220. 20. Qin YF, Fang HM, Tian QN, Bao ZX, Lu P, Zhao JM, et al. Transcriptome profiling and digital gene expression by deep-sequencing in normal/regenerative tissues of planarian Dugesia japonica.

EP

GENOMICS. 2011 97:364-71.

21. Surget-Groba Y, Montoya-Burgos JI. Optimization of de novo transcriptome assembly from next-generation sequencing data. GENOME RES. 2010 20:1432-40. 22. Ogata H, Goto S, Sato K, Fujibuchi W, Bono H, Kanehisa M. KEGG: Kyoto Encyclopedia of

AC C

431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474

Genes and Genomes. NUCLEIC ACIDS RES. 1999 27:29-34. 23. Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, et al. The COG database: an updated version includes eukaryotes. BMC BIOINFORMATICS. 2003 4:41. 24. Tatusov RL, Galperin MY, Natale DA, Koonin EV. The COG database: a tool for genome-scale analysis of protein functions and evolution. NUCLEIC ACIDS RES. 2000 28:33-6. 25. Tatusov RL, Natale DA, Garkavtsev IV, Tatusova TA, Shankavaram UT, Rao BS, et al. The COG database: new developments in phylogenetic classification of proteins from complete genomes. NUCLEIC ACIDS RES. 2001 29:22-8. 26. Iseli C, Jongeneel CV, Bucher P. ESTScan: a program for detecting, evaluating, and reconstructing potential coding regions in EST sequences. Proc Int Conf Intell Syst Mol Biol. 1999:138-48. 27. Conesa A, Gotz S, Garcia-Gomez JM, Terol J, Talon M, Robles M. Blast2GO: a universal tool for

ACCEPTED MANUSCRIPT annotation, visualization and analysis in functional genomics research. BIOINFORMATICS. 2005 21:3674-6. 28. Ye J, Fang L, Zheng H, Zhang Y, Chen J, Zhang Z, et al. WEGO: a web tool for plotting GO annotations. NUCLEIC ACIDS RES. 2006 34:W293-7. 29. Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. NAT METHODS. 2008 5:621-8. PCR and the 2(T)(-Delta Delta C) method. METHODS. 2001 25:402-8.

RI PT

30. Livak KJ, Schmittgen TD. Analysis of relative gene expression data using real-time quantitative 31. Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I. Controlling the false discovery rate in behavior genetics research. BEHAV BRAIN RES. 2001 125:279-84.

32. Metzker ML. Sequencing technologies - the next generation. NAT REV GENET. 2010 11:31-46. 33. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, et al. Genome

SC

sequencing in microfabricated high-density picolitre reactors. NATURE. 2005 437:376-80.

34. Liu L, Li Y, Li S, Hu N, He Y, Pong R, et al. Comparison of Next-Generation Sequencing Systems. JOURNAL OF BIOMEDICINE AND BIOTECHNOLOGY. 2012.

35. Guryev V, Cuppen E. Next-generation sequencing approaches in genetic rodent model systems to

M AN U

study functional effects of human genetic variation. FEBS LETT. 2009 583:1668-73. 36. Morozova O, Marra MA. Applications of next-generation sequencing technologies in functional genomics. GENOMICS. 2008 92:255-64.

37. Git A, Dvinge H, Salmon-Divon M, Osborne M, Kutter C, Hadfield J, et al. Systematic comparison of microarray profiling, real-time PCR, and next-generation sequencing technologies for measuring differential microRNA expression. RNA. 2010 16:991-1006.

38. Hashimoto S, Qu W, Ahsan B, Ogoshi K, Sasaki A, Nakatani Y, et al. High-resolution analysis of sequencer. PLOS ONE. 2009 4:e4108.

TE D

the 5'-end transcriptome using a next generation DNA

39. Li R, Fan W, Tian G, Zhu H, He L, Cai J, et al. The sequence and de novo assembly of the giant panda genome. NATURE. 2010 463:311-7.

40. Wall PK, Leebens-Mack J, Chanderbali AS, Barakat A, Wolcott E, Liang H, et al. Comparison of next generation sequencing technologies for transcriptome characterization. BMC GENOMICS. 2009

EP

10:347.

41. Zeng V, Villanueva KE, Ewen-Campen BS, Alwes F, Browne WE, Extavour CG. De novo assembly and characterization of a maternal and developmental transcriptome for the emerging model crustacean Parhyale hawaiensis. BMC GENOMICS. 2011 12:581.

AC C

475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518

42. Clark MS, Thorne MA, Toullec JY, Meng Y, Guan LL, Peck LS, et al. Antarctic krill 454 pyrosequencing reveals chaperone and stress transcriptome. PLOS ONE. 2011 6:e15919. 43. Jung H, Lyons RE, Dinh H, Hurwood DA, McWilliam S, Mather PB. Transcriptomics of a giant freshwater prawn (Macrobrachium rosenbergii): de novo

assembly, annotation and marker discovery.

PLOS ONE. 2011 6:e27938. 44. He L, Wang Q, Jin X, Wang Y, Chen L, Liu L, et al. Transcriptome profiling of testis during sexual maturation stages in Eriocheir sinensis using Illumina sequencing. PLOS ONE. 2012 7:e33735. 45. Lv J, Liu P, Wang Y, Gao B, Chen P, Li J. Transcriptome analysis of Portunus trituberculatus in response to salinity stress provides insights into the molecular basis of osmoregulation. PLOS ONE. 2013 8:e82155. 46. Li C, Weng S, Chen Y, Yu X, Lu L, Zhang H, et al. Analysis of Litopenaeus vannamei transcriptome using the next-generation DNA sequencing technique. PLOS ONE. 2012 7:e47442.

ACCEPTED MANUSCRIPT

557 558 559

47. Kurata S, Ariki S, Kawabata S. Recognition of pathogens and activation of immune responses in Drosophila and horseshoe crab innate immunity. IMMUNOBIOLOGY. 2006 211:237-49. 48. Andrade AA, Silva PN, Pereira AC, De Sousa LP, Ferreira PC, Gazzinelli RT, et al. The vaccinia virus-stimulated mitogen-activated protein kinase (MAPK) pathway is

required for virus

multiplication. BIOCHEM J. 2004 381:437-46. 49. Gozdecka M, Breitwieser W. The roles of ATF2 (activating transcription factor 2) in tumorigenesis. Biochem Soc Trans. 2012 40:230-4.

RI PT

50. Cerenius L, Lee BL, Soderhall K. The proPO-system: pros and cons for its role in invertebrate immunity. TRENDS IMMUNOL. 2008 29:263-71.

51. Johansson MW. Cell adhesion molecules in invertebrate immunity. DEV COMP IMMUNOL. 1999 23:303-15.

52. Jiravanichpaisal P, Lee BL, Soderhall K. Cell-mediated immunity in arthropods: hematopoiesis,

SC

coagulation, melanization and opsonization. IMMUNOBIOLOGY. 2006 211:213-36.

53. Lv S, Lu B, Xu J, Xu H, Zhao J, Li S, et al. Immune response of peroxinectin of Chinese mitten crab Eriocheir sinensis to exterior stimulation. DEV COMP IMMUNOL. 2015 51:56-64. 54. Liu CH, Cheng W, Chen JC. The peroxinectin of white shrimp Litopenaeus vannamei is

M AN U

synthesised in the semi-granular and granular cells, and its transcription is up-regulated with Vibrio alginolyticus infection. FISH SHELLFISH IMMUN. 2005 18:431-44.

55. Du Z, Ren Q, Huang A, Fang W, Zhou J, Gao L, et al. A novel peroxinectin involved in antiviral and antibacterial immunity of mud crab, Scylla paramamosain. MOL BIOL REP. 2013 40:6873-81. 56. Kobayashi M, Johansson MW, Ll KSD. The 76 kD cell-adhesion factor from crayfish haemocytes promotes encapsulation in vitro. CELL TISSUE RES. 1990 260:13-8.

57. Barrett FM. Wound-healing phenoloxidase in larval cuticle of Calpodes ethlius (Lepidoptera:

TE D

Hesperiidae). Canadian Journal of Zoology. 1984 62:834-8.

58. T L Hopkins A, Kramer KJ. Insect Cuticle Sclerotization. ANNU REV ENTOMOL. 2003 37:273-302.

59. Lai SC, Chen CC, Hou RF. Immunolocalization of prophenoloxidase in the process of wound 39:266-74.

EP

healing in the mosquito Armigeres subalbatus (Diptera : Culicidae). J MED ENTOMOL. 2002 60. Ratcliffe NA, Brookman JL, Rowley AF. Activation of the prophenoloxidase cascade and initiation of nodule formation in locusts by bacterial lipopolysaccharides. DEV COMP IMMUNOL. 1991 15:33-9.

AC C

519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556

61. Aspan A, Huang T, Cerenius L, Soderhall K. CDNA cloning of prophenoloxidase from the freshwater crayfish Pacifastacus leniusculus and its activation. P NATL ACAD SCI USA. 1995 92:939-43.

62. Ai H, Huang Y, Li S, Weng S, Yu X, He J. Characterization of a prophenoloxidase from hemocytes of the shrimp Litopenaeus vannamei that is down-regulated by white spot syndrome virus. FISH SHELLFISH IMMUN. 2008 25:28-39.

ACCEPTED MANUSCRIPT 560

Table legends

561

Table 1 Summary of the transcriptome assembly in HA and CA.

562

HA: PBS group

563

(HA&CA) were merged to produce long unigenes, called ALL-Unigenes.

564

Table 2 Summary of the annotations of S. paramamosain unigenes.

ALL: the above two groups of unigenes

RI PT

CA: virus group

565

Table 3 Summary of the top 10 DEGs enriched pathway by KEGG Pathway analysis.

SC

566

M AN U

567 568

Table 4 Summary of the pathway with top 10 DEGs ratio.

569

DEGs ratio= (the number of DEGs in the pathway/the number of mapped unigenes in

570

the pathway) %

TE D

571 572 573

Figure legends

575

Figure 1 Figures of Nr classification. (A) E-value distribution. (B) Similarity

576

distribution. (C) Species distribution.

AC C

577

EP

574

578

Figure 2 COG function classification of unigenes. Unigenes annotated by COG were

579

classified into 25 categories.

580 581

Figure 3 Histogram presentation of Gene Ontology classification of the DEGs

ACCEPTED MANUSCRIPT (HA-VS-CA). Major GO categories terms are shown in the x-axis and grouped into

583

three main ontologies: biological process, cellular component and molecular function.

584

The righty-axis indicates the number of genes in each category, while the left y-axis

585

indicates the percentage of total genes in that category.

586

A. biological process

587

B. cellular component

588

C. molecular function

SC

RI PT

582

M AN U

589

Figure 4 Histogram presentation of KEGG pathway (lv2) analysis of the DEGs

591

(HA-VS-CA). These KEGG pathway (level 2) are clustered into 6 KEGG pathway

592

(level 1).

593

A. Organismal Systems;

594

B. Metabolism;

595

C. Human Diseases;

596

D. Genetic Information Processing;

597

E. Environmental Information Processing;

598

F. Cellular Processes.

EP

AC C

599

TE D

590

600

Figure 5 Comparison of the expression profiles of 10 selected genes as determined by

601

Hiseq2000 sequencing and qRT-PCR. Gene abbreviations are as follows: Cu/Zn SOD,

602

copper/zinc

603

antilipopolysaccharide factor; HSP90, heat shock protein 90

superoxide;

FABP,

fatty

acid

binding

protein;

ALF,

ACCEPTED MANUSCRIPT

Table 1

HA

81,901

Total Length(nt) 52,431,015

CA

70,059

49,072,111

ALL

67,279

59,103,964

HA

125,599

48,695,656

CA

106,587

43,491,909

Mean Length(nt)

N50

640

1,239

700

1,442

878

1,601

388

708

408

812

AC C

EP

TE D

M AN U

SC

Contigs

Total Number

RI PT

Unigenes

Sample

ACCEPTED MANUSCRIPT

Table 2 Number of

Ratio of

Annotation

blasted Unigenes

ALL-Unigenes

NR

27,532

40.92%

NT

15,824

23.52%

Swiss-Prot

22,936

34.09%

KEGG

20,260

30.11%

COG

11,866

17.64%

GO

13,039

total

32,547

SC

M AN U

TE D EP AC C

RI PT

Database for

19.38%

48.38%

ACCEPTED MANUSCRIPT

#Pathway

Up

Down

DEGs

RATIO in ALL DEGs

Metabolic pathways

164

490

654

13.25%

Amoebiasis

87

147

234

4.74%

Vibrio cholerae infection

77

147

224

Huntington's disease

29

193

222

Ribosome

96

122

218

RI PT

Table 3

Regulation of actin cytoskeleton

50

134

184

3.73%

Phagosome

17

143

160

3.24%

Epstein-Barr virus infection

34

Focal adhesion

35

Pathways in cancer

42

4.50%

M AN U

SC

4.42%

126

160

3.24%

120

155

3.14%

108

150

3.04%

TE D EP AC C

4.54%

ACCEPTED MANUSCRIPT

Table 4 Down/ #Pathway

Up

Down

DEGs

All-Unigenes

DEG/ALL Up

69

80

156

1

35

36

73

0

46

46

101

5

29

7

54

and presentation Regulation of

Proteasome Aldosterone-regulate d sodium reabsorption

Proximal tubule

reclamation

5

22

EP

bicarbonate

6.27

49.32%

35.00

45.54%

--

34

75

45.33%

5.80

61

135

45.19%

7.71

27

61

44.26%

4.40

TE D

Rheumatoid arthritis

M AN U

autophagy

51.28%

SC

11

RI PT

Antigen processing

AC C

Drug metabolism -

6

24

30

69

43.48%

4.00

24

22

46

108

42.59%

0.92

7

24

31

73

42.47%

3.43

cytochrome P450 Complement and

coagulation cascades Metabolism of xenobiotics by cytochrome P450

ACCEPTED MANUSCRIPT Steroid hormone 6

15

21

51

41.18%

2.50

AC C

EP

TE D

M AN U

SC

RI PT

biosynthesis

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT Highlights:

RI PT

SC



M AN U



TE D



EP



First transcriptome of Mud crab gills response to Mud crab reovirus infection using Illumina Hiseq 2000. A total of 104.3 million clean reads were obtained, and 81,901, 70,059 and 67,279 unigenes were gained respectively from HA reads, CA reads and HA&CA reads. About 13,856 DEGs were detected in diseased crabs compared with the control, including 4,444 significantly upregulated and 9,412 downregulated DEGs. Ten of differently expressed genes were selected for validation with RT-qPCR. Immune-related genes involved in Wnt signaling pathway, MAPK signaling pathway and apoptosis pathway were focused.

AC C