Identification of putative genes involved in parasitism in the anchor worm, Lernaea cyprinacea by de novo transcriptome analysis

Identification of putative genes involved in parasitism in the anchor worm, Lernaea cyprinacea by de novo transcriptome analysis

Accepted Manuscript Title: Identification of putative genes involved in parasitism in the anchor worm, Lernaea cyprinacea by de novo Transcriptome ana...

785KB Sizes 0 Downloads 26 Views

Accepted Manuscript Title: Identification of putative genes involved in parasitism in the anchor worm, Lernaea cyprinacea by de novo Transcriptome analysis. Author: Pallavi B., Shankar K.M., Abhiman P.B., Iqlas Ahmed PII: DOI: Reference:

S0014-4894(15)00076-4 http://dx.doi.org/doi:10.1016/j.exppara.2015.03.014 YEXPR 7020

To appear in:

Experimental Parasitology

Received date: Revised date: Accepted date:

31-7-2014 19-3-2015 20-3-2015

Please cite this article as: Pallavi B., Shankar K.M., Abhiman P.B., Iqlas Ahmed, Identification of putative genes involved in parasitism in the anchor worm, Lernaea cyprinacea by de novo Transcriptome analysis., Experimental Parasitology (2015), http://dx.doi.org/doi:10.1016/j.exppara.2015.03.014. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

1 1

Identification of putative genes involved in parasitism in the anchor worm, Lernaea

2

cyprinacea by de novo Transcriptome analysis.

3

Pallavi B1,2, Shankar KM2*, Abhiman PB1,2 & Iqlas Ahmed1,2

4

1

5

Mangalore 575002, India.

6

2

7

Sciences University

8

*Corresponding author: Dean, College of Fisheries Mangalore, Tel: +91-824-2248936;

9

Fax: +91-824-2248366

Aquatic Animal Health Laboratory, Department of Aquaculture, College of Fisheries,

Department of Aquaculture,College of Fisheries,Karnataka Veterinary, Animal and Fisheries

10

E-mail:[email protected]

11

Highlights:

12 13 14 15 16



De novo transcriptome sequencing of adult and free living stages of L. cyprinacea.



36,054 unigenes found defined annotations in the pfam database.



Differences in transcription between free-living and parasitic stages established.

17



This would shorten route towards vaccine/control strategies against the parasite.

18 19

Graphical Abstract

20

Page 1 of 18

2 21

ABSTRACT

22

There is little information on the genome sequence of Lernaea cyprinacea a major

23

ectoparasite of freshwater fish throughout the world. We subjected the L. cyprinacea

24

transcriptome (adult and free living stages) to Illumina HiSeq 2000 sequencing. We obtained

25

a total of 31671751 (31.67 millions) reads for the adult parasitic stage and 33840446 (33.84

26

millions) for the free living stage. The reads were assembled into 50,792 contigs for the adult

27

stage and 69,378 for the free living stage. Using the pfam database, 41.91 % of the

28

transcriptome was annotated. The transcriptome was mined for genes associated with

29

parasitism. To examine gene expression changes associated with the parasitism of

30

L. cyprinacea during the transit from the free living to parasitic stage, we studied the

31

differentially expressed transcripts between the two stages. The microsatellite markers were

32

also identified (9,843 for adult stage; 16,813 for free living stages) and this would facilitate

33

population genetic studies in various geographical isolates of Lernaea. Our data provides the

34

most comprehensive sequence resource available for L. cyprinacea and demonstrates that

35

Illumina sequencing allows de novo transcriptome assembly and gene expression analysis in

36

a species lacking genome information. The data could open new avenues for a wide array of

37

genetic, evolutionary, biological, ecological, epidemiological studies, and a solid foundation

38

for the development of novel interventions against L. cyprinacea.

39

Keywords:

40

Lernaea cyprinacea; Transcriptome; Parasitism

41

1. Introduction

42

Lernaea infection is a major disease problem encountered in carp culture in the Indian

43

subcontinent and has been reported from Indian major carps Catla (Catla catla), Rohu (Labeo

44

rohita), Mrigal (Cirrhinus mrigala), exotic carps silver carp (Hypophthalmichthys molitrix),

Page 2 of 18

3 45

grass carp (Ctenopharyngodon idella) and indigenous carps Labeo fimbriatus (Nandeesha et

46

al., 1984, 1985; Tamuli and Shanbhouge, 1996; Zafar et al., 2001). The three Indian Major

47

Carps viz., Catla, Rohu and Mrigal, contribute significantly to Indian aquaculture production

48

with an output of over two million tonnes (Kurva et al., 2013). Ectoparasitic diseases in

49

freshwater fish farms of India result in an annual loss of 300 crores INR due to disease-

50

induced mortality and impaired growth (Sahoo and Kar, 2012; Lakra et al., 2006). Lernaea

51

infections have been reported from Africa, Asia, Europe, North America (Hoffman, 1999)

52

and South America (Plaul et al., 2010). Lernaea cyprinacea, Linnaeus, 1758, is the only

53

cosmopolitan species in the genus Lernaea (Piasecki et al., 2004).

54

The life cycle of L. cyprinacea is composed of nine stages; following three free living

55

naupliar stages are five copepodite stages and one adult stage. The parasite feeds on fish

56

mucus, blood and tissue debris. Invasion destroys scales, skin and muscles of fish. Heavy

57

parasitosis can be the cause of mass death of fish and secondary bacterial or fungal infections

58

(Bednarska et al., 2009).

59

Unfortunately, the only effective treatment against Lernaea spp is the application of

60

organophosphate and organochlorine pesticides. Their chemical stability, lipophilic nature

61

and toxicity, has led researchers to be concerned with their presence in the environment

62

(Amaraneni, 2002). The persistence of therapeutic agents in the aquatic environment causes

63

adverse effects on the ecosystem (Anon 1988, Choo 1994). Lernaea spp has been reported to

64

develop resistance to certain pesticides (Hoole et al., 2001; Sandra, 2004). Thus,

65

immunological protection of fishes against Lernaea spp infestation is presently the practically

66

sustainable alternative control method to the current use of pesticides that is riddled with

67

serious limitations. Owing to the complex nature of parasites, search for vaccine targets has

68

proven difficult. Currently no EST records are available for Lernaea cyprinacea in the

Page 3 of 18

4 69

National Center for Biotechnology Information (NCBI) database. Lack of genomic data has

70

hampered the use of molecular tools in developing control strategies for L. cyprinacea.

71

In order to predict and prioritize novel antigenic targets expressed across different

72

developmental stages of L. cyprinacea, we employed the Illumina sequencing and predictive

73

algorithms to explore similarities and differences in the transcriptomes of the free living and

74

adult parasitic stage of L. cyprinacea. This study involves the first de novo transcriptome

75

analysis of L. cyprinacea. Bioinformatic analyses of the transcriptomic data allowed a

76

detailed exploration of molecular changes associated with the transition from the free-living

77

to the parasitic stage and a prediction of the roles that key transcripts play in the metabolic

78

pathways linked to parasitism. Overall, this study provides the first insight into the molecular

79

biology of the important parasite L. cyprinacea.

80

2. Materials and Methods

81

2.1. Parasite Isolation

82

Catla heavily infected with L. cyprinacea were brought to the laboratory from the farm of the

83

Fisheries College, Mangalore, India. The L. cyprinacea was carefully pulled out with forceps

84

to avoid contamination with the host tissues, snap frozen in liquid nitrogen and stored in it

85

until used. Intact egg sacs were removed and incubated in flasks containing filtered tap water.

86

The resulting nauplii were maintained until a majority moulted to the first copepodite stage.

87

The free living stages were recovered onto 47-mm cellulose acetate filter membranes with a

88

pore size of 0.22 µm (Millipore). The membranes were flash frozen in liquid nitrogen and

89

stored in it till the next use (Sutherland et al., 2012). The phylogenetic status of the parasite

90

was also checked by partial sequencing of the rDNA of 18S and 28S regions (communicated

91

for publication elsewhere) and it was confirmed as L. cyprinacea.

92

2.2. RNA isolation

Page 4 of 18

5 93

The pooled adult (50) and free living stages (500) of the L. cyprinacea sample were

94

separately homogenized using TOMY Homogenizer with steel beads. Total RNA was

95

extracted with the TriZOL (Invitrogen) according to the manufacturer’s instructions.

96

2.3. Library preparation and sequencing

97

Transcriptome library for sequencing was constructed using TruSeq RNA sample preparation

98

kit according to the manufacturer’s instructions. Total RNA (1µg) was subjected to Poly A

99

purification of mRNA. Purified mRNA was fragmented for 4 min at elevated temperature

100

(94oC) in the presence of divalent cations and reverse transcribed with Superscript III

101

Reverse transcriptase (Invitrogen) by priming with random hexamers. Second strand cDNA

102

was synthesized in the presence of DNA polymerase I and RnaseH. The cDNA was cleaned

103

up using Agencourt Ampure XP SPRI beads (Beckman Coulter). Illumina Adapters were

104

ligated to the cDNA molecules after end repair and addition of A base. On completion of

105

ligation, SPRI was cleaned. The library was amplified using 11 cycles of PCR for enrichment

106

of adapter ligated fragments. The prepared library was quantified using a Nanodrop

107

spectrophotometer (Thermo Scientific, DE, USA) and validated for quality by running an

108

aliquot on a High Sensitivity Bioanalyzer Chip (Agilent). The prepared library was

109

sequenced on Illumina HiSeq 2000 (Illumina) to generate reads of 2x100 bp. The sequencing

110

reactions for the parasitic and free living stages were run at the same time to prevent

111

confounding error profiles with real differences in transcription.

112

2.4. De novo assembly and sequence clustering

113

Raw reads were processed using the perl script. Raw read processing step involved Adapter

114

trimming, B-block removal and low quality base filtering. Read Quality was assessed using

115

read quality check tool, SeqQC. Contamination at reads level was checked by aligning the

116

processed reads with the RNA sequences of the host fish using the Bowtie-0.12.8 tool and

117

reads aligning to the host fish were removed. The Velvet_1.2.10 tool (Zerbino and Birney,

Page 5 of 18

6 118

2008) was used for the de-novo assembly of high quality reads to get contigs. The

119

Oases_0.2.08 (Schulz et al., 2012) tool was used for transcript generation from de novo

120

assembled contigs.

121

combined transcripts to form unigenes with minimum similarity cut-off of 95%.

122

2.5. Ontology and annotation

123

Assembled transcripts were mapped against UniProt and associated GO, pfam databases and

124

COG database using the ncbi-BLAST-2.2.28 tool (Altschul et al., 1990).

125

2.6. SSRs identification

126

MISA tool was used for identification and localization of perfect microsatellites as well as

127

compound microsatellites, which are interrupted by a certain number of bases.

128

2.7. Differential Gene Expression Analysis

129

The transcripts from the adult and larval stages were combined and clustered using the CD

130

Hit tool with a minimum similarity cutoff of 95%. The reads were aligned using the Bowtie

131

version (0.12.8) tool and the read count profile generated. The Differential Gene Expression

132

analysis was carried out with the DESeq tool (Anders and Huber, 2010). The P-value given

133

by DESeq package was calculated based on significance test incorporated within DESeq.

134

Transcripts having corrected P-value as <=0.05 were considered to be significant. Corrected

135

P-value given by DESeq package was calculated based on the Benjamini Hochburg

136

procedure. The absolute value of log 2 Ratio <_1 was used as the threshold to judge the gene

137

expression difference. If the log fold change was >=1 the transcript was considered as

138

upregulated and <=1 as down regulated.

139

2.8. Validation of gene expression in Lernaea cyprinacea by real time qPCR

The CD-HIT tool (Limin et al., 2012) was used for clustering of

Page 6 of 18

7 140

Ten genes were selected at random from the differentially expressed genes for validation by

141

quantitative real time PCR analysis. Among the chosen transcripts were genes encoding

142

NADH dehydrogenase, Aquaporin, Serpin, Kazal like serine protease inhibitor, Cathepsin B,

143

Cathepsin L, Vitellogenin, Glutathione S Transferase, Peritrophin and Trypsin. Total RNA

144

was isolated from the adult L. cyprinacea sample using the Trizol reagent. Using the Affinity

145

Script QPCR cDNA synthesis kit, 2000 ng of DNase treated RNA was reverse transcribed to

146

make 100 ng/ul of cDNA. The primers were manually designed using Gene Runner version

147

3.05. The primers were validated and the amplicon sizes were confirmed using 2% agarose

148

gel. Relative quantification by qPCR was then carried out using Brilliant II SYBR Green

149

qPCR Master mix. The experiment was conducted using Stratagene Mx3005P (Agilent

150

technologies) platform. PCR consisted of initial denaturation at 95°C for 10 min followed by

151

40 cycles of 95°C for 30 s, 58°C for 1min, 72°C for 1 min. A melt curve was also performed

152

after the assay to check for the specificity of the reaction. Quantification of selected mRNA

153

transcript abundance was performed using the comparative threshold cycle (CT) method.

154

3.0. Results and Discussion

155

3.1. Illumina sequencing and sequence assembly

156

De novo assembly of the parasite transcriptome is a challenging task due to the lack of

157

sufficient reference genomes/ gene sequences in public databases. A total of 31.67 million

158

reads were generated for the adult stage and 33.84 millions reads for the free living stage of

159

L. cyprinacea on Illumina HiSeq 2000 platform (Table 1). While the number of reads should

160

capture a reasonable proportion of genes present in the RNA sample, it will not provide a full

161

characterisation of the transcriptome. Nevertheless, with the development of highly efficient

162

assembly tools in future, the raw data can be utilised for better assembly. The raw reads

163

produced in the present study have been deposited in the NCBI Sequence Read Archive

164

Database (Accession No. PRJNA232511). The length of transcript contigs ranged from 200

Page 7 of 18

8 165

to 23,727 bp for the adult stage and 200 to 27,996 for the free living stage. The adult stages

166

had an average contig length of 1,071.2 ± 1,218.5 bp with an N50 value of 1,750 and free

167

living stages had an average contig length of 1,154.2 ± 1,408.8 bp with an N50 value of

168

2,040.

169

3.2. Annotation of predicted proteins

170

Distinct sequences were searched against the pfam database with a cut-off E-value of 1.0E-4

171

to annotate the unigenes. In total, 36,054 unigenes (41.91 % of all distinct sequences)

172

matched known genes; the other 49,970 unigenes (58.09 %) failed to acquire annotation

173

information in the pfam database.

174

The E-value distribution of the top hits in the pfam database showed that 58 % of the mapped

175

sequences have strong homology 1.0E-50 to 1.0E-150, whereas 42 % of the homolog

176

sequences ranged between (1.0E4-1.0E 49). The similarity distribution has a comparable

177

pattern with 19 % of the sequences having a similarity higher than 80%, while 80% of the

178

hits have a similarity ranging from 30 to 80 %. The species distribution of the top pfam hits

179

for each unique sequence is shown in (Fig. 1).

180

3.3. Unigene functional annotation by Gene ontology (GO) and Classification of

181

Clusters of Orthologous Groups (COG)

182

Gene Ontology (GO) is an international standardized gene functional classification system

183

and covers three domains: cellular component, molecular function, and biological process. A

184

total of 86,024 unigenes were assigned to 2,439 GO terms. 1,085 GO terms originated from

185

the GO domain Biological Process, 1,052 GO terms from the Cellular Component domain

186

and 302 GO terms from the Molecular Function domain (Fig 2).

187

Out of the total 86,024 unigenes, 28,498 genes got defined annotations in the biological

188

process category, 49,762 unigenes in the molecular function category and 21,259 unigenes in

189

the cellular component category. The highly represented groups among biological processes

Page 8 of 18

9 190

category were translation (2,716 unigenes), proteolysis (2,114 unigenes) and small GTPase

191

mediated signal transduction (716 unigenes) processes. The molecular function classification

192

showed a predominance of the ATP binding category (4,689 unigenes), followed by

193

structural constituents of the ribosome (2,828 unigenes) and zinc ion binding (2,801

194

unigenes). Under the cellular component category genes coding for proteins integral to

195

membrane (3,966 unigenes), ribosome (2,578 unigenes) and nucleus (2,058 unigenes) were

196

observed to be highly represented.

197

For more accurate annotation of their functions, we aligned the unigenes to the COG database

198

to find homologous genes (Fig 3). In total, 17,573 unigenes (20.43%) were annotated and

199

formed 24 COG classifications. Among the functional classes, the cluster ‘‘translation,

200

ribosomal structure and biogenesis’’ constituted the largest group (3,650; 20.77%) followed

201

by ‘‘general function prediction’’ (2,790;15.87%) and ‘‘post translational modification,

202

protein turn over and chaperones’’ (1,833;10.43%); the two smallest groups were ‘‘Cell

203

motility” (25; 0.142%) and ‘‘nuclear structure’’ (2; 0.011 %).

204

3.4. Identification of short sequence repeats (SSRs)

205

Microsatellites are widely used genetic markers in population genetic and epidemiological

206

studies (e.g. Schlotterer et al., 1991; Gilbert et al., 1998). The remarkable sequence

207

conservations observed around the microsatellite loci may be used for the development of

208

host-fish species-specific probes for the study of L. cyprinacea populations and their

209

epidemiology. The most prevalent SSR type in both the adult and free living samples was tri-

210

nucleotides, immediately followed by mono-nucleotides, then di-nucleotides, tetra-

211

nucleotides, hexa-nucleotides and penta-nucleotides (Table 2).

212

3.5. Changes in gene expression profile between the adult and free living stages of L.

213

cyprinacea

Page 9 of 18

10 214

The free living and parasitic stages thrive in different environments and the living

215

requirements also vary. Thus, some difference in gene expression between these two phases

216

of development can be expected. Out of the total 86,024 reference transcripts available,

217

38,280 transcripts were found to be expressed in both the stages, 19,069 transcripts were

218

found to be expressed only in the adult parasitic stage and 28,505 transcripts only in the free

219

living stage. Out of the 971 P significant (P-value <= 0.05) transcripts showing differential

220

expression, 659 transcripts were found to be upregulated in the adult parasitic stage and 312

221

transcripts were found to be downregulated.

222

After host penetration, growth in adult female L. cyprinacea involves sexual maturation for

223

egg production (expansion of genital segment) and an increased capacity for food (mainly

224

blood) uptake (expansion of the abdomen). The female specific vitellogenin genes were

225

found to be upregulated in the egg laying adult female L. cyprinacea (Fig.4A). These are

226

incorporated into the eggs and they supply the eggs with sufficient nutrients to ensure proper

227

development and growth after hatching until external food can be ingested and utilised

228

autonomously. Among the differentially expressed genes, genes involved in protein digestion

229

such as Digestive cysteine proteinases, Trypsin, Cathepsin B, Cathepsin L, Chymotrypsin,

230

serine proteases were found to be upregulated in the adult parasitic stage (Fig.4B, C, D, E, F,

231

G). Blood feeding causes excess protein overload in the parasite. Blood-induced expression

232

of protease transcripts would therefore be expected. These proteolytic enzymes not only help

233

in protein digestion, but also facilitate the establishment of parasite infection through

234

proteolytic activation of enzymes. The protease inhibitors like serpin and KTSPIs were also

235

upregulated in the adult parasitic stage. They may be actively involved in the inhibition of

236

components of the host blood coagulation cascade to facilitate fluidity in the mouth parts and

237

midgut following blood-feeding on the fish. However, carboxypeptidases were found to be

Page 10 of 18

11 238

upregulated in the free living stage (Fig. 4H). It might function as a digestive enzyme in the

239

juvenile gut.

240

Sequences encoding detoxifying enzymes like, superoxide dismutase (SOD), glutathione S-

241

transferases (GST), peroxiredoxin were found in the transcriptome of both free living and

242

adult stage of L. cyprinacea (Fig. 4K,L,M). SOD and GST were upregulated in the free living

243

stage stressing their relevance for immune evasion during the initial interaction with the host.

244

Furthermore, early development is a life stage where oxidative stress levels are high due to

245

the presumed link between the high metabolic activities required for growth and ROS

246

generation (Monaghan et al., 2009). Barata et al., (2005) reported similar expression patterns

247

of these genes in Daphnia magna. However, the genes transcribing peroxiredoxin, another

248

antioxidant protein were upregulated in the blood feeding parasitic stage only and might be

249

actively involved in detoxifying the ROS generated through blood meal digestion.

250

Aquaporin genes were also found to be upregulated in the parasitic blood feeding stage

251

(Fig.4N). The L. cyprinacea aquaporins might help cope with the osmotic stress resulting

252

from blood feeding and excrete excess water coming in through the blood meal.

253

The peritrophic matrix of the parasite rearranges itself during the course of blood digestion

254

and hence the genes encoding peritrophins were found to be enriched in the adult parasitic

255

stage (Fig.4O). The peritophic matrix serves as a molecular sieve of partially digested protein

256

and carbohydrate, as a scaffold for proteases, peptidases, and glycosidases, as a sink for toxic

257

substances, and as a barrier to ingested pathogens. The upregulation of chitin synthase in the

258

adult stage (Fig.4P) might be associated with the synthesis of the chitin fibrils to aid in the

259

remodelling of the peritrophic matrix following blood meal.

260

Animals with an exoskeleton grow through molting and each instar typically shows limited

261

increase in size. In another parasitic copepod, Lernaeocera branchialis, substantial growth

262

and metamorphosis in adult females after the final molt have been reported, resulting in a 20-

Page 11 of 18

12 263

fold size increase of the abdomen (Smith and Whitfield, 1988). Large scale cuticle secretion

264

must account for the large size increase. This might explain the upregulation of cuticle

265

proteins in the adult female stage of L. cyprinacea (Fig.4Q). The Venom allergen genes were

266

found to be upregulated during the blood feeding stage of L. cyprinacea (Fig.4R). These may

267

be secreted during feeding and might be involved either in suppression of the host immune

268

system or in the prevention of clotting to prolong feeding.

269

3.6. Validation of gene expression in L. cyprinacea by real time qPCR

270

Real-time RT-PCR is frequently used to confirm data obtained from high-throughput

271

sequencing (Chen et al., 2010; Kalavacharla et al., 2011). The expression pattern of most of

272

the genes obtained through qRT-PCR data largely corroborated the RNA-seq data. The

273

qRTPCR analysis confirms that RNA-seq approach has provided reliable data.

274

4.0. Conclusion

275

In conclusion, the whole transcriptome of the adult and free living stages of L.cyprinacea was

276

subject to Illumina HiSeq 2000 sequencing. This study is the first to obtain fundamental

277

molecular knowledge of L. cyprinacea. Some noteworthy results of this study are that a

278

significant number of putative genes involved in parasitism were identified within the derived

279

sequences; a number of microsatellite markers were predicted, which upon validation could

280

facilitate the identification of polymorphisms within L. cyprinacea populations. Given the

281

shortcomings of the currently available pesticide treatment against Lernaea spp., our data has

282

created an opportunity to shorten the route towards developing more efficient and sustainable

283

control programs like vaccination against the parasite.

284

Acknowledgements

285

The authors wish to thank the NAIP- ICAR, New Delhi, for funding the research and

286

CSIR, New Delhi for providing the Senior Research Fellowship to the first author. Thanks

287

are due to Genotypic Technologies, Bangalore for sequencing support services.

Page 12 of 18

13 288

References:

289

Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D., 1990. Basic local alignment search

290 291 292 293 294 295 296

tool. J. Mol. Biol. 215, 403–410. Amaraneni, S.R ., 2002. Persistence of pesticides in water, sediment and fish from fish farms in Kolleru Lake, India. J Sci Food Agric.82, 918-923. Anders, S., Huber, W., 2010. Differential expression analysis for sequence count data. Genome Biol. 11:R106. Anon, 1988. Norwegian aquaculture: controlling the antibiotic explosion. In: Animal Pharmacy, J.B. Publication Ltd., Surrey.

297

Barata, C., Navarro, J.C., Varo, I., Riva, M.C., Arun, S., Porte, C. 2005. Changes in

298

antioxidant enzyme activities, fatty acid composition and lipid peroxidation in Daphnia

299

magna during the aging process. Comp. Biochem. Physiol. B, Biochem. Mol. Biol.

300

140, 81–90.

301 302

Bednarska, M., Bednarski, M., Soltysiak, Z., Polechonski, R., 2009. Invasion of Lernaea cyprinacea in rainbow trout (Oncorhynchus mykiss). Acta Sci. Pol. Med.Vet. 8, 27–32.

303

Chen, S., Yang, P.C., Jiang, F., Wei, Y.Y., Ma, Z.Y., Kang, L. 2010. De novo analysis of

304

transcriptome dynamics in the migratory locust during the development of phase traits.

305

PloS One. 5, 1–15.

306 307

Choo, P. S., 1994. Degradation of oxytetracycline hydrochloride in fresh and seawater. Asian Fish.Sci. 7, 195-200.

308

Gilbert, S.C., Plebanski, M., Gupta, S., Morris. J., Cox, M., Aidod, M., Kwiatkowski, D.,

309

Greenwood, B.M., Whittle, H.C. & Hill, A.V., 1998. Association of malaria parasite

310

population structure, HLA, and immunological antagonism. Science. 279, 1173–1177.

Page 13 of 18

14 311 312

Hoffman, G.L., 1999. Parasites of North American fresh water fishes, 2nd edition, Comstock Publishing Associates, Division of Cornell University Press, Ithaca and London.

313

Hoole, D., Bucke, D., Burgess, P., Wellby, I., 2001. Infectious Diseases-Parasitic Crustacea

314

Lernaea cyprinacea. In: Textbook of Diseases of Carp and Other Cyprinid Fishes" 2nd

315

Ed. Blackwell Science.116-117.

316

Kalavacharla, V., Liu, Z., Meyers, B.C., Thimmapuram, J., Melmaiee, K., 2011.

317

Identification and analysis of common bean (Phaseolus vulgaris L.) transcriptomes by

318

massively parallel pyrosequencing. BMC Plant Biol. 11,135.

319

Kurva, R. R., Gadadhar, D., Abraham, T. J., 2013. Parasitic study of Cirrhinus Mrigala

320

(Hamilton, 1822) in selected districts of West Bengal, India. Intl. J. of Adv. Biotec and

321

Res. 4, 419-436.

322

Lakra, W.S., Abidi, R., Singh, A.K., Sood, N., Rathore, G., Swaminathan, T.R., 2006. Fish

323

introductions and quarantine: Indian perspective. National Bureau of Fish Genetic

324

Resources, Lucknow, India.

325

Limin, F., Beifang, N., Zhengwei, Z., Sitao, W., Weizhong, L., 2012. CD-HIT: accelerated

326

for clustering the next generation sequencing data. Bioinformatics. 28, 3150-3152.

327

Monaghan, P., Metcalfe, N.B., Torres, R., 2009. Oxidative stress as a mediator of life history

328 329 330 331 332

trade-offs: mechanisms, measurements and interpretation. Ecol. Lett. 12, 75–92. Nandeesha, M.C., Devaraj, K.V., Murthy, C.K., 1984. Incidence of crustacean parasite Lernaea bhadraensis on fingerlings of Labeo fimbriatus (Bloch). Curr. Res. 13, 80–82. Nandeesha, M.C., Seenappa, D., Devaraj, K.V., Murthy, C.K., 1985. Incidence of anchor worm Lernaea on new hosts of fishes. Environ. Ecol. 3, 293–295.

Page 14 of 18

15 333 334

Piasecki, W., Goodwin, A. E., Eiras, J. C., Nowak, B. F., 2004. Importance of Copepoda in Freshwater Aquaculture. Zool. Stud. 43, 193-205.

335

Plaul, S. E., Romero N. G, Barbeito C. G., 2010. Distribution of the exotic parasite, Lernaea

336

cyprinacea (Copepoda, Lernaeidae) in Argentina. Bull. Eur. Assoc. Fish Pathol. 30, 65-

337

73.

338

Sahoo, P.K., Kar, B., 2012. Argulosis: Current understanding of host-pathogen interaction

339

and its control. In Invited Lectures and Abstracts of Second National Conference on

340

Fisheries Biotechnology, 2–3 November, CIFE, Mumbai. 52–60.

341 342 343 344

Sandra, Y. D. V., 2004. Koi Husbandry, Health Assessment and Health Maintenance. Koi Health Advisor Program of the AKCA. Ph D. www.nda.agric.com. Schlotterer, C., Amos, B., Tautz, D., 1991. Conservation of polymorphic simple sequence in cetacean species. Nature. 354, 63–65.

345

Schulz, M.H., Zerbino, D.R., Vingron, M., Birney, E., 2012. Oases: Robust de novo RNA-

346

seq assembly across the dynamic range of expression levels. Bioinformatics. 28, 1086–

347

1092.

348

Smith, J. A.,Whitfield, P. J., 1988. Ultrastructural studies on the early cuticular

349

metamorphosis of adult female Lernaeocera branchialis (L) (Copepoda, Pennellidae).

350

Hydrobiologia. 167, 607-616.

351

Sutherland B. J. G., Stuart, G. J., Motoshige, Y., Dan, S., Sanderson, Ben, F. K., and Simon

352

R. M.J., 2012. Transcriptomics of coping strategies in free-swimming Lepeophtheirus

353

salmonis (Copepoda) larvae responding to abiotic stress. Mol. Ecol. 21, 6000–6014.

354 355 356 357

Tamuli, K.K., Shanbhouge, S.L., 1996. Incidence and intensity of anchor worm (Lernaea bhadrensis) infection on cultivated carps. Environ. Ecol.14, 282–288. Zafar, I., Minhas, I.K., Naeem, K., 2001. Seasonal occurrence of lernaeosis in pond aquaculture in Punjab. Proc. Pak. Congr. Zool. 21, 159–168.

Page 15 of 18

16 358

Zerbino, D.R., Birney, E., 2008. Velvet: Algorithms for de novo short read assembly using de

359

Bruijn graphs. Genome Res. 18, 821–829.

Figure legends

360 361 362

Fig. 1: Characteristics of homology search of Illumina sequences against the pfam database:

363

Species distribution as a percentage of the total homologous sequences. We used the first hit

364

of each sequence for analysis.

365

Fig. 2: Gene ontology classification of the unigenes.

366

Fig. 3: Clusters of orthologous groups (COG) classification of the unigenes.

367

Fig.4: Analyses of differentially expressed genes during Lernaea cyprinacea development.

368

The gene expression levels of (A); Vitellogenin B) Digestive cysteine protease C) Trypsin D)

369

Cathepsin B E ) Cathepsin L F) Chymotrypsin G) Carboxypeptidase H) Serine protease I)

370

Serpin J) Kazal like serine protease inhibitor K) SOD L) Glutathione S transferase M)

371

Peroxiredoxin N) Aquaporin O) Peritrophin P) Chitin Synthase Q) Cuticle protein R)Venom

372

Allergen

373 374

Table 1: Summary of sequence assembly of the Lernaea cyprinacea transcriptome

375 Adult

Free living

Total number of raw reads

31671751 (31.67 millions)

33840446 (33.84 millions)

Mean

100

100

Contigs Generated

50792

69378

Maximum Contig Length

23727

27996

Minimum Contig Length

200

200

read length

Page 16 of 18

17 Average Contig Length

1,071.2 ± 1,218.5

1,154.2 ± 1,408.8

Total Contigs Length

54409281 (54.4 MB)

80075269 (80 MB)

Contigs >= 200 bp

50792

69378

Contigs >= 500 bp

30213

40700

Contigs >= 1 Kbp

17329

23873

Contigs >= 10 Kbp

78

146

N50 value

1750

2040

376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391

Table 2: SSR mining in L. cyprinacea transcriptome.

392 Statistics of Microsatellite Search Sample

Adult

Free Living

Total number of sequences examined

50792

69378

Total size of examined sequences

54409281

80075269

Total number of identified SSRs

9843

16813

Number

containing

7653

12317

Number of sequences containing

1557

1842

(bp)

of

SSR

sequences

more than 1 SSR

Page 17 of 18

18 988

1711

Number of SSRs with 1 units

2352

3280

Number of SSRs with 2 units

1147

2057

Number of SSRs with 3 units

5193

9405

Number of SSRs with 4 units

145

257

Number of SSRs with 5 units

9

36

Number of SSRs with 6 units

9

67

Number

of

SSRs

present

in

compound formation

393 Unit size of microsatellite

Minimum number of repeats

Mono nucleotide repeats

10

Di nucleotide repeats

6

Tri, tetra, penta,hexa nucleotide repeats

5,5,5,5

Maximal number of bases interrupting 2 SSRs in a

100

compound microsatellite

394 395 396

Page 18 of 18