Accepted Manuscript Title: Identification of putative genes involved in parasitism in the anchor worm, Lernaea cyprinacea by de novo Transcriptome analysis. Author: Pallavi B., Shankar K.M., Abhiman P.B., Iqlas Ahmed PII: DOI: Reference:
S0014-4894(15)00076-4 http://dx.doi.org/doi:10.1016/j.exppara.2015.03.014 YEXPR 7020
To appear in:
Experimental Parasitology
Received date: Revised date: Accepted date:
31-7-2014 19-3-2015 20-3-2015
Please cite this article as: Pallavi B., Shankar K.M., Abhiman P.B., Iqlas Ahmed, Identification of putative genes involved in parasitism in the anchor worm, Lernaea cyprinacea by de novo Transcriptome analysis., Experimental Parasitology (2015), http://dx.doi.org/doi:10.1016/j.exppara.2015.03.014. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
1 1
Identification of putative genes involved in parasitism in the anchor worm, Lernaea
2
cyprinacea by de novo Transcriptome analysis.
3
Pallavi B1,2, Shankar KM2*, Abhiman PB1,2 & Iqlas Ahmed1,2
4
1
5
Mangalore 575002, India.
6
2
7
Sciences University
8
*Corresponding author: Dean, College of Fisheries Mangalore, Tel: +91-824-2248936;
9
Fax: +91-824-2248366
Aquatic Animal Health Laboratory, Department of Aquaculture, College of Fisheries,
Department of Aquaculture,College of Fisheries,Karnataka Veterinary, Animal and Fisheries
10
E-mail:
[email protected]
11
Highlights:
12 13 14 15 16
De novo transcriptome sequencing of adult and free living stages of L. cyprinacea.
36,054 unigenes found defined annotations in the pfam database.
Differences in transcription between free-living and parasitic stages established.
17
This would shorten route towards vaccine/control strategies against the parasite.
18 19
Graphical Abstract
20
Page 1 of 18
2 21
ABSTRACT
22
There is little information on the genome sequence of Lernaea cyprinacea a major
23
ectoparasite of freshwater fish throughout the world. We subjected the L. cyprinacea
24
transcriptome (adult and free living stages) to Illumina HiSeq 2000 sequencing. We obtained
25
a total of 31671751 (31.67 millions) reads for the adult parasitic stage and 33840446 (33.84
26
millions) for the free living stage. The reads were assembled into 50,792 contigs for the adult
27
stage and 69,378 for the free living stage. Using the pfam database, 41.91 % of the
28
transcriptome was annotated. The transcriptome was mined for genes associated with
29
parasitism. To examine gene expression changes associated with the parasitism of
30
L. cyprinacea during the transit from the free living to parasitic stage, we studied the
31
differentially expressed transcripts between the two stages. The microsatellite markers were
32
also identified (9,843 for adult stage; 16,813 for free living stages) and this would facilitate
33
population genetic studies in various geographical isolates of Lernaea. Our data provides the
34
most comprehensive sequence resource available for L. cyprinacea and demonstrates that
35
Illumina sequencing allows de novo transcriptome assembly and gene expression analysis in
36
a species lacking genome information. The data could open new avenues for a wide array of
37
genetic, evolutionary, biological, ecological, epidemiological studies, and a solid foundation
38
for the development of novel interventions against L. cyprinacea.
39
Keywords:
40
Lernaea cyprinacea; Transcriptome; Parasitism
41
1. Introduction
42
Lernaea infection is a major disease problem encountered in carp culture in the Indian
43
subcontinent and has been reported from Indian major carps Catla (Catla catla), Rohu (Labeo
44
rohita), Mrigal (Cirrhinus mrigala), exotic carps silver carp (Hypophthalmichthys molitrix),
Page 2 of 18
3 45
grass carp (Ctenopharyngodon idella) and indigenous carps Labeo fimbriatus (Nandeesha et
46
al., 1984, 1985; Tamuli and Shanbhouge, 1996; Zafar et al., 2001). The three Indian Major
47
Carps viz., Catla, Rohu and Mrigal, contribute significantly to Indian aquaculture production
48
with an output of over two million tonnes (Kurva et al., 2013). Ectoparasitic diseases in
49
freshwater fish farms of India result in an annual loss of 300 crores INR due to disease-
50
induced mortality and impaired growth (Sahoo and Kar, 2012; Lakra et al., 2006). Lernaea
51
infections have been reported from Africa, Asia, Europe, North America (Hoffman, 1999)
52
and South America (Plaul et al., 2010). Lernaea cyprinacea, Linnaeus, 1758, is the only
53
cosmopolitan species in the genus Lernaea (Piasecki et al., 2004).
54
The life cycle of L. cyprinacea is composed of nine stages; following three free living
55
naupliar stages are five copepodite stages and one adult stage. The parasite feeds on fish
56
mucus, blood and tissue debris. Invasion destroys scales, skin and muscles of fish. Heavy
57
parasitosis can be the cause of mass death of fish and secondary bacterial or fungal infections
58
(Bednarska et al., 2009).
59
Unfortunately, the only effective treatment against Lernaea spp is the application of
60
organophosphate and organochlorine pesticides. Their chemical stability, lipophilic nature
61
and toxicity, has led researchers to be concerned with their presence in the environment
62
(Amaraneni, 2002). The persistence of therapeutic agents in the aquatic environment causes
63
adverse effects on the ecosystem (Anon 1988, Choo 1994). Lernaea spp has been reported to
64
develop resistance to certain pesticides (Hoole et al., 2001; Sandra, 2004). Thus,
65
immunological protection of fishes against Lernaea spp infestation is presently the practically
66
sustainable alternative control method to the current use of pesticides that is riddled with
67
serious limitations. Owing to the complex nature of parasites, search for vaccine targets has
68
proven difficult. Currently no EST records are available for Lernaea cyprinacea in the
Page 3 of 18
4 69
National Center for Biotechnology Information (NCBI) database. Lack of genomic data has
70
hampered the use of molecular tools in developing control strategies for L. cyprinacea.
71
In order to predict and prioritize novel antigenic targets expressed across different
72
developmental stages of L. cyprinacea, we employed the Illumina sequencing and predictive
73
algorithms to explore similarities and differences in the transcriptomes of the free living and
74
adult parasitic stage of L. cyprinacea. This study involves the first de novo transcriptome
75
analysis of L. cyprinacea. Bioinformatic analyses of the transcriptomic data allowed a
76
detailed exploration of molecular changes associated with the transition from the free-living
77
to the parasitic stage and a prediction of the roles that key transcripts play in the metabolic
78
pathways linked to parasitism. Overall, this study provides the first insight into the molecular
79
biology of the important parasite L. cyprinacea.
80
2. Materials and Methods
81
2.1. Parasite Isolation
82
Catla heavily infected with L. cyprinacea were brought to the laboratory from the farm of the
83
Fisheries College, Mangalore, India. The L. cyprinacea was carefully pulled out with forceps
84
to avoid contamination with the host tissues, snap frozen in liquid nitrogen and stored in it
85
until used. Intact egg sacs were removed and incubated in flasks containing filtered tap water.
86
The resulting nauplii were maintained until a majority moulted to the first copepodite stage.
87
The free living stages were recovered onto 47-mm cellulose acetate filter membranes with a
88
pore size of 0.22 µm (Millipore). The membranes were flash frozen in liquid nitrogen and
89
stored in it till the next use (Sutherland et al., 2012). The phylogenetic status of the parasite
90
was also checked by partial sequencing of the rDNA of 18S and 28S regions (communicated
91
for publication elsewhere) and it was confirmed as L. cyprinacea.
92
2.2. RNA isolation
Page 4 of 18
5 93
The pooled adult (50) and free living stages (500) of the L. cyprinacea sample were
94
separately homogenized using TOMY Homogenizer with steel beads. Total RNA was
95
extracted with the TriZOL (Invitrogen) according to the manufacturer’s instructions.
96
2.3. Library preparation and sequencing
97
Transcriptome library for sequencing was constructed using TruSeq RNA sample preparation
98
kit according to the manufacturer’s instructions. Total RNA (1µg) was subjected to Poly A
99
purification of mRNA. Purified mRNA was fragmented for 4 min at elevated temperature
100
(94oC) in the presence of divalent cations and reverse transcribed with Superscript III
101
Reverse transcriptase (Invitrogen) by priming with random hexamers. Second strand cDNA
102
was synthesized in the presence of DNA polymerase I and RnaseH. The cDNA was cleaned
103
up using Agencourt Ampure XP SPRI beads (Beckman Coulter). Illumina Adapters were
104
ligated to the cDNA molecules after end repair and addition of A base. On completion of
105
ligation, SPRI was cleaned. The library was amplified using 11 cycles of PCR for enrichment
106
of adapter ligated fragments. The prepared library was quantified using a Nanodrop
107
spectrophotometer (Thermo Scientific, DE, USA) and validated for quality by running an
108
aliquot on a High Sensitivity Bioanalyzer Chip (Agilent). The prepared library was
109
sequenced on Illumina HiSeq 2000 (Illumina) to generate reads of 2x100 bp. The sequencing
110
reactions for the parasitic and free living stages were run at the same time to prevent
111
confounding error profiles with real differences in transcription.
112
2.4. De novo assembly and sequence clustering
113
Raw reads were processed using the perl script. Raw read processing step involved Adapter
114
trimming, B-block removal and low quality base filtering. Read Quality was assessed using
115
read quality check tool, SeqQC. Contamination at reads level was checked by aligning the
116
processed reads with the RNA sequences of the host fish using the Bowtie-0.12.8 tool and
117
reads aligning to the host fish were removed. The Velvet_1.2.10 tool (Zerbino and Birney,
Page 5 of 18
6 118
2008) was used for the de-novo assembly of high quality reads to get contigs. The
119
Oases_0.2.08 (Schulz et al., 2012) tool was used for transcript generation from de novo
120
assembled contigs.
121
combined transcripts to form unigenes with minimum similarity cut-off of 95%.
122
2.5. Ontology and annotation
123
Assembled transcripts were mapped against UniProt and associated GO, pfam databases and
124
COG database using the ncbi-BLAST-2.2.28 tool (Altschul et al., 1990).
125
2.6. SSRs identification
126
MISA tool was used for identification and localization of perfect microsatellites as well as
127
compound microsatellites, which are interrupted by a certain number of bases.
128
2.7. Differential Gene Expression Analysis
129
The transcripts from the adult and larval stages were combined and clustered using the CD
130
Hit tool with a minimum similarity cutoff of 95%. The reads were aligned using the Bowtie
131
version (0.12.8) tool and the read count profile generated. The Differential Gene Expression
132
analysis was carried out with the DESeq tool (Anders and Huber, 2010). The P-value given
133
by DESeq package was calculated based on significance test incorporated within DESeq.
134
Transcripts having corrected P-value as <=0.05 were considered to be significant. Corrected
135
P-value given by DESeq package was calculated based on the Benjamini Hochburg
136
procedure. The absolute value of log 2 Ratio <_1 was used as the threshold to judge the gene
137
expression difference. If the log fold change was >=1 the transcript was considered as
138
upregulated and <=1 as down regulated.
139
2.8. Validation of gene expression in Lernaea cyprinacea by real time qPCR
The CD-HIT tool (Limin et al., 2012) was used for clustering of
Page 6 of 18
7 140
Ten genes were selected at random from the differentially expressed genes for validation by
141
quantitative real time PCR analysis. Among the chosen transcripts were genes encoding
142
NADH dehydrogenase, Aquaporin, Serpin, Kazal like serine protease inhibitor, Cathepsin B,
143
Cathepsin L, Vitellogenin, Glutathione S Transferase, Peritrophin and Trypsin. Total RNA
144
was isolated from the adult L. cyprinacea sample using the Trizol reagent. Using the Affinity
145
Script QPCR cDNA synthesis kit, 2000 ng of DNase treated RNA was reverse transcribed to
146
make 100 ng/ul of cDNA. The primers were manually designed using Gene Runner version
147
3.05. The primers were validated and the amplicon sizes were confirmed using 2% agarose
148
gel. Relative quantification by qPCR was then carried out using Brilliant II SYBR Green
149
qPCR Master mix. The experiment was conducted using Stratagene Mx3005P (Agilent
150
technologies) platform. PCR consisted of initial denaturation at 95°C for 10 min followed by
151
40 cycles of 95°C for 30 s, 58°C for 1min, 72°C for 1 min. A melt curve was also performed
152
after the assay to check for the specificity of the reaction. Quantification of selected mRNA
153
transcript abundance was performed using the comparative threshold cycle (CT) method.
154
3.0. Results and Discussion
155
3.1. Illumina sequencing and sequence assembly
156
De novo assembly of the parasite transcriptome is a challenging task due to the lack of
157
sufficient reference genomes/ gene sequences in public databases. A total of 31.67 million
158
reads were generated for the adult stage and 33.84 millions reads for the free living stage of
159
L. cyprinacea on Illumina HiSeq 2000 platform (Table 1). While the number of reads should
160
capture a reasonable proportion of genes present in the RNA sample, it will not provide a full
161
characterisation of the transcriptome. Nevertheless, with the development of highly efficient
162
assembly tools in future, the raw data can be utilised for better assembly. The raw reads
163
produced in the present study have been deposited in the NCBI Sequence Read Archive
164
Database (Accession No. PRJNA232511). The length of transcript contigs ranged from 200
Page 7 of 18
8 165
to 23,727 bp for the adult stage and 200 to 27,996 for the free living stage. The adult stages
166
had an average contig length of 1,071.2 ± 1,218.5 bp with an N50 value of 1,750 and free
167
living stages had an average contig length of 1,154.2 ± 1,408.8 bp with an N50 value of
168
2,040.
169
3.2. Annotation of predicted proteins
170
Distinct sequences were searched against the pfam database with a cut-off E-value of 1.0E-4
171
to annotate the unigenes. In total, 36,054 unigenes (41.91 % of all distinct sequences)
172
matched known genes; the other 49,970 unigenes (58.09 %) failed to acquire annotation
173
information in the pfam database.
174
The E-value distribution of the top hits in the pfam database showed that 58 % of the mapped
175
sequences have strong homology 1.0E-50 to 1.0E-150, whereas 42 % of the homolog
176
sequences ranged between (1.0E4-1.0E 49). The similarity distribution has a comparable
177
pattern with 19 % of the sequences having a similarity higher than 80%, while 80% of the
178
hits have a similarity ranging from 30 to 80 %. The species distribution of the top pfam hits
179
for each unique sequence is shown in (Fig. 1).
180
3.3. Unigene functional annotation by Gene ontology (GO) and Classification of
181
Clusters of Orthologous Groups (COG)
182
Gene Ontology (GO) is an international standardized gene functional classification system
183
and covers three domains: cellular component, molecular function, and biological process. A
184
total of 86,024 unigenes were assigned to 2,439 GO terms. 1,085 GO terms originated from
185
the GO domain Biological Process, 1,052 GO terms from the Cellular Component domain
186
and 302 GO terms from the Molecular Function domain (Fig 2).
187
Out of the total 86,024 unigenes, 28,498 genes got defined annotations in the biological
188
process category, 49,762 unigenes in the molecular function category and 21,259 unigenes in
189
the cellular component category. The highly represented groups among biological processes
Page 8 of 18
9 190
category were translation (2,716 unigenes), proteolysis (2,114 unigenes) and small GTPase
191
mediated signal transduction (716 unigenes) processes. The molecular function classification
192
showed a predominance of the ATP binding category (4,689 unigenes), followed by
193
structural constituents of the ribosome (2,828 unigenes) and zinc ion binding (2,801
194
unigenes). Under the cellular component category genes coding for proteins integral to
195
membrane (3,966 unigenes), ribosome (2,578 unigenes) and nucleus (2,058 unigenes) were
196
observed to be highly represented.
197
For more accurate annotation of their functions, we aligned the unigenes to the COG database
198
to find homologous genes (Fig 3). In total, 17,573 unigenes (20.43%) were annotated and
199
formed 24 COG classifications. Among the functional classes, the cluster ‘‘translation,
200
ribosomal structure and biogenesis’’ constituted the largest group (3,650; 20.77%) followed
201
by ‘‘general function prediction’’ (2,790;15.87%) and ‘‘post translational modification,
202
protein turn over and chaperones’’ (1,833;10.43%); the two smallest groups were ‘‘Cell
203
motility” (25; 0.142%) and ‘‘nuclear structure’’ (2; 0.011 %).
204
3.4. Identification of short sequence repeats (SSRs)
205
Microsatellites are widely used genetic markers in population genetic and epidemiological
206
studies (e.g. Schlotterer et al., 1991; Gilbert et al., 1998). The remarkable sequence
207
conservations observed around the microsatellite loci may be used for the development of
208
host-fish species-specific probes for the study of L. cyprinacea populations and their
209
epidemiology. The most prevalent SSR type in both the adult and free living samples was tri-
210
nucleotides, immediately followed by mono-nucleotides, then di-nucleotides, tetra-
211
nucleotides, hexa-nucleotides and penta-nucleotides (Table 2).
212
3.5. Changes in gene expression profile between the adult and free living stages of L.
213
cyprinacea
Page 9 of 18
10 214
The free living and parasitic stages thrive in different environments and the living
215
requirements also vary. Thus, some difference in gene expression between these two phases
216
of development can be expected. Out of the total 86,024 reference transcripts available,
217
38,280 transcripts were found to be expressed in both the stages, 19,069 transcripts were
218
found to be expressed only in the adult parasitic stage and 28,505 transcripts only in the free
219
living stage. Out of the 971 P significant (P-value <= 0.05) transcripts showing differential
220
expression, 659 transcripts were found to be upregulated in the adult parasitic stage and 312
221
transcripts were found to be downregulated.
222
After host penetration, growth in adult female L. cyprinacea involves sexual maturation for
223
egg production (expansion of genital segment) and an increased capacity for food (mainly
224
blood) uptake (expansion of the abdomen). The female specific vitellogenin genes were
225
found to be upregulated in the egg laying adult female L. cyprinacea (Fig.4A). These are
226
incorporated into the eggs and they supply the eggs with sufficient nutrients to ensure proper
227
development and growth after hatching until external food can be ingested and utilised
228
autonomously. Among the differentially expressed genes, genes involved in protein digestion
229
such as Digestive cysteine proteinases, Trypsin, Cathepsin B, Cathepsin L, Chymotrypsin,
230
serine proteases were found to be upregulated in the adult parasitic stage (Fig.4B, C, D, E, F,
231
G). Blood feeding causes excess protein overload in the parasite. Blood-induced expression
232
of protease transcripts would therefore be expected. These proteolytic enzymes not only help
233
in protein digestion, but also facilitate the establishment of parasite infection through
234
proteolytic activation of enzymes. The protease inhibitors like serpin and KTSPIs were also
235
upregulated in the adult parasitic stage. They may be actively involved in the inhibition of
236
components of the host blood coagulation cascade to facilitate fluidity in the mouth parts and
237
midgut following blood-feeding on the fish. However, carboxypeptidases were found to be
Page 10 of 18
11 238
upregulated in the free living stage (Fig. 4H). It might function as a digestive enzyme in the
239
juvenile gut.
240
Sequences encoding detoxifying enzymes like, superoxide dismutase (SOD), glutathione S-
241
transferases (GST), peroxiredoxin were found in the transcriptome of both free living and
242
adult stage of L. cyprinacea (Fig. 4K,L,M). SOD and GST were upregulated in the free living
243
stage stressing their relevance for immune evasion during the initial interaction with the host.
244
Furthermore, early development is a life stage where oxidative stress levels are high due to
245
the presumed link between the high metabolic activities required for growth and ROS
246
generation (Monaghan et al., 2009). Barata et al., (2005) reported similar expression patterns
247
of these genes in Daphnia magna. However, the genes transcribing peroxiredoxin, another
248
antioxidant protein were upregulated in the blood feeding parasitic stage only and might be
249
actively involved in detoxifying the ROS generated through blood meal digestion.
250
Aquaporin genes were also found to be upregulated in the parasitic blood feeding stage
251
(Fig.4N). The L. cyprinacea aquaporins might help cope with the osmotic stress resulting
252
from blood feeding and excrete excess water coming in through the blood meal.
253
The peritrophic matrix of the parasite rearranges itself during the course of blood digestion
254
and hence the genes encoding peritrophins were found to be enriched in the adult parasitic
255
stage (Fig.4O). The peritophic matrix serves as a molecular sieve of partially digested protein
256
and carbohydrate, as a scaffold for proteases, peptidases, and glycosidases, as a sink for toxic
257
substances, and as a barrier to ingested pathogens. The upregulation of chitin synthase in the
258
adult stage (Fig.4P) might be associated with the synthesis of the chitin fibrils to aid in the
259
remodelling of the peritrophic matrix following blood meal.
260
Animals with an exoskeleton grow through molting and each instar typically shows limited
261
increase in size. In another parasitic copepod, Lernaeocera branchialis, substantial growth
262
and metamorphosis in adult females after the final molt have been reported, resulting in a 20-
Page 11 of 18
12 263
fold size increase of the abdomen (Smith and Whitfield, 1988). Large scale cuticle secretion
264
must account for the large size increase. This might explain the upregulation of cuticle
265
proteins in the adult female stage of L. cyprinacea (Fig.4Q). The Venom allergen genes were
266
found to be upregulated during the blood feeding stage of L. cyprinacea (Fig.4R). These may
267
be secreted during feeding and might be involved either in suppression of the host immune
268
system or in the prevention of clotting to prolong feeding.
269
3.6. Validation of gene expression in L. cyprinacea by real time qPCR
270
Real-time RT-PCR is frequently used to confirm data obtained from high-throughput
271
sequencing (Chen et al., 2010; Kalavacharla et al., 2011). The expression pattern of most of
272
the genes obtained through qRT-PCR data largely corroborated the RNA-seq data. The
273
qRTPCR analysis confirms that RNA-seq approach has provided reliable data.
274
4.0. Conclusion
275
In conclusion, the whole transcriptome of the adult and free living stages of L.cyprinacea was
276
subject to Illumina HiSeq 2000 sequencing. This study is the first to obtain fundamental
277
molecular knowledge of L. cyprinacea. Some noteworthy results of this study are that a
278
significant number of putative genes involved in parasitism were identified within the derived
279
sequences; a number of microsatellite markers were predicted, which upon validation could
280
facilitate the identification of polymorphisms within L. cyprinacea populations. Given the
281
shortcomings of the currently available pesticide treatment against Lernaea spp., our data has
282
created an opportunity to shorten the route towards developing more efficient and sustainable
283
control programs like vaccination against the parasite.
284
Acknowledgements
285
The authors wish to thank the NAIP- ICAR, New Delhi, for funding the research and
286
CSIR, New Delhi for providing the Senior Research Fellowship to the first author. Thanks
287
are due to Genotypic Technologies, Bangalore for sequencing support services.
Page 12 of 18
13 288
References:
289
Altschul, S., Gish, W., Miller, W., Myers, E., Lipman, D., 1990. Basic local alignment search
290 291 292 293 294 295 296
tool. J. Mol. Biol. 215, 403–410. Amaraneni, S.R ., 2002. Persistence of pesticides in water, sediment and fish from fish farms in Kolleru Lake, India. J Sci Food Agric.82, 918-923. Anders, S., Huber, W., 2010. Differential expression analysis for sequence count data. Genome Biol. 11:R106. Anon, 1988. Norwegian aquaculture: controlling the antibiotic explosion. In: Animal Pharmacy, J.B. Publication Ltd., Surrey.
297
Barata, C., Navarro, J.C., Varo, I., Riva, M.C., Arun, S., Porte, C. 2005. Changes in
298
antioxidant enzyme activities, fatty acid composition and lipid peroxidation in Daphnia
299
magna during the aging process. Comp. Biochem. Physiol. B, Biochem. Mol. Biol.
300
140, 81–90.
301 302
Bednarska, M., Bednarski, M., Soltysiak, Z., Polechonski, R., 2009. Invasion of Lernaea cyprinacea in rainbow trout (Oncorhynchus mykiss). Acta Sci. Pol. Med.Vet. 8, 27–32.
303
Chen, S., Yang, P.C., Jiang, F., Wei, Y.Y., Ma, Z.Y., Kang, L. 2010. De novo analysis of
304
transcriptome dynamics in the migratory locust during the development of phase traits.
305
PloS One. 5, 1–15.
306 307
Choo, P. S., 1994. Degradation of oxytetracycline hydrochloride in fresh and seawater. Asian Fish.Sci. 7, 195-200.
308
Gilbert, S.C., Plebanski, M., Gupta, S., Morris. J., Cox, M., Aidod, M., Kwiatkowski, D.,
309
Greenwood, B.M., Whittle, H.C. & Hill, A.V., 1998. Association of malaria parasite
310
population structure, HLA, and immunological antagonism. Science. 279, 1173–1177.
Page 13 of 18
14 311 312
Hoffman, G.L., 1999. Parasites of North American fresh water fishes, 2nd edition, Comstock Publishing Associates, Division of Cornell University Press, Ithaca and London.
313
Hoole, D., Bucke, D., Burgess, P., Wellby, I., 2001. Infectious Diseases-Parasitic Crustacea
314
Lernaea cyprinacea. In: Textbook of Diseases of Carp and Other Cyprinid Fishes" 2nd
315
Ed. Blackwell Science.116-117.
316
Kalavacharla, V., Liu, Z., Meyers, B.C., Thimmapuram, J., Melmaiee, K., 2011.
317
Identification and analysis of common bean (Phaseolus vulgaris L.) transcriptomes by
318
massively parallel pyrosequencing. BMC Plant Biol. 11,135.
319
Kurva, R. R., Gadadhar, D., Abraham, T. J., 2013. Parasitic study of Cirrhinus Mrigala
320
(Hamilton, 1822) in selected districts of West Bengal, India. Intl. J. of Adv. Biotec and
321
Res. 4, 419-436.
322
Lakra, W.S., Abidi, R., Singh, A.K., Sood, N., Rathore, G., Swaminathan, T.R., 2006. Fish
323
introductions and quarantine: Indian perspective. National Bureau of Fish Genetic
324
Resources, Lucknow, India.
325
Limin, F., Beifang, N., Zhengwei, Z., Sitao, W., Weizhong, L., 2012. CD-HIT: accelerated
326
for clustering the next generation sequencing data. Bioinformatics. 28, 3150-3152.
327
Monaghan, P., Metcalfe, N.B., Torres, R., 2009. Oxidative stress as a mediator of life history
328 329 330 331 332
trade-offs: mechanisms, measurements and interpretation. Ecol. Lett. 12, 75–92. Nandeesha, M.C., Devaraj, K.V., Murthy, C.K., 1984. Incidence of crustacean parasite Lernaea bhadraensis on fingerlings of Labeo fimbriatus (Bloch). Curr. Res. 13, 80–82. Nandeesha, M.C., Seenappa, D., Devaraj, K.V., Murthy, C.K., 1985. Incidence of anchor worm Lernaea on new hosts of fishes. Environ. Ecol. 3, 293–295.
Page 14 of 18
15 333 334
Piasecki, W., Goodwin, A. E., Eiras, J. C., Nowak, B. F., 2004. Importance of Copepoda in Freshwater Aquaculture. Zool. Stud. 43, 193-205.
335
Plaul, S. E., Romero N. G, Barbeito C. G., 2010. Distribution of the exotic parasite, Lernaea
336
cyprinacea (Copepoda, Lernaeidae) in Argentina. Bull. Eur. Assoc. Fish Pathol. 30, 65-
337
73.
338
Sahoo, P.K., Kar, B., 2012. Argulosis: Current understanding of host-pathogen interaction
339
and its control. In Invited Lectures and Abstracts of Second National Conference on
340
Fisheries Biotechnology, 2–3 November, CIFE, Mumbai. 52–60.
341 342 343 344
Sandra, Y. D. V., 2004. Koi Husbandry, Health Assessment and Health Maintenance. Koi Health Advisor Program of the AKCA. Ph D. www.nda.agric.com. Schlotterer, C., Amos, B., Tautz, D., 1991. Conservation of polymorphic simple sequence in cetacean species. Nature. 354, 63–65.
345
Schulz, M.H., Zerbino, D.R., Vingron, M., Birney, E., 2012. Oases: Robust de novo RNA-
346
seq assembly across the dynamic range of expression levels. Bioinformatics. 28, 1086–
347
1092.
348
Smith, J. A.,Whitfield, P. J., 1988. Ultrastructural studies on the early cuticular
349
metamorphosis of adult female Lernaeocera branchialis (L) (Copepoda, Pennellidae).
350
Hydrobiologia. 167, 607-616.
351
Sutherland B. J. G., Stuart, G. J., Motoshige, Y., Dan, S., Sanderson, Ben, F. K., and Simon
352
R. M.J., 2012. Transcriptomics of coping strategies in free-swimming Lepeophtheirus
353
salmonis (Copepoda) larvae responding to abiotic stress. Mol. Ecol. 21, 6000–6014.
354 355 356 357
Tamuli, K.K., Shanbhouge, S.L., 1996. Incidence and intensity of anchor worm (Lernaea bhadrensis) infection on cultivated carps. Environ. Ecol.14, 282–288. Zafar, I., Minhas, I.K., Naeem, K., 2001. Seasonal occurrence of lernaeosis in pond aquaculture in Punjab. Proc. Pak. Congr. Zool. 21, 159–168.
Page 15 of 18
16 358
Zerbino, D.R., Birney, E., 2008. Velvet: Algorithms for de novo short read assembly using de
359
Bruijn graphs. Genome Res. 18, 821–829.
Figure legends
360 361 362
Fig. 1: Characteristics of homology search of Illumina sequences against the pfam database:
363
Species distribution as a percentage of the total homologous sequences. We used the first hit
364
of each sequence for analysis.
365
Fig. 2: Gene ontology classification of the unigenes.
366
Fig. 3: Clusters of orthologous groups (COG) classification of the unigenes.
367
Fig.4: Analyses of differentially expressed genes during Lernaea cyprinacea development.
368
The gene expression levels of (A); Vitellogenin B) Digestive cysteine protease C) Trypsin D)
369
Cathepsin B E ) Cathepsin L F) Chymotrypsin G) Carboxypeptidase H) Serine protease I)
370
Serpin J) Kazal like serine protease inhibitor K) SOD L) Glutathione S transferase M)
371
Peroxiredoxin N) Aquaporin O) Peritrophin P) Chitin Synthase Q) Cuticle protein R)Venom
372
Allergen
373 374
Table 1: Summary of sequence assembly of the Lernaea cyprinacea transcriptome
375 Adult
Free living
Total number of raw reads
31671751 (31.67 millions)
33840446 (33.84 millions)
Mean
100
100
Contigs Generated
50792
69378
Maximum Contig Length
23727
27996
Minimum Contig Length
200
200
read length
Page 16 of 18
17 Average Contig Length
1,071.2 ± 1,218.5
1,154.2 ± 1,408.8
Total Contigs Length
54409281 (54.4 MB)
80075269 (80 MB)
Contigs >= 200 bp
50792
69378
Contigs >= 500 bp
30213
40700
Contigs >= 1 Kbp
17329
23873
Contigs >= 10 Kbp
78
146
N50 value
1750
2040
376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391
Table 2: SSR mining in L. cyprinacea transcriptome.
392 Statistics of Microsatellite Search Sample
Adult
Free Living
Total number of sequences examined
50792
69378
Total size of examined sequences
54409281
80075269
Total number of identified SSRs
9843
16813
Number
containing
7653
12317
Number of sequences containing
1557
1842
(bp)
of
SSR
sequences
more than 1 SSR
Page 17 of 18
18 988
1711
Number of SSRs with 1 units
2352
3280
Number of SSRs with 2 units
1147
2057
Number of SSRs with 3 units
5193
9405
Number of SSRs with 4 units
145
257
Number of SSRs with 5 units
9
36
Number of SSRs with 6 units
9
67
Number
of
SSRs
present
in
compound formation
393 Unit size of microsatellite
Minimum number of repeats
Mono nucleotide repeats
10
Di nucleotide repeats
6
Tri, tetra, penta,hexa nucleotide repeats
5,5,5,5
Maximal number of bases interrupting 2 SSRs in a
100
compound microsatellite
394 395 396
Page 18 of 18