An expressed sequence tag analysis of the life-cycle of the parasitic nematode Strongyloides ratti

An expressed sequence tag analysis of the life-cycle of the parasitic nematode Strongyloides ratti

Molecular & Biochemical Parasitology 142 (2005) 32–46 An expressed sequence tag analysis of the life-cycle of the parasitic nematode Strongyloides ra...

724KB Sizes 3 Downloads 73 Views

Molecular & Biochemical Parasitology 142 (2005) 32–46

An expressed sequence tag analysis of the life-cycle of the parasitic nematode Strongyloides ratti夽 Fiona J. Thompson a,∗,1 , Makedonka Mitreva b,1 , Gary L.A. Barker a , John Martin b , Robert H. Waterson b , James P. McCarter b,c , Mark E. Viney a b

a School of Biological Sciences, University of Bristol, Woodland Road, Bristol BS8 1UG, UK Genome Sequencing Center, Department of Genetics, Washington University School of Medicine, 4444 Forest Park Boulevard, St. Louis, MO 63108, USA c Divergence Inc., 893 North Warson Road, St. Louis, MO 63141, USA

Received 24 January 2005; received in revised form 17 March 2005; accepted 17 March 2005 Available online 18 April 2005

Abstract 14,761 expressed sequence tags (ESTs) were generated, representing five stages during the parasitic and free-living phases of the life-cycle of the parasitic nematode Strongyloides ratti. These ESTs formed 4152 clusters, of which 97% contained 10 or fewer ESTs and 66% were singletons. These 4152 clusters are likely to represent approximately 20% of S. ratti’s genes. The clusters’ consensus sequences were used to assign each cluster to one of three databases: (i) Caenorhabditis elegans and C. briggsae sequences; (ii) other nematode sequences; (iii) non-nematode sequences. This approach has identified putative nematode-specific genes, that may be targets for developing approaches for parasitic nematode control. Approximately 25% of the clusters have no significant alignments and may therefore represent novel genes. The EST representation between the libraries was used to analyse stage-specific or -biased expression in silico. This showed that 81% of clusters are present in only one library and 12% are present in any two libraries, indicating substantial stage-specificity of gene expression. The 30-most abundantly expressed clusters were analysed in further detail. Many of these have significantly different parasitic- or free-living-specific or -biased expression. Many of the parasitic-specific genes are, as yet, uncharacterised: one of these represents 25% of all ESTs obtained from the parasitic stage. © 2005 Elsevier B.V. All rights reserved. Keywords: Strongyloides ratti; ESTs; Gene expression; Genomics; Nematode; Parasite; C. elegans

1. Introduction Parasitic nematodes are important pathogens of humans, other animals and plants [1] (http://www.who.int/ Abbreviations: iL3, infective third stage larvae; p.i., post-infection; EST, expressed sequence tag; RNAi, RNA interference; BLAST, basic local alignment search tool; L, larval stage; ORF, open reading frame; aa, amino acids; CI, confidence intervals 夽 Note: DNA sequences reported in this paper have been deposited in GenBank, EMBL and DDJB under the accession numbers detailed in Appendix A. Sequences are also available at http://www.nematode.net/. ∗ Corresponding author. Tel.: +44 117 928 7470; fax: +44 117 925 7374. E-mail address: [email protected] (F.J. Thompson). 1 These authors contributed equally to this work. 0166-6851/$ – see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.molbiopara.2005.03.006

wormcontrol/newsletter/en/ Issue 1). Members of the genus Strongyloides are parasites of vertebrates, with some 60 species described from mammals, birds, amphibians and reptiles [2]. Two species infect humans: S. stercoralis and S. fuelleborni [3]. Overall, it is estimated that some 50–100 million individuals are infected with Strongyloides spp. [1,4]. A number of Strongyloides spp. are used in experimental studies, such as, S. stercoralis in mice (a non-natural host) and in dogs (a host from which S. stercoralis has been isolated, but the extent to which parasites are shared between humans and dogs is not known); S. ratti and S. venezuelensis in mice (a non-natural host, for both species) as well as the natural host, rats [5–7]. Thus, in addition to the advantages of working with laboratory rodents, the advantage of working

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

33

Fig. 1. Life-cycle of S. ratti with two discrete developmental switches, shown as grey boxes: (1) a sex determination event; (2) a female-only developmental switch [10]. L, denotes larval stages, as numbered. Stages that were used to make libraries are indicated in colour; blue, free-living L1 library; yellow, free-living L2 library; green, mixed free-living adult and iL3s library; red, parasitic female libraries from days 6 and 15 p.i.

with S. ratti and S. venezuelensis in rats is that the natural host–parasite relationship is maintained. The life-cycle of Strongyloides spp. is unique among nematode parasites of vertebrates (Fig. 1). Infective third-stage larvae (iL3s) present in the environment, encounter a host and enter by skin penetration. These larvae migrate through the host body, moulting via a fourth larval stage (L4) into adult parasites in the small intestine [8]. The parasitic stages are female only, which in S. ratti have been shown to reproduce by mitotic parthenogenesis [9]. These parasitic females lie embedded in the mucosa of the small intestine from where they produce eggs, that are passed out of the host in faeces. These eggs give rise to a free-living generation. The control of the development of this free-living generation is complex and has been the subject of extensive study. This is best understood for S. ratti (Fig. 1). The eggs of S. ratti are male and female [10]. Male eggs hatch and moult through four larval stages into free-living adult males only. The female eggs have a developmental choice (Fig. 1). Firstly, they can develop in a manner similar to that of male eggs and develop into freeliving adult females. These free-living adults mate by sexual reproduction [11] and the female lays eggs. These eggs hatch and develop through two larval stages into iL3s that persist in the environment until a suitable host becomes available. In the alternative developmental route of female eggs, they can develop through two larval stages directly into iL3s. Thus, in this life-cycle there are two points of control: the sex ratio of the eggs produced by parasitic females, and a develop-

mental choice of female eggs [12]. Extensive investigation of the factors that affect these control points have shown that the temperature external to the hosts affects the female-only developmental choice and that the immune status of the host from which eggs are passed affects both the sex ratio and the female-only developmental choice [7,12–14]. This complex life-cycle is unique to Strongyloides. However, the genus Parastrongyloides, which is the genus most closely related to Strongyloides, has essentially the same free-living life-cycle, but has both parasitic males and females and multiple freeliving adult generations [15,16]. The infective stage of many nematodes, including Strongyloides spp., is a third larval stage. This is thought to be analogous to the so-called dauer larvae of many free-living nematodes, including Caenorhabditis elegans [17,18]. Dauer larvae are ‘alternative’, arrested larval forms that develop at times of environmental stress. With respect to S. ratti, not only is there an analogy between its iL3s and the dauer larvae of C. elegans, but there is also an analogy between the developmental choice of young larvae of C. elegans (dauer or non-dauer development) and the female-only choice of S. ratti larvae [19]. Previous analysis of ESTs from S. stercoralis compared S. stercoralis clusters specific to L1s or specific to iL3s, with C. elegans dauer-specific and nutrient-rich-specific (i.e. nondauer) genes [20]. This found a significant number of BLAST matches between S. stercoralis L1-specific and C. elegans nutrient-rich-specific genes. However, it was also found that

34

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

S. stercoralis iL3-specific or -biased clusters were more likely to have significant matches to C. elegans dauer-specific genes than expected by chance [20]. Overall, this suggests that there is some conservation of stage-specific or -biased gene expression between S. stercoralis and C. elegans, which seems to be greater for S. stercoralis L1-specific and C. elegans nutrientrich-specific genes. However, these analyses are particularly sensitive to the data sets used in the comparison and there is considerable movement in the genes whose expression is defined as specific or biased to one stage for each species (e.g. [21]). There is now a good phenomenological understanding of the S. ratti life-cycle, but the next challenge is to begin to investigate and understand its molecular and genetic control. The parasitic female stages of S. ratti are also an excellent system for studying the host–parasite interaction. In S. ratti infections in rats, an anti-S. ratti immune response develops as an infection progresses [7]. As the immune response develops, the parasitic females become progressively shorter, their per capita fecundity is reduced and they move to a more posterior position in the intestine [22–25]. These effects are reversed if the rat host is immunosuppressed or do not occur in immunodeficient or immunosuppressed hosts [25,26]. Many of these effects are shared with the effects of the host immune response against other nematodes in other host–parasite systems (e.g. [27]). Despite the fact that these phenomena are widespread, the molecular basis underlying these changes, or indeed the effect of the host immune response on any molecular processes within parasitic nematodes, have not been investigated [28]. Our desire to understand the molecular control of the S. ratti life-cycle and to investigate these effects of the host immune response on S. ratti parasitic females, were the impetus to undertake this EST analysis and to use this to build S. ratti microarrays for further expression studies [28]. C. elegans was the first metazoan whose complete genome sequence was determined [29] which has been followed by the genome sequence of C. briggsae [30].There has also been extensive EST analysis of many other nematodes and there are over 400,000 sequences available from a range of free-living nematodes and parasites of humans, other animals and plants (some 19 genera of parasitic nematodes in total) from all of the five major Nematoda clades, except Clade II [31–33]. The first genome sequence of a parasitic nematode, Brugia malayi, is now becoming available [34]. This wealth of genomic information has therefore begun to develop a picture of the genomes of nematodes and of their detectably expressed genes. The phylogenetic relationships of the nematodes have shown that parasitism has evolved a number of times within this group [31]. A key area that is now ripe for investigation is to determine the extent to which the genomes of free-living nematodes and their parasitic relatives are shared and the extent to which the genomes of different parasitic nematodes are shared. Recent analyses of partial genomes of a range of parasitic nematodes and some free-living species have shown that there are substantial differences in the representation of expressed genes between dif-

ferent clades and species [33]. The significant consequence of this is that it may be difficult to infer gene, genetic or biological information between different clades or species of nematodes, which therefore puts in doubt the utility of studying free-living nematodes to understand the biology of parasitic nematodes. The work presented here on S. ratti, compliments and extends the EST analysis of parasitic nematodes. In this S. ratti study, ESTs have been obtained from different life-cycle stages, which not only sought to ensure that life-cycle-wide sampling occurred, but also to allow an in silico analysis of gene expression between different life-cycle stages. This work has discovered some 4000 genes from S. ratti and has shown significant stage-specific or -biased expression of many of these. Gene expression in the parasitic female stages is very heavily biased towards the expression of a small number of parasitic female-specific genes that are, in essence, completely uncharacterised in any other organism. This work will now allow a full molecular analysis of the biology of this remarkable organism which is already being undertaken by the use of S. ratti microarrays.

2. Materials and methods 2.1. Parasite material and library construction The S. ratti isofemale line ED321 heterogenic was used throughout [13]. Infections were maintained in Wistar rats and the free-living generation was grown in faecal cultures at 19 ◦ C, as previously described, unless otherwise stated [13,26]. cDNA libraries were made from five different stages of the life-cycle: (1) free-living L1s; (2) free-living L2s; (3) mixed free-living adults and iL3s; (4) parasitic females at day 6 post-infection (p.i.); (5) parasitic females at day 15 p.i. At day 6 p.i. the parasitic females are newly present in the intestine of infected hosts and are not yet subject to the effects of the host immune response; at day 15 p.i. the parasitic females are subject to the effects of the host immune response [25]. For some analyses, the ESTs from the two parasitic female libraries were combined. L1s were collected from fresh faeces held in a Baermann funnel for 6–8 h at 25 ◦ C. For L2 collection, faeces were cultured for 24 h at 19 ◦ C; to collect mixed free-living adults and iL3s, faeces were cultured for 72 h at 19 ◦ C after which free-living stages were collected using a Baermann funnel. All stages were recovered by flotation on a 60% (w/w) sucrose solution followed by several washes in distilled water to remove contaminating debris, as previously described [13,35]. The freeliving stages are not synchronous populations and thus the designations of libraries 1–3, inclusive, refers to the predominant stage present in the preparations [13]; the actual representation of stages in representative samples used in library preparation are as previously described [35] with the only difference being that the L1 preparation used here was held at 25 ◦ C [35].

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

Parasitic females were recovered from the intestine of sacrificed rats at days 6 and 15 p.i., and cleaned of contaminating debris by centrifugation through a Percoll gradient, as previously described [25,36]. All these preparations of worms were finally pelleted by centrifugation, an equal volume of TRI® reagent (Sigma) added and the samples snap frozen in liquid nitrogen. RNA was extracted following the manufacturer’s protocol. There is no evidence to show what percent of S. ratti mRNA transcripts contain the nematode splicedleader sequence (SL-1) [37]. Therefore, in order to maximise the representation of mRNA transcripts, two alternative libraries were synthesised for each life-cycle stage: an SL1 based library with cDNA generated using oligo dT and SL-1 primer sequence [38] and a conventional library where cDNA was directionally cloned using the SMART cDNA library construction system (Clontech Laboratories) with some modifications [39]. 2.2. Sequencing, clustering and annotation Sequencing and EST processing was performed as previously described [40]. From 21,085 attempts, 14,761 (70%) passed the filtering process [41] and the resulting dbEST files were submitted to GenBank (http://www.ncbi. nlm.nih.gov/dbEST/) (Appendix A) between February 2001 and June 2003. Sequence trace files and information about clone requests are available at http://www.nematode.net. Clustering of the 14,701 EST sequences [20] enabled the production of contig consensus sequences; the complete cluster assembly (NemaGene Strongyloides ratti v. 2.0) is available for searching and FTP acquisition at http://www.nematode.net. WU-BLAST sequence comparisons [42] were performed using the 5237 contig sequences grouped into 4152 clusters. Clusters were used as queries against three databases: (i) ‘C. elegans and C. briggsae’ which contains C. elegans protein sequences (http://www.wormbase.org, Wormpep v. 119, 17/02/04), C. elegans mitochondrial nucleotide sequences (http://www. ncbi.nlm.nih.gov/genome/guide/nematode/) and C. briggsae nucleotide sequences (http://www.ncbi.nlm.nih.gov/ genome/guide/nematode/); (ii) ‘other nematodes’, which contains all nucleotide sequences of nematode origin (e.g. 11,335 ESTs from S. stercoralis; 5323 from Parastrongyloides trichosuri; 7790 ESTs from the free-living nematode Pristionchus pacificus, etc. (http://www.nematode.net)) but excluding the C. elegans and C. briggsae data and S. ratti ESTs (GenBank nucleotide database and dbEST); (iii) ‘nonnematodes’, an amino acid database of all non-nematode derived sequences. Databases (ii) and (iii) were internally constructed. To determine the representation of each cluster among the databases, each cluster was queried against each database (i–iii) and its presence within that database was recorded if a BLAST alignment with an E-value of <1 × 10−05 was achieved. In addition, the most significant BLAST alignment for each cluster with respect to the following six categories was determined: S. stercoralis; P. tri-

35

chosuri; plant parasitic nematodes, animal parasitic nematodes; human parasitic nematodes and free-living nematodes. Fragmentation, defined as the representation of one gene by multiple non-overlapping clusters, was estimated by examining S. ratti clusters with homology to C. elegans. The fragmentation index was calculated from the number of best scoring BLAST alignments against Wormpep v. 119 [20]. 2.3. Cluster representation through the S. ratti life-cycle To investigate the representation of each cluster through the S. ratti life-cycle, the number of ESTs for that cluster in each library was determined. To determine whether there was any significant difference in the representation of a cluster between the parasitic female and free-living phases of the life-cycle, the number of ESTs for each cluster was separated into two groups: (a) those from the L1, L2 and free-living adult/iL3 libraries, combined, and (b) those from the two parasitic female libraries, combined. For clusters for which there was a difference of at least five ESTs between (a) and (b), the number of ESTs in the two groups was compared to the null hypothesis that there was an equal number of ESTs using Fisher’s exact test [43,44]. 2.4. Gene ontology The gene ontology (GO) protein database and GO slim terms were downloaded from http://www.genontology.org. Each cluster consensus sequence was compared to this database by BLAST (E-value cut-off of <1 × 10−05 ) and the resulting GO or GO slim term assigned to each EST of each cluster. To investigate the occurrence of GO terms in the freeliving (a, above) and parasitic (b, above) libraries, the number of ESTs in each cluster (and associated GO term) in these libraries was determined. The sum of these values (a + b) for each GO term was used to determine the 20 most highly represented GO terms. For ESTs that occurred in libraries of (a), above, for each GO term the number of ESTs was expressed as a percentage of the total number of ESTs for (a) encompassed within these 20 GO terms; the data for libraries (b) was treated in the same way. Predicted 95% confidence intervals were assigned to these percentages [45]. Studies with C. elegans have determined genes that have an RNAi embryonic lethal phenotype (e.g. [46], http://www. wormbase.org/db/searches/rnai search). We wished to determine the occurrence of such phenotypes among C. elegans alignments of the 99 S. ratti clusters common to four libraries (Appendix E) compared to the 2638 (109 + 318 + 160 + 2051; Fig. 3) S. ratti clusters with significant C. elegans BLAST alignment. To do this we mapped the S. ratti clusters to C. elegans genes by BLAST, which was in turn mapped to the C. elegans embryonic lethal phenotype list (http://www.wormbase.org/db/searches/rnai search). All S. ratti clusters with a significant BLAST alignment to a C. elegans gene were designated as having a C. elegans RNAi

36

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

embryonic lethal phenotype or not, which was used to determine the percent occurrence (and predicted 95% CI [45]) of this in these S. ratti cluster groups.

3. Results and discussion 3.1. Sequencing and annotation 14,701 S. ratti ESTs were generated, of which approximately half were from the parasitic female libraries (days 6 and 15 p.i.) and half from the three free-living stage libraries, combined (Fig. 1). Approximately 96% of the parasitic female ESTs were from the day 6 p.i. library with only 242 ESTs from the day 15 p.i. library. The number of ESTs from each life-cycle stage and each library is shown in Table 1. To date, this is one of the largest datasets available for an animal-parasitic nematode (e.g. http://www.nematode.net/cgi-bin/web totals.cgi). Overlapping EST sequences were identified and used to construct 5237 contigs. Each contig therefore contains ESTs representing apparently identical transcripts. This contig building reduced the total number of nucleotides used for further analyses from 6,312,601 to 2,537,787. These contigs were further grouped into 4152 clusters, where each cluster represents apparent splice-variants of a gene, alleles, polymorphisms or highly similar gene-family members, indistinguishable from alleles [40,47]. Contig consensus sequences were used in subsequent analyses and results summarised at the cluster level. The average (±S.D., n) EST length was 427.6 (±148, n = 14,761) bases, but after contig building this increased to 484.6 (±206.7, n = 5237). The largest cluster contained 1822 ESTs. Based on an analysis of C. elegans, it was estimated that the frequency of fragmentation (i.e. the representation of one gene by multiple non-overlapping clusters) was 8.4% and therefore the actual number of genes represented by these clusters is approximately 3803. Assuming that the genome of S. ratti codes for approximately 22,239 proteins, as does C. elegans (http://www.wormbase.org, Wormpep v. 119, 17/02/04), the S. ratti clusters represent between 17 and 19% (3808–4152 clusters) of the products of its genome.

The distribution of ESTs between clusters is shown in Fig. 2. Sixty six percent of clusters are singletons (i.e. they contain just one EST) and the majority (97%) of clusters contain 10 or fewer ESTs. In order to compare the degree of sequence diversity within each of the libraries, whilst taking into account the different sequencing efforts made on each, we used a random re-sampling approach. One thousand ESTs were drawn at random from each of the libraries and the number of singletons in this sub-sample was counted; this random sampling was repeated 1000 times. This showed that the L1 library had a mean of 537 singletons (95% CI 507–565) which was significantly greater than the L2 library (421, 391–450), the free-living adult/iL3 library (354, 335–373) or the parasitic female (day 6 p.i.) library (474, 443–504). This greater EST diversity in the L1 library may reflect the synthesis of many new transcripts required for the initiation of growth and development that begins at the L1 stage, compared with the other stages analysed here. This pattern has not been observed in analogous analyses of S. stercoralis and Trichinella spiralis (M. Mitreva, pers. comm., data not shown). 3.2. BLAST homologies The cluster consensus sequences were queried against three databases: (i) C. elegans and C. briggsae, (ii) other nematodes (excluding (i)) and (iii) non-nematode (Fig. 3). The 74.3% (3087/4152) of the S. ratti clusters show significant alignment with a sequence from some other species; 25.7% (1065/4152) do not. These 1065 clusters are likely to represent transcripts that are novel to, or newly identified in, S. ratti. Alternatively, these 1065 clusters may have insufficient sequence available to allow identification of a significant BLAST alignment. Given that we estimate that we have identified c. 18% of S. ratti’s genes, these 1065 putatively S. ratti-unique genes suggests that c. 4.5% of S. ratti genes may be unique to this species. Such a surprisingly high level of species-unique genes has also been identified in several other nematode species [33]: elucidating the function of these genes will remain a top priority for parasitologists [40]. However, we note that the mean length (±S.D.) of the 1065 clusters with no significant BLAST alignment is 451 (±201) bases, compared with 581 (±153) for the 3087

Table 1 The number of ESTs produced per life-cycle stage, per library Stage

Library

Number of ESTs

Total

Percenta

Number of recombinantsb

L1 L2 Free-living adult/iL3 Parasitic female (day 6 p.i.) Parasitic female (day 15 p.i.) Total

SL-1, pAMP1 SL-1, pAMP1 SL-1, pAMP1 SL-1c , SL-1, pAMP1 SL-1

328, 2322 382, 2976 318, 1057 379, 176, 6521 242 14701

2650 3358 1375 7076 242 14701

18 23 9 48 2 100

1.0 × 104 , 9.6 × 104 2.1 × 103 , 6.1 × 104 1 × 104 , 5.8 × 104 1.2 × 104 , NKd , 8.7 × 104 1.3 × 103 to 1.3 × 104

a b c d

The number of ESTs per life-cycle stage as a percent of the total number of ESTs. Number of recombinants in the un-amplified libraries. Two SL-1 libraries were made for parasitic females at day 6 p.i. NK: not known.

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

37

Fig. 2. Histogram showing the number of ESTs per cluster against the total number of ESTs per cluster group. For example, there are 17 clusters with 9 EST members (x-axis) giving a total of 153 ESTs (y-axis). Lines indicate the break point between linear and non-linear scale.

clusters with a BLAST alignment. This shorter average length may suggest that some 3 and 5 untranslated regions (UTRs) may be included here; these have previously been shown to have very little alignment between species [48]. Furthermore, analysis of the predicted longest open reading frames (ORF) of the contig sequences showed that in those with-

out a BLAST alignment the mean ORF length is 93.2 amino acids (aa) (±S.D. 59.9, n = 1131) which is shorter than the 147 aa (±64.9, n = 4105) ORF length of those with a BLAST alignment. This average shorter ORF may, at least in part, explain the absence of significant BLAST alignments for these 1131 contigs. However the distribution of ORF lengths was

Fig. 3. Venn diagram of the 4152 S. ratti clusters, that had a significant BLAST alignment (3087) and their distribution (and in parentheses, expressed as a percentage of all clusters) between the three groups (‘C. elegans and C. briggsae’; ‘other nematodes’ or ‘non-nematodes’) and 1065 clusters that had no significant BLAST alignment.

38

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

bimodal in both cases: of the contigs without a significant BLAST alignment, 56% (95% CI 53–59) had ORFs of less than 80 aa, whereas of the contigs with a significant BLAST alignment, 19% (CI 16–21) had ORFs of less than 80 aa (Appendix B). Of the 4152 clusters, 64.4% had significant alignment to sequences from nematode species other than C. elegans and C. briggsae; similarly 63.5% had significant alignment with a C. elegans or a C. briggsae sequence. There was substantial overlap (2369 clusters) between these groups (Fig. 3). This differs from a previous analysis of S. stercoralis ESTs in which there were more significant BLAST alignments to C. elegans sequences than to those of ‘other nematodes’ [20]. It is likely that this difference is due to the addition of two closely related Clade IVA nematodes, S. stercoralis and P. trichosuri, to the ‘other nematode’ group since the previous analysis [20]. 228 (5.5%) of the S. ratti clusters are unique to nematodes other than C. elegans and C. briggsae and these may represent genes that are important and specific to a parasitic life-style. The distribution of these significant BLAST alignments (Table 2) shows that the majority of these S. ratti clusters correspond to other Clade IVA parasitic nematodes present in the database; 57% to S. stercoralis and 20% to P. trichosuri. The remainder of the alignments are to plant(13%), animal- (6.5%) and human- (2%) parasitic nematodes and to free-living species other than C. elegans and C. briggsae (1.5%). This frequency distribution of these significant alignments is consistent with the proposed nematode molecular phylogeny [16,31]. An extension of this group (i.e. the 228 unique to nematodes other than C. elegans and C. briggsae) to include the 75 clusters that are also shared by nonnematodes, increases the representation of matches to plant-

Table 2 The percentage distribution of clusters with significant BLAST alignments between other nematode species and groups Species/group

S. stercoralis P. trichosuri Plant parasitesc Animal parasitesd Human parasitese Free-living nematodesf a

Cladesa

IVA IVA IVB I, III, V III, V IVB, V

Other nematodes only (n = 228)b

Other nematodes (n = 228) and non-nematodes (n = 75)b

57 20 13 6.5 2 1.5

36 13 25 14 9 3

After [31]. From Fig. 3. c Parasites of plants: Globodera pallida; Heterodera glycines; Meloidogyne arenaria; M. artellia; M. chitwoodi; M. hapla; M. incognita; M. javanica; M. paranaensis; Pratylenchus penetrans. d Parasites of animals:Ancylostoma caninum; Ascaris suum; Dirofilaria immitis; Litomosoides sigmodontis; Ostertagia ostertagi; Parelaphostrongylus tenuis; Teladorsagia circumcincta; Toxocara canis; Trichinella spiralis. e Parasites of humans: Brugia malayi; Necator americanus; Onchocerca volvulus. f Free-living nematodes: Pristionchus pacificus; Zeldia punctata. b

and animal-parasitic nematodes, other than S. stercoralis and P. trichosuri (Table 2). 146 (3.5%) of the S. ratti clusters do not match any other nematode sequences and may represent the first identification of these genes in nematodes (Fig. 3). 655 (109 + 318 + 228) (15.7%) of clusters appear to be nematode specific. A priori, these may represent genes that could be anti-nematode therapeutic targets. However, this suggestion must be tempered with the caveat that these assignments (Fig. 3) are based on significant BLAST alignments within each database used here. Thus, it is possible that future comparison of the complete S. ratti gene sequences may reveal significant BLAST alignments and, possibly, homologues for some of these genes among sequences in the non-nematode sector (Fig. 3). Notwithstanding this, these genes and their products should be a priority for further investigation for their potential as targets for nematode control. Of the 49.4% of clusters that were common to all databases, approximately 59% of these matched a generic GO slim term, distributed between 31 GO slim terms (http://geneontology.org/) (Appendix C). Approximately 30% of these clusters matched ‘unknown’ or ‘other’ processes or functions. The remainder of the clusters matched a broad range of terms including transcription, protein biosynthesis and signal transduction and there are also annotations that encompass growth, development and reproduction. This is consistent with these genes having a core, basic biological role in eukaryotic and, or animal life. The 41% of the 2051 clusters that do not match a GO slim term include clusters that have significant BLAST alignment to as yet un-annotated database entries or hypothetical proteins, such that the identity or function of the gene is not known. GO slim analysis of other clusters in the Venn diagram (Fig. 3) showed relatively few matches (Appendix D). As above, this is likely to be due to the fact that many of these clusters have significant BLAST alignments to un-annotated database entries or hypothetical proteins. 3.3. Large clusters The S. ratti clusters were ranked by their number of EST members and the top 30 clusters, that represent 38% of the total ESTs obtained, were investigated in detail (Table 3). A number of clusters have significant alignment to known proteins, the majority of which are house-keeping genes, such as elongation factors, ribosomal proteins and actin. Most of these genes or their protein products have been identified in a number of other nematode species, and some have been characterised in some detail (see below). The largest cluster (SR00007) has greatest similarity to a C. elegans hypothetical protein; in fact 10 of these 30 clusters have significant alignments to hypothetical proteins (eight in C. elegans, one in Homo sapiens and one in Plasmodium falciparum). The high level of representation of these 30 clusters in the various libraries attests to their likely biological significance in the life of S. ratti [20,49–51]. A priori, it is possi-

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

39

Table 3 The 30 largest S. ratti clusters and their representation among the four S. ratti libraries

Ce: C. elegans; Sr: S. ratti; Pt: Parastrongyloides trichosuri. a Number of ESTs and number of contigs per cluster, respectively. b The most significant BLAST alignment for each cluster is given with its GenBank accession number and its E-value. c The number (and as a percentage) of ESTs of each cluster in the representative libraries. d Data for the two parasitic female libraries are combined. e The probability that the number of ESTs in the parasitic female libraries (combined) and in the three free-living libraries (combined) is significantly different (see Section 2 for details). Shaded boxes indicate when the number of ESTs are greater in the parasitic female libraries (combined) than the free-living libraries (combined); un-shaded vice versa. f NS: no significant difference.

40

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

ble that a number of genes are differentially expressed between the different life-cycle stages of S. ratti, reflecting the very different niches and life-histories of these stages in the life-cycle. To investigate this, for these 30 most abundantly represented clusters, we determined whether they were significantly differentially represented (measured as the number of ESTs per library) in the free-living (combined) or the parasitic female (combined) libraries (Table 3). Twentysix of the 30 clusters had a significantly different representation between the free-living and parasitic stage libraries: 14 are represented significantly more in the parasitic female libraries and 12 significantly more in the free-living stage libraries. Of these 14 parasitic female library-enriched clusters, one cluster (SR00007) represented 25% of the ESTs sequenced from that library. The most significant alignment for this cluster is a C. elegans hypothetical protein (Table 4). Indeed six of these 14 clusters have greatest alignment to C. elegans hypothetical proteins (Table 4). This relatively large number of such clusters is notable. Their occurrence may reflect both the paucity of EST data from parasitic stages of related species and, or the substantial differences between different clades and species of nematodes [33]. Of the remaining eight parasitic female library-enriched clusters, two (SR00449, SR01064) have significant alignments to hypo-

thetical proteins from other species (Table 3). Two clusters (SR01048, SR00068) have no significant alignment and therefore may represent novel genes. The remaining four clusters have significant alignment to heat-shock proteins (SR00083, SR00984, SR01014) and a kinase (SR00015). Stage-specific expression of small heat shock proteins occurs in a number of other systems; these proteins are also thought to be constitutively involved in many cellular processes [52–55]. Observations of the expression of heat shock proteins throughout development has led to the hypothesis that they may be used for monitoring changes in an organism’s environment [56,57]. This is likely to be particularly important for parasitic nematodes as they move between freeliving and parasitic phases or between different host species [58]. Twelve clusters are expressed at a significantly higher level in the free-living stage libraries compared with the parasitic female libraries. The majority (11/12) of these clusters have significant alignments to known proteins such as ribosomal proteins (SR00026, SR00393, SR01062, SR01068), elongation factors (SR00009, SR00615) and actin (SR00113). Two of these clusters have significant alignment to collagens (SR00008, SR00012) many of which have stage-specific expression in other organisms. For example, C. elegans col-12 is expressed after each larval moult when new cuticle is being

Table 4 C. elegans hypothetical proteins with significant alignment to large S. ratti clusters (Table 3) Clustera

C. elegansb

C. elegans expression data (cDNAs)c

Amino acid sequence similarityd

RNAi phenotypee

Microarray topology map positionf

SR00007

F52E1.5

Partially confirmed

WTg

8

SR00035

E01A2.5

Predicted

NDh

ND

SR00976

ZK829.4

Confirmed

Slow growth; unclassified

ND

SR01000

B0244.8

Partially confirmed

Embryonic lethal; sterile; sterile progeny; WT

7

SR00034

T05E11.3

Confirmed

Slow growth; larval arrest; uncoordinated; WT

ND

SR01070

F42A10.1

Confirmed

WT

20

SR00066

M04G12.2

Confirmed

WT

31

SR01067

Y47D7A.15

Partially confirmed

Trypanosoma cruzi MUC.Y-1 protein; TR:P90601 Pfam domain PF01902 (ATP-binding region) Pfam domains PF02812 and PF00208 (Glu/Leu/Phe/Val dehydrogenase, dimerisation domain) Pfam domain PF00057 (low-density lipoprotein receptor domain class A) Pfam domains PF00183 (Hsp90 protein), PF02518 (histidine kinase-, DNA gyrase B-, and Hsp90-like ATPase) Pfam domain PF00005 (ABC transporter) C. elegans CPZ-2 protein; Pfam domain PF00112 (Papain family cysteine protease) Xenopus laevis Homer2-prov protein; TR:Q7ZMP9

ND

ND

a b c d e f g h

S. ratti cluster name. C. elegans gene name, http://www.wormbase.org, Wormpep v. 119, 17/02/04. Confirmed, every base of every exon has transcript support; partially confirmed, partial transcript support; predicted, no transcript support. UniProt reference http://www.ebi.uniprot.org or the protein family database (Pfam) reference http://www.sanger.ac.uk/Software/Pfam/. Many data sources used see http://www.wormbase.org for original references. http://www.wormbase.org, after [69]. WT: wild-type. ND: no data.

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

secreted and deposited, and after the L4 to adult larval moult [59]. SR00013 has significant alignment to an aspartic protease precursor; C. elegans asp-1 is highly expressed in late embryonic and early larval stages [60]. Excretory/secretory (ES) products are often expressed in a stage-specific manner and have been shown to have many biologically important functions in nematode development [61]. The Ostertagia ostertagi F7 ES product (SR00605) which is a member of the fatty acid and retinol binding proteins (FARs) is expressed in the L4 stage [61]. FARs are nematode-specific and are thought to be involved in complex host–parasite interactions [62,63]. Only one of these 12 clusters (SR01067) has significant alignment to a hypothetical protein (Table 4 and above). A number of recent studies have shown fluctuations in the level of ribosomal protein gene expression during growth and development [64–66]. For example, in S. stercoralis, ESTs of ribosomal genes were significantly more common among ESTs of L1s than among those from iL3s [20]. Evidence suggests that protein synthesis can be controlled indirectly, via changes in the number and, or function of ribosomes that is in turn controlled or affected by the expression of ribosomal proteins [65]. Alternatively, it has been suggested that some ribosomal proteins are involved in a variety of biological functions independent of their role in the ribosome [67,68]. Therefore, the differences that we have observed in the expression level of several ribosomal protein genes during the life-cycle of S. ratti may reflect their role in controlling levels of protein synthesis and, or separate, stage-specific functions that remain to be identified. Only four of these 30 clusters have no significant difference in their stage-specific expression profiles (Table 3). These include a ribosomal protein (SR00719), an LC3 gabarap protein (SR00369), a Riken protein (SR00041) and a hypothetical protein (SR01070). These four genes will provide good candidates for constitutively expressed controls in future expression studies [35]. In an attempt to gain further information about the possible role of these large S. ratti clusters that have greatest alignment with a hypothetical protein, we collated information about the C. elegans hypothetical proteins to which they align (Table 4). Three of the C. elegans genes have RNAi knock-down phenotypes, which include embryonic lethality, sterility, larval arrest and slow growth, whereas three maintain a wild-type phenotype following RNAi (Table 4). Analysis of the predicted amino acid sequence of these hypothetical proteins shows the existence of a number of identifiable domains. However, these do not suggest any particular protein function. In the three-dimensional map of C. elegans gene expression [69], for the four C. elegans hypothetical proteins shown in Table 4 (http://www.wormbase.org), two (B0244.8 (SR01000), F42A10.1 (SR01070)) are found in, so-called, mountains of expression that are enriched for germline-expressed genes; one (F52E1.5 (SR00007)) is associated with a mountain with intestinal association and one (M04G12.2 (SR00066)) in a mountain with no functional assignment.

41

3.4. Cluster representation through the S. ratti life-cycle The distribution of the clusters (measured as their EST representation) in the four different cDNA libraries is shown in Fig. 4. Eighty one percent (3374/4152) of the clusters are present in only one library and 12% (492/4152) of clusters are present in any two libraries. Thus, there is significant stage-specificity of expression in the S. ratti life-cycle as measured by EST representation. However, it is likely that with further EST sequencing some of the stage-specificity that we have observed may be lost or, rather, suggest stagebiased expression. In S. stercoralis and T. spiralis more than 80% of clusters were stage-specific [20,39]. Recent analyses of partial genomes, predominantly constructed from EST data, have shown substantial inter-specific differences among nematodes [33]. The high level of stage-specificity that we have observed with S. ratti, and observed elsewhere [20,33,39], may caution that in such inter-specific comparisons, the stages of the life-cycles that are the source of the data used in the comparison, needs to be considered carefully. Only three clusters are present in all S. ratti libraries. These have significant alignments to ribosomal proteins (SR00026, SR01062) and to an ATP synthase (SR00705) (Appendix E). The putative identities of these genes are consistent with a housekeeping role and thus also consistent with expression in all life-cycle stages. Following exclusion of the day 15 p.i. parasitic female library data we were able to identify 99 clusters (2.4% of the total 4152) that are shared between the day 6 p.i. parasitic female libraries and all of the free-living libraries (Appendix E). We hypothesised that these 99 clusters are likely to represent genes whose products function in core biological roles. Supporting evidence for this hypothesis was found by analysing the RNAi embryonic lethal phenotype of C. elegans alignments for these 99 clusters compared to those sequences not common to all libraries. Of the C. elegans genes that have significant alignments to these 99 S. ratti clusters, 53% (95% CI 42–63) have an embryonic lethal RNAi phenotype; whereas, only 27% (CI 24–29) of the other clusters, not shared between the four libraries, do. Furthermore, 83% of the 99 S. ratti library-common clusters match a GO term (data not shown) whereas only 44.3% of the other clusters have a match (Fig. 3 and Appendix D). This is consistent with these 99 clusters, common to all S. ratti libraries, having a core biological role. It is notable that in excess of 100 clusters are shared between the L1 and L2 libraries (132), the L1 and parasitic female libraries (127) and the L2 and parasitic female libraries (124); whereas, substantially fewer are shared between the free-living adult and L1 libraries (31), the free-living adult and L2 libraries (25) and the free-living adult and parasitic female libraries (53) (Fig. 4). Notwithstanding the different biology of the L1 and L2 stages (i.e. free-living) compared with the parasitic females, the comparatively high number of shared clusters may reflect the fact that these larval stages are the progeny of the parasitic females.

42

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

Fig. 4. The distribution of the 4152 clusters among four S. ratti cDNA libraries. The number of clusters (and in parentheses, expressed as a percentage of all clusters) specific to one library is shown in the large circles and those shared between any two libraries are shown in the small circles, with the position of those circles, indicating the libraries between which those clusters are shared. The number of ESTs that were sequenced from each library (and in parentheses, expressed as percentage of the 14,701 ESTs) is indicated in the square boxes. The data for the two parasitic female libraries were combined. A complete list of the distribution of all clusters in all library combinations is shown in Appendix E.

Table 5 The 20 most highly represented gene ontology terms

Shaded box indicates GO terms with a significantly different representation between the parasitic (combined) and the free-living stage libraries (combined). a GO reference number, http://www.geneontology.org. b GO term, http://www.geneontology.org. c Percent (95% CI) representation of this GO term in the parasitic libraries (combined). d Percent (95% CI) representation of this GO term in the free-living stage libraries (combined).

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

3.5. Gene ontology The 20 most highly represented GO terms among the freeliving libraries (combined) and the parasitic libraries (combined) by their GO assignments are shown in Table 5: because the gene ontology terms are non-hierarchical, these groupings are not mutually exclusive. The GO terms that are more highly represented in the free-living stage libraries include processes involved with growth and development, such as protein synthesis and cuticle synthesis. This is consistent with growth of early larval stages and reproduction of the free-living adults. The GO terms that are more highly represented in the parasitic females are those involved with the cell-cycle, including kinase activity, amino acid phosphorylation, heat shock proteins, glutamate catabolism and glutamate dehydrogenase. The glutamate catabolism and glutamate dehydrogenase suggests the occurrence of substantial excretion of nitrogen from amino acid sources. This is consistent with the protein-rich diet of the parasitic females, which one can envisage to be the result of their feeding on host gut tissue and intestinal contents. GO terms representing basic core biological processes (ribosomes and protein biosynthesis; embryogenesis; basic physiology) were represented approximately equally in the free-living and parasitic female stages.

4. Conclusions This EST analysis of S. ratti has discovered c. 20% of its genes of which approximately 25% have no apparent database alignments, and thus may be newly discovered genes. We have identified groups of genes that may be specific to nematodes or to parasitic nematodes. These genes and their products may be targets for vaccination or chemotherapeutic approaches to nematode control. We have observed substantial levels of stage-specific expression, with the greatest diversity of genes expressed in the L1 stage. Many abundantly expressed genes are expressed at different levels between the parasitic and free-living stages. Among the abundant parasitic female-specific genes are many genes that are as yet uncharacterised. One gene from the parasitic stage represents 25% of all ESTs obtained from this stage.

43

technical support; Mike Dante and Todd Wylie for bioinformatic assistance and Mike Gardner for Fig. 1. This work was funded by a MRC grant awarded to MEV and a NIH-NIAID research grant, AI46593, awarded to Robert H. Waterston. MEV is also supported by BBSRC, NERC and The Wellcome Trust; GB is supported in part by the BBSRC; JPM was supported by a Helen Hay Whitney/Merck Fellowship.

Appendix A GenBank accession numbers for S. ratti ESTs, ‘−’ indicates a range of numbers BG301344–BG301979 BG893439–BG894299 BI073273–BI074239 BI142476–BI142713 BI322935–BI324322 BI397003–BI97381 BI450388–BI451029 BI502070–BI502585 BI703938–BI704069 BI741797–BI742589 BM879002–BM880232 BQ090556–BQ091407 BQ479120–BQ479466 BU582622–BU582653 CB097387–CB098637 CB274883–CB274989 CB274992–CB274994 CB274997–CB275003 CB275006–CB275007 CB275009–CB275011 CB275013–CB275014 CB275016–CB275018 CB275020 CB275022–CB275025 CB275029–CB275034 CB275037–CB275038 CB275041–CB275043 CB275045–CB275051 CB275054 CB275057 CB275059–CB275061 CB275063–CB275064 CB275066–CB275074

CB275076–CB275079 CB275082–CB275083 CB275085–CB275086 CB275088 CB275091–CB275099 CB275102–CB275104 CB275106–CB275109 CB275111–CB275112 CB275114 CB275116–CB275131 CB275136 CB275138–CB275140 CB275142–CB275143 CB275145 CB275147–CB275152 CB275154–CB275156 CB275158–CB275160 CB275162–CB275165 CB275167 CB275170–CB275175 CB275177–CB275179 CB275181–CB275182 CB275184 CB275186–CB275244 CB275246–CB275247 CB275249–CB275255 CB275257–CB275260 CB275262–CB275282 CB277333–CB277417 CD420523–CD421737 CD523474–CD524090 CD524092–CD526320

Appendix B Acknowledgments We would like to thank Brandi Chiapelli, Louise Hughes, Claire Murphy, Deana Pape and Clare Wilkes for substantial

The frequency distribution of the ORF length (aa) of contigs with (䊉) and without (+) significant BLAST alignment (Fig. 3).

44

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

Appendix C Appendix D GO slim terms for the 2051 clusters that had a significant match in all three databases (Fig. 3) Term Other categories Molecular function unknown Biological process unknown Protein biosynthesis Reproduction Signal transduction Protein metabolism Protein binding RNA binding Nucleobase, nucleoside, nucleotide and nucleic acid metabolism Transport Physiological processes Metabolism Growth Development Cell cycle DNA binding Energy pathways Organelle organisation and biogenesis Transcription factor activity Cell proliferation Transporter activity Electron transporter activity Electron transport Carbohydrate metabolism Lipid metabolism Actin binding Transcription Response to stress Calcium ion binding Chaperone activity

Number of entries Percent of total 154 113 90 75 58 53 46 45 41 40

12.81 9.40 7.49 6.24 4.83 4.40 3.83 3.74 3.41 3.33

38 37 33 33 31 30 29 28 28 25 19 18 17 17 16 16 15 15 15 14 13 1202

3.16 3.08 2.75 2.75 2.58 2.49 2.41 2.33 2.33 2.08 1.58 1.50 1.41 1.41 1.33 1.33 1.25 1.25 1.25 1.17 1.08 100

For clusters in each Venn diagram sector (Fig. 3), the number of clusters that have a GO slim term (and as a percent of the clusters in the relevant sector) and the number of GO slim terms matched in the respective sector Venn diagram sectors (number of clusters/sector)

Number of entries matched (%)

Number of GO slim terms

2051 318 228 75 146 160 109 3087

1202 (59.0) 45 (14.0) 8 (3.5) 22 (29.3) 6 (4.1) 62 (38.8) 21 (19.2) 1366 (44.3)

31 18 6 13 6 34 16 31

Appendix E The number of clusters in each library, or shared between different libraries. 1, L1 library; 2, L2 library; 3, free-living adult/iL3 library; 4, parasitic female library day 6 p.i.; 5, parasitic female library day 15 p.i. ‘—’ indicates a combination Stage combinations

Number of clusters

1 2 3 4 5 1—2 1—3

802 696 275 1379 211 132 31

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

Appendix E (Continued ) Stage combinations

Number of clusters

1—4 1—5 2—3 2—4 2—5 3—4 3—5 4—5 1—2—3 1—3—4 1—4—5 1—2—4 1—2—5 2—3—4 2—3—5 2—4—5 1—3—5 3—4—5 1—2—3—4 1—2—3—5 2—3—4—5 1—2—4—5 1—3—4—5 1—2—3—4—5 Total

126 1 25 124 0 53 0 11 28 28 0 100 0 23 0 0 0 1 99 0 0 2 2 3 4152

References [1] Norhayati M, Fatmah MS, Yusof S, Edariah AB. Intestinal parasitic infections in man: a review. Med J Malays 2003;58:296–305. [2] Speare R. Identification of species of Strongyloides. In: Grove DI, editor. Strongyloidiasis: a major roundworm infection of man. London: Taylor & Francis; 1989. p. 11–85. [3] Ashford RW, Barnish G. Strongyloides fuelleborni and similar parasites in animals and man. In: Grove DI, editor. Strongyloidiasis: a major roundworm infection of man. London: Taylor & Francis; 1989. p. 271–81. [4] Crompton DWT. Human helminthic populations. In: Pawlowski ZS, editor. Bailliere’s clinical tropical medicine and communicable diseases. London: Academic Press; 1987. p. 489–510. [5] Sandground JH. Speciation and specificity in the nematode genus Strongyloides. J Parasitol 1925;12:59–81. [6] Brumpt E. Precis de parasitologie. 6th ed. Paris: Masson et Cie; 1934. p. 1042. [7] Dawkins HJS. Strongyloides ratti infections in rodents: value and limitations as a model for human strongyloidiasis. In: Grove DI, editor. Strongyloidiasis: a major roundworm infection of man. London: Taylor & Francis; 1989. p. 287–333. [8] Tindall NR, Wilson PAG. Criteria for a proof of migration routes of immature parasites inside hosts exemplified by studies of Strongyloides ratti in the rat. Parasitology 1988;96:551–63. [9] Viney ME. A genetic analysis of reproduction in Strongyloides ratti. Parasitology 1994;109:511–5. [10] Harvey SC, Viney ME. Sex determination in the parasitic nematode Strongyloides ratti. Genetics 2001;158:1527–33. [11] Viney ME, Matthews BE, Walliker D. Mating in the nematode parasite Strongyloides ratti: proof of genetic exchange. Proc Roy Soc Lond B 1993;254:213–9. [12] Harvey SC, Gemmill AW, Read AF, Viney ME. The control of morph development in the parasitic nematode Strongyloides ratti. Proc Roy Soc Lond B 2000;267:2057–63.

45

[13] Viney ME. Developmental switching in the parasitic nematode Strongyloides ratti. Proc Roy Soc Lond B 1996;263:201–8. [14] Gemmill AW, Viney ME, Read AF. The evolutionary ecology of host-specificity: experimental studies with Strongyloides ratti. Parasitology 2000;120:429–37. [15] Morgan DO. Parastrongyloides winchesi gen. et sp. nov. A remarkable new nematode parasite of the mole and shrew. J Helminthol 1928;6:79–86. [16] Dorris M, Viney ME, Blaxter ML. Molecular phylogenetic analysis of the genus Strongyloides and related nematodes. Int J Parasitol 2002;32:1507–17. [17] Hotez PJ, Hawdon JM, Schad GA. Hookworm larval infectivity, arrest and amphiparatenesis: the Caenorhabditis elegans daf-c paradigm. Trends Parasitol 1993;9:23–6. [18] Riddle DL, Albert PS. Genetic and environmental regulation of dauer larva development. In: Riddle LD, Blumenthal T, Meyer BJ, Priess JR, editors. C. elegans II. New York: Cold Spring Harbor Laboratory Press; 1997. p. 739–68. [19] Viney ME. Environmental control of nematode life-cycles. In: Lewis E, Campbell JF, Sukhdeo M, editors. Behavioural ecology of parasites. Oxford: CAB International; 2002. p. 111–28. [20] Mitreva MD, McCarter JP, Martin J, et al. Comparative genomics of gene expression in the parasitic and free-living nematodes Strongyloides stercoralis and Caenorhabditis elegans. Genome Res 2004;14:209–20. [21] Wang J, Kim SK. Global analysis of dauer gene expression in Caenorhabditis elegans. Development 2003;130:1621–34. [22] Moqbel RD, McLaren DJ. Strongyloides ratti: structural and functional characteristics of normal and immune-damaged worms. Exp Parasitol 1980;49:139–52. [23] Moqbel RD, McLaren DJ, Wakelin D. Strongyloides ratti: reversibility of immune damage to adult worms. Exp Parasitol 1980;49:153–66. [24] Kimura E, Shintoku Y, Kadosaka T, Fujiwara M, Kondo S, Itoh M. A second peak of egg excretion in Strongyloides ratti-infected rats: its origin and biological meaning. Parasitology 1999;119:221–6. [25] Wilkes CP, Thompson FJ, Gardner MP, Paterson S, Viney ME. The effect of the host immune response on the parasitic nematode Strongyloides ratti. Parasitology 2004;128:661–9. [26] Paterson S, Viney ME. Host immune responses are necessary for density-dependence in nematode infections. Parasitology 2002;125:1–8. [27] Stear MJ, Bairden K, Duncan JL, et al. How hosts control worms. Nature 1997;389:27. [28] Viney ME. How do host immune responses affect nematode infections? Trends Parasitol 2002;18:63–6. [29] C. elegans Sequencing Consortium. Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 1998;282:2012–8. [30] Stein LD, Bao Z, Blasiar D, et al. The genome sequence of Caenorhabditis briggsae: a platform for comparative genomics. PLoS Biol 2003;1:166–92. [31] Blaxter ML, De Ley P, Garey JR, et al. A molecular evolutionary framework for the phylum Nematoda. Nature 1998;392:71–5. [32] Parkinson J, Mitreva M, Hall N, Blaxter M, McCarter JP. 400,000 nematode ESTs on the net. Trends Parasitol 2003;19:283–6. [33] Parkinson J, Mitreva M, Whitton C, et al. A transcriptomic analysis of the phylum Nematoda. Nat Genet 2004;36:1259–67. [34] Whiton C, Daub J, Quail M, et al. A genome sequence survey of the filarial nematode Brugia malayi: repeats, gene discovery, and comparative genomics. Mol Biochem Parasitol 2004;137:215– 27. [35] Crook M, Thompson FJ, Grant WN, Viney ME. daf-7 and the development of Strongyloides ratti and Parastrongyloides trichosuri. Mol Biochem Parasitol 2005;132:213–23. [36] Paterson S, Viney ME. Functional consequences of genetic diversity in Strongyloides ratti infections. Parasitology 2003;270:1023–32.

46

F.J. Thompson et al. / Molecular & Biochemical Parasitology 142 (2005) 32–46

[37] Mitreva M, Elling AA, Dante M, et al. A survey of SL1-spliced transcripts from the root-lesion nematode Pratylenchus penetrans. Mol Genet Genom 2004;272:138–48. [38] Krause M, Hirsh D. A trans-spliced leader sequence on actin mRNA in C. elegans. Cell 1987;29:753–61. [39] Mitreva M, Jasmer DP, Appleton J, et al. Gene discovery in the adenophorean nematode Trichinella spiralis: an analysis of transcription from three life cycle stages. Mol Biochem Parasitol 2004;137:277–91. [40] McCarter JP, Mitreva MD, Martin J, et al. Analysis and functional classification of transcripts from the nematode Meloidogyne incognita. Genome Biol 2003;4:26. [41] Hillier LD, Lennon G, Becker M, et al. Generation and analysis of 280,000 human expressed sequence tags. Genome Res 1996;6:807–28. [42] Altschul SF, Madden TL, Schaffer AA, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucl Acids Res 1997;25:3389–402. [43] Siegel S. Nonparametric methods for the behavioral sciences. New York: McGraw-Hill; 1956. [44] Agresti A. An introduction to categorical data analysis. New York: John Wiley; 1996. [45] Rohlf JF. In: Rohlf JF, Sokal RR, editors. Statistical tables. 2nd ed. San Francisco: W.H. Freeman; 1981. p. 158–62. [46] Kamath RS, Fraser AG, Dong Y, et al. Systematic functional analysis of the Caenorhabditis elegans genome using RNAi. Nature 2003;421:231–7. [47] Wylie T, Martin JC, Dante M, et al. Nematode.net: a tool for navigating sequences from parasitic and free-living nematodes. Nucl Acids Res 2004;32:423–6. [48] Ten Asbroek AL, Olsen J, Housman D, Baas F, Stanton V. Genetic variation in mRNA coding sequences of highly conserved genes. Physiol Genom 2001;5:113–8. [49] Tetteh KK, Loukas A, Tripp C, Maizels RM. Identification of abundantly expressed novel and conserved genes from the infective larval stage of Toxocara canis by an expressed sequence tag strategy. Infect Immun 1999;67:4771–9. [50] Daub J, Loukas A, Pritchard DI, Blaxter M. A survey of genes expressed in adults of the human hookworm, Necator americanus. Parasitology 2000;120:171–84. [51] Choo KB, Chen HH, Cheng WT, Chang HS, Wang M. In silico mining of EST databases for novel pre-implantation embryo-specific zinc finger protein genes. Mol Reprod Dev 2001;59:249–55. [52] Morimoto RI, Tissieres A, Georgopoulos G. Stress response in biology and medicine. Cold Spring Harbor: Cold Spring Harbor Laboratory Press; 1990. [53] Arrigo A-P, Landry J. Expression and function of the low-molecularweight heat shock proteins. In: Morimoto RI, Tissieres A, Georgopoulos G, editors. The biology of heat shock proteins and molecular chaperones. Cold Spring Harbor: Spring Harbor Laboratory Press; 1994. p. 335–73.

[54] Thompson FJ, Martin SA, Devaney E. Brugia pahangi: characterisation of a small heat shock protein cDNA clone. Exp Parasitol 1996;83:259–66. [55] Hartman D, Cottee PA, Savin KW, et al. Haemonchus contortus: molecular characterisation of a small heat shock protein. Exp Parasitol 2003;104:96–103. [56] Dalley BK, Golomb M. Gene expression in the Caenorhabditis elegans dauer larva: developmental regulation of Hsp90 and other genes. Dev Biol 1992;151:80–90. [57] Rutherford SL, Lindquist S. Hsp90 as a capacitor for morphological evolution. Nature 1998;396:336–42. [58] Thompson FJ, Cockroft AC, Wheatley I, Britton C, Devaney E. Heat shock and developmental expression of hsp83 in the filarial nematode Brugia pahangi. Eur J Biochem 2001;268:5808–25. [59] Johnstone IL, Barry JD. Temporal reiteration of a precise gene expression pattern during nematode development. EMBO J 1996;15:3633–9. [60] Tcherepanova I, Bhattacharyya L, Rubin CS, Freedman JH. Aspartic proteases from the nematode Caenorhabditis elegans. Structural organization and developmental and cell-specific expression of asp-1. J Biol Chem 2000;275:26359–69. [61] Vercauteren I, Geldhof P, Peelaers I, Claerebout E, Berx G, Vercruysse J. Identification of excretory–secretory products of larval and adult Ostertagia ostertagi by immunoscreening of cDNA libraries. Mol Biochem Parasitol 2003;126:201–8. [62] Basavaraju S, Zhan B, Kennedy MW, Liu Y, Hawdon J, Hotez PJ. Ac-FAR-1, a 20 kDa fatty acid- and retinol-binding protein secreted by adult Ancylostoma caninum hookworms: gene transcription pattern, ligand binding properties and structural characterisation. Mol Biochem Parasitol 2003;126:63–71. [63] Garofalo A, Rowlinson MC, Amambua NA, et al. The FAR protein family of the nematode Caenorhabditis elegans. Differential lipid binding properties, structural characteristics, and developmental regulation. J Biol Chem 2003;278:8065–74. [64] Draptchinskaia N, Gustavsson P, Andersson B, et al. The gene encoding ribosomal protein S19 is mutated in diamond-blackfan anaemia. Nat Genet 1999;21:169–75. [65] Amsterdam A, Sadler KC, Lai K, et al. Many ribosomal protein genes are cancer genes in zebrafish. PLoS Biol 2004;5:139. [66] Marygold SJ, Coelho CMA, Leevers SJ. Genetic analysis of RpL38 and RpL5, two minute genes located in the centric heterochromatin of chromosome 2 of Drosophila melanogaster. Genetics 2004;104:034124. [67] Volarevic S, Thomas G. Role of S6 phosphorylation and S6 kinase in cell growth. Prog Nucl Acids Res Mol Biol 2001;65:101– 27. [68] Lohrum MA, Ludwig RL, Kubbutat MH, Hanlon M, Vousden KH. Regulation of HDM2 activity by the ribosomal protein L11. Cancer Cell 2003;6:577–87. [69] Kim SK, Lund J, Kiraly M, et al. A gene expression map for Caenorhabditis elegans. Science 2001;293:2087–92.