Microarray analysis identifies genes preferentially expressed in the lung schistosomulum of Schistosoma mansoni

Microarray analysis identifies genes preferentially expressed in the lung schistosomulum of Schistosoma mansoni

International Journal for Parasitology 36 (2006) 1–8 www.elsevier.com/locate/ijpara Rapid Communication Microarray analysis identifies genes prefere...

334KB Sizes 1 Downloads 86 Views

International Journal for Parasitology 36 (2006) 1–8 www.elsevier.com/locate/ijpara

Rapid Communication

Microarray analysis identifies genes preferentially expressed in the lung schistosomulum of Schistosoma mansoni Gary P. Dillon a, Theresa Feltwell b, Jason P. Skelton b, Peter D. Ashton a, Patricia S. Coulson a, Michael A Quail b, Nefeli Nikolaidou-Katsaridou b, R. Alan Wilson a, Alasdair C. Ivens b,* b

a Department of Biology, University of York, PO Box 373, York YO10 5YW, UK Pathogen Microarrays Group, Wellcome Trust Sanger Institute, Hinxton CB10 1SA, UK

Received 7 September 2005; received in revised form 10 October 2005; accepted 24 October 2005

Abstract The lung schistosomulum of Schistosoma mansoni is a validated target of protective immunity elicited in vaccinated mice. To identify genes expressed at this stage we constructed a microarray, representing 3088 contigs and singlets, with cDNA derived from in vitro cultured larvae and used it to screen RNA from seven life-cycle stages. Clustering of genes by expression profile across the life cycle revealed a number of membrane, membrane-associated and secreted proteins up-regulated at the lung stage, that may represent potential immune targets. Two promising secreted molecules have homology to antigens with vaccine and/or immunomodulatory potential in other helminths. q 2005 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved. Keywords: Blood fluke; Life cycle; Transcriptome; Secretion; Membrane proteins; Schistosoma

Schistosomiasis is an important parasitic disease of humans in many parts of the tropics, second only to malaria as a cause of morbidity (King et al. (2005)) and mortality. Nevertheless, the development of an effective vaccine as a tool for schistosomiasis control has proved an elusive goal. Although acquired resistance has been demonstrated in human populations, little is known about the mechanisms or mediating antigens on which a vaccine could be based. However, high levels of anti-larval immunity can be elicited by the exposure of rodents (Coulson, 1997) and primates to radiationattenuated (RA) cercariae. Although not suitable on ethical or practical grounds for use in humans, the RA vaccine provides an excellent paradigm on which to base a recombinant human vaccine (Coulson, 1997). Lung schistosomula appear to be the principal target of immune attack in once-vaccinated C57BL/6 mice (Coulson, 1997), whilst the involvement of CD4CT-cells in the pulmonary effector response means that the parasite antigens must either be secreted or surface-exposed (S/S) in order to be processed by accessory cells for presentation on major histocompatibility complex (MHC) II. Unfortunately, little progress has been made in defining the S/S * Corresponding author. Tel.: C44 1223 494851; fax: C44 1223 494919. E-mail address: [email protected] (A.C. Ivens).

antigens of the lung schistosomulum (Harrop et al. (1999)), but it is a reasonable premise that they will be the product of genes highly expressed at that stage. Microarrays provide the ideal tool to compare the level of expression of many genes simultaneously using multiple RNA samples. We describe the construction of an array of w6000 features, comprising expressed sequence tags (ESTs) exclusively from the lung schistosomula of Schistosoma mansoni, which we have used to compare the pattern of gene expression in lung schistosomula relative to six other life-cycle stages. In silico analyses were used to identify transcripts that are restricted to, or highly expressed at, the lung stage. Using software that identifies S/S proteins via encoded signal peptides, we have highlighted a small subset of genes whose products might serve as potential vaccine candidates. A Puerto Rican isolate of S. mansoni was maintained by passage through NMRI strain mice and Biomphalaria glabrata snails. The seven life-cycle stages providing RNA were: (i) early liver worms and (ii) adult worms, obtained by portal perfusion of mice, 21 days and 8 weeks, respectively, after infection with 200 cercariae; (iii) eggs extracted from infected mouse liver by maceration and partial digestion with trypsin followed by washing and passage through a 180 mm sieve; (iv) germ balls from developing daughter sporocysts, acquired by

0020-7519/$30.00 q 2005 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.ijpara.2005.10.008

2

G.P. Dillon et al. / International Journal for Parasitology 36 (2006) 1–8

careful dissection of the hepatopancreas 21 days after exposing snails to 40 miracidia each; (v) cercariae obtained by exposing snails with a patent infection to a bright light; (vi) day 2 and (vii) day 7 schistosomula, produced by mechanical transformation of cercariae followed by in vitro culture. (After 7 days in culture, the larvae have the morphological features of lung worms and are capable of maturation when introduced to the portal vein of mice (Harrop and Wilson (1993)).) The total RNA was extracted from the parasite samples using TRIzol (Invitrogen) and its integrity checked with a RNA 6000 Nano LabChip (Agilent), both procedures according to manufacturer’s instructions. Two lung stage libraries were constructed using different approaches. Library one, which was directionally cloned, was made from oligo(dT) purified mRNA, using the Poly(A)Purist kit (Ambion) according to manufacturer’s instructions. A 5 mg sample of purified mRNA from day 7 schistosomula was reverse transcribed using the ZAP cDNA synthesis kit (Stratagene). The recommended protocol was modified such that following Xho I digestion of the 3 0 adapter sequence, the cDNA was purified by gel electrophoresis (three times) and fractions representing 0.5–1.5, 1.5–2.5 and 2.5–7 kb recovered. Subsequently, separate libraries were constructed from each fraction following the ZAP cDNA synthesis kit protocol. Library two, the second EST source, was a randomly

primed 5 0 -biased library commissioned from Incyte, cloned into the plasmid vector pcDNA2.1 (Invitrogen); insert sizes ranged from 0.6–2.5 kb (mean 1.2 kb). Approximately 5000 clones from library one and 1000 from library two were sequenced using BigDyew Terminator chemistry (v3.0, v3.1) on an ABI 3700 automated DNA sequencer (Applied Biosystems). The w6000 sequences (accession numbers AM042715–AM048613) were assembled into 3088 contigs and singlets using CAP3 (contigs can be accessed at http:// www.schistodb.org; Supplementary Material Table 1). Putative function was assigned to the clusters by homology to proteins in the UniProt database using BLASTX. Gene Ontology (GO) terms were assigned by homology using UniProt; in both cases, the expect value threshold was set to 1e16. Subcellular location was predicted using the Proteome Analyst Specialized Subcellular Localization Server (Lu et al. (2004)). PCR products were obtained from individual clones using universal primers with C6-amino modification of the forward primer enabling their covalent attachment to the slide surface. PCR products were quality-assessed by agarose gel electrophoresis, and those containing two or more fragments were eliminated. Arrays were printed in duplicate on CodeLinke Activated Slides (Amersham Biosciences) using a Biorobotics MicroGridII robot (Genomic Solutions), as

Table 1 Gene Ontology categories identified as being under regulatory control in the life cycle stages indicated, ascribed by GoMiner Contig

Product

UniProt accession

Category

Members

Altered

P-value up/ down

Stage(s)

Sm00415 Sm03361

Q6P7S4 Q7PPN7

TCA Cycle Electron transport

6 39

3 9

D!0.001 D!0.001

2C7 2C7

Sm03958

Succinate-CoA-ligase (EM) Cytochrome c oxidase assembly factor (EM) Ferredoxin (EM)

Q9YHT2

Sm07278

Aconitate hydratase (EM)

Q7Q3F6

Sm09091

Cytochrome c1 (EM)

Q6VBC0

Sm12677

WCRF180 (CR)

Q9NRL2

Sm12462 Sm03987 Sm12764 Sm00765

STGP4 (EM) Annexin (CS) Ag5 (PA) Dynein light chain (CS)

Q26581 Q86E42 Q86FG7 Q94758

Sm02451 Sm13136

Calpain (CS) DMN1a (CS)

O45033 Q6P3T6/ Q05193

Electron transporter activity Mitochondrian electron transport chain Electron transport TCA Cycle Aconitate hydratase activity TCA Cycle Electron transporter activity Mitochondrian electron transport chain Electron transport Chromatin remodelling Nuclear Chromosome Sugar porter Calcium dependent phospholipid binding Trypsin activity Microtubule associated complex Microtuble motor activity Calpain activity Receptor mediated endocytosis

18 5 39 7 1 7 18 5 39 3 2 6 3 2 23 13 4 2

3 2 9 3 1 3 3 2 9 1 1 1 1 1 3 2 2 1

D 0.003 D 0.003 D!0.001 D!0.001 D 0.02 D!0.001 D 0.003 D 0.003 D!0.001 D 0.05 D 0.04 U 0.05 U 0.03 U 0.02 U!0.001 U 0.002 U 0.002 U 0.02

2C7 2C7 2C7 2C7 2C7 2C7 2C7 2C7 2C7 2C7 2C7 7 7 7 7C21 7C21 7C21 7C21

Synaptic transmission Motor activity Coated pit

7 24 1

1 3 1

U 0.04 U!0.001 U 0.006

7C21 7C21 7C21

The P value is the probability that the individual category has one or more genes that are up (U) or down (D) regulated. The ‘Members’ column lists the total number of genes represented on the array present in a given category; the ‘Altered’ column lists the actual number showing altered expression. Abbreviations in the Product column are: EM, Energy Metabolism; CR, Chromosome remodelling; CS, Cytoskeletal; PA, Protease Activity. a Dynamin Microtubule-associated force-producing protein involved in producing microtubule bundles and able to bind and hydrolyze GTP. Most probably involved in vesicular trafficking processes, in particular endocytosis.

G.P. Dillon et al. / International Journal for Parasitology 36 (2006) 1–8

described (http://www.sanger.ac.uk/Projects/Microarrays/ arraylab/protocol4.pdf). No attempt was made to remove redundant transcripts prior to printing of the arrays. Labelling of mRNA was performed without prior amplification, either by T7 polymerase or PCR, to avoid introducing bias due to transcript abundance, length and GC content (Wadenback et al. (2005)). The cDNAs were generated by reverse transcription of total RNA using Superscript III enzyme (Invitrogen), primed with anchored oligo-dT23 (Sigma Aldrich) and labelled with deoxyribonucleotides (dNTPs) incorporating either a 2 0 -deoxycytidine 5 0 -triphosphate (dCTP) Cy3 or Cy5 dye (Perkin Elmer). Labelled products were purified using AutoSeq G50 columns (Amersham Biosciences), mixed with control samples, and hybridised to microarray slides in 48% formamide at 55 8C for 16–20 h, in a humid chamber. The slides were then washed at room temperature, scanned using a Genepixw 4000B laser scanner and the array features, independent of redundancy, quality checked and quantified using GenePixw Pro software (Axon Instruments Inc.). A control pool was created by combining equal amounts of total RNA from the seven life cycle stages. A 20 mg sample of this pool and a single stage sample, labelled as above with Cy3 or Cy5 dyes, were hybridised to a slide. Overall, analysis of the seven life cycle stages encompassed 28 slides, incorporating two biological replicates and dye swaps; these data were submitted to ArrayExpress (http://www.ebi.ac.uk/arrayexpress/; Accessions A-SGRP-2 and E-SGRP-2). Array data were analysed using the R statistical language and environment (http://www.r-project.org), specifically with the microarray analysis tools available from the Bioconductor Project (http:// www.bioconductor.org). Data were background-subtracted using a Bayesian model-based method (Kooperberg et al. (2002)), normalized using the LIMMA Bioconductor package (Smyth, 2004) with printtip loess (see LIMMA documentation) to correct for spatial and other artefacts. Data obtained from biological replicates were averaged before linear models were applied and significance statistics generated, using empirical Bayesian methods in order to assess differential gene expression. Six ESTs representing a spectrum of expression patterns were chosen for validation of array predictions by real time PCR analyses. These were: SmlC4a06, relatively constant; SmlC13d04, extreme variability; SmlC15f10, SmlA39h10, SmlC19c8 and SmlC4a6, variable expression across the life cycle. The Primer Express package (Applied Biosystems) was used to design primers and Taqman probes to the six ESTs and the 18S ribosomal RNA control (Supplementary Material Table 2). Aliquots of the same RNA samples used for the array experiments were assayed on an ABI 7700 PRISM instrument, according to the manufacturer’s instructions. All data were normalized to the lowest level of expression determined by real time PCR. The level of expression was compared with that estimated from the array hybridisations (Supplementary Material Fig. 1). The scatter plot revealed that RT-PCR was 3.2 times more sensitive than microarrays for analysis of mRNA levels when three experimental values representing extreme levels of expression, which inflate the correlation

3

coefficient (R), were omitted from the analysis. The datasets exhibited high concordance, with RZ0.68; when only those samples flagged as statistically significant on the array are considered, a correlation coefficient of 0.88 is obtained. The processed array data for the 1,665 contigs and singlets that could be annotated by BLASTX sequence similarity to one or more Uniprot protein database entries were first analysed using GoMiner, in a subjective approach, to provide a broad overview of biological processes centred on the lung schistosomulum. For each life cycle stage, the ratios of signal intensity relative to the pool were determined, normalised against the day 7 value, and classified as up- or down-regulated on the basis of a significant (P ! 0.001) two-fold change in signal. The complete list of ESTs for which a tentative BLASTX-derived function could be inferred was submitted to GoMiner (Gene Ontology version number go_200503), followed by a matrix of up/down expression scores. GoMiner returns probabilities that a given GO category is enriched in differentially up- or down-regulated genes in the life cycle stage of interest (Zeeberg et al. (2003)). The screen for up- or down-regulated genes yielded 42 at days 2 and 7, 22 at day 7 and 10 at days 7 and 21. The most prominent aspects of parasite biology highlighted by GO analysis in lung schistosomula and adjacent life cycle stages were changes in the expression of genes associated with energy metabolism (EM; 6), cytoskeletal organisation (CS; 4), protease activity (PA;1) and chromosome remodelling (CR;1, Table 1). Clustering of array data was performed using Bioconductor software packages as two separate components. The life stages were first separated into two distinct groups, intra-mamalian and extra-mamalian; data for all genes were averaged within each group and the difference in expression between the two groups was calculated. Expression was classed as significant if it exceeded a natural log-odds (lods) cutoff of 3 (ln probability that a gene is differentially expressed/probability that it is not; it equated to a corrected P of approximately 0.0001). After separate comparison of each life cycle stage with the control, classification resulted in a relatively small selection of differentially expressed genes for further analyses and graphical representation. A heat map of clusters was generated by calculating the Euclidean distance between ESTs, based on their expression levels across the experiment. The ESTs were then grouped using the McQuitty agglomeration method to produce a dendrogram that best represented the data (Fig. 1). The heat map of 563 ESTs, representing 281 sequence contigs, gives a graphical overview of the results; columns represent life stages and rows are grouped by hierarchical clustering. Genes up-regulated at days 2 and 7 (Fig. 1, regions A, B and E), and seven alone (Fig. 1, regions C, D and F), are listed in Table 2, together with annotations of putative function (BLASTX-assigned, expect value %1!10K16). Thirty-six genes were identified, and their sub-cellular location predicted by Proteome Analyst. They all fell within the following few categories: membrane (6); membrane-associated (5); secreted (5); cytoskeleton (5); organelle (3); cytosolic (3) and those with no location assignable (9). As noted earlier there are redundant features (i.e. multiple EST clones representing the same

4

G.P. Dillon et al. / International Journal for Parasitology 36 (2006) 1–8

Table 2 Subset of genes up-regulated at days 2 and 7, or day 7 alone, relative to the pooled control (Bioconductor project analyses, lods O3) Contig

Putative product

UniProt accession

Location

Stage

ESTs

Sm12542 Sm04463 Sm03463 Sm12462 Sm13225 Sm11655 Sm12654 Sm07783 Sm03987 Sm12683

Tetraspanin 1 Tetraspanin 2 Hypothetical SGTP4 Sodium ion channel Cadherin Scramblase Annexin 1 Annexin 2 Hypothetical containing PDZ Tensin Hypothetical Wasp venom allergen Antigen 5 Hypothetical Hypothetical Arp-2 Severin1 Severin 2 Fimbrin SM22.6 LAMP Mitochondrial ribosomal protein Cathepsin B1 isotype 2 Lactate dehydrogenase

O44420 O44420 N/A Q26581 Q6ZMN3 Q8VDA1 P58195 Q86DV3 Q86E42 Q801P2

Membrane

2C7 7 2C7 7 7 7 2C7 2C7 7 7

3/4 1/1 3/4 5/5 1/1 1/1 1/1 1/1 1/1 2/2

7 2C7 7 7 7 7 2C7 2C7 2C7 7 7 2C7 7

2/2 1/1 2/3 5/5 2/2 1/1 1/1 2/2 3/3 14/14 9/11 1/1 1/1

Q8MNY1 Q7Z1I3

7 2C7

1/5 8/9

Ribosomal protein Hypothetical Hypothetical Hypothetical Hypothetical Hypothetical Hypothetical Hypothetical Hypothetical Hypothetical Hypothetical

Q86DZ5 Q7PWE8 Q8MPF1 N/A N/A N/A N/A N/A N/A N/A N/A

2C7 7 7 2C7 2C7 2C7 2C7 2C7 2C7 7 7

1/1 1/1 2/3 1/1 4/4 1/1 1/1 1/2 1/1 1/2 1/1

Sm13052 Sm13221 Sm12775 Sm12764 Sm12352 Sm01621 Sm00556 Sm12997 Sm12742 Sm13240 Sm12876 Sm03779 Sm12766 Sm12907 Sm00165/Sm00493/ Sm11814 Sm09078 Sm13152 Sm04825 Sm11584 Sm13096 Sm05403 Sm11750 Sm11870 Sm12883 Sm12913 Sm29274

Q8IZW7 N/A Q86F86 Q86FG7 N/A N/A Q7SXW6 Q24800 Q24800 Q26574 P14202 Q86E98 Q86EG7

Membrane associated

Secreted

Cytoskeleton

Organelle

Cytosolic

Unknown

The expressed sequence tags (ESTs) column represents the fraction of replicates on the array that co-cluster on the heatmap.

transcript) on the array, which thus served as additional internal controls. Analysed independently of each other, each copy exhibited similar hybridisation characteristics, providing an additional level of confidence in the predictions. Although it would be possible to dissect out other patterns of gene expression from the array data obtained, these would not be germane to the question we posed. We have highlighted just one further pattern, primarily because the trends observed can be clearly related to a documented aspect of parasite biology, namely gut development. Thus a group of 15 contigs, represented by 38 copies on the array (Fig. 1, regions 1 and 2), increased in expression from lung schistosomulum to 21 day liver worm to adult (Supplementary Material Table S3). The largest group identified were proteolytic enzymes, all involved in haemoglobin digestion; a gene encoding a protein Saposin B homolog, also known as LGG (Accession numbers: Q26536, Q26587, Q26535), a putative saposin, and a pyrophosphatase may also be of some note.

ArrayMiner was used as a third method of data analysis, whereby the log2 ratios of the entire pair-wise microarray dataset, unfiltered by lods score, were subjected to unitnormalised Gaussian clustering (Pearson distance coefficient; www.optimaldesign.com (Falkenauer, E and Marchand, A, 2001. Using k-Means? Consider ArrayMiner. Proceedings of the 2001 International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences, Las Vegas, Nevada, USA June 25–28, 2001). A total of 128 clusters was generated from which we selected four that best matched our criteria for high level of expression in the lung stage (Fig. 2, clusters 11 and 59) or increasing expression from lung to liver to adult worm (Fig. 2; clusters 46 and 60). In the latter category, cluster 60 contained the four previously detected proteases (Supplementary Material Table S3) plus two further transcripts with no known homologue (Supplementary Material Table S4). Cluster 46 contained three previously identified transcripts (LGG, Ectonucleotide pyrophosphatase,

G.P. Dillon et al. / International Journal for Parasitology 36 (2006) 1–8

5

Fig. 1. Heat map and dendrogram of 563 expressed sequence tags (ESTs) which differed significantly (lods O3) in their expression in one or more life cycle stages. The dendrogram explicitly reveals the structure of the clusters. Regions A–F indicate clusters with ESTs most highly expressed at days 2 and 7, or day 7 alone. Regions 1 and 2 indicate ESTs with expression in descending order adult O21 day liver worm O lung schistosomulum. The graduated colour scheme uses 600 shades on a scale from blue through white to red, representing relative expression levels from low to high (i.e. under- and over- representation relative to the control). The central black line in each column represents the deviation of expression up or down. Life cycle stages: E, Egg; GB, Germ Ball; C, Cercaria; D2, Day 2 schistosomulum; D7, Day 7 schistosomulum; D21, Day 21 liver worm; A, Adult.

mitochondrial ribosomal protein) plus a further seven with attributed function and eight encoding hypothetical proteins. Clusters 11 and 59, whose profiles peaked at the lung stage, contained 14 transcripts detected by analysis with Bioconductor packages plus six additional transcripts with known function, the most pertinent being winged helix, tetraspanin 3 and 4 and calpain, plus 19 with no known function. The O6000 features on the array compile into 3088 contigs and singlets that represent approximately 44% coverage of the lung stage transcriptome, based on the estimate of 7000 expressed transcripts per life cycle stage (Verjovski-Almeida et al. (2003)). While levels of gene and protein expression do not necessarily coincide, it is likely that some of the abundant transcripts encode major differentially expressed constituents

of lung schistosomula and hence are potential antigens. (Protein abundance is an important consideration where priming of the system by antigen is concerned.) We opted to recover adequate amounts of RNA for hybridisations as we felt that the effort entailed was justified in order to avoid artefacts introduced by nucleic acid amplification (Wadenback et al. (2005)); we analysed the data set produced in three ways to maximise the information extracted. The most rigorous classification of expression patterns was provided by analyses using Bioconductor software packages, selecting only those ESTs with a high level of statistically significant variation across the life cycle stages relative to the pool. The transcripts that did not vary significantly relative to the control may represent a core of genes encoding the proteins needed for

6

G.P. Dillon et al. / International Journal for Parasitology 36 (2006) 1–8

Fig. 2. Four clusters selected by visual inspection of the 128 generated from the total dataset by ArrayMiner. Clusters 11 and 59 depict those transcripts most highly expressed at the lung stage. Clusters 46 and 60 depict transcripts with expression in descending order adult O21 day liver worm O lung schistosomulum. The blue line is the centroid representing the average expression profile across the seven life cycle stages as described in Fig. 1.

cellular processes common to all schistosome cells. A more detailed in silico and biochemical analysis of this subset is ongoing. Within the 563 ESTs selected by Bioconductor-based analyses, we focused on a profile displaying increased gene expression from lung to adult worm for the purpose of validation. This cluster comprised genes encoding three cathepsins and asparaginyl endopeptidase (Sm32), which collectively are responsible for the digestion of haemoglobin. The development of the worm gut, and onset of blood feeding, provide the morphological (Basch, 1981) and functional correlates for this pattern of gene expression. Of the eight other transcripts identified in this group, LGG (a saposin), is of particular note. It is an activator of lysosomal lipid degrading enzymes and so probably originates in the lysosomes of the gut epithelium, as do the cathepsins. The ArrayMiner analysis highlighted two clusters of transcripts that similarly increased in expression level from the lung to adult worm. The majority of proteases mentioned above were present in one cluster, suggesting co-ordinated regulation of their expression. Conversely, LGG and a second saposin clustered separately, together with a possible sexual development-associated protein (gynecophoral canal protein). Given that our array is based on lung stage transcripts, the detection of a supposed developmental cue intimates that the processes of sexual maturation are poised to begin at this early migratory stage. That our array analysis selects expression profiles correlating with documented morphological and physiological changes in the parasite gut from lung stage to adult worm strongly suggests that the patterns it highlights in the lung schistosomulum are biologically meaningful. The goal of our study was to analyse the gene expression pattern of in vitro cultured lung schistosomula, which equate to the intravascular migrating parasite. We also included larvae

cultured for 2 days after derivation from infective cercariae by a transformation process that involves loss of penetration glands and considerable body remodelling in preparation for intradermal migration and skin exit via the blood vessels. Migration of larvae through murine skin and pulmonary vascular beds is sufficiently protracted to permit interaction with immune effector responses, characterised by leukocytic infiltrates that block migration and lead to parasite elimination (Coulson, 1997). A switch in larval metabolism from aerobic to anaerobic pathways is known to occur at transformation, only reversed after the onset of blood feeding (Lawson and Wilson (1980)). GoMiner identified a number of down-regulated genes, e.g. Cytochrome C1, ferridoxin and aconitate hydratase, associated with aerobic respiration, together with up-regulation of the sugar transporter STGP4. This gene and lactate dehydrogenase were also detected by cluster analysis, their up-regulation presumably reflecting an increased demand for glucose in the absence of aerobic metabolic activity. Although it is not possible to be categorical, given that the entire S. mansoni transcriptome is not available, the microarray-based data do appear to corroborate metabolic studies performed in ex vivo worms (Lawson and Wilson (1980)). Demonstrating the metabolic equivalence of in vitro and ex vivo schistosomula gives additional confidence that the cultured parasites follow a normal developmental profile. At days 2 and 7, parallel to the metabolic changes, chromatin is likely to be in an unpacked state as chromosome-condensing functions are depressed. In addition, at day 7 the up-regulation of winged helix transcription factor points to the stage-specific expression of a subset of genes. Changes in chromatin structure and increased transcription factor expression are consistent with morphological studies showing that cell movement and tissue remodelling occur in the developing larva without accompanying cell division (Clegg, 1965).

G.P. Dillon et al. / International Journal for Parasitology 36 (2006) 1–8

The body flexibility of the newly transformed schistosome larva is determined by a network of outer circular and inner longitudinal muscle fibres, held in place by adherens junctions, together with a sub-tegumental matrix of extracellular fibrils (Crabtree and Wilson (1986)). Loss of this matrix permits a four-fold extension in maximum body length by day 6–8 (Crabtree and Wilson (1980)), while the basal tegument becomes attached directly to the underlying muscles by adherens junctions (Crabtree and Wilson (1986)). Cadherins, a postulated constituent of such structures (Verjovski-Almeida et al. (2003)) was strongly up-regulated in the day 7 larva. The changes in body flexibility are achieved without any alteration in the number of muscle fibres, which acquire the capability for much greater extension. This is reflected in the up-regulation of actin-related protein 2 (ARP-2) and severins 1 and 2 at days 2 and 7. The first of these initiates polymerisation of F-actin, while the severins counteract this process by capping the growing filaments. Fimbrin, which cross-links actin filaments into bundles, is also up-regulated at day 7. The alteration in body form during migration may well require changes in the focal adhesion interactions with the extracellular matrix and the formation of new adherens junctions between cells; the elevated expression of tensin, a phosphatase with Src homology 2 (SH2) domains that disrupts signalling at focal adhesions, can be viewed in this context. A second event associated with intravascular migration is the modification of body spination (Crabtree and Wilson (1980)); mid-body spines are lost while anterior and posterior spines are retained, an arrangement that facilitates passage of the larva through the narrow bore capillaries (Crabtree and Wilson (1980)). Since the spines are composed of polymerised actin and fimbrin (R.S. Curwen, personal communication), the action of the severin gene products may be to promote spine loss. Alternatively, the up-regulation of ARP-2 and fimbrin in lung parasites may be a prerequisite for the rapid reappearance of spines after parasite arrival in the hepatic portal vein (Crabtree and Wilson (1980)). The dramatic and rapid fluctuations in body length that the migrating schistosomula undergo has consequences for the configuration of the syncytial tegument that constitutes the parasite’s surface, and must be able to accommodate such changes. It has been estimated that the surface area increases by O50% between days 3 and 8 (Crabtree and Wilson (1980)), being thrown up into pitted ridges in contracted regions, whilst appearing almost completely smooth in the fully extended parasite (Crabtree and Wilson (1986)). The high-level expression at days 2 and 7 of scramblase, a phospholipid flippase, points to the occurrence of membrane biogenesis associated with the increasing surface area. The pitted architecture of the adult tegument surface is maintained by an associated cytoskeleton, believed to consist of actin, and a series of co-ordinately expressed calcium-binding proteins with varying sequence similarities to the dynein light chain (Braschi et al. (1981)). The up-regulation at day 7 of a single member of that group, Sm22.6, argues for its role in tegument plasticity rather than rigidity, while the high levels of calciumdependent protease calpain transcripts (at 7 and 21 days) may facilitate the cytoskeletal rearrangement, possibly required for

7

squeezing along capillaries. The detection of dynein light chain transcripts at days 7 and 21 may indicate the parasite’s readiness to reinstate the more ridged tegumental architecture of the liver worm (Crabtree and Wilson (1980)). The extreme deformation of the tegument surface (and possibly other) membranes in the lung schistosomulum may explain why four distinct tetraspanin transcripts were upregulated. This group of proteins has diverse functions in membrane architecture, providing an intra-membranal framework with which other trans- or extra-membranal proteins can associate. It is notable that none of the four tetraspanins is identical to the previously described schistosome vaccine candidate Sm23. The high level expression of two annexins, a class of phospholipid binding proteins, may also be related to surface membrane function. This family of proteins has been implicated in processes such as membrane domain stabilisation, and the regulation of membrane-cytoskeleton dynamics. Lastly, a transcript encoding a sodium channel identified in the day 7 sample may have physiological implications. In vertebrates, this amiloride-sensitive class of channel is located at the luminal surface of transporting epithelia, and is responsible for the maintenance of body salt and water balance. If located at the tegument surface, it is tempting to conclude that it serves the same function in the migrating parasite. A principal motivation of this study was to highlight transcripts that encode S/S proteins. The rationale was that proteins which met these criteria might mediate the protective immunity elicited by the RA vaccine (Coulson, 1997). The membrane proteins discussed above fall into this category, as do five highly expressed transcripts that encode secreted proteins, on the basis of Proteome Analyst assignment. Of these five, three are hypothetical so we can make no further inferences about their putative function. The remaining two are of particular interest, one being a homologue of a very diverse protein family which includes venom allergens, sperm coating proteins and developmentally regulated proteins of primitive chordates. The precise function in any of these contexts remains elusive but most pertinent to our investigation is the exploration of venom allergen homologues in the human hookworm as vaccine candidates (Asojo et al. (2005)). The other putative secreted protein, Antigen 5 (Ag5), has a homologue within the Phylum Playthelminthes which is secreted by the hydatid cyst stage of the tapeworm Echinococcus granulosus, where it is both immunogenic and potentially immunomodulatory (Lorenzo et al. (2003)). Possession of the full-length sequences for other hypothetical proteins, e.g. those identified by ArrayMiner in clusters 11 and 59, should reveal whether they encode a signal sequence that would target them for secretion or membrane insertion. Ultrastructural studies two decades ago (Crabtree and Wilson (1986)) revealed that the lung schistosomulum of S. mansoni possessed virtually no unique cellular features, apart from a single secretory inclusion, the homogeneous body present in the tegument of the intravascular larva. The only suggestion made for the potential function of this inclusion was to release proteins that might act as ‘lubricants’ to ease passage

8

G.P. Dillon et al. / International Journal for Parasitology 36 (2006) 1–8

along capillaries (Crabtree and Wilson (1986)). It appears paradoxical that such an unprepossessing life cycle stage should serve as a target for protective immune responses. However, the paucity of distinguishing features has actually made the task of antigen identification more straightforward, as there are relatively few novel targets from which to select molecules for detailed investigation. Our microarray analysis has identified a manageable number of genes preferentially expressed in the lung schistosomulum, which are now the subjects of further characterisation. Acknowledgements Gary Dillon is in receipt of a BBSRC postgraduate studentship. Matloob Qureshi assisted with data submission to ArrayExpress. The project received an allocation of funding from the Pathogen Sequencing Advisory Group at the Wellcome Trust Sanger Institute. Supplementary Data Supplementary data associated with this article can be found, in the online version, at doi:10.1016/j.ijpara.2005.10.008.

References Asojo, O.A., Goud, G., Dhar, K., Loukas, A., Zhan, B., Deumic, V., Liu, S., Borgstahl, G.E., Hotez, P.J., 2005. X-ray structure of Na-ASP-2, a pathogenesis-related-1 protein from the nematode parasite, Necator americanus, and a vaccine antigen for human hookworm infection. J. Mol. Biol. 346, 801–814. Basch, P.F., 1981. Cultivation of Schistosoma mansoni in vitro. I. Establishment of cultures from cercariae and development until pairing. J. Parasitol. 67, 179–185. Braschi, S., Curwen, R.S., Ashton, P.D., Verjovski-Almeida, S., Wilson, R.A., in press. The tegument surface membranes of the human blood parasite Schistosoma mansoni: a proteomic analysis after differential extraction. Proteomics. Clegg, J.A., 1965. In vitro cultivation of Schistosoma mansoni. Exp. Parasitol. 16, 133–147. Coulson, P.S., 1997. The radiation-attenuated vaccine against schistosomes in animal models: paradigm for a human vaccine? Adv. Parasitol. 39, 271–336.

Crabtree, J.E., Wilson, R.A., 1980. Schistosoma mansoni: a scanning electron microscope study of the developing schistosomulum. Parasitology 81, 553–564. Crabtree, J.E., Wilson, R.A., 1986. Schistosoma mansoni: an ultrastructural examination of pulmonary migration. Parasitology 92, 343–354. Harrop, R., Coulson, P.S., Wilson, R.A., 1999. Characterization, cloning and immunogenicity of antigens released by lung-stage larvae of Schistosoma mansoni. Parasitology 118, 583–594. Harrop, R., Wilson, R.A., 1993. Protein synthesis and release by cultured schistosomula of Schistosoma mansoni. Parasitology 107, 265–274. King, C.H., Dickman, K., Tisch, D.J., 2005. Reassessment of the cost of chronic helmintic infection: a meta-analysis of disability-related outcomes in endemic schistosomiasis. Lancet 365, 1561–1569. Kooperberg, C., Fazzio, T.G., Delrow, J.J., Tsukiyama, T., 2002. Improved background correction for spotted DNA microarrays. J. Comput. Biol. 9, 55–66. Lawson, J.R., Wilson, R.A., 1980. Metabolic changes associated with the migration of the schistosomulum of Schistosoma mansoni in the mammal host. Parasitology 81, 325–336. Lorenzo, C., Salinas, G., Brugnini, A., Wernstedt, C., Hellman, U., GonzalezSapienza, G., 2003. Echinococcus granulosus antigen 5 is closely related to proteases of the trypsin family. Biochem. J. 369, 191–198. Lu, Z., Szafron, D., Greiner, R., Lu, P., Wishart, D.S., Poulin, B., Anvik, J., Macdonell, C., Eisner, R., 2004. Predicting subcellular localization of proteins using machine-learned classifiers. Bioinformatics 20, 547–556. Smyth, Gordon K., 2004. Linear models and empirical Bayes methods for assessing differential expression in microarray experiments. Statistical Applications in Genetics and Molecular Biology 3, Article 3. Verjovski-Almeida, S., DeMarco, R., Martins, E.A., Guimaraes, P.E., Ojopi, E.P., Paquola, A.C., Piazza, J.P., Nishiyama Jr., M.Y., Kitajima, J.P., Adamson, R.E., Ashton, P.D., Bonaldo, M.F., Coulson, P.S., Dillon, G.P., Farias, L.P., Gregorio, S.P., Ho, P.L., Leite, R.A., Malaquias, L.C., Marques, R.C., Miyasato, P.A., Nascimento, A.L., Ohlweiler, F.P., Reis, E.M., Ribeiro, M.A., Sa, R.G., Stukart, G.C., Soares, M.B., Gargioni, C., Kawano, T., Rodrigues, V., Madeira, A.M., Wilson, R.A., Menck, C.F., Setubal, J.C., Leite, L.C., Dias-Neto, E., 2003. Transcriptome analysis of the acoelomate human parasite Schistosoma mansoni. Nat. Genet. 35, 148–157. Wadenback, J., Clapham, D.H., Craig, D., Sederoff, R., Peter, G.F., von Arnold, S., Egertsdotter, U., 2005. Comparison of standard exponential and linear techniques to amplify small cDNA samples for microarrays. BMC Genomics 6, 61–68. Zeeberg, B.R., Feng, W., Wang, G., Wang, M.D., Fojo, A.T., Sunshine, M., Narasimhan, S., Kane, D.W., Reinhold, W.C., Lababidi, S., Bussey, K.J., Riss, J., Barrett, J.C., Weinstein, J.N., 2003. GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol. 4, R28.