cut-1-like genes of Ascaris lumbricoides

cut-1-like genes of Ascaris lumbricoides

Gene 193 (1997) 81–87 cut-1-like genes of Ascaris lumbricoides Mohammed Timinouni a, Paolo Bazzicalupo b,* a Institut Pasteur du Maroc, Casablanca, M...

231KB Sizes 0 Downloads 90 Views

Gene 193 (1997) 81–87

cut-1-like genes of Ascaris lumbricoides Mohammed Timinouni a, Paolo Bazzicalupo b,* a Institut Pasteur du Maroc, Casablanca, Morocco b Istituto Internazionale di Genetica e Biofisica, via G. Marconi 10, 80125 Napoli, Italy Received 28 October 1996; received in revised form 23 December 1996; accepted 8 January 1997

Abstract Three genomic fragments homologous to cut-1 of Caenorhabditis elegans (C. elegans) have been identified in the intestinal parasitic nematode Ascaris lumbricoides (A. lumbricoides). Two of these fragments identify one region of the A. lumbricoides genome; they are separated by 8–9 kb and have opposite orientation, with the direction of transcription converging toward the center of the region. The third gene, which has been studied more completely, is in a different region of the genome separated from the first one by not less than 12–15 kb. The complete genomic sequence of this third gene has been determined. cDNA overlapping clones were obtained from adult A. lumbricoides RNA via the rapid amplification of cDNA ends (RACE ) procedure [Frohman et al., 1988. Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proc. Natl. Acad. Sci. USA 85, 8998–9002] and sequenced. The mature mRNA of this gene, which we have named ascut-1, is trans-spliced to the spliced leader sequence of nematodes (SL1) [ Krause, M., Hirsh, D., 1987. A transspliced leader sequence on actin mRNA in C. elegans. Cell 49, 753–761]. The mRNA is 1684 nt long plus the poly(A) tail and contains four exons with a 138 nt untranslated 5∞ leader and a 388 nt untranslated 3∞ tail. Conceptual translation of the coding sequence shows a protein of 385 aa with a signal peptide of 16 aa. The protein shows very high homology with CECUT-1, the product of the C. elegans gene cut-1 and with other cuticlin proteins of nematodes. A 262 amino acids region which is strongly conserved between these proteins seems to identify a group of proteins, so far restricted to nematodes, for which the name CUT-1-like is proposed. © 1997 Elsevier Science B.V. Keywords: Cuticle; Cuticlin; Parasite; Nematodes; CECUT-1; C. elegans

1. Introduction The nematode cuticle is a multi-layered elastic extra cellular structure which surrounds the animal and functions as an exoskeleton determining the body shape of the worms. In addition to its mechanical functions it also mediates the metabolic interaction of the animal * Corresponding author. Tel.: +39 81 7257281; fax: +39 81 5936123; e-mail: [email protected] Abbreviations: aa, amino acid(s); bp, base pair(s); kb, kilobase(s) or 1000 bp; nt, nucleotide(s); C. elegans, the free living nematode Caenorhabditids elegans; A. lumbricoides, the intestinal parasitic nematode Ascaris lumbricoides; M. artiellia, the plant parasitic nematode Meloidogyne artiellia; CECUT-1, ASCUT-1 and MACUT-1, cuticlin proteins: the products of the genes cut-1 of C. elegans, ascut-1 of A. lumbricoides and macut-1 of M. artiellia, respectively; CECUT-3, the name we have given to F22B5.3, which is one of the deduced proteins coded for by cosmid F22B5, sequenced by the C. elegans genome sequencing project; SL1, spliced leader sequence 1 of nematodes; ORF, open reading frame; SDS, sodium dodecyl sulfate. 0378-1119/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved. PII S 03 7 8 -1 1 1 9 ( 9 7 ) 0 0 0 89 - 9

with its environment and, in parasitic nematodes, the interaction with the host and its immune system. Its major components are collagen-like proteins which can be extracted by a combination of strong ionic detergents and disulfide-reducing agents (Cox et al., 1981; Politz and Philipp, 1992; Kramer, 1994). The insoluble residue left after extraction is resistant to collagenase and is named cuticlin. Cuticlin was first described as the insoluble residue of adult Ascaris cuticles; its amino acid composition and its structure by X-ray diffraction indicated that it is composed of a mixture of highly crosslinked proteins different from collagens (Fujimoto and Kanaya, 1973). We have identified and cloned two genes, cut-1 and cut-2, of the free living nematode C. elegans. Their products CECUT-1 and CECUT-2, are both components of the cuticlin residue but have no sequence similarity except for a short stretch of CECUT-1 (~ 40 aa long) which is rich in alanines and prolines and resembles the motif repeated several times in CECUT-2

82

M. Timinouni, P. Bazzicalupo / Gene 193 (1997) 81–87

(Sebastiano et al., 1991; Lassandro et al., 1994). The cloned genes have provided the opportunity to obtain recombinant cuticlin proteins and to study for the first time some of the biochemical properties of cuticlin components, in particular the mechanism of their crosslinking (Parise and Bazzicalupo, 1997). In addition, specific antibodies have been raised against the recombinant proteins and have been used to localise the corresponding proteins within the cuticle layers (Ristoratore et al., 1994; Favre et al., 1995). While CECUT-2 is a component of the cuticle of all stages of C. elegans (Lassandro et al., 1994), CECUT-1 is present only in the cuticle of dauer larvae where it forms two ribbons running the whole length of the worm, underneath the lateral alae (Sebastiano et al., 1991). A fragment of cut1 was used to identify a homologous gene from the plant parasitic nematode Meloidogyne artiellia (M. artiellia) (De Giorgi et al., 1996) and experiments with specific antibodies have revealed that CECUT-1 crossreacting epitopes are present in the cuticle of at least another nematode, Heterorhabditis sp. ( Favre et al., 1995). It is obviously of interest to know whether cuticlin-like genes are present in all nematodes and particularly in animal parasitic species. In this paper we report the identification of three cut-1-like genes from the intestinal nematode A. lumbricoides and the complete structural organisation of one of them.

2. Materials and methods 2.1. Library screening An A. lumbricoides genomic library in lambda EMBL3A was a gift from Fritz Mueller, Fribourg, Switzerland. It was screened with an 1.8-kb EcoRI fragment from cut-1 of C. elegans. The probe was radio labelled by random primed DNA synthesis; hybridisation was in 3×SSC at 65°C, final washes were in 0.1% SDS, 0.5×SSC at 65°C. 2.2. Primers General primers for RACE, AP and dT-AP, where those of Frohman et al. (1988). The gene-specific primers were designed on the basis of the genomic sequence and are listed in Table 1. The nucleotides to which they are identical or complementary are those of the sequence of Fig. 3 and their orientations and approximate positions are indicated in the scheme of the gene in the same figure. 2.3. cDNA clones The rapid amplification of cDNA ends (RACE) procedure of Frohman et al. (1988) with some modifications

Table 1 AP dT-AP SL1p

a b

c

d e

5∞-GACTCGAGTCGACATCG adaptor, contains an XhoI and a SalI site 5∞-GACTCGAGTCGACATCGATTTTTTTTTTTTTTTTT like AP plus a ClaI site and 17 dT 5∞-GCCGGGATCCGGTTTAATTACCCAAGTTTGAG a CGGC clump, a BamHI site, and 22 nt homologous to the Spliced Leader Sequence 1 of nematodes 5∞GGACAGCCGATACAGTTCGCA 21 nt identical to nt 3604–3624 5∞-GCCGGGATCCCACAAGTGGACTTGCGATTC a GCCG clump, a BamHI site, and 20 nt identical to 3646–3665 5∞-GCCGGGATCCTCGAGGTTGTTCAGCAGGTAT a GCCG clump, a BamHI site, and 21 nt complementary to 3791–3770 5∞-ATTTGTAGACGTGCGCTTCT 20 nt complementary to 3838–3819 5∞-TAATCTGGCACTGATAGAA 19 nt complementary to 3877–3859

was used to determine the structure of the mRNA. The most important modification regarded the cloning of the 5∞ end of the mRNA. Instead of the usual, unbiased, procedure of adding via terminal transferase a poly(A) tail at the 5∞ end of the cDNA and then amplifying the resulting molecules with a dT-AP primer, we amplified the cDNA directly with a primer, SL1p, homologous to the spliced leader sequence 1 of nematodes, SL1 ( Krause and Hirsh, 1987), basing this choice on the fact that more than 50% of nematodes mRNAs are trans-spliced to SL1. For cDNA synthesis, 2 mg aliquots of adult A. lumbricoides total RNA, which was a gift of Joyce Moore (Glasgow, UK ) were used as template for AMV reverse transcriptase (Promega) following standard procedures. Primers used were random hexamers for the RACE cloning of the 5∞ ends and dT-AP for the RACE cloning of the 3∞ ends. After stopping the reaction the mixtures were diluted to 100 ml in water and aliquots were used as template in the PCR amplification steps. The first amplification step for 5∞ ends cloning was performed with primers e and SL1p. The amplification products were analyzed by southern blot hybridisation using, as probe, primer a. Fragments which were positive were excised from the gel and amplified again using SL1p and a second, nested, gene-specific primer, c. The products were analyzed by agarose gel electrophoresis and hybridisation to a. The positive bands were excised from the gel, digested with BamHI and ligated into pUC18. After transformation, the clones containing the appropriate inserts were identified by hybridisation to a and the inserts were sequenced. To clone the 3∞ end the dT-AP primed cDNA was used. The procedure was similar to the one used to clone the 5∞ end except that the non-specific primer was AP instead of SL1p; the gene-specific primers were a in

M. Timinouni, P. Bazzicalupo / Gene 193 (1997) 81–87

83

the first amplification step and b in the second and the hybridising probe was primer d. 2.4. Sequence determination DNA sequences were determined by the dideoxynucleotide chain termination method using a sequenase kit from USB and an ABI 373A sequencer. All the sequence were determined on both strand after sub cloning fragments of appropriate size. For the genomic sequence, gene-specific primers appropriately synthesised were also used. 2.5. Computer analysis of the sequences PSORT is a specialised expert system for the prediction of protein localisation sites in cells (Nakai and Kanehisa, 1992). A www version of this program was used to predict the signal peptides in the proteins and their potential cleavage sites. The alignment of the four protein sequences was initially done using the 1.6 version of Clustal W ( Thompson et al., 1994). Fig. 4 contains some manual adjustment of the results of Clustal W that seem to improve the meaningfulness of the alignment.

3. Results and discussion 3.1. cut-1 homologues in A. lumbricoides 2×105 plaques of the A. lumbricoides genomic library were screened with an EcoRI fragment of C. elegans cut-1 containing the second and third exons of the gene. After two rounds of purification eight phages were positive and were further analyzed. The DNA from the phages was prepared and analyzed by restriction digestion and hybridisation to the same probe used for the screening ( Fig. 1). The positive phages fall in two groups: 3 phages contain only one EcoRI hybridising band of about 4 kb; a second group of 4 phages contains instead two EcoRI fragments of 1.8 and 3.3 kb which hybridise to the probe. Phage B5, which in this experiment showed no clear positive hybridisation, was analyzed again and gave an hybridisation pattern similar to that of phages B1, B2 and B6 (not shown). Restriction maps of two representative phages are shown in Fig. 2 and indicate that the two groups of phages represent two different genomic regions which we call region 1 represented by phage B1 and region 2 represented by phage B7. No phage linking these two regions has been isolated. Restriction analysis and hybridisation to the cut-1 probe of phage B7 indicate that two different cut-1 homologous sequences are present in region 2 and that they are separated by about 8 kb. Partial sequencing of

Fig. 1. Restriction digestion and Southern blot analysis of the DNAs of l phages from the A. lumbricoides genomic library. B1 to B8 are the names given to the phages which were positive after two rounds of purification. Digestion was with EcoRI; the probe was a C. elegans cut-1 DNA fragment containing the second and third exon of the gene. Hybridisation conditions are described in Section 2.1.

hybridising subclones of this phage has provided the evidence for the converging orientation of these sequences which is shown in Fig. 2. Both these cut-1like sequences contain stretches of high homology to exon three of C. elegans cut-1, not shown. The structure of the gene from region 1, phage B1, was studied more completely. The 4.1-kb EcoRI fragment hybridising to the probe was restriction mapped and completely sequenced. In addition, to cover the region containing the 5∞ end of the message (see Section 3.2), the next two EcoRI genomic fragments were also sequenced. A total of 5792 bases of genomic sequence are shown in Fig. 3. The sequence has been submitted to GeneBank with the name ascut-1 and has been given accession No. U73005. 3.2. Structure of the mRNA RNA from adult A. lumbricoides was analyzed in northern blots using as a probe an XhoI-SacI fragment of about 1000 bp subcloned from the 4.1-kb EcoRI fragment from phage B1. Even with a very high specific activity probe and loading 20 mg of total RNA no specific signal could be detected with this approach. To increase sensitivity and to determine the structure of the specific mRNA, we successfully followed the procedure of Frohman et al. (1988) for the rapid amplification of cDNA ends ( RACE) with the modifications described in Section 2.3.

84

M. Timinouni, P. Bazzicalupo / Gene 193 (1997) 81–87

Fig. 2. Restriction maps of phages B1 and B7. Black boxes indicate the smallest restriction fragments hybridising to the C. elegans cut-1 probe, the same as that in Fig. 1. Diagonals filled boxes indicate regions that have been sequenced. The arrows indicate the orientation of the sense strand relative to the restriction map. R, EcoRI; H, HindIII; P, PstI; K, KpnI; Sa, SacI; Xb, XbaI; X, XhoI; S, SalI; EV, EcoRV.

The longest insert present in the 5∞ end RACE clones was 810 bp. The sequence starts with SL1 which is spliced to nt 766 of Fig. 3. Two exons of, respectively, 189 and 389 nt follow SL1. The insert also contains the first 210 nt of the third exon and, as expected, terminates with the sequence corresponding to the primer used in the last amplification step. After the first amplification step, hybridisation with the specific probe detected, in addition to the main band, two fainter bands of smaller size. These bands were also re-amplified, and cloned. Their sequences show that they correspond to mRNAs in which SL1 is spliced precisely at the beginning of the second or third exon giving rise to inserts of 622 and 232 bp, respectively. RACE cloning of the 3∞ end generated only clones with one type of insert of approximately 1000 bp. When the inserts of three such clones were sequenced, comparison with the genomic sequence revealed the presence of one intron from nt 4084 to nt 4498 of Fig. 3. The first 147 nt of this clones overlap with the last 147 nt of the 5∞ end clones. The clones terminate with stretches of 18 to 23 As following nucleotide 5081 of the genomic sequence. Aligning the combined sequences of the cDNA clones to the genomic one provides the intron exon structure

of the gene which is represented in the diagram at the bottom of Fig. 3. The four exons present in the longest mRNA (1684 nt) are 211 (including 22 nt of SL1), 389, 501 and 583 nucleotides long. The lengths of the first, second and third introns are, respectively, 1896, 343 and 415 nucleotides long. Two additional mature mRNAs have been detected in the adult A. lumbricoides RNA that we have examined. We do not know the biological significance of any of these shorter and less abundant mRNAs, which also begin with SL1, and are, respectively, 1495 and 1106 nt long. Because of transplicing, transcription initiation does not coincide with the 5∞ end of the mRNAs and we still do not know where transcription actually begins. 3.3. Features of the predicted protein ASCUT-1 Conceptual translation of the longest mRNA sequence shows an open reading frame (ORF ) of 1155 nt coding for a predicted protein of 385 aa. The translation start codon of this protein, which we call ASCUT-1, is 138 nt from the first nucleotide of SL1, at nt 882 of Fig. 3. A 388 nt long 3∞ non-translated region follows the stop codon of the ORF (nt 4691–4693 of Fig. 3). Searches of various databases with the sequence of

M. Timinouni, P. Bazzicalupo / Gene 193 (1997) 81–87

85

Fig. 3. Genomic sequence of the gene ascut-1. Exons are in capital letters. The genomic sequence was determined on subclones derived from phage B1. The intron-exon junctions and the structure of the mRNA derive from the comparison of the genomic sequence with the sequence of the cDNA clones (Section 3.2). The start and stop codons of the putative protein are underlined. This sequence is deposited in GenBank and has been given accession No. U73005. In the scheme, boxes represent exons with the untranslated regions corresponding to the empty parts while the parts filled with diagonals represent protein coding regions. The arrows a, b, c, d and e indicate the direction and approximate positions of the gene specific primers used for RACE.

ASCUT-1 has shown that very strong homologies exist with CECUT-1 and with some hypothetical proteins, which also share homology with CECUT-1 and which are deduced from the sequences generated by the C. elegans Genome Sequencing Project ( Wilson et al., 1994). In addition there is very significant homology

with the deduced amino acid sequence of the cut-1 homologue of the plant parasitic nematode M. artiellia. In Fig. 4 the A. lumbricoides protein is aligned with the sequences of CECUT-1, MACUT-1 (De Giorgi et al., 1996) and CEF22B5.3 (accession No. Z50044) of C. elegans, which we have named CECUT-3. This latter is,

86

M. Timinouni, P. Bazzicalupo / Gene 193 (1997) 81–87

Fig. 4. Multiple alignment of CUT-1-like proteins of nematodes. ASCUT-1 is the putative protein coded for by the gene described in this paper. CECUT-1 is the first cuticlin protein studied and is described in Sebastiano et al. (1991). CECUT-3 is a cuticlin-like protein of C. elegans coded for by CEF22B5.3, accession No. Z50044, sequenced by the C. elegans Genome Sequencing Project. MACUT-1 is a cuticlin homologue of the nematode M. artiellia (De Giorgi et al., 1996). The alignment was obtained as described in Section 2.5. The down pointing arrows indicate the positions of the putative signal peptide cleavage site. Black diamonds indicate the position of the introns, black circles mark the positions of the conserved cysteines. The region rich in basic aa, residues 330 to 345, is underlined.

among the cuticlin-like proteins of C. elegans, the one most similar to CECUT-1 and ASCUT-1. Several features of the proteins depicted in Fig. 4 are of interest. All four proteins begin with a putative cleavable signal peptide which ranges from 16 aa, for ASCUT-1 to 22 aa for MACUT-1 (Nakai and Kanehisa, 1992). The signal peptide is followed by the region that is most conserved between these proteins; the region is 262 residues long and contains 12 cysteines whose positions are conserved in all four proteins. In this region ASCUT-1 is 90%, 87% and 84% identical to CECUT-1, CECUT-3 and MACUT-1, respectively; in addition, most of the substitutions are conservative. Past this strongly homologous region, MACUT-1 and CECUT-1 contain a region of 36 and 43 aa, respectively, which is rich in alanine and proline and which is reminiscent of the repetitive motif of the central region of CECUT-2 (Lassandro et al., 1994). No such region is present in either CECUT-3 or ASCUT-1: hence the gap in their aligned sequences. Past this gap, homology between all four proteins resumes with a region, residues 330 to 345, which is rich in arginine lysine and glutamine and which may be a site for protease processing not unlike the processing site present in cuticle collagens of nematodes ( Kramer, 1994). The terminal 100 or so residues of the sequences also show homology, albeit at a much lower level. In Fig. 4, the positions of the introns in the different genes are also marked (black diamonds).

While intron positions are completely conserved between the two C. elegans genes, only the position of the last intron seems to be conserved in all four genes. macut-1 has only two introns while the other genes have three; finally the positions of the first intron of macut-1 and of the second intron of ascut-1 are the same. At the moment we do not know whether the two shorter and less abundant mRNAs detected by the RACE procedure are translated or not. The longest ORFs in these two mRNAs are in the same frame as the one of the complete messenger and the first methionines are in such positions that, if translated, these mRNAs would produce proteins of 264 and 154 aa, respectively.

4. Conclusions (1) We identified three regions of the A. lumbricoides genome with homologies to the cuticlin gene cut-1 of C. elegans. Two of these regions are close to each other in the genome and their sequences have opposite orientation. At present it is not known whether these are genes or pseudo-genes because their characterisation is still incomplete and it is not known whether they are transcribed or not. The third region is separated from these two by more than

M. Timinouni, P. Bazzicalupo / Gene 193 (1997) 81–87

12–15 kb and corresponds to a transcribed gene which we name ascut-1. (2) The pattern of transcription of ascut-1 has not been studied in this work. We know that the amount of specific mRNA present in the RNA prepared from whole adult worms is extremely low as it was undetectable by northern blot analysis and two rounds of amplifications where necessary to clone the cDNAs in the RACE procedure. However, via this procedure, it was possible to determine the structure of three specific mRNAs present in the adult A. lumbricoides. Like C. elegans cut-1, and like more than 50% of nematode messengers, ascut1 mRNAs are trans-spliced to SL1. It should be kept in mind, however, that the procedure used (5∞ RACE using SL1p) could only detect SL1 transspliced molecules thus, the presence or absence of non-trans-spliced variants has not been addressed. It is possible that, like cut-1 in C. elegans, ascut-1 is not functionally expressed in adults and that the few mRNA molecules detected, including the shorter variants, are only biologically irrelevant remnants of an earlier expression. A detailed analysis of the RNA from different stages and tissues of the parasite will be necessary to establish the pattern of expression of this gene. (3) The gene studied codes for a 385 aa protein, ASCUT-1, which is strongly conserved when compared to C. elegans CUT-1 and to other nematode cuticlins. Comparison of the sequences of four CUT-1 homologues from three widely different nematodes has allowed the identification of a region, immediately following the signal peptide, which is highly conserved with 204 aa out of 262 (78%) identical in the four proteins considered. This region is absent in C. elegans CUT-2, a different component of C. elegans cuticlin, and seems to characterise a group of proteins which we propose to call CUT1-like proteins, while cut-1-like would be the genes that code for them. There are, so far, no sequenced homologues of cut-1-like genes from organisms that are not nematodes.

Acknowledgement We thank Teresa Vespa and Rita Vito for technical assistance. This work was supported by a grant from

87

the EEC programme STD-3, contract CT92-0096-I. Support was also received from Consorzio InterUniversitario per la Ricerca nei Paesi in via di Sviluppo (CIRPS ), Universita` di Roma ‘La Sapienza’ and by COMIPA, Roma.

References Cox, G.N., Kusch, M., Edgar, R.S., 1981. Cuticle of Caenorhabditis elegans: its isolation and partial characterization. J. Cell Biol. 90, 7–17. De Giorgi, C., De Luca, F., Lamberti, F., 1996. A silent trans-splicing signal in the cuticlin-encoding gene of the plant parasitic nematode Meloidogyne artiellia. Gene 170, 261–265. Favre, R., Hermann, R., Cermola, M., Hohenberg, H., Muller, M., Bazzicalupo, P., 1995. Immuno-gold-labelling of CUT-1, CUT-2 and cuticlin epitopes in Caenorhabditis elegans and Heterorhabditis sp. processed by high pressure freezing and freeze-substitution. J. Submicrosc. Cytol. Pathol. 27, 341–347. Frohman, M.A., Dush, M.K., Martin, G.R., 1988. Rapid production of full-lenght cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proc. Natl. Acad. Sci. USA 85, 8998–9002. Fujimoto, D., Kanaya S., 1973. Cuticlin a noncollagen structural protein from Ascaris cuticle. Arch. Biochem. Biophys. 157, 1–6. Kramer, J.M., 1994. Structures and function of collagens in Caenorhabditis elegans. FASEB J. 8, 329–336. Krause, M., Hirsh, D., 1987. A trans-spliced leader sequence on actin mRNA in C. elegans. Cell 49, 753–761. Lassandro, F., Sebastiano, M., Zei, F., Bazzicalupo P., 1994. The role of dityrosine formation in the crosslinking of CUT-2, the product of a second cuticlin gene of Caenorhabditis elegans. Mol. Biochem. Parasitol. 65, 147–159. Nakai, K., Kanehisa, M., 1992. A knowledge base for predicting protein localization sites in eukaryotic cells. Genomics 14, 897–911. Parise, G., Bazzicalupo, P., 1997. Assembly of nematode cuticle: role of hydrophobic interactions in CUT-2 cross-linking. Biochim. Biophys. Acta 1337. 295–301. Politz, S.M., Philipp, M., 1992. Caenorhabditis elegans as a model for parasitic nematodes: a focus on the cuticle. Parasitol. Today 8, 6–12. Ristoratore, F., Cermola, M., Nola, M., Bazzicalupo, P., Favre, R., 1994. Ultrastructural immuno localization of CUT-1 and CUT-2 antigenic sites in the cuticles of the nematode Caenorhabditis elegans J. Submicrosc. Cytol. Pathol. 26, 437–443. Sebastiano, M. Lassandro, F., Bazzicalupo, P., 1991. cut-1, a Caenorhabditis elegans gene coding for a dauer-specific non collagenous component of the cuticle. Dev. Biol. 146, 519–530. Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, positions-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680. Wilson, R., Ainscough, R., Anderson, K., Baynes, C., Berks, M. et al., 1994. The C. elegans genome sequencing project: contiguous nucleotide sequence of over two megabases from chromosome III. Nature 368, 32–38.