Molecular characterization of the C-3 DNA puff gene of Rhynchosciara americana

Molecular characterization of the C-3 DNA puff gene of Rhynchosciara americana

Gene 193 (1997) 163–172 Molecular characterization of the C-3 DNA puff gene of Rhynchosciara americana Luiz O.F. Penalva a,1, Jonny Yokosawa a, Ann J...

313KB Sizes 0 Downloads 56 Views

Gene 193 (1997) 163–172

Molecular characterization of the C-3 DNA puff gene of Rhynchosciara americana Luiz O.F. Penalva a,1, Jonny Yokosawa a, Ann J. Stocker a, Maria Albertina M. Soares a, Monika Graessmann b, T. Cristina Orlando c, Carlos E. Winter c, Luisa M. Botella d, Adolf Graessmann e, Francisco J.S. Lara a,* a Departamento de Biologia, Instituto de Biocieˆncias, Universidade de Sa˜o Paulo, C.P. 11.461, CEP 05422-970, Sa˜o Paulo, Brazil b Fachbereich Medizinische Grundlagenfacher, Institute of Molecular Biology and Biochemistry, F.U. Berlin, Arnimallee 22, 14195, Berlin, Germany c Departamento de Parasitologia, Instituto de Cieˆncias Biome´dicas, Universidade de Sa˜o Paulo, C.P. 66208, CEP 05389-970, Sa˜o Paulo, Brazil d Centro de Investigaciones Biologicas, Velasquez 144, Madrid 28006, Spain e Institut fu¨r Molekular Biologie und Biochimie der Freien Universita¨t Berlin, Berlin, Germany Received 18 October 1996; received in revised form 3 January 1997; accepted 8 January 1997; Received by U.K. Laemmli

Abstract We have mapped a region of about 33 kb which includes the transcription unit of the C-3 DNA puff gene of Rhynchosciara americana. The C-3 TU and a region extending approximately 800 bp upstream of the C-3 promoter were characterized. The TU is composed of three exons and produces a 1.1-kb mRNA whose level in salivary glands increases with the expansion of the C-3 puff. The C-3 messenger appears to undergo rapid deadenylation resulting in an RNA of about 0.95 kb which can still be observed in gland cells 15 h after the puff has regressed. The 1.1-kb mRNA codes for a 32.4-kDa, predominantly alpha-helical polypeptide with three conserved parallel coiled-coil stretches. The aa composition and structure of this polypeptide suggests that it is secreted and contributes to the formation of the cocoon in which the larvae pupate. The region upstream of the promoter contains several A-rich sequences with similarity to the ACS of yeast which might have a role in the initiation of replication/amplification. © 1997 Elsevier Science B.V. Keywords: Diptera; Sciaridae; Salivary gland; Gene amplification; Coiled-coil protein

1. Introduction The so-called DNA puffs which form at discrete regions of sciarid salivary gland chromosomes at the end of the larval stage are sites at which there is an increased synthesis of DNA (DNA sequence amplifica* Corresponding author. Tel.: +55 11 8187572; Fax: +55 11 8187573; e-mail: [email protected] 1 Present address: Gene Expression Programme, European Molecular Biology Laboratory, Meyerhofstrasse, 1, D-69117 Heidelberg, Germany. Abbreviations: ARS, autonomously replicating sequence; ACS, ARS consensus sequence; BhC4, BhB10, B. hygida DNA puff C4 and B10 genes; ORF, open reading frame; Ra, R. americana; RaC3, RaC8 R. americana DNA puff C-3 and C-8 genes; ScII/9-1 and 2, S. coprophila DNA puff II/9A genes; TpC4B, T. pubescens DNA puff C4B gene; TU, transcription unit. 0378-1119/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved. PII S 03 7 8 -1 1 1 9 ( 9 7 ) 0 0 1 04 - 2

tion) as compared with other chromosome regions. This site-specific DNA synthesis provides a system in which the mechanism of DNA replication and its relationship with transcription can be studied. Origins of replication have now been mapped for two DNA puffs (Liang et al., 1993; Yokosawa, 1995) and their control is currently being examined (reviewed in Gerbi et al., 1993). DNA puffs were initially observed in Rhynchosciara americana by increased Feulgen staining (Breuer and Pavan, 1955) and incorporation of 3H-thymidine ( Ficq and Pavan, 1957) at the puff sites. They were correlated with the increased production of certain secretory proteins by Winter et al. (1977a,b) and subsequently shown to be sites of intensive messenger RNA synthesis (Bonaldo et al., 1979; reviewed in Lara et al., 1991). The C-3 DNA puff is one of the largest puffs observed

164

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

in R. americana. It forms in all salivary gland cells, but appears earlier and is considerably larger in the medial and distal cells (Stocker et al., 1984). Its amplification and puffing seem to originate at a single, very thin band (Breuer and Pavan, 1955). According to Glover et al. (1982), sequences corresponding to its messenger amplify 16 times. In the present investigation, we have characterized the C-3 transcription unit as well as the promoter region and its upstream sequence, which should contain regions involved in amplification.

2. Materials and methods 2.1. cDNA library A cDNA library was made by Stratagene (La Jolla, CA, USA) with lUniZAPTM as vehicle, using an RNA preparation extracted from salivary glands of 6th-period larvae, when the C-3 puff is expanded ( Terra et al., 1973). Fifteen hundred thousand recombinants from this library were screened with the insert from pRaY19A (Millar et al., 1985).

2.3. The C-3 puff transcript analysis Total salivary gland RNA from stages before, during and after the expansion of the C-3 DNA puff was electrophoresed according to the method of Kroczek and Siebert (1990). The presence of RNA transcripts from the puff region was examined by Northern blotting. The membranes were probed with the insert from the cDNA clone, pRaC3-22, and the genomic clones lRA8B and lRA16. The number and size of the exons, as well as the direction of transcription along the genomic clones, was determined by RNase protection (Ausubel et al., 1989). This technique also allowed the site where C-3 transcription is initiated to be estimated. The length of the poly(A) tail was determined by the method of Vournakis et al. (1975). Globin mRNA was used as a control for RNase H activity. 2.4. In situ hybridization The location of the cDNA as well as the genomic clones selected in the screenings was checked by in situ hybridization to R. americana polytene chromosomes using the method described in Stocker et al. (1993).

2.2. Genomic library

2.5. Sequencing of the cDNA and genomic clones

This library was prepared with DNA from salivary glands of 5th-period larvae ( Terra et al., 1973) using Lambda Dash II TM as vector and the Lambda Dash II/BamHI cloning kit (Stratagene). Genomic clones were obtained by screening the library with the cDNA, pRaC3-22 (this paper). The clones used in this paper are shown in Fig. 1.

cDNA: Restriction fragments from the cDNA clone were subcloned into phages M13, mp18 and mp19 for the preparation of both strands. Sequencing was done using the Taq-21, M13 Dye primer Sequencing Kit and a Model 373 Sequencer (Applied Biosystems). Genomic clones were sequenced using a series of 20-base primers and the Sequenase USB kit (Amersham). Sequencing of subclone pRaHS1.3 ( Fig. 1) was done using the automatic 377 DNA sequencer (Applied Biosystems). 2.6. Analysis of sequences

Fig. 1. Schematic representation of the mapped region around the TU of the R. americana C-3 gene. The genomic clones mentioned in this paper are aligned beneath the composite restriction map. Direction of transcription is indicated by an arrow.

Nucleotide sequence alignment and analysis was initially done by Blast programs (Altschul et al., 1990) available through the NCBI servers and refined by hand using the Esee program, version 1.09e. Analysis of the promoter region for palindromic, repetitive and hairpin loop sequences was carried out with the aid of the DNAsis program (Hitachi America, Ltd, Brisbane, CA). The promoter region was analyzed for transcriptional elements using the Matrix Search 1.0 program (Chen et al., 1995). Polypeptide structural analysis was done using the Predict Protein program (Rost and Sander, 1993, 1994) through the server at EMBL. For aa sequence alignment and analysis Blosum’s matrix of similarity (Henikoff and Henikoff, 1979) and the ClustalW program ( Thompson et al., 1994) were used.

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

Prediction of coiled-coil regions in the aa sequence was made by COILS version 2.2 (Lupas et al., 1991; Lupas, 1996) based on the algorithms of Parry (1982). The predictions shown were made using a window of 14 aa. As suggested by Lupas et al. (1991), we ran the program with two different options, one giving the same weight to all positions in the heptad motif and the other giving different weights to the hydrophobic a and d aa compared with aa at the other positions. Similar results were obtained with both options.

3. Results and discussion 3.1. Characterization of the transcription unit Twenty-three positive clones were obtained by screening the cDNA library. Clone pRaC3-22 contained the longest insert and was selected for further analysis. In situ hybridization ( Fig. 2a) showed that this clone was complementary only to the C-3 DNA puff region. The cDNA insert was sequenced and found to be 963 bp long. This sequence contains one ORF which gives rise to a 34.2-kDa polypeptide ( Fig. 3). Restriction maps of the three genomic clones that were screened (lRA6E, lRA8B and lRA16E) revealed that they overlap and cover a region of about 33 kb ( Fig. 1). All of these clones hybridized in situ in the C-3 puff (Fig. 2b) or its amplified band. Mapping of the cDNA within the genomic clones placed the transcription unit near the center of the mapped genomic region ( Fig. 1). RNase protection assays using the genomic clones lRA6E and lRA8B in individual experiments, gave fragments of approximately 0.55, 0.30 and 0.08 kb when hybridized with total RNA at the stage when the C-3 puff was expanded ( Fig. 4). When hybridizations were done using anti-sense mRNA transcribed from the

Fig. 2. Hybridization in situ of (a) the cDNA pRaC3-22 to a chromosome in which amplification is just beginning, and (b) the genomic clone lRA16E to a chromosome in which the C-3 puff is fully expanded.

165

cDNA clone, pRaC3-22, the smallest protected fragment was approximately 0.06 kb. Besides indicating that the gene had two introns, these results allowed us to conclude that there was only about a 20-nucleotide difference between the size of the cDNA and the size of the C-3 mRNA. The RNase protection experiments also allowed a determination of the direction of transcription of the gene along the genomic clones ( Fig. 1) because of the known orientations of these clones with respect to the T7 and T3 promoters of the vector. Appropriate regions of pRaY19A were sequenced to locate the two introns and the promoter region of the gene. A subclone of lRA16E, pRaHS1.3 (Fig. 1), was used to give the sequence upstream of the promoter. The complete sequence is presented in Fig. 3. The introns, shown in lower case in Fig. 3, are short and A/T-rich. Intron I has a similar structure and position for all DNA puff genes which have been sequenced (DiBartholomeis and Gerbi, 1989; Frydman et al., 1993; Monesi et al., 1995; Fontes, 1996; Dessen et al., unpublished data). However, intron II is unique to the C-3 DNA puff gene. 3.2. The C3 promoter and upstream region A typical TATA box sequence begins at position 786 of the genomic sequence, 68 bp upstream from the ATG start codon ( Fig. 3). A sequence similar to the heptanucleotide consensus sequence which determines the initiation of transcription in insects (Hultmark et al., 1986) is observed 27 bp downstream of the first T in the TATA box. According to results of the RNase protection assay ( loc. cit.) and considering that transcription of most non-heat shock genes begins at an A, 30±2 bp downstream from the first T in the TATA box (Hultmark et al., 1986), the probable initiation site of the C-3 mRNA has been indicated in boldface within this region ( Fig. 3). Initiation of transcription for the other sequenced DNA puff genes also begins in or near the heptanucleotide consensus sequence (DiBartholomeis and Gerbi, 1989; Frydman et al., 1993; Monesi et al., 1995; Fontes, 1996; Dessen et al., unpublished data) ( Table 1). The regions around the TATA box, the transcription start site and the translation start site contain groups of bases similar to other DNA puff genes ( Table 1). The highest similarity for this region of the C-3 gene is with the Sciara coprophila II/9-1 and II/9-2 DNA puff genes. An A-rich sequence is observed at position 752, 34 bp upstream of the first T in the TATA box of the C3 gene. This sequence has been relatively conserved among all DNA puff genes analyzed ( Table 2). The similarity of part of the A-rich sequence with the 11-bp core of the ACS of yeast (van Houten and Newlon, 1990) ( Table 2), indicates that it may have a role in initiation of

166

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

Fig. 3. Nucleotide sequence of the C-3 TU and upstream region. The cDNA sequence begins at position 837 of the genomic clone. Intron sequences are indicated in lower case. Other sequences described in the text are boxed (GATA, TGA-containing, TATA box, transcription initiation region, start and stop codons, instability sequence), circled (poly(A) tail signal, A-rich sequences) or underlined (sfg-1 and -2/3/4).  =duplicated heptanucleotide.  ÷=palindromic region. A-rich sequences are labelled 1–5.

replication/amplification. Bidimensional gel electrophoresis in Sciara coprophila (Liang et al., 1993) and in Rhynchosciara americana ( Yokosawa, 1995) have demonstrated that, although there are multiple initiation sites, replication of the II/9-1 and C3 DNA puff genes might initiate in this part of the promoter. For these two species, the A-rich sequence shows close homology with the ACS of yeast ( Table 2). Four other A-rich sequences with structural similarity to the ACS sequence are located further upstream in the promoter of the C-3 gene ( Fig. 3, Table 3). Four of the A-rich regions are within large hairpin loops. The region upstream of this A-rich sequence does not show obvious homology with the regions upstream from the promoters of other DNA puff genes. Even the duplicated Sciara coprophila II/9-1 and II/9-2 genes begin to diverge significantly here. We have analyzed the C-3 promoter with the aid of several computer programs for sequences which might have some control

function. The region upstream of the TATA box contains a large number of palindromes, mostly imperfect, and repeats of short sequences of bases ( Fig. 3). For example, there are five GATA sequences and near repeats of sequences that begin with the triplet, TGA. The heptanucleotide sequence, AACCAAT, is repeated twice in the region between positions 282 and 311. Sequences containing GATA have recently been implicated in the regulation of several Drosophila genes (Abel et al., 1993; Ramain et al., 1993; Winick et al., 1993). Two of the GATA sequences in the C-3 promoter contain the entire consensus, 5∞-WGATAR. All other DNA puff genes examined also contained GATA sequences, but usually without the full consensus. In S. coprophila, three sequences containing a consensus TGAMCW have been reported to bind EcR (DiBartholomeis and Gerbi, 1989, in Gerbi et al., 1993). The Matrix Search 1.0 program (Chen et al., 1995) indicated sequences in the C-3 promoter with high homology to a number of transcrip-

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

167

1988; Hui et al., 1990) and might prove interesting in future functional tests. 3.3. The C-3 mRNA

Fig. 4. RNase protection. Lanes: 1, RNA polymerase T7-transcript of lRA6E hybridized with total RNA isolated from Ra salivary glands during P6 when the C-3 puff is maximally expanded; 2, RNA polymerase T3-transcript of lRA6E hybridized with total P6 salivary gland RNA; 3, RNA polymerase T7-transcript of the genomic clone lRA6E hybridized with the T7-transcript of the cDNA clone, pRaC3-22 (antisense RNA); 4, RNA polymerase T3-transcript of lRA6E hybridized with pRaC3-22 anti-sense RNA. Bands indicated are protected fragments.

tion factor consensus sequences. Two sequences which begin at 225 and 612 bp (Fig. 3) are homologous with consensus sequences for silk gland factors in the promoter of the Bombyx fibroin gene (Suzuki and Suzuki,

The C-3 messenger is first detected during late period 4 when its puff begins to expand (Fig. 5). It reaches maximum levels during period 6, when the C-3 puff is fully expanded. Some C-3 messenger can still be detected during period 7, approximately 15 h after its puff has regressed (Fig. 5) These results differ from those of Santelli et al. (1991), who detected C-3 messenger only when its puff was large. The position of the C-3 messenger band on Northerns changes during development. During period 4, the mRNA band extends from about 1.1 to 0.95 kb and a thin, darker band at the 1.1 kb position can sometimes be distinguished (Fig. 5, lane 2). During period 5, when the puff is still expanding, and period 6, when it is maximal, signal becomes more intense (Fig. 5, lanes 3, 4). After the puff has regressed, only a thin band at about 0.95 kb is observed (Fig. 5, lane 5). When less RNA is loaded on the gel, the period 5 band (Fig. 5, lane 6) is clearly located above the 0.95 kb period 7 band (Fig. 5, lanes 5, 8). This apparent shift of a messenger to a smaller size during the expansion and regression of its puff seems similar to the behavior of the B10 DNA puff mRNA of B. hygida (Fontes, 1996). For the B10 mRNA, decrease in size is due to the shortening of its poly(A) tail ( Fontes, 1996). We therefore compared C-3 messenger size before and after poly(A) tail removal at several different stages of puff development ( Fig. 6). At all stages, messengers from which poly(A) tails had been removed were found at a position extending between about 1.0 and 0.9 kb. For RNA samples that had not been subjected to the technique of poly(A) tail removal, stages when the C-3 puff was expanding (Fig. 6, 1a) or

Table 1 TATA, transcription and translation start regions of seven DNA puff genes

TATA box, start of transcription and start of translation are in boldface. R=G, A; Y=C, T; M=A, C: K=G, T; S=G, C; W=A, T; H= A, C, T; B=G, T, C; V=G, C, A; D=G, A, T; N=A, C, G, T. Nucleotide consensus symbols in lower case indicate the presence in low frequency of non-consensus bases.

168

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

Table 2 Adenine-rich sequence upstream of TATA box in seven DNA puff genes

Underlined bases show closest similarity to the ACS consensus. Meaning of nucleotide consensus symbols given in Table 1.

Table 3 Similarity of several A-rich sequences in the RaC3 promoter with the ACS element of yeast

Ra sequence most similar to consensus shown in boldface. (a) or (b) after sequence number indicates different regions of sequence. Meaning of nucleotide consensus symbols given in Table 1. –, best fit achieved by leaving space between bases.

regressing (Fig. 6, 2a), showed bands which spread downward from about 1.2 kb. However the untreated RNA sample taken after C-3 puff regression (Fig. 6, 3a), was at the position of the samples whose poly(A) tails had been removed. These results are in agreement with the newly formed C-3 mRNA having a poly(A) tail of 150–200 bases which is rapidly degraded. When the C-3 puff is just beginning to expand, messenger production seems to be slower than poly(A) tail degradation. When the puff is more expanded, a high percentage of the C-3 messenger population has long poly(A) tails. After the puff has regressed, all C-3 messengers have undergone deadenylation. However, some of these deadenylated mRNAs remain in the cells for a considerable length of time after puff regression. The C-3 mRNA contains the 3∞ sequence, AUUUA, which has been shown to cause mRNA instability when inserted into a rabbit b-globin gene (Shaw and Kamen, 1986). AUUUA or other AU-rich sequences are present

in the 3∞ non-translated region of mRNAs produced by all other DNA puff genes, including the B. hygida B10 messenger (Fontes, 1996). Such AU-rich elements can facilitate rapid deadenylation, an initial step in mRNA degradation (Chen and Shyu, 1995), and could have a role in the deadenylation of these messengers. When genomic clones extending about 15 kb to either side of the transcription unit were used as probes in Northern blotting just before and during the expansion of the C-3 puff, a band at the position of the C-3 messenger was the only one observed in salivary glands (data not shown). This suggests that only one developmentally controlled transcription unit is active in the region during this time. 3.4. The C-3 polypeptide The predicted polypeptide coded by the C-3 puff gene is 281 residues long, which corresponds to a protein of

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

Fig. 5. Northern blot hybridization of total salivary gland RNA probed with 32P-labelled pRac3-22 insert. Lanes 1–5 have 5 mg and lanes 6–8 have 1 mg total RNA. Samples shown are from the same group of larvae. Lanes: 1, P3, pre-puff; 2, late P4, beginning puff; 3 and 6, P5, increasing puff; 4 and 7, P6 fully expanded puff; 5 and 8, P7, about 15 h after puff regression

Fig. 6. Size of C-3 messenger at several stages of development with (a) and without (b) the poly(A) tail. Lanes: 1, expanding puff; 2, regressing puff; 3, about 15 h after puff regression. 2 contains 1/5 the amount of total RNA as the other developmental periods.

32.4 kDa with a calculated pI of 7.41 (Fig. 7). It is particularly rich in glutamic acid (13.2%), lysine (13.0%) and leucine (10.7%). It has 14 cysteine residues, nine of which have been conserved between the C-3 polypeptide and polypeptides from three other homologous proteins coded by DNA puff genes ( Fig. 7). The hydropathy graph shows that the first 18 aa residues at the aminoterminal end, including the entire first exon, are apolar,

169

which suggests that they may constitute part of a signal for secretion (von Heijne, 1982). On the other hand, a comparison with the Prosite Database (Prosite File, 10.2/7/93) showed that glycine 16 is a probable myristylation site. This suggests that Gly-16 may be the N-terminal residue of the mature protein. The rest of the C-3 polypeptide consists mainly of polar aa residues and its predicted secondary structure is almost entirely alpha-helical (Rost and Sander, 1993, 1994). Two sites for glycosylation (N-X-S/T ) are located at aa residues 86–89 and 243–246 (Prosite File, 10.2/7/93). Since DNA puffs have been correlated with the production of salivary secretion, these data support the idea that the C-3 puff product is a secreted protein, possibly the 28.0 kDa, P8, polypeptide, which increases in amount as the puff expands ( Winter et al., 1977a,b). The C-3 puff and the P8 polypeptide which has been correlated with it ( Winter et al., 1977a,b) appear when the larvae are constructing individual cells in their collective cocoon, at the stage when the cocoon is losing its elasticity. The polypeptide produced by the C-3 puff seems likely to be involved in the formation of these cells and the hardening of the cocoon. Glutamic acid and lysine are the two aa which occur in highest concentration in the C-3 polypeptide and have also been found in relatively high concentration in the insoluble fraction of the prepupal cocoon ( Terra and deBianchi, 1973). The polypeptide produced by the C-8 DNA puff, which is active during the same period as C-3, contains lower amounts of glutamic acid (6%) and lysine (2%). However, both of these polypeptides contain high percentages of leucine, isoleucine and serine, also found in high concentrations in the insoluble fraction of prepupal cocoon ( Terra and deBianchi, 1973). Comparison of the C-3 polypeptide with other proteins by the Blast program (Altschul et al., 1990) produced high score segment pairing with a number of structural proteins, in particular with the myosin heavy chain from many different organisms, as well as with other DNA puff proteins. The similarity of the C-3 polypeptide with non-DNA puff proteins is probably due to its coiled-coil structure. When the seven DNA puff polypeptides were aligned for comparison (not shown), three families could be distinguished based on sequence similarities. Family I comprised the R. americana C-3, S. coprophila II/9-1 and 2, and T. pubescens C4B polypeptides ( Fig. 7). The C-3 polypeptide was most closely related to the S. coprophila II/9-1 polypeptide. Family II comprised the R. americana C-8 (Frydman et al., 1993) and B. hygida C-4 polypeptides, while the B. hygida B-10 polypeptide is, thus far, the only member of Family III. Analysis of the R. americana C-3 polypeptide aa sequence for parallel coiled-coil stretches (Lupas et al., 1991; Lupas, 1996) showed three regions that have a probability higher than 97% of adopting this conforma-

170

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

Fig. 7. Protein alignment of Family I DNA puff proteins. The alignment shows the R. americana C-3 polypeptide (C3.Ra), the S. coprophila II/9-1 (II/9-1.Sc) and II/9-2 (II/9-2.Sc.) polypeptides and the T. pubescens C4B polypeptide (C4B.Tp). Residues that are conserved in at least three of the four sequences are shaded. All cysteines are boxed and those absolutely conserved are indicated by an asterisk below the alignment. The three lines (I, II and III ) above the alignment show regions of the C-3 polypeptide which have a 97% or higher probability of forming coiled-coil structures (COILS2 program, Lupas, 1996, see Section 2: Materials and methods). The letters above the lines show the position of the C-3 residues within the predicted heptad repeats of the coiled-coil a-helix.

tion ( Fig. 7). Previous studies made on the S. coprophila II/9-1 and 2 polypeptides showed that these polypeptides also have a high content of coiled-coil a-helices and show similarity to a-helical rod portions of myosin (DiBartholomeis and Gerbi, 1989). The three coiledcoil regions of the C-3 polypeptide are among the most conserved aa sequences of Family I polypeptides ( Fig. 7) and are probably involved in the nucleation of protein fiber formation during utilization of the protein for cocoon spinning. The conserved cysteines are all located outside these three regions. They could be involved in the crosslinking of the fibers, although other covalent interactions among aa residues from different protein chains are also possible.

4. Conclusions (1) A 33-kb region which includes the TU of the C-3 DNA puff gene of R. americana was mapped and

the C-3 TU characterized. It differs from the TUs of other DNA puff genes sequenced so far by the presence of a second intron. The C-3 TU produces an mRNA which appears to undergo rapid deadenylation. However, deadenylated C-3 mRNA can still be observed in gland cells at least 15 h after the puff has regressed. (2) The region around the TATA box of the C-3 gene is similar to the promoter regions of other DNA puff genes and contains a sequence homologous with the ACS of yeast. However, regions further upstream diverge greatly among the different DNA puff genes. (3) As observed for polypeptides coded by other DNA puff genes, the 34.2-kDa polypeptide coded by the C-3 gene has characteristics of a secreted protein. It has a 5∞ hydrophobic segment followed by a long a-helical region. Three stretches within the a-helical region have a greater than 97% probability of forming coiled-coils. The C-3 polypeptide can be

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

grouped with the polypeptides coded by the II/9-1 and 2 genes of S. coprophila and the C4B gene of T. pubescens.

Acknowledgement We thank Dr. Silvia R.B. Uliana for help in sequencing and Prof. Carlos F. Menck for constructive comments on the manuscript. This work has been supported by grants from FAPESP (Fundac¸a˜o de Amparo a` Pesquisa do Estado de Sa˜o Paulo) and CNPq (Conselho Nacional de Desenvolvimento Cientı´fico e Tecnolo´gico). The nucleotide sequence described in this paper has been given the GenBank accession No. 67878-U69899.

References Abel, T., Mitchelson, A.M., Maniatis, T., 1993. A Drosophila GATA family member that binds to Adh regulatory sequences is expressed in the developing fat body. Development 119, 623–633. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403–410. Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.D. Smith, J.A., Struhl, K. (Eds.), 1989. Current Protocols in Molecular Biology. Massachusetts General Hospital, Harvard Medical School, Boston, MA. Bonaldo, M.F., Santelli, R.V., Lara, F.J.S., 1979. The transcript from a DNA puff of Rhynchosciara and its migration to the cytoplasm. Cell 17, 827–833. Breuer, M.E., Pavan, C., 1955. Behavior of polytene chromosomes of Rhynchosciara angelae at different stages of larval development. Chromosoma 7, 371–386. Chen, C.Y.A., Shyu, A.B., 1995. AU-rich elements: characterization and importance in mRNA degradation. Trends Biochem. Sci. 20, 465–470. Chen, Q.K., Hertz, J.Z., Stormo, G.D., 1995. MATRIX SEARCH 1.0: a computer program that scans DNA sequences for transcriptional elements using database of weight matrices. Comput. Appl. Biosci. 11, 563–566. DiBartholomeis, S.M., Gerbi, S., 1989. Molecular characterization of DNA puff II/9A genes in Sciara coprophila. J. Mol. Biol. 210, 531–540. Ficq, A., Pavan, C., 1957. Autoradiography of polytene chromosomes of Rhynchosciara angelae at different stages of larval development. Nature 180, 983–984. Fontes, A.M., 1996. Clonagem e caracterizaca˜o da estrutura e expressa˜o de um gene amplificado no pufe de DNA B-10 de Bradysia hygida (Diptera, Scaridae). Doctoral Thesis, Faculdade de Medicina/USP, Ribera˜o Preto. Frydman, H.M., Cadavid, E.O., Yokosawa, J., Silva, F.H., NavarroCattapan, L.D., Santelli, R.V., Jacobs-Lorena, M., Graessmann, M., Graessmann, A., Stocker, A.J., Lara, F.J.S., 1993. Molecular characterization of the DNA puff C-8 gene of Rhynchosciara americana. J. Mol. Biol. 233, 799–803. Gerbi, S.A., Liang, C., Wu, N., DiBartolomeis, S.M., Bienz-Tadmor, B., Smith, H.S., Urnov, F.D., 1993. DNA amplification in DNA puff II/9A of Sciara coprophila. Cold Spring Harbor Symp. Quant. Biol. 58, 487–494. Glover, D.M., Zaha, A., Stocker, A.J., Santelli, R.V., Pueyo, M.T., deToledo, S.M., Lara, F.J.S.: Gene amplification in Rhynchosciara

171

salivary gland chromosomes. Proc. Natl. Acad, Sci. USA 79, 2947–2951. Henikoff, S., Henikoff, J.G., 1979. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10915–10919. Hui, C., Matsuno, K., Suzuki, Y., 1990. Fibroin gene promoter contains a cluster of homeodomain binding sites that interact with three silk gland factors. J. Mol. Biol. 213, 651–670. Hultmark, D., Klemenz, R., Gehring, W.J., 1986. Translational and transcriptional control elements in the untranslated leader of the heat shock gene hsp22. Cell 44, 429–438. Kroczek, R.A., Siebert, E., 1990. Optimization of Northern analysis by vacuum blotting, RNA-transfer visualization and ultraviolet fixation. Anal. Biochem. 184, 90–95. Lara, F.J.S., Stocker, A.J., Amabis, J.M., 1991. DNA sequence amplification in Sciarid flies: results and perspectives. Braz. J. Med. Biol. Res. 24, 233–248. Liang, C., Spitzer, J.D., Smith, H.S., Gerbi, S.A., 1993. Replication initiates at a confined region during DNA amplification in Sciara DNA puff II/9A. Genes Dev. 7, 1072–1084. Lupas, A., Van Dyke, M., Stock, J., 1991. Predicting coiled-coils from protein sequences. Science 252, 1162–1164. Lupas, A., 1996. Prediction and analysis of coiled-coil structures. Methods Enzymol. 266, 513–525. Millar, S., Hayward, D.C., Read, C.A., Browne, M.J., Santelli, R.V., Vallejo, P.G., Pueyo, M.T., Zaha, A., Glover, D.M., Lara, F.J.S., 1985. Segments of chromosomal DNA from Rhynchosciara americana that undergo additional rounds of DNA replication in the salivary gland DNA puffs have only weak ARS activity in yeast. Gene 34, 81–86. Monesi, N., Fernandez, M.A., Fontes, A.M., Basso, L.R., Nakanishi, Y., Baron, B., Buttin, G., Pac¸o´-Larson, M.L., 1995. Molecular characterization of an 18 kb segment of DNA puff C4 of Bradysia hygida (Diptera, Sciaridae). Chromosoma 103, 715–724. Parry, D.A.D., 1982. Coiled-coils in alpha-helix containing proteins: analysis of the residue types within the heptad repeat and the use of these data in the prediction of coiled-coils in other proteins. Biosci. Rep. 2, 1017–1024. Ramain, P., Heitzler, P., Hamlin, M., Simpson, P., 1993. pannier, a negative regulator of achaete and scute in Drosophila, encodes a zinc finger protein with homology to the vertebrate transcription factor GATA-1. Development 119, 1277–1291. Rost, B., Sander, C., 1993. Prediction of protein secondary structure at better than 70% accuracy. J. Mol. Biol. 232, 548–599. Rost, B., Sander, C., 1994. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 19, 55–72. Santelli, R.V., Machado-Santelli, G.M., Pueyo, M.T., Navarro-Cattapan, L.D., Lara, F.J.S., 1991. Replication and transcription in the course of DNA amplification of the C3 and C8 DNA puffs of Rhynchosciara americana. Mech. Dev. 36, 59–66. Shaw, G., Kamen, R., 1986. A conserved AU sequence from the 3∞ untranslated region of GM-CSF mRNA mediates selective mRNA degradation. Cell 46, 659–667. Stocker, A.J., Troyano-Pueyo, M., Pereira, S.D., Lara, F.J.S., 1984. Ecdysteroid titers and changes in chromosomal activity in the salivary glands of Rhynchosciara americana. Chromosoma 90, 26–38. Stocker, A.J, Gorab, E., Amabis, J.M., Lara, F.J.S., 1988. A molecular cytogenetic comparison between Rhynchosciara americana and Rhynchosciara hollaenderi (Diptera: Sciaridae). Genome 36, 831–843. Suzuki, T., Suzuki, Y., 1988. Interaction of composite protein complex with the fibroin enhancer sequence. J. Biol. Chem. 263, 5979–5986. Terra, W.R., deBianchi, A.G., Gambarini, A.G., Lara, F.J.S., 1973a. Haemolymph amino acids and related compounds during cocoon production by larvae of the fly, Rhynchosciara americana. J. Insect Physiol. 19, 2097–2106. Terra, W.R., deBianchi, A.G., 1973b. Chemical composition of the

172

L.O.F. Penalva et al. / Gene 193 (1997) 163–172

cocoon of the fly Rhynchosciara americana. Insect Biochem. 4, 173–183. Thompson, J.D., Higgins, D.G., Gibson, T.J., 1994. Clustal W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res. 22, 4673–4680. van Houten, J.V., Newlon, C.S., 1990. Mutational analysis of the consensus sequence of a replication origin from yeast chromosome III. Mol. Cell. Biol. 10, 3917–3925. von Heijne, G., 1982. A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 14, 4683–4690. Vournakis, J.N., Efstratiadis, A., Kafatos, F.C., 1975. Electrophoretic patterns of deadenylated chorion and globin mRNAs. Proc. Natl. Acad. Sci. USA 72, 2959–2963. Winick, J., Abel, T., Leonard, M.W., Michelson, A.M., Chardon-Lori-

aux, I., Holmgren, R.A., Maniatis, T., Engel, J.D., 1993. A GATA family transcription factor is expressed along the embryonic dorsoventral axis in Drosophila melanogaster. Development 119, 055–1065. Winter, C.E., deBianchi, A.G., Terra, W.R., Lara, F.J.S., 1977a. Relationships between newly synthesized proteins and DNA puff patterns in salivary glands of Rhynchosciara americana Chromosoma 61, 193–206. Winter, C.E., deBianchi, A.G., Terra, W.R., Lara, F.J.S., 1977b. The giant DNA puffs of Rhynchosciara americana code for polypeptides of the salivary gland secretion. J. Insect Physiol. 23, 1455–1459. Yokosawa, J., 1995. Mapeamento de origens de replicac¸a˜o do pufe de DNA C3 de Rhynchosciara americana. Doctoral Thesis, Instituto de Quı´mica/USP, Sa˜o Paulo.