Control of gene expression in the P2-related temperate coliphage 186

Control of gene expression in the P2-related temperate coliphage 186

J. Mol. Biol. (1989) 206, 251-255 Control of Gene Expression in the P2-related Temperate Coliphage 186 VI?. Sequence Analysis of the Early Lytic Regi...

503KB Sizes 0 Downloads 29 Views

J. Mol. Biol. (1989) 206, 251-255

Control of Gene Expression in the P2-related Temperate Coliphage 186 VI?. Sequence Analysis of the Early Lytic Region We have completed the sequence of the 186 early lytic region and established that this region encodes the four genes CP75, CP76, CP77 and CP78, with CP79 the first gene of the next region. Functions have been assigned to the four early genes.

Coliphage 186 is a temperate phage of the P2related family. The nucleotide sequence of the major control region of 186 is completely unrelated to that of coliphage 3, and provides an interesting alternative to the lambdoid phages for the study of the lysis-lysogeny genetic switch (Kalionis et al., 1986a). Similarly, the late control strategy promises to differ from that in the lambdoid phage (Kalionis et al., 1986b). We wish to report the nucleotide sequence and analysis of the early lytic region as a basis for future studies of the functions encoded, which we expect to concern both the lysis-lysogeny switch and the control of the next phase of transcription (Finnegan & Egan, 1981). The early lytic region we define as the region encoding the first lytic transcript after 186 infection, which we equate with the pR transcript (band 2) seen in the in vitro transcription pattern (Pritchard & Egan, 1985). This transcript commences at 74.5% on the 186 genome and from its size is predicted to terminate close to, and to the left of, the unique BgZII site at 79.6% (Fig. l(a)). The DNA sequence is known to the left of the P&I site at 77.4 y. (Kalionis et al., 1986a) and to the right of the BgZII site at 79.6% (Sivaprasad, 1984; results) and we Sivaprasad et al., unpublished therefore sequenced the intervening DNA. The DNA sequence of both strands was determined by the dideoxy chain termination method of Sanger e2 al. (1977) using the strategy outlined in Figure l(b). HpaII clones spanning the PstI and BgZII sites provided the necessary overlap to give continuity with the neighbouring regions. The sequence obtained is recorded in Figure 2(a), together with its analysis by computer to indicate potential genes and potential control signals. Also included are the neighbouring sequences that encode the 5’ end of CP76’ and the 3’ end of CP79. The results of the analysis of the 186 DNA sequence published to date are summarized in Figure 2(b). Potential transcriptional terminators were sought using dot matrix analysis (Maize1 & Lenk, 1981) and the computer program COMSTR (written by Sivaprasad, this laboratory). COMSTR searches for

stable stem-loop structures and calculates the free energy of these structures using the rules of Tinoco et al. (1973) as modified by Steger et al. (1984). Six stable stem-loop structures were predicted to be encoded in the P&I-BglII region. The positions of these six potential terminators on the DNA sequence are shown in Figure 2(a). Structures 1 (46&489), 2 (685-725), 4 (971-996), 5 (1027-1049) and 6 (1082-1102) are not immediately followed by a T-rich region and are therefore unlikely to be Rho-independent terminators. However, these structures may be Rho-dependent terminators. Structure 3 (95&969) is consistent with a Rhoindependent terminator, with a G +C rich stem (AG = - 10.2 kcal/mol; 1 cal = 4.184 J) immediately followed by a T-rich (6/9) region. It is the only candidate for the in vitro termination signal, as no Rho-factor was added to the assay. The associated transcript as estimated from the sequence is approximately 1360 bases in length, which is consistent with the 1450 base estimate of the pR in vitro transcript reported by Pritchard & Egan (1985). It is one of two potential terminators that are situated outside an open reading frame. Structure 3 was therefore considered to be the signal that terminates the early lytic transcript and was named tR1. Recent transcription studies have shown that tR1 functions as a terminator in vitro at an efficiency of 70% (Richardson et al., unpublished results). Possible promoters in the P&I-BgZII sequence were detected with the computer program SCAN, using a weight matrix composed of the number of occurrences of each base at each position for the 112 promoters compiled in 1983 by Hawley & McClure et al., 1986a). Two weak rightward (Kalionis candidates were detected at positions 753 and 791, and two reasonably strong leftward candidates were detected at positions 701 and 710 (the position referring to the 1st base of the - 10 region (Rosenberg & Court, 1979)). The question of the reality of these promoters will be the subject of a future investigation. Leftward promoters are of interest for their potential involvement in the establishment phase of c1 transcription in the lysislysogeny decision, and rightward promoters are of

t Paper V in this series is Lamont et aZ. (1988). 0022-2836/89/05025 l-05 $03.00/O

251

0 1989 Academic Press Limited

H. Richardson

W,

VUTS,

RQPO,NML

20

F

4y

J

I,

et al.

FEDB

l-l G

40

tj

.?t

1

A,

80

60

4 10 0 %

EgtlI

pR

(bl

CP75 H

PstI 436 (77.4%) CP76 i)

(74.5%)

.

.-i

t HpaIl

t HpalI 389 567 (77. 2%) (77.8%) + 4

4

1124 (79.6%)

4

CP79 t Hpan 965

cp80 t

HPalI

(79.1%)

.

1364 (80.4%)

4

200bp

Figure 1. Physical and genetic map of phage 186 and the sequencing strategy of the PstIIBgZII (77.4% to 79.6%) fragment. (a) The physical and genetic map of 186. The functions of the genes (Hocking t Egan, 1982) and the physical mapping (Finnegan & Egan, 1979) have been described. The arrows underneath the map represent the regions of the 186 genome, which have been previously sequenced: P&l (655% to 77.4%) (Kalionis et aZ., 1986a) and BgZII--BamHI (79.6% to 96.00/u) (Sivaprasad, 1984). (b) The 745% to 80.4% region is expanded to show the sequencing strategy of the PatI-BgZII (77.4% to 79.6%) re g ion. The predicted genes in the adjacent sequenced regions (Kalionis et al., 1986a; Sivaprasad, 1984) are shown. The PstI-BgZII and HpaII fragments were cloned into M13mp8 and mp9 (Messing & Vieira, 1982; Messing, 1983) and the sequence was determined using the dideoxy nucleotide chain termination technique (Sanger et al., 1977, 1980; Schreier & Cortese, 1979). The arrows above and below the map represent gel readings used to generate the DNA sequence. Rightward arrows represent gel readings used to generate the Z-strand sequence, whereas leftward arrows represent gel readings used to generate the r-strand sequence. bp, base-pairs.

interest for their potential involvement in middle gene transcription. From the possible open reading frames (ORFst) detected in the Psf-BgZII region (bases 436-l 124), four rightward ORFs were indicated as potential genes by the computer program GENE (Kalionis et al., 1986a) using the codon usage frequencies of Escherichia coli (Chen et al., 1982). These ORFs involve bases 436 to 510, 518 to 745, 732 to 932 and 999 to 1124 (Fig. 2(a)). No significant leftward ORFs were detected. When the sequence was analysed for ribosome-binding sites using the rules of Storm0 et al. (1982) and the computer program SCAN (Kalionis et al., 1986a), three sites were detected. The site at base 518 satisfied rule 6, whereas the sites at bases 732 and 999 satisfied rule 7. These t Abbreviation

used: ORF, open reading frame.

three ribosome-binding sites were associated with three of the ORFs described above. The three ORFs with ribosome-binding sites we have called CP77, CP78 and CP79 (CP standing foi computer protein, followed by the chromosomal coordinate approximating the initiation codon of the gene). CP77 and CP78 overlap each other by 14 bases and are predicted to encode proteins of 75 and 66 amino acid residues, respectively. CP79 overlaps the BgZII site at 79.6% and is predicted to encode a protein of 77 amino acid residues (Sivaprasad et al., unpublished results). The fourth ORF begins to the left of the P&I site at 77.4% (Kalionis et al., 1986a) and represents the carboxy-terminal portion of CP76, which is predicted to encode a protein of 169 amino acid residues. Table 1 shows various properties of the encoded proteins that have been predicted from the DNA sequence. Preliminary evidence of proteins of

Letters to the Editor

CP76

253

NKLNPEOPHOFTPPELULLTDL’EDSTLVDGFLAOIHCLP AACAAGCTCMCCCAGAACAGC~CACCAGTTCAC~CGC~~A~~~~~~GACffGACCG~CA~CM~~C~~T~~~~CCA~~CATT~~~A 130 140 150 160 170 le.0 190 200

210

220

230

240

CVPVNELAKDKLOSYVHRANSELGELASGAVSDERL”AR TGCGTGCCG~TAATGACCTG~MAGTCTCGT 250 260 270 280

330

340

350

360

450

460

470

290

300

310

320

KHNHIESVNSGIRNLSLSALALHARLQTNPAHSSVVDTHS AAGCACAACATGATTCAAACCCTTAA~~ATTCGCA’GT’~CA~GTC~‘~ffiC~~ATG~CGT~MT~CGCTATGlCGAGCGT~ATGAGC 370 380 H&701 I 420 400 410 430 77.2% 77.4:

GIGASFGLI~ NLKSEPSFASLLVKQSPGNHYGHGUIAG G~TCCTTTCGTCTGATTT~T~T~CTCMMCICMC RB3 540 500 530

560

~~CCA’:k-G’~ 77.6%

KDGKRUHPCRSOSELLKGLKTKSPKS3GFLI{ ‘AAGGACCGCAACCCCTGCCACCCFTCCCCCTCACA~CCGM’TA~~~CTG~AC~G’CGCCGMATCC’CAG~~TT’M”AT’CGTATTG’CCACT”G~ 610 630 640 650 660 670

CP77

#I (-10.5)

I2 (-10.8) RIVHFVIK 690

720

GVKHV’R*

-

CP78

YECCKEFKDGLKAElIKOLK3KPAVVFGYS~ GGGACTGC’CC~GAAT”MGGATCC1’17MMCCCGAM’CA’CM~AG’TMAMGC~CCTGCTG’TGTA’TT~ATA’AG’TM”AAT’AAACG’AA~A BSD 860 870 BBD 890 900 910 920

930

940

95-t C-10.2) tR1

H/?Ull m%

\

#4 (-8.4)

,

NSRTIYLSTPSGAGDHLLESLFKEAKK

(-9.0) CP79

EERKDRALAVSIRLEDLAVHI’NSDNTGKEAAELLRREAT AVAAAGACC CC’ CCCCG’TTCAATCCGTCTCGAA A GGCCG7TCACATTACCM~CAGATATGACA~C~GM~~CGA~TACT~GCC~GMGCCACTC 116 w 1120 %ti 1140 1150 1160 1170 1180 C-10.2) 79.6%

RFfNESQELH* GCT’TGACAACGAA’CACAGGAGCTTCAC’AA 1210 1220 1230

HpuIl - - - - - - - - - - - - - - - i30.4? 1364

1190

200bp

la)

D

B

69

Int

-

CI

75

t H 1 PB-

18

1200

76

77

78

79

1H-w 4tL

1000

2000

PL

3000

4000 +

__--M

PR

tFt1

(b)

Figure 2. (a) DNA sequence of the l-strand of the PstI-BgZII (77.4% to 79.6%) fragment and adjacent regions from 186 t&p. The DNA sequence to the left of the P&I (77.4%) site was determined by Kalionis el al. (1986u) and to the right of the BgZII (79*6o/o) site was determined by Sivaprasad (1984). Transcription and translation is from left to right. Potential genes are indicated on the left of the Figure. Relevant restriction sites are marked beneath the DNA sequence. Ribosome-binding sites (RBS) are underlined. Potential transcription terminators are indicated by the convergent arrows and are numbered 1 to 6. The stability of these structures, determined using the rules of Tinoco et aZ. (1973) as modified by Steger et al. (1984), are listed in kcal/mol in parentheses. (b) A representation of the predicted coding regions from the sequence determined presently (shaded), added to that of the sequence known to date (Kalionis et al., 1986a; Sivaprasad, 1984). The coding regions are represented by the boxed regions. The 1450 base in vitro transcript defining the early lytic region is shown. This transcript is predicted to terminate at the Rho-independent tR1 terminator (structure 3). bp. base-pairs.

fi. Richardson et al.

254

cp77+ 28 -II--

CP77am 42

28

42

cP78+ 28

42

CP76am 28

42

Table 1

m k 1 G I

Properties

($ the proteins predicted sequence

from

th4 DN.4

(‘1’76f

(‘Pi8

i ‘1’7Yt

123456789

o-

Total aa 169 66 75 ISasic xa (“,)$ 15 (8.9) 14 (18%) 11 (16.6) 18 (10.6) 3 (4.0) 12 (18.2) Acidic aa (%)I Hydrophobic aa (T/0)$ 85 (50.3) 38 (50.6) 34 (51.0) 79 (46.7) 35 (46.7) 27 (40.9) Polar aa (“/o)§ 18,671 7522 Molecular weight 8399

(31.;)

Figure 3. The protein products of CP77 and CP78. pEC404 is the PstI-H&c11 (77.4% to 78.7%) fragment from 186 cloned behind the 1 pL promoter in the expression vector pPLc236 (Remaut et al., 1981). The fragment encodes CP77 (8.4 x lo3 M,) and the cloning created a CP78 fusion protein of 4.5 x lo3 M,. pEC422 is the analogous clone of CP77am, predicted to give a prematurely terminated protein product in an Su- cell of 5.9 x lo3 M,. pEC421 is the SauIIIA-Bg1II (77.9% to 79.6%) fragment encoding CP78 (7.5 x lo3 M,) and the CP79 fusion (37.3 x 103M,), and pEC420 encoding the CP78am (1.1 x lo3 M,) mutant, similarly cloned. Expression from pL was controlled by the 1 ~I857 gene plasmid ~~1857 or from a expressed from the compatable AH1 lysogen (Remaut et al., 1981). Maxicells were prepared from E4168 (159 Su- uvrA recA56) carrying a plasmid clone and ~~1857. Proteins were labelled with [35S]methionine at 28°C or 42°C. Samples (approx. cts/min)

were

fractionated

on

a

15%

poly-

acrylamide/SDS/6 M-urea gel at 90 V overnight (Swank & Munkres, 1971; Ley, 1984), fixed in 10% acetic acid, 507; methanol overnight then for 2 h in 70/o acetic acid, 50/b methanol, and fluorographed as described by Reeve & Shaw (1979). Tracks 1 to 4 were fluorographed for 6 h at -8O”C, whereas 5 to 9 were fluorographed for 24 h at -80°C.

molecular weights commensurate with predictions from the translation of the sequence was provided by maxicell analysis (Sancar et al., 1979). The results concerning CP77 and CP78 are presented in Figure 3. For the CP77 clone, pEC404 (Richardson & Egan, 1989), a wide band of well-expressed protein appeared at 42°C with a mobility approximately the size (8.4 x IO3 M,) predicted. For the comparable clone of the amber mutant CP77am (pEC422; Richardson & Egan, 1989) this band disappeared. However, there was no evidence of the 4.5 x lo3 M, candidate protein for the CP78 fusion protein predicted for the pEC404 clone nor of the amber fragment (5.9 x lo3 M,). In the case of the CP78 clone, pEC421

protein

77 13 (l&8) 15 (19.5) 30 (39.0)

46 (59.7) 8817

t The present sequence data have been co-ordinated with the sequence data reported by Kalionis et al. (1986a) to enable the predictions for CP76 and of Sivaprasad rt nl. (unpublished results) for CP79. $ The sum of basic (K,tl) or acidic (E,D) or hydrophobic (A,V,L,I,F,W) residues. 5 Polar amino acids are D,N,E,Q,K,S,R,T,H. Proteins containing a percentage of polar amino acids less than 409’, have a low polarity and are likely to interact with the cell membrane (Capaldi & Vanderkooi, 1972).

(8.4) CPU CP78 (7.5)

60.000

(‘P77

(Richardson

approximately

the

& Egan,

size

1989), a

predicted

(7.5 x IO3 M,) appeared upon induction at 42”C, and this band was absent for the clone of the amber mutant (Richardson & Egan, 1989). There was no evidence of the CP79 fusion (37.3 x lo3 M,) or of the amber fragment (1.1 x lo3 M,) predicted for the clones. Although further confirmation will be sought, these preliminary data support the elevation of CP77 and CP88 to gene status. We finally concerned ourselves with the functions of the early genes. In the accompanying paper (Richardson & Egan, 1989) we report that CP78 encodes the function depressing host replication (dhr gene) and CP77, a function which, when cloned, caused filamentation (f;Z gene). The remaining two genes of the early region CP75 and CP76 probably encode DNA-binding proteins, as each contain an amino acid sequence scoring significantly (scoring 1709 and 1673, respectively) on the helium-turn-helix weight matrix of I)odd & Egan (1987). CP75 encodes a function concerned with antagonizing transcription from pL during the lytic response and has been termed the apl gene (Dodd & Egan, unpublished results). CP7fi encodes a function involved in the establishment of lysogeny we term the cII gene (Lament et al., unpublished results). With all genes assigned a function, the enigma remains of identifying the postulated gene X involved in the next phase of transcription (Finnegan & Egan, 1981), which was expected to map in the early region. To summarize, we have completed the DNA sequencing of the early lytic region of 186 and presented evidence to suggest that it encodes foul genes, and provided the basis for our transcription studies (Richardson et al., unpublished results) of the DNA sequence beyond the early region. This work was supported by a Program grant from the Australian Research Grants Scheme to J.B.E. H.R. held a University Research Grant postgraduate scholarship. The authors thank Ioanis Anargyros for technical assistance and photographic work.

Letters to the Editor Helena Richardson Scripps Clinic & Medical Foundation La Jolla, CA 92037, U.S.A. Arnis Puspurs Adelaide Children’s Hospital North Adelaide South Australia J. Barry Egan Department of Biochemistry University of Adelaide Adelaide. South Australia 5001, Australia

Received 30 March 1988, and in revised form 21 October 1988

References Capaldi, R. A. & Vanderkooi, G. (1972). Proc. Nat. Acad. Sci., U.S.A. 69, 93&932. Chen, H. R., Dayhoff, M. O., Barker, W. C., Hunt, L. T., Yeh, L.-S., George, D. G. & Orcutt, B. C. (1982). DNA, 1, 365-374. Dodd, I. B. & Egan, J. B. (1987). J. Mol. Biol. 194, 557564. Finnegan, J. & Egan, J. B. (1979). MOE. Gen. Genet. 172, 287-293. Finnegan, J. & Egan, J. B. (1981). J. Viral. 38, 987-995. Hocking, S. M. & Egan, J. B. (1982). J. Viral. 44, 1056 1067. Kalionis, B., Dodd, I. B. & Egan, J. B. (1986a). J. Mol. Biol. 191, 199209.

255

Kalionis, B., Pritchard, M. & Egan, J. B. (19866). J. Mol. Biol. 191, 211-220. Lamont, I., Kalionis, B. & Egan, J. B. (1988). J. Mol. Biol. 199, 379-382. Ley, H. (1984). Focus, 6(3), 5. Maizel, J. B. & Lenk, R. P. (1981). Proc. Nat. Acad. Sci., U.S.A. 12, 7665-7669. Messing, J. (1983). Methods Enzymol. 101, 20--78. Messing, J. & Vieira, J. (1982). Gene, 19, 264-276. Pritchard, M. & Egan, J. B. (1985). EMRO J. 4, 35993604. Reeve, J. N. & Shaw, J. E. (1979). Mol. Gen. Genet. 172, 271-297. Remaut, E., Stanssens, P. & Fiers, W. (1981). Gene, 15, 81-93. Richardson, H. & Egan, J. B. (1989). J. Mol. Biol. 206, 5968. Rosenberg, M. & Court, D. (1979). Annu. Rev. Genet. 13, 319-353. Sancar, A., Hack, A. M. & Rupp, W. D. (1979). J. Bacterial. 137, 692693. Sanger, F., Nicklen, S. 6 Coulson, A. R. (1977). Proc. Nat. Acad. Sei., U.S.A. 74, 5463-5467. Sanger, F., Goulson, A. R., Barrell, B. G., Smith, A. J. H. & Roe, B. A. (1980). J. Mol. Biol. 143, 161-178. Schreier, P. H. & Cortese, R. (1979). J. MoZ. Riol. 129, 169172. Sivaprasad. A. V. (1984). Ph.D. thesis University of Adelaide. Steger, G., Gross, H., Sanger, H. C., Reisner, D., Hofman, H. & Randles, J. W. (1984). J. RiomoE. Struct. Dynum. 2,543-571. Stormo, G. D., Shimatake, T. D. & Gold, L. M. (1982). Xucl. Acids Res. 10, 2971-2995. Swank, R. T. & Munkres, K. D. (1971). Anal. Rio&em. 39, 462477. Tinoco, I., Borer, P. N., Dengler, B., Levine, M. D., Uhlenbeck, 0. C., Crothers, D. M. & Gralla, J. (1973). Nature (London), 246, 4&41.

Edited by N. Sternberg