DNA sequence and coding properties of mutD(dnaQ) a dominant Escherichia coli mutator gene

DNA sequence and coding properties of mutD(dnaQ) a dominant Escherichia coli mutator gene

J. Mol. Biol. (1986) 190, 113-117 DNA Sequence and Coding Properties of mutl)(dnaQ) a Dominant Escherichia coli Mutator Gene The mutD(dnaQ) gene in E...

1MB Sizes 8 Downloads 74 Views

J. Mol. Biol. (1986) 190, 113-117

DNA Sequence and Coding Properties of mutl)(dnaQ) a Dominant Escherichia coli Mutator Gene The mutD(dnaQ) gene in Escherichia coli codes for the E subunit of the DNA polymerase pol III holoenzyme. Previous work has shown that this gene lies adjacent to the gene coding for RNase H (mh). The two products are translated from diverging promoters. Here we report on the 1.6 kb (1 kb = lo3 bases or base-pairs) sequence of the region coding for both genes, and the transcripts encoded by them. mutD codes for two transcripts, one of whose origins lies within the mh structural gene. Both transcripts overlap and are complementary to a region of the mh transcript. Thus, they can potentially form doublestranded helices with mh. Of the two possible double-stranded structures, the shorter does not interfere with a likely rnh ribosome binding site, while the longer one does. We suggest that this unique organization may regulate rnh translation rates.

The mutD(dnaQ)-rnh region of the chromosome codes for two proteins (Horiuchi et al., 1981; Cox $ Horner , 1983), the E subunit of the pol III holoenzyme (Scheuermann et al., 1983) and ribonuclease H (Kanaya & Crouch, 1983). Purified pol III holoenzyme isolated from mutD5 cells is deficient in the 3’ --) 5’ proofreading exonuclease activity associated with wild-type preparations (Echols et al., 1983; DiFrancesco et al., 1984), and it is now clear that the E subunit is a 3’+ 5’ proofreading exonuclease (Scheuermann & Echols, 1984). Here we report on the DNA sequence of the mutD and rnh genes and the transcripts encoded by them. Two mutD transcripts can be detected, one whose initiation site lies within the rnh polypeptide coding region. We find that the rnh message and one of the mutD messenger RNAs are complementary for over 100 base-pairs, an unusual structure that may help to regulate translation of the rnh transcript. The mutDi and rnh+ sequence is shown in Figure 1, with the proposed reading frames and the deduced amino acid sequence. Our sequence agrees with that of Kanaya & Crouch (1983) for the region they sequenced (nucleotides 1 to 754, Fig. 1). There are substantial differences between our sequence and that reported by Maki et al. (1983) for the mutD coding sequence. Five of these discrepancies caused differences in the proposed amino acid sequence for mutD. These discrepancies could be strain-specific, a result of mutations fixed in diverging pedigrees (mutators in general stand a much smaller chance of segregating by recombination from mutations they inflict upon themselves (Leigh, 1970)). However, on this hypothesis we would expect third position changes to occur with higher frequency than first and might second, and some of the differences reasonably be expected to be functionally related. None of these expectations is met. For example, of the five amino acid differences noted in Figure 1, 0022-2836/86/130113-05 5

$03.00/O

three are changes from hydrophilic to hydrophobic residues, and one (position 935) is a proline according to our sequence, an alanine according to Maki et al. (1983). Thus, it seems to us that the differences between our two sequences, while possibly attributable to mutation, are more likely due to sequencing errors. There are several centers of symmetry in the sequence shown in Figure 1. The sequence between 1551 and 1573 corresponds to a canonical transcription termination site (Rosenberg & Court, 1979; Farnham & Platt, 1980). The calculated free energy of folding for this sequence is - 12.6 kcal/mol (1 kcal = 4.184 kJ) (Zuker & Stiegler, 1981). Positions 2 to 32 can also form a relatively strong stem and loop structure (-20.6 kcal/mol). This too could act as a termination site, although additional 3’ sequence data will be needed before this region can be compared with other examples. The inverted repeat in the region of mutD-rnh overlap (nucleotides 514 to 528 and 559 to 573) is possibly involved in transcriptional or translational regulation. S1 protection experiments (Berk & Sharp, 1977) were used to locate the initiation sites for the mutD and rnh transcripts (Fig. 2). For the sake of experimental convenience, these experiments were carried out with merodiploids (see the legend to Fig. 2). Results with haploids are qualitatively similar with regard to the numbers of transcripts, their 5’ origins, and relative amounts (results not shown). We can identify protected fragments from three regions. Two fragments originate in the sequence between the methionines corresponding to the N termini of the rnh and mutD genes and are of opposite polarity. In addition, we have identified a second mutD transcript originating at position 447. The organization of these transcripts, and possible promoter sequences and ribosome attachment sites, are summarized in Figure 3. Figure 2 also shows two mutDz and two rnh 113

0 1986 Academic Press Inc. (London) Ltd.

rr,vn 1 ?I : 8,) ‘) 3 3c 40 3’ i L TTMGCGGT~CGCCAACCTCGGTGGGCCGTTACAGCATTTGGTGTCCGAAT?TGAAGTTG*ACCATCGGAC***G~GGTC*C*~CCTAA l V E V Q Y G T C E I. --

: , T

"

‘4

130 140 120 100 110 152 160 170 1 dC, GTAGCGGCGCCGTGCTCGGTCAAGTAGTGTCGCAAGCAAAGGCCCACAGGCCGTACCGGAAATTGGGTAAGGGTAAACTAAACTACGAC G A H G K V W E W K I M A A A R ALEDCRENEPH Q H 12 210 220 200 190 GGGGTTACGTCGTAGTTCTGCAACGGTCTCTAGCTGTAAMTAC QWLDVNKVPKKDATKWGRKKWNti GLAADLR

230

240

2 -: 5

263

250

310 320 330 290 300 280 340 350 CTAGGTGACCCACTATGGGACCGCCTGTATGACCGACAGCCATGAGTTTTACTGAAGCGTTAC~GA~ATTGCGGAGGTCGCGCTGTTA I EKLAELAVI WQTIGQR V Y Q SDTSLIVECH

360

410 400 420 380 390 * 370 TCGCCGGTAGTTGAGGTATGCCMCAACCAACCACAAGAGCGCAGGCGCTATCGCATTTTATCGCGG NNTTRT YGASFTKE AAMLEMR

450

440

430 R

G

R

Y

R

L

I

A

G

490 500 510 460 470 480 520 530 CATTGGGGGTCCAGGACCTAACGGGTCTGTGCTTGGTAGCCACTTTT~AGATGGAC~ATTCGTAGAGACCATCTG~GGACATTAACT s GDTFIEVQKL Y G G E G P N G L c M S 5' l ATCGAACTGTAAAACGACAAGTCTGACATAAATGACCGCTCCACCGG 560 570 580 550

T

I G A H Y E G H K.1 MN Q I TATGAACCAGATTGGTGCGCACTATGAAGGCCACAAGATCTAA 650 . 660 670 640

I

E

A

590

I

5'

TRQIVLDTETTG 600

610

620

630

GAVEVVNRRLT

680

545

G

690

N

N

710

720

RLVDPEAFGVHGIADEFLLDKP FHVYLKPD CTTCCATGTTTATCTCAAACCCGATCGGCTGGTGGATCCGG~GCCTTTGGCGTACATGGTATTGCCGATG~TTTTTGCTCGAT~GCC 740 &750 760 770 730 780 800 a 0

810

700

D E F MD Y I T F A E VA RGAELVI G HNAAFDI CACGTTTGCCGAAGTAGCCGATGAGTTCATGGACTATATTCGCGGCGCGGAGTTGGTGATCCATAACGCAGCGTTCGATATCGGCTTTAT 840 850 860 870 820 880 890 l

F

900

KTNTFCKVTDS LAVARKM D YEFSLLKRDIP GGACTACGAGTTTTCGTTGCTTAAGCGCGCGATATTCCG~GACC~TACTTTCTGT~GGTCACCGATAGCCTTGCGGTGGCGAGG~~T 930 920 0 940 950 960 980 910 970

990

LCARYEIDN SKRTLH G A E P GKRNSLDA GTTTcCCGGTAAGCGCAACAGCCTCGATGCGTTATGTGCTCGcTACG~TAGAT~CAGT~CG~CGCTGCACGGGGCATTACTCGA 1000 1010 1020 1030 1040 1050 1060 1070 T G G Q T SMAFAMEGETQQQQ A Q 1 L A E V Y L AM TGC~CAGATCCTTGCGGAAGTTTATCTGGCGATGACCGGTGGTCAAACGTGGAGAGACACAACAGCAACA 1090 1100 1110 1120 1130 1140 1150 V R Q A S KLRVVFATDEEI GEATIQRI AGGTGAAGCAACAATTCAGCGCATTGTACGTCAGGCAAGTGAGATTGCAGCTCATGAAGC 1180 1190 1200 1210 1220 1230

1240

L

L

D

1080

1160 A

M

1170 A

H

1250

E

A

1260

R L D L V Q K K G G SCLWRA CCGTCTCGATCTGGTGCAGAAGAAAGGCGGAAGGCGG~GTTGCCTCTGGCGAGCAT~TACCTGTG~AGGCGCTA~~TAGCGACTTGGGCGA 1270 1280 1290 1300 1310 1320 1330 1340 1350

0

0

-0

l

-

TTTTTGCAGCAAACGATTCAAAAc;ATGAGAAAAACCGTTGACG~GGTCGAGGC~TCCGT~TATTCGCCTCGTTCCC~CGGAACACA 1360 1370 1380 1390 1400 1410 1420 1430

1440

0

ACGCGGAGCGGTAGTTCAGT~GGTTAGAATACCTGCCTGTCACGCAGGGGGTCGCGGGTTCGAGTCCCGTCCGTTCCGCCACTATTCACT 1450 1460 1470 1480 1490 1500 1510 1520

1530

-CATGAAAATGAGTTCAGAGAGCCGCAAGATTTTTAATTTTTAATTTTGCGGTTTTTTTGTATTTGAATT 1590 3' 1570 1580 1540 1550 1560

Figure 1. The mutD-rnh nucleotide and proposed amino acid sequences. Deletion sets containing extensive overlaps (Cox & Horner, 1983) were sequenced by the dideoxy method of Sanger et al. (1977). The first 540 bases read from 3’ to 5’. the remainder from 5’ to 3’. Those sequences differing from Maki et al. (1983) are marked with a small closed circle. The Fnu4HT and PvuI cut-sites used to prepare the probes described for Fig. 2 are marked with bold arrows at positions 364/365 and 745/746, respectively. Inverted and direct repeats are underlined. Potential RNA stem and loop structures can form at positions 2 to 32, 1399 to 1424, and 1550 to 1576, with free energies of formation - 20.6, - 12.0 and - 12.6 kcal/mol. respectively (Zuker & Stiegler, 1981). Sequence hyphens have been omitted for clarity.

Letters to the Editor Fnu4HI obcde

PVUI

Standard

obcdeATCG

m&D, mh

mutO

Figure 2. Transcripts from the mutD and mh regions. The 488 nucleotide BamHI fragment (positions 271 to 758) was labeled on the 5’ ends with 32P, cut separately with FnuH41 and PwuI, and used to probe for RNA initiation sites (Berk C Sharp, 1977). &-digested samples were electrophoresed on 6% (w/v) acrylamide/6 M-urea gels. The strains used for preparing RNA were plasmidcarrying derivatives of W3110 described previously, KH1265 and KH1266 (Cox t Horner, 1983). KH1265 is mutC on the chromosome and carries the 1592 base-pair EcoRI mut + fragment on pBR322. KH1266 is coisogenic, carrying mut+ on the chromosome and mutD5 on the plasmid. RNA was prepared from cultures grown in minimal medium (Vogel & Bonner, 1956). The FnuH41 lanes contain: a, probe mixed and annealed with KH1265 mut+ RNA but not digested with S,; b, KH1265 mut+ RNA annealed with the Fnu4HI probe and digested with 1667 units of S,/ml; c, KH1265 mut+ RNA annealed to the FnuH41 probe and digested with 3334 units of S,/ml; d, KH1266 mutD5 RNA annealed with the FnuH41 probe and digested with 1667 units of S,/ml; e, KH1266 mutD5 RNA annealed with the FnuH41 probe and digested with 3334 units of S,/ml. The PwuI lanes contain corresponding RNA samples and S1 reaction conditions using the PvuI probe. The 484 nucleotide BamHI fragment (position 270 to 759, Fig. 1) was sequenced by the dideoxy methods and is included as a standard.

transcripts, nucleotides,

115 differing in size by three and two respectively. These are typical results.

It is possible that the shorter one of each pair is a degradation product, or that RNA polymerase initiates at two closely spaced sites. The PwuI lanes in Figure 2 appear to contain traces of the mutD, message. This is because the PvuI restriction site is nine base-pairs from the labeled BamHI site, and thus during preparation the PvuI-cut probe is not always cleanly separated from the parental fragment. Although it is difficult to assess the relative concentrations of mutD and mh transcripts from the data of Figure 2, the amounts of the two mutD messages approximately equal those of mh. If anything, the mh transcripts appear to be more abundant. Figure 2 shows that mutD5 and mut+ cells contain roughly equal amounts of both mutD and rnh transcripts (the FnuH41 d and e lanes in Fig. 2 suggest otherwise, but these differences are not reproducible). Thus, it does not seem likely that the mutD- phenotype can be attributed to altered mutD or rnh transcript levels, observations consistent with the location of m&D alleles isolated in this laboratory (Cox & Horner, 1982), all of which map in the mutD protein-coding region (Horner & Cox, unpublished results), and with the observations of Echols et al. (1983) that it is the E subunit itself that appears to be defective in proofreading. In this context it is worth recalling that the mutation rates characteristic of many mutD alleles depend on intracellular thymidylate pools (Erlich & Cox, 1980), and that the E subunit binds TTP (Biswas & Kornberg, 1984). These observations, and those described here, make it likely that the TTP dependence of the mutD phenotype is explained by a failure of mutant E to bind TTP correctly. Erlich & Cox (1980) showed that yeast extract contains a mutational effector in addition to thymidine. We have systematically looked for two kinds of consensus sequences that might help us to understand the role played by the second effector( On the assumption that the SOS system (Witkin, 1976) might be involved in this response, we looked without success for both the lex promoter consensus sequence shared by many genes involved in the SOS response (Brandsma et al., 1983) and the CAP-binding consensus sequence (Ebright et al., 1984). The proposed promoter sequences in Figure 3 preserve few features of the canonical structure suggested by Siebenlist et al. (1980). The mh - 10 and -35 sequences fit best, although even here the - 10 site lies only seven nucleotides from the initiation site of one of the transcripts. The same is true for mutD transcripts 1 and 2. The possible -35 sites for these two messages are even more tentative, and perhaps the most that can be said about them is that they are approximately the right distance from the - 10 sites and are very A + T-rich. The existence of two mutD transcripts raises the

-m&D,

-101

-351

-352

AAATAGCGCCGTAACCCCCAGGTCCTGGATTGCCCAGACACGAACCATCGGTG~

4So

450

4io

4:o

4io

470

460

460

560

570

--mutD2 Met CCTGTAATTGAATCGAACTGTAAAACGACAAGTC T$m$%?CGCTf$ GGACATTAACTTAGCTTGACATTTTGCTGTTCAGTGTA TACTGGCG A

Met

20 mh

5:o

TACACGCCAGATCGTT ATGTGCGGTCTAGCAA

C

560

560

6;O

670

-35

-10

Figure 3. Overlapping and diverging regulatory sequences for the m&D and rnh t,ranscripts. Possible ribosome binding sites are underlined (Shine & Delgarno, 1974; Storm0 et al.? 1982). Possible RNA polymerase binding sites and methionine initiation codons are boxed. Transcript startpoints are indicated by triangles. Sequence hyphens have been omitted for clarity.

possibility that one may be transcribed independently of the other. It is also clear that both mutD transcripts can form double-stranded complementary structures with the mh transcript. The shorter potential double helix, a minimum of ten base pairs long, does not block the proposed ribosome binding site on the mh transcript, while the longer of the two, 103 to 106 base-pairs long, clearly does. This second structure would lower rnh translation rates by blocking ribosome binding to the mh message in a way akin to ompF regulation (Coleman et al., 1984) and IS10 transposition (Simons & Kleckner, 1983). It is worth noting in this regard that Maki et al. (1983) have measured /5galactosidase levels in strains in which both the m&D and mh promoter regions are fused to /I-galactosidase. Their measurements led them to suggest that the mutD promoter is approximately five times stronger than the rnh promoter, a result that does not seem to be reflected in the transcript levels measured here. This could be due to instability of one of their fusion products, or to the intrinsic problems of estimating message abundance from S, protection experiments. However, it is also possible that, like ompF regulation, the lJ-galactosidase ratios measured by Maki et al. (1983) reflect the possibility that mutD, transcription prevents translation of rnh message by forming a double-st’randed RNA whose rnh ribosome binding site is blocked. Whether or not the proposed double-stranded intermediate exists in the cell, and if so, under what conditions, will require further study. This research was supported awarded by the National Institutes

by grant of Health.

GM28923

After the manuscript was submitted for publication. results were reported by very similar S, protection Nomura et al. ( 1985).

Edward C. Cox Deborah L. Homer Department of Biology Princeton University Princeton, N.J. 08544, C.S.A. Received 26 July 1985. and in revised form 10 January 1986

References Berk. A. ?J. & Sharp, P. A. (1977). CeZl, 12, 721-732. Biswas. S. B. & Kornberg. A. (1984). J. Biol. Chem. 259. 7990-7993. Brandsma, J. A., Bosch. D.. Backendorf. C. 8: van de Putte, P. (1983). Nature (London), 305, 243-245. Coleman, J., Green, P. J. & Inouye. M. (1984). Cell, 37. 429-436. Cox. E. C. & Horner, D. L. (1982). Genetics, 100, 7-18. Cox, E. C. & Horner. D. L. (1983). Proc. Nat. Acad. Sci., V.S.A. 80, 2295-2299. DiFrancesco. R.. Bhatnagar, S. K.. Brown. A. & Bessman, M. J. (1984). J. Biol. Chem. 259. 55675573. Ebright, R. H.. Cossart, P.. Gicquel-Sanzey. & Beckwith. ,J. (1984). Proc. Nat. Acad. Sk., U.S.A. 81. 72747278. Echols. H., Lu. C. & Burgers, P. M. J. (1983). Proc. Sat. Acad. AX, U.S.A. 80, 2189-2192. Erlich. H. A. 8 Cox, E. C. (1980). L&Z. Cm. Genet. 178, 703-708. Farnham, P. ,J. & Plat,t. T. (1980). Cell, 20, 739-748. Horiurhi. T.. Maki, H., Maruyama, M. & Sekiguchi, M. (1981). Proc. Nat. Acad. Sci., U.S.A. 78, 3770-3774. Kanaya. S. & Crouch, R. J. (1983). J. Biol. Chem. 258, 1276-1281. Leigh, E. G. Jr (1970). Amer. Nat. 104, 301-305. Maki. H., Horiuchi, T. & Sekiguchi, M. (1983). Proc. &Tat. Acad. Xci.. V.S.A. 80, 7137-7141. Xomura. T.. Aiba, H. & Ishihama, A. (1985). J. Biol. Chem. 260. 7122-il25,

Letters to the Editor Rosenberg, M. & Court, D. (1979). Annu. Rev. Genet. 13, 319-353. Sanger, F., Nicklen, S. & Coulson, A. R. (1977). Proc. Nut. Acad. Sci., U.S.A. 74, 5463-5467. R. H. & Echols, H. (1984). Proc. Scheuermann, Nat. Acad. Sci., U.S.A. 81, 7747-7751. Scheuermann, R., Tann, S., Burgers, P. M. J., Lu, C. & Echols: H. (1983). Proc. Nat. Acad. Sci., U.S.A. 80. 7085-7089. Shine, J. & Delgarno, L. (1974). Proc. Nut. Acad. Sci., U.S.A. 71, 1342-1346.

117

Siebenlist, U., Simpson, R. B. & Gilbert, W. (1980). Cell, 20, 269-281. Simons, R. W. & Kleckner, N. (1983). Cell, 34, 683-691. Stormo, G. D., Schneider, T. D. & Gold, L. M. (1982). Nucl. Acids Res. 10, 2971-2996. Vogel, H. J. & Bonner, D. M. (1956). J. Biol. Chem. 218, 97-106. Witkin, E. M. (1976). Bacterial. Rev. 40, 869-907. Zuker, M. & Stiegler, P. (1981). Nucl. Acids Res. 9, 133148.

Edited by M. Gottesman