Sequence of hrdB, an essential gene encoding sigma-like transcription factor of Streptomyces coelicolor A3(2): homology to principal sigma factors

Sequence of hrdB, an essential gene encoding sigma-like transcription factor of Streptomyces coelicolor A3(2): homology to principal sigma factors

Gene. 107 (1991) 145-148 0 1991 Elsevier Science Publishers GENE B.V. All rights reserved. 145 0378-1119/91/%03.50 06107 Sequence of /w&3, an es...

527KB Sizes 1 Downloads 43 Views

Gene. 107 (1991) 145-148 0 1991 Elsevier Science Publishers

GENE

B.V. All rights reserved.

145

0378-1119/91/%03.50

06107

Sequence of /w&3, an essential gene encoding sigma-like transcription factor of Streptomyces A3(2): homology to principal sigma factors (RNA polymerase; rpof) box; Gram-positive bacteria; sequence similarity; recombin~t

coelicolor

DNA)

Tetsuo Shiina, Kan Tanaka and Hideo Takahashi Institute of Applied Microbiology, Universit_yof Tokyo, Bunkyo-ku, Tokyo 113 (Japan) Received by A. Nakazawa: 15 March 1991 Revised/Accepted: 21 July/22 July 199 1 Received at publishers: 20 August 1991

SUMMARY

The complete nucleotide sequence of the hrdB gene, au essential gene of Streptomyces coelicolor A3(2), indicates the presence of an open reading frame encoding a putative polypeptide of 442 amino acid (aa) residues with an nil, of 48412. The principal a-like tr~scriptional factor of S. c~iiculur (HrdB) protein showed an extensive aa sequence homology with the known principal CTfactors of ~scherichiu coli, Bacj~~~ssabtilis, ~se~orn~~as aeruginosa and Myxococcus xanthus. The degree of sequence similarity between HrdB protein and the known principal crfactors was distinct from that observed between the principal c factors and the alternative (minor) (7factors. Essentially all of the functional domains proposed for the principal (r factor of iE. cola’were conserved in HrdB protein. The putative (r factor, HrdB, like that of B. subtilis had a short internal nonconserved region, which might be characteristic of Gram + species.

INTRODUCTION

Using an oligo probe (rpuD probe), designed from the sequence of aa stretch that is completely conserved between the principal TVfactors of E. coli and B. subtilis, we have previously identified and cloned four regions from S. coelicolor A3(2) (Tanaka et al., 1988; 1991). By sequencing, we found coding regions which were highly similar to the principal d factors of E. co/i and B. subtjlis. In particular, a sequence of 13 aa residues in the putative coding region that was identical to those in the principal (I factors was designated ‘rpoD box’. These genes were named h&f, hrdB, hrdC, and hrdD (they appear to be homologs of the rpoD gene). Correspondence to: Dr. H. Takahashi, Institute of Applied University of Tokyo, Bunkyo-ku, Tokyo 113 (Japan) Tel. @l-3)3812-2111, Abbreviations: HrdB;

aa, amino acid(s);

nucleotide; ORF, factor in E. coli.

Microbiology, EXPERIMENTAL

A&D

DISCUSSION

Ext. 7825; Fax (81-3)3813-0539.

HrdB, principal

color; kb, kilobase

Distribution of hrd sequences among different Streptomyces strains analyzed by genomic Southern hybridization indicated that only the hrdB homologs were present in all of the Stre~tomy~es species examined {Takahashi et al., 1988). It is likely that the hrdB homologs are essential for the normal growth of mycelial Streptomyces strains and that the hrdB gene product is the functional homolog of the principal CT factors involved in the transcription of housekeeping genes. The lethal effect of the hrdB gene disruption supports this notion (Buttner et al., 1990; K.T. and H.T., unpublished observation). The aim of this study was to determine the complete nt sequence of the hrdB gene and to compare the putative HrdB protein with the known principal CTfactors.

bp, base pair(s);

u-Iike transcriptional

or 1000 bp; nt, nucleotidefs); open reading

frame;

hrdE, gene encoding oligo, oligodeoxyribo-

rpoL), gene encoding

principal

sequence of h&B region A 4%kb Sal1 fragment derived from the DNA of S. coeficolor A3(2) containing the hrdB gene has been cloned

(a) Nueleotide

factor of ~rrep~o~yces co&D

in the Sal1 site of pTZ19R resulting in pCSB 1 (Tanaka

146 90

GCGGCCGCMGGTACGAGTTGATGACCTTGTTI;4TCCGCATCTGACCtLTCGCTTACGGGGTGTGACTCGGGCCACGC&GATTG 180 GGCGTAACGCTCTTGGGtVlCMCACG.4TGACGATG.4CCT~GAGGTGACAGCCGCGG.4GGG.~4TACGG.4CGCCGTTCACGGCGCTGTGC.4TCTGGG 270 CGGCCCGCCCGCACCGTCGGCCCATTCCGAAGCCGCCGGTGGT~GGCCCCTGTCCGCCGT~CACGGGGCCGG~GCCG~C.~CG~CCG 360 AGAGGTTCrrCGTGTCGGCCAGCACATCCCGTACGCTCCCGCCGGAG.4TCGCCGAGTCCGTCTCTGTCATGGCGCTCATTGAGCGGGGM 450 AGGCTGAGGGGCAGATCGCCGGCGATGACGTGCGTCGGGCCTTCGAAGCTGACCAG.ISTTCCGGCCACTCAGTGGMGMCGTACTGCGC.4 540 GCCTC.4;lCCAGATCCTCGAGGMGAGGGTGTGAGGGTGTGACGCTGATGGTC.4GTGCCGCAGAGCCC~4GCGCACCCG~4GAGCGTCGCAGCG.~4~4 MV SAAE P K R TR K S VAAK S18 630 GTCCCGCCAAGCGCACCGCCACGMGGCGGCGGTCGtL4GGCGACCGCTCCCGCCGCCCCGGCCGCGCCCG P AK R T.4 T K A VAA N P VT S R f\lA T A P AA P A.4 PA 48 720 CG.4CCG.4GCCCGCCGCCGTCG.4GGtL4GAGGCGCCCGCC.~G~GGCCGCGGCC.~4G~GACG.4CCGCC.~4G.~GG~GACGGCG.~~G~G.4 T E P AAV E E E A P A K KAAA KKT TA K KA T A K KT 78 610 CCACCGCCMGMGGCGGCGGCCMGMGACCACCGCCMG;IAGGMGACGGCGAGCTTCTCG.4GGACGAGGCGACCGAGGGCCGMGG TAKKAAAKKTTAKKEDGELLEDEATEEPKA108 900 CCGCG.4CGGAGG.4GCCCGACGGGTACCGGGCCGCCCCGG~CCAGC.4GGTCGCCG ATEEPEGTENAGFVLSDEDEDDAPAQQVAA138 990 CGGC&GGTGCCACCGCCGACCCGGTCAAGGACTACCTC.e4GCAGATCGGC.~GGTCCCTCTGCTCAACGCCGAGCRGGAGGTCGAGCTCG AGATADPVKDYLKQIGKVPLLNAEQEVELA168 1080 CCAAGCGCATCGAGGCCGGTCTGTTCGCCGAGGACMGCTGGCTGGCGCCCAAGCTCAAGCGCGAGCTGGIZG.4TC.4 KRIEAGLFAEDKLANSDKLAPKLKRELEII198 1170 TCGCCGAGGACGGCCGCCGCGCC.4AG,4ACCACCTGCTGGAGGCC.~CCTCCGCCTGGTGGTCTCGCTGGCC~GCGCTACACCGGCCGCG AEDGRRAKNHLLEANLRLVVSLAKRYTGRG228 I.260

GCATGCTCTTCCTGGACCTCATCCAGG?\GGGC~4CCTCGGTCTGATCCGCGCGGTGG.4GMGTTCGACTACACC.IV\GGGCTAC?LCT MLFLDLIQEGNLGLIRAVEKFDYTKGYKFS258 1350 CCACGTACGCCACCTGGTGGATCCGCCAGGCGATCACCCGCGCG.4TGGCCGACC.4GGCGCGCACC.4TC~G~4TCCCGGTGC.4CATGGTCG T Y A T W W I R Q A I T R A M A D Q A R T I R I P V H M V E288 l-140 AGGTCATCMCMGCTCGCGCGCGTGCAGCGCCAGATGCTCCAGGACCTGGGCCGTG.4GCCCACCCCGGAGGAGCTGGCCMGG,4GCTCG VINKLARVQRQMLQDLGREPTPEELAKELD318 1530 ACATGACCCCGGAGMGGTC.4TCGAGGTCCAGAAGTACGGCCGCGAGCCCATCTCGCTGC.4CACGCCGCTGGGCGAGGACGGCGACAGCG MTPEKVIEVQKYGREPISLHTPLGEDGDSE318 1620 AGTTCGGTG.4CCT~4TCGtlGGACTCCGAGGCGGTCGTCCCGGCCGACGCGGTCAGCTTCACACTGCTGCAGGAGCAGCTGCACTCCG~C FGDLIEDSEAVVPADAVSFTLLQEQLHSVL378 1710 TCGACACCCTGTCCGAGCGCGAGGCGGGCGTCGTCTCCATGCGC~CGG.4CTCACCGACGGTCAGCCG~4GACCCTCGACGAG.4TCGGCA DTLSEREAGVVSMRFGLTDGQPKTLDEXGK-jOS 1800 AGGTGTACGGCGTCACGCGTGAGCGCATCCGCCAG.4TCGAGTCGAAG.4CCATGTCGMGCTGCTGCGTCAGGGGTCGCGCTCGCAGGTGCTGC VYGVTRERIRQIESKTMSKLRNPSRSQVLR438 1890 GCGACTACCTCGACTAGGTCGTAGCCGACCGGACGGCCGGTC~CG~CAGGTGCG .-.._-_-___> <-D YL D l 442 1980 GGACCGGGCCTTCTTGCTGGGCAGCGGGTGCGCGGkACGGCGTACTGGATC.4CTCTGGGTGTCCCATGACCACCTAGGAGTGAGGAGCCC 1986 GCATGC Fig. 1. Nucleotide sequence of hrdB and the deduced aa sequence. The entire regions were sequenced by the dideoxy chain-termination method (Sanger et al., 1977) using the Sequenase rM (US Biochemical Co., Cleveland, OH). The aa are aligned with the first nt of each codon. The underlined sequence upstream from the translational start codon indicates a possible ribosome-binding sequence. Dashed convergent arrows downstream from the stop codon mark an inverted repeat. The sequence was deposited in DDBJ, GenBank and EMBL databases under accession No. X52983.

et al., 1988). Fig. 1 shows the nt sequence of the 1986-bp NotI-SphI fragment. The cloned DNA in the sequenced region was surveyed by GC plot (FRAME analysis) (Nagaso et al, 1988). There are five possible start codons

between a stop codon (TAA) at nt 123 and nt 630. An codon at nt 479 was considered to be the start codon gene taking into account the presence of a possible some-binding sequence, 5’-GAGGAAGAGGG,

ATG of the riboat nt

147 (--_____________lb_______________)

BS

[__-_______c1____---_____-____--_][-__-_-_---___~~____________---___--][ *t * l* * **+ *tt t # * l * Dp,,l~yLKE? GRVSLLSAKE EIAYAQKIEE G___--_--_--___4 aa________i_______----

EC PaA Mx ScB

DPVRMYMREM GTVELLTREG EIDIAKRIED DPVRMYMREM GTVELLTREG EIEIAKRIEE DPVRLYLRKMGSVSLLTREG EVEIAKRIED DPVKDYLKQI GKVPLLKAEQEVELAKRIEA

l

G------------249 G------------251 G------------243 G-------------31

aa-------------------aa-------------------aa-------------------aa--------------------

(____-_2.2----__)(__-2.3________--)(--_-~~~~~~----

(____2_1______)

____________________________________---_c2-_______-____-__________________---_ t

***t*

**+

**t*t**t*

***t**tt*

+++t*

EC Pa.4 Mx ScB

----___

Bs EC

Pa.4 MIX ScB

Bs EC

PaA Mx ScB

Bs EC PaA Mx ScB Fig. 2. Alignment are marked

********it QARTIRIPVH QARTIRIPVH QARTIRIPVH QARTIRIPVII QARTIRIPVH

*tt ***)I * + ttdktt* l *****+t* MVETIKKLIR VQRQLLQDLGREPTPEEIAE MIETINKLHR ISRQMLQE?IG REPTPEELAE MIETINKLNR ISRQ!lLQEMG REPTPEELGE MIETINKLIR TSRYLVQEIG REPTPEEIAE MVEVIKKLARVQRQMLQDLG REPTPEELAK

*********

*****it+**

l *t DMD;TP;;R E;L;IAQEPV RMLMPEDKIRKVLKIAKEPI RMDMPEDKIRKVLKIAKEPI KMELPLDKVRKVLKIAKEPI ELDMTPEKVI EVQKYGREPI

*+ **tit+ SLETPIGEED SMETPIGDDE SMETPIGDDE SLETPIGEEE SLllTPLGEDG

---_ 1 (__(______4*1_______) ____________-_________~~_~__~~~~______-______~_____~~~__~~~~__--~ l* *+ ** * t ;:llL;;F:;; QEATSPSDIIA AYELLKEQLE DVLDTLTDRE EX;‘:R;I;F:;. DDGRTR-;E DSHLGDFIED TTLELPLDSA TTESLRAATH DVLAGLTAREAKVLRMRFGI DXSTDYTLEE DSHLGDFIED STMQSPIEMA TESELKESTR EVLAGLTARE AKVLRMRFGI D?IKTDllTm DSHLGDFIED KSLVSPADAV IKMNLAEQTRKVLATLTPRE EKVLRWRFGI GEKSDIITLEJ DSEFGDLIED SEAVVPADAVSFTLLQEQLH SVLDTLSERE AGVVSMREGLTDGQPKTE

VGKQFDVTRERIRQIEAKAL VGDQFDVTRERIRQIEAKAL VGQDFEVTRE RIRQIEAKAL IGKVYGVTRE RIRQIESKTM of the aa sequence with asterisks

of HrdB

protein

and plus symbols,

RKLRHPSRSE RKLRllPSRSE RKLRllPSRSK SKLRHPSRSQ

VLRSFLDD HLRSFLDE RLRSFVES VLRDTLD

(ScB) with known respectively.

I

Sequence

homologs

were defined as pairs of residues belonging

to one of the following

Only the numbers

of aa residues

and Takahashi

groups:(A,G), (D,E,N,Q),

and region Cl is not shown in the figure. The functional

(1986) are also presented.

of principal

domains

in the V2 regions

558-568 preceding the ATG codon. The ORF terminates at a TAG codon at nt 18 15. Thus the hrdB ORF, spanning 1326 bp, encodes a putative polypeptide of 442 aa with an A4, of 48412. The deduced aa sequence of HrdB is shown below the nt sequence in Fig. 1. A possible stem-loop structure was found at nt 1873-l 897 shown by a pair of convergent dashed arrows. This sequence may function as a Rhoindependent transcriptional stop signal. If this sequence is transcribed, the RNA can form a stable stem-loop structure having dG = -26.8 kcal/mol.

o factors.

data of rpoD homologs

Gitt et al. (1985) for B. subtilis (Bs), Inouye (1990) for M. xunthus (Mx), and Tanaka the N terminus

l

(____---_--_______3________-_-_--_____----__-

1

---___4.2______--__-_____) ______________________~~____~_~___~~___~ t **** ****** l +

proteins

*t**t*t+**

~RRLAEAXLR LVVSIAKRYV GRGMLFLDLI HEGXMGLMKA vEKFDYRKGY ~13Tymtw RQAITRAIAD KKEXVEANLRLVISIAKKYT KRGLQFLDLI QEGXIGLMKAVDKFEYRRGY KFSTYATWWI RQAITRSIAD KKEMVEANLRLVISIAKKYT XRGLQFLDLI QEGKIGLMKAVDKFEYRRCY KFS’NATWI RQAITRSIAD KSELVEAKLR LVVSIAKKYT XRGLQFLDLI QEGKIGL?IKAVDKFEYKRGY KFSTYATWWI RQAITR4IAD KKHLLEAh’LRLVVSLAKRYT GRGMLFLDLI QEGXLGLIRA VEKFDYTKGY KFSTYATIWI RQAITRAI’L4D

BS

Identical

and conserved

(199 1) for P. aeruginosa

(H,K,R),

aa among

the five

were taken from Burton et al. (1981) for E. coli (EC), (M,V,L,I),

(lB, 2.1,2.2,2.3,2.4,3,4.1,

(PaA). Conserved

substitutions

(F,Y,W), (S,T), C, P. Region Vl spanning

and 4.2) proposed

by Gribskov

and Burgess

are shown.

(b) The HrdB protein shares a common structure with the known principal 0 factors Although aa sequences of principal 0 factors or principal o-like proteins from several eubacteria have been reported, the principal cs factor (d3) of B. subtilis was the only sequence reported for eubacteria from Gram+ species. HrdB is the second example from Gram + bacteria. Accordingly, it is important to compare the HrdB sequence with other principal r~factors to know the basic structure for the principal 0 factors among divergent eubacterial species.

148 Sequence alignment of HrdB with the principal Q factors of E. coli, P. aeruginosa and M, xanthus belonging to Gramspecies and B. subtilis of Gram + species allows us to divide these proteins into four regions (Cl, C2, Vl and V2) (Fig. 2). The N-terminal regions being heterogeneous in length and sequence were designated as nonconserved region 1 (Vl region; not shown in the alignment of Fig. 2) and the following 31-aa residues having an appreciable sequence similarity among them were called C 1 region (conserved region 1). Then the internal regions heterogeneous in length and sequence, which include the large gap found between the principal a factors of E. coli and B. subtilis, were designated as nonconserved region 2 (V2). The V2 regions of B. subtilis protein and HrdB consist of only 4and 3 1-aa residues, respectively. In contrast, the principal factors of the Grambacteria have very large V2 regions consisting of 243- to 251-aa residues. The short vs. long V2 regions appear to be characteristic of the principal a factors of Gram + vs. Gramspecies, respectively. Segments of 237 aa ranging from the end of the V2 regions to the C termini were highly similar among the five principal a or a-like proteins and named C2 region (conserved region 2). One of the most remarkable features of this alignment is that there is neither an insertion nor a deletion in the conserved 31-aa residues in the Cl region and the 237-aa residues in the C2 region. The sequence similarities in these two regions among the principal a factors from divergent eubacterial species were 51.9% at the identical aa and 78.1% at the conserved aa levels. The conserved Cl and C2 sequences in the principal a factors and HrdB are distinct from that observed between the principal a and alternative a factors (Stragier et al., 1985; Helmann and Chamberlin, 1988). These observations suggest that HrdB shares a basic structure and function with the principal a-like transcriptional factors of eubacterial strains. Essentially all of the important functional domains proposed by Gribskov and Burgess (1986) were included in C 1 and C2 regions. We have previously reported the finding of an identical 13-aa stretch in the junction portion of regions 2.3 and 2.4 in C2 region and named it ‘rpoD box’ (Tanaka et al., 1988). In fact, HrdB shares a stretch of 18-aa residues with other four principal factors in the region including the rpoD box stretch. Only four aa residues are different out of 20 aa comprising the possible helix-turn-helix motifs (underlined in Fig. 2) in region 4.2 between the principal a factor of B. subtilis and HrdB protein. Moreover, all of the nonidentical aa were conservative substitutions. Since the

motif in region 4.2 is responsible for the recognition of the -35 sequence of promoters, it is likely that the principal a factor of B. subtilis and HrdB recognize promoters having a very similar or identical sequence.

ACKNOWLEDGEMENTS

This work was supported in part by grants for scientific research from the Ministry of Education, Culture, and Science of Japan and from the Institute of Physical Chemical Research (RIKEN).

REFERENCES Burton,

Z.F., Burgess,

R.R., Lin, J., Moore,

C.A.: The nucleotide polymerase

sequence

D., Holder,

S. and Gross,

of the cloned rpoD gene for the RNA

from Escherichia coli. Nucleic Acids Res. 9

sigma subunit

(1981) 2889-2903. Buttner,

M.J., Chater,

transcriptional

K.F. and Bibb, M.J.: Cloning,

analysis

of Streptomyces

of three RNA polymerase

coelicolor A3(2). J. Bacterial.

disruption,

and

sigma factor genes

172 (1990) 3367-3378.

Gitt, M.A., Wang, L.-F. and Doi, R.H.: A strong sequence exists between the major RNA polymerase sigma factors

homology of Bacillus

subtilis and Escherichiu coli. J. Biol. Chem. 260 (1985) 7178-7185. Gribskov,

M. and Burgess,

phage

R.R.: Sigma

from Escherichia coli,

factors

SPOl, and phage T4 are homologous

protein.

Nucleic

Acids

Res. 14 (1986) 6745-6763. Helmann,

J.D. and Chamberlin,

sigma factors. Inouye,

M.J.: Structure

Annu. Rev. Biochem.

and function of bacterial

57 (1988) 839-872.

S.: Cloning and DNA sequence

of the gene coding for the major

sigma factor from Myxococcusxunthus. J. Bacterial. Nagaso,

H., Saito, S., Saito, H. and Takahashi,

and expression amylase Sanger,

F., Nicklen,

sequence

of a Streptomyces griseosporeus proteinaceous

inhibitor

terminating

172 (1990) 80-85.

H.: Nucleotide

(HaimII)

gene. J. Bacterial.

S. and Coulson,

inhibitors.

Proc.

A.R.: DNA sequencing Natl.

Acad.

alpha-

70 (1988) 4451-4457. Sci.

with chain-

USA

74 (1977)

5463-5467. Stragier,

P., Parsot,

C. and Bouvier,

served in major and alternate

J.: Two functional

bacterial

sigma factors.

domains

con-

FEBS Lett. 187

(1985) 11-15. Takahashi,

H., Tanaka,

and

Ogawara,

Scientific Tanaka,

K. and Shiina, T.: Genetic

in Streptomyces strains.

gene homologues

H. (Eds.),

Societies

K. and Takahashi,

for the principal

Biology

Press, Tokyo,

of Actinomycetes

in eubacteria:

‘88. Japan

of the gene (rpoDA)

of Pseudomonas aeruginosa. Biochim.

Biophys. Acta 1089 (1991) 113-119. Tanaka, K., Shiina, T. and Takahashi, H.: Multiple principal homologs

of rpoD

Y., Beppu, T.

1988, pp. 58-63.

H.: Cloning and analysis

sigma factor

constituent

In: Okami,

identification

(1988) 1040-1042. Tanaka, K., Shiina, T. and Takahashi,

sigma factor

of the ‘rpoD box’. Science 242

H.: Nucleotide

sequence

of genes

hrdA, hrdC, and hrdD from Streptomyces coelicolor A3(2) having similarity to rpoD gene. Mol. Gen. Genet.

(in press).