Gene. 107 (1991) 145-148 0 1991 Elsevier Science Publishers
GENE
B.V. All rights reserved.
145
0378-1119/91/%03.50
06107
Sequence of /w&3, an essential gene encoding sigma-like transcription factor of Streptomyces A3(2): homology to principal sigma factors (RNA polymerase; rpof) box; Gram-positive bacteria; sequence similarity; recombin~t
coelicolor
DNA)
Tetsuo Shiina, Kan Tanaka and Hideo Takahashi Institute of Applied Microbiology, Universit_yof Tokyo, Bunkyo-ku, Tokyo 113 (Japan) Received by A. Nakazawa: 15 March 1991 Revised/Accepted: 21 July/22 July 199 1 Received at publishers: 20 August 1991
SUMMARY
The complete nucleotide sequence of the hrdB gene, au essential gene of Streptomyces coelicolor A3(2), indicates the presence of an open reading frame encoding a putative polypeptide of 442 amino acid (aa) residues with an nil, of 48412. The principal a-like tr~scriptional factor of S. c~iiculur (HrdB) protein showed an extensive aa sequence homology with the known principal CTfactors of ~scherichiu coli, Bacj~~~ssabtilis, ~se~orn~~as aeruginosa and Myxococcus xanthus. The degree of sequence similarity between HrdB protein and the known principal crfactors was distinct from that observed between the principal c factors and the alternative (minor) (7factors. Essentially all of the functional domains proposed for the principal (r factor of iE. cola’were conserved in HrdB protein. The putative (r factor, HrdB, like that of B. subtilis had a short internal nonconserved region, which might be characteristic of Gram + species.
INTRODUCTION
Using an oligo probe (rpuD probe), designed from the sequence of aa stretch that is completely conserved between the principal TVfactors of E. coli and B. subtilis, we have previously identified and cloned four regions from S. coelicolor A3(2) (Tanaka et al., 1988; 1991). By sequencing, we found coding regions which were highly similar to the principal d factors of E. co/i and B. subtjlis. In particular, a sequence of 13 aa residues in the putative coding region that was identical to those in the principal (I factors was designated ‘rpoD box’. These genes were named h&f, hrdB, hrdC, and hrdD (they appear to be homologs of the rpoD gene). Correspondence to: Dr. H. Takahashi, Institute of Applied University of Tokyo, Bunkyo-ku, Tokyo 113 (Japan) Tel. @l-3)3812-2111, Abbreviations: HrdB;
aa, amino acid(s);
nucleotide; ORF, factor in E. coli.
Microbiology, EXPERIMENTAL
A&D
DISCUSSION
Ext. 7825; Fax (81-3)3813-0539.
HrdB, principal
color; kb, kilobase
Distribution of hrd sequences among different Streptomyces strains analyzed by genomic Southern hybridization indicated that only the hrdB homologs were present in all of the Stre~tomy~es species examined {Takahashi et al., 1988). It is likely that the hrdB homologs are essential for the normal growth of mycelial Streptomyces strains and that the hrdB gene product is the functional homolog of the principal CT factors involved in the transcription of housekeeping genes. The lethal effect of the hrdB gene disruption supports this notion (Buttner et al., 1990; K.T. and H.T., unpublished observation). The aim of this study was to determine the complete nt sequence of the hrdB gene and to compare the putative HrdB protein with the known principal CTfactors.
bp, base pair(s);
u-Iike transcriptional
or 1000 bp; nt, nucleotidefs); open reading
frame;
hrdE, gene encoding oligo, oligodeoxyribo-
rpoL), gene encoding
principal
sequence of h&B region A 4%kb Sal1 fragment derived from the DNA of S. coeficolor A3(2) containing the hrdB gene has been cloned
(a) Nueleotide
factor of ~rrep~o~yces co&D
in the Sal1 site of pTZ19R resulting in pCSB 1 (Tanaka
146 90
GCGGCCGCMGGTACGAGTTGATGACCTTGTTI;4TCCGCATCTGACCtLTCGCTTACGGGGTGTGACTCGGGCCACGC&GATTG 180 GGCGTAACGCTCTTGGGtVlCMCACG.4TGACGATG.4CCT~GAGGTGACAGCCGCGG.4GGG.~4TACGG.4CGCCGTTCACGGCGCTGTGC.4TCTGGG 270 CGGCCCGCCCGCACCGTCGGCCCATTCCGAAGCCGCCGGTGGT~GGCCCCTGTCCGCCGT~CACGGGGCCGG~GCCG~C.~CG~CCG 360 AGAGGTTCrrCGTGTCGGCCAGCACATCCCGTACGCTCCCGCCGGAG.4TCGCCGAGTCCGTCTCTGTCATGGCGCTCATTGAGCGGGGM 450 AGGCTGAGGGGCAGATCGCCGGCGATGACGTGCGTCGGGCCTTCGAAGCTGACCAG.ISTTCCGGCCACTCAGTGGMGMCGTACTGCGC.4 540 GCCTC.4;lCCAGATCCTCGAGGMGAGGGTGTGAGGGTGTGACGCTGATGGTC.4GTGCCGCAGAGCCC~4GCGCACCCG~4GAGCGTCGCAGCG.~4~4 MV SAAE P K R TR K S VAAK S18 630 GTCCCGCCAAGCGCACCGCCACGMGGCGGCGGTCGtL4GGCGACCGCTCCCGCCGCCCCGGCCGCGCCCG P AK R T.4 T K A VAA N P VT S R f\lA T A P AA P A.4 PA 48 720 CG.4CCG.4GCCCGCCGCCGTCG.4GGtL4GAGGCGCCCGCC.~G~GGCCGCGGCC.~4G~GACG.4CCGCC.~4G.~GG~GACGGCG.~~G~G.4 T E P AAV E E E A P A K KAAA KKT TA K KA T A K KT 78 610 CCACCGCCMGMGGCGGCGGCCMGMGACCACCGCCMG;IAGGMGACGGCGAGCTTCTCG.4GGACGAGGCGACCGAGGGCCGMGG TAKKAAAKKTTAKKEDGELLEDEATEEPKA108 900 CCGCG.4CGGAGG.4GCCCGACGGGTACCGGGCCGCCCCGG~CCAGC.4GGTCGCCG ATEEPEGTENAGFVLSDEDEDDAPAQQVAA138 990 CGGC&GGTGCCACCGCCGACCCGGTCAAGGACTACCTC.e4GCAGATCGGC.~GGTCCCTCTGCTCAACGCCGAGCRGGAGGTCGAGCTCG AGATADPVKDYLKQIGKVPLLNAEQEVELA168 1080 CCAAGCGCATCGAGGCCGGTCTGTTCGCCGAGGACMGCTGGCTGGCGCCCAAGCTCAAGCGCGAGCTGGIZG.4TC.4 KRIEAGLFAEDKLANSDKLAPKLKRELEII198 1170 TCGCCGAGGACGGCCGCCGCGCC.4AG,4ACCACCTGCTGGAGGCC.~CCTCCGCCTGGTGGTCTCGCTGGCC~GCGCTACACCGGCCGCG AEDGRRAKNHLLEANLRLVVSLAKRYTGRG228 I.260
GCATGCTCTTCCTGGACCTCATCCAGG?\GGGC~4CCTCGGTCTGATCCGCGCGGTGG.4GMGTTCGACTACACC.IV\GGGCTAC?LCT MLFLDLIQEGNLGLIRAVEKFDYTKGYKFS258 1350 CCACGTACGCCACCTGGTGGATCCGCCAGGCGATCACCCGCGCG.4TGGCCGACC.4GGCGCGCACC.4TC~G~4TCCCGGTGC.4CATGGTCG T Y A T W W I R Q A I T R A M A D Q A R T I R I P V H M V E288 l-140 AGGTCATCMCMGCTCGCGCGCGTGCAGCGCCAGATGCTCCAGGACCTGGGCCGTG.4GCCCACCCCGGAGGAGCTGGCCMGG,4GCTCG VINKLARVQRQMLQDLGREPTPEELAKELD318 1530 ACATGACCCCGGAGMGGTC.4TCGAGGTCCAGAAGTACGGCCGCGAGCCCATCTCGCTGC.4CACGCCGCTGGGCGAGGACGGCGACAGCG MTPEKVIEVQKYGREPISLHTPLGEDGDSE318 1620 AGTTCGGTG.4CCT~4TCGtlGGACTCCGAGGCGGTCGTCCCGGCCGACGCGGTCAGCTTCACACTGCTGCAGGAGCAGCTGCACTCCG~C FGDLIEDSEAVVPADAVSFTLLQEQLHSVL378 1710 TCGACACCCTGTCCGAGCGCGAGGCGGGCGTCGTCTCCATGCGC~CGG.4CTCACCGACGGTCAGCCG~4GACCCTCGACGAG.4TCGGCA DTLSEREAGVVSMRFGLTDGQPKTLDEXGK-jOS 1800 AGGTGTACGGCGTCACGCGTGAGCGCATCCGCCAG.4TCGAGTCGAAG.4CCATGTCGMGCTGCTGCGTCAGGGGTCGCGCTCGCAGGTGCTGC VYGVTRERIRQIESKTMSKLRNPSRSQVLR438 1890 GCGACTACCTCGACTAGGTCGTAGCCGACCGGACGGCCGGTC~CG~CAGGTGCG .-.._-_-___> <-D YL D l 442 1980 GGACCGGGCCTTCTTGCTGGGCAGCGGGTGCGCGGkACGGCGTACTGGATC.4CTCTGGGTGTCCCATGACCACCTAGGAGTGAGGAGCCC 1986 GCATGC Fig. 1. Nucleotide sequence of hrdB and the deduced aa sequence. The entire regions were sequenced by the dideoxy chain-termination method (Sanger et al., 1977) using the Sequenase rM (US Biochemical Co., Cleveland, OH). The aa are aligned with the first nt of each codon. The underlined sequence upstream from the translational start codon indicates a possible ribosome-binding sequence. Dashed convergent arrows downstream from the stop codon mark an inverted repeat. The sequence was deposited in DDBJ, GenBank and EMBL databases under accession No. X52983.
et al., 1988). Fig. 1 shows the nt sequence of the 1986-bp NotI-SphI fragment. The cloned DNA in the sequenced region was surveyed by GC plot (FRAME analysis) (Nagaso et al, 1988). There are five possible start codons
between a stop codon (TAA) at nt 123 and nt 630. An codon at nt 479 was considered to be the start codon gene taking into account the presence of a possible some-binding sequence, 5’-GAGGAAGAGGG,
ATG of the riboat nt
147 (--_____________lb_______________)
BS
[__-_______c1____---_____-____--_][-__-_-_---___~~____________---___--][ *t * l* * **+ *tt t # * l * Dp,,l~yLKE? GRVSLLSAKE EIAYAQKIEE G___--_--_--___4 aa________i_______----
EC PaA Mx ScB
DPVRMYMREM GTVELLTREG EIDIAKRIED DPVRMYMREM GTVELLTREG EIEIAKRIEE DPVRLYLRKMGSVSLLTREG EVEIAKRIED DPVKDYLKQI GKVPLLKAEQEVELAKRIEA
l
G------------249 G------------251 G------------243 G-------------31
aa-------------------aa-------------------aa-------------------aa--------------------
(____-_2.2----__)(__-2.3________--)(--_-~~~~~~----
(____2_1______)
____________________________________---_c2-_______-____-__________________---_ t
***t*
**+
**t*t**t*
***t**tt*
+++t*
EC Pa.4 Mx ScB
----___
Bs EC
Pa.4 MIX ScB
Bs EC
PaA Mx ScB
Bs EC PaA Mx ScB Fig. 2. Alignment are marked
********it QARTIRIPVH QARTIRIPVH QARTIRIPVH QARTIRIPVII QARTIRIPVH
*tt ***)I * + ttdktt* l *****+t* MVETIKKLIR VQRQLLQDLGREPTPEEIAE MIETINKLHR ISRQMLQE?IG REPTPEELAE MIETINKLNR ISRQ!lLQEMG REPTPEELGE MIETINKLIR TSRYLVQEIG REPTPEEIAE MVEVIKKLARVQRQMLQDLG REPTPEELAK
*********
*****it+**
l *t DMD;TP;;R E;L;IAQEPV RMLMPEDKIRKVLKIAKEPI RMDMPEDKIRKVLKIAKEPI KMELPLDKVRKVLKIAKEPI ELDMTPEKVI EVQKYGREPI
*+ **tit+ SLETPIGEED SMETPIGDDE SMETPIGDDE SLETPIGEEE SLllTPLGEDG
---_ 1 (__(______4*1_______) ____________-_________~~_~__~~~~______-______~_____~~~__~~~~__--~ l* *+ ** * t ;:llL;;F:;; QEATSPSDIIA AYELLKEQLE DVLDTLTDRE EX;‘:R;I;F:;. DDGRTR-;E DSHLGDFIED TTLELPLDSA TTESLRAATH DVLAGLTAREAKVLRMRFGI DXSTDYTLEE DSHLGDFIED STMQSPIEMA TESELKESTR EVLAGLTARE AKVLRMRFGI D?IKTDllTm DSHLGDFIED KSLVSPADAV IKMNLAEQTRKVLATLTPRE EKVLRWRFGI GEKSDIITLEJ DSEFGDLIED SEAVVPADAVSFTLLQEQLH SVLDTLSERE AGVVSMREGLTDGQPKTE
VGKQFDVTRERIRQIEAKAL VGDQFDVTRERIRQIEAKAL VGQDFEVTRE RIRQIEAKAL IGKVYGVTRE RIRQIESKTM of the aa sequence with asterisks
of HrdB
protein
and plus symbols,
RKLRHPSRSE RKLRllPSRSE RKLRllPSRSK SKLRHPSRSQ
VLRSFLDD HLRSFLDE RLRSFVES VLRDTLD
(ScB) with known respectively.
I
Sequence
homologs
were defined as pairs of residues belonging
to one of the following
Only the numbers
of aa residues
and Takahashi
groups:(A,G), (D,E,N,Q),
and region Cl is not shown in the figure. The functional
(1986) are also presented.
of principal
domains
in the V2 regions
558-568 preceding the ATG codon. The ORF terminates at a TAG codon at nt 18 15. Thus the hrdB ORF, spanning 1326 bp, encodes a putative polypeptide of 442 aa with an A4, of 48412. The deduced aa sequence of HrdB is shown below the nt sequence in Fig. 1. A possible stem-loop structure was found at nt 1873-l 897 shown by a pair of convergent dashed arrows. This sequence may function as a Rhoindependent transcriptional stop signal. If this sequence is transcribed, the RNA can form a stable stem-loop structure having dG = -26.8 kcal/mol.
o factors.
data of rpoD homologs
Gitt et al. (1985) for B. subtilis (Bs), Inouye (1990) for M. xunthus (Mx), and Tanaka the N terminus
l
(____---_--_______3________-_-_--_____----__-
1
---___4.2______--__-_____) ______________________~~____~_~___~~___~ t **** ****** l +
proteins
*t**t*t+**
~RRLAEAXLR LVVSIAKRYV GRGMLFLDLI HEGXMGLMKA vEKFDYRKGY ~13Tymtw RQAITRAIAD KKEXVEANLRLVISIAKKYT KRGLQFLDLI QEGXIGLMKAVDKFEYRRGY KFSTYATWWI RQAITRSIAD KKEMVEANLRLVISIAKKYT XRGLQFLDLI QEGKIGLMKAVDKFEYRRCY KFS’NATWI RQAITRSIAD KSELVEAKLR LVVSIAKKYT XRGLQFLDLI QEGKIGL?IKAVDKFEYKRGY KFSTYATWWI RQAITR4IAD KKHLLEAh’LRLVVSLAKRYT GRGMLFLDLI QEGXLGLIRA VEKFDYTKGY KFSTYATIWI RQAITRAI’L4D
BS
Identical
and conserved
(199 1) for P. aeruginosa
(H,K,R),
aa among
the five
were taken from Burton et al. (1981) for E. coli (EC), (M,V,L,I),
(lB, 2.1,2.2,2.3,2.4,3,4.1,
(PaA). Conserved
substitutions
(F,Y,W), (S,T), C, P. Region Vl spanning
and 4.2) proposed
by Gribskov
and Burgess
are shown.
(b) The HrdB protein shares a common structure with the known principal 0 factors Although aa sequences of principal 0 factors or principal o-like proteins from several eubacteria have been reported, the principal cs factor (d3) of B. subtilis was the only sequence reported for eubacteria from Gram+ species. HrdB is the second example from Gram + bacteria. Accordingly, it is important to compare the HrdB sequence with other principal r~factors to know the basic structure for the principal 0 factors among divergent eubacterial species.
148 Sequence alignment of HrdB with the principal Q factors of E. coli, P. aeruginosa and M, xanthus belonging to Gramspecies and B. subtilis of Gram + species allows us to divide these proteins into four regions (Cl, C2, Vl and V2) (Fig. 2). The N-terminal regions being heterogeneous in length and sequence were designated as nonconserved region 1 (Vl region; not shown in the alignment of Fig. 2) and the following 31-aa residues having an appreciable sequence similarity among them were called C 1 region (conserved region 1). Then the internal regions heterogeneous in length and sequence, which include the large gap found between the principal a factors of E. coli and B. subtilis, were designated as nonconserved region 2 (V2). The V2 regions of B. subtilis protein and HrdB consist of only 4and 3 1-aa residues, respectively. In contrast, the principal factors of the Grambacteria have very large V2 regions consisting of 243- to 251-aa residues. The short vs. long V2 regions appear to be characteristic of the principal a factors of Gram + vs. Gramspecies, respectively. Segments of 237 aa ranging from the end of the V2 regions to the C termini were highly similar among the five principal a or a-like proteins and named C2 region (conserved region 2). One of the most remarkable features of this alignment is that there is neither an insertion nor a deletion in the conserved 31-aa residues in the Cl region and the 237-aa residues in the C2 region. The sequence similarities in these two regions among the principal a factors from divergent eubacterial species were 51.9% at the identical aa and 78.1% at the conserved aa levels. The conserved Cl and C2 sequences in the principal a factors and HrdB are distinct from that observed between the principal a and alternative a factors (Stragier et al., 1985; Helmann and Chamberlin, 1988). These observations suggest that HrdB shares a basic structure and function with the principal a-like transcriptional factors of eubacterial strains. Essentially all of the important functional domains proposed by Gribskov and Burgess (1986) were included in C 1 and C2 regions. We have previously reported the finding of an identical 13-aa stretch in the junction portion of regions 2.3 and 2.4 in C2 region and named it ‘rpoD box’ (Tanaka et al., 1988). In fact, HrdB shares a stretch of 18-aa residues with other four principal factors in the region including the rpoD box stretch. Only four aa residues are different out of 20 aa comprising the possible helix-turn-helix motifs (underlined in Fig. 2) in region 4.2 between the principal a factor of B. subtilis and HrdB protein. Moreover, all of the nonidentical aa were conservative substitutions. Since the
motif in region 4.2 is responsible for the recognition of the -35 sequence of promoters, it is likely that the principal a factor of B. subtilis and HrdB recognize promoters having a very similar or identical sequence.
ACKNOWLEDGEMENTS
This work was supported in part by grants for scientific research from the Ministry of Education, Culture, and Science of Japan and from the Institute of Physical Chemical Research (RIKEN).
REFERENCES Burton,
Z.F., Burgess,
R.R., Lin, J., Moore,
C.A.: The nucleotide polymerase
sequence
D., Holder,
S. and Gross,
of the cloned rpoD gene for the RNA
from Escherichia coli. Nucleic Acids Res. 9
sigma subunit
(1981) 2889-2903. Buttner,
M.J., Chater,
transcriptional
K.F. and Bibb, M.J.: Cloning,
analysis
of Streptomyces
of three RNA polymerase
coelicolor A3(2). J. Bacterial.
disruption,
and
sigma factor genes
172 (1990) 3367-3378.
Gitt, M.A., Wang, L.-F. and Doi, R.H.: A strong sequence exists between the major RNA polymerase sigma factors
homology of Bacillus
subtilis and Escherichiu coli. J. Biol. Chem. 260 (1985) 7178-7185. Gribskov,
M. and Burgess,
phage
R.R.: Sigma
from Escherichia coli,
factors
SPOl, and phage T4 are homologous
protein.
Nucleic
Acids
Res. 14 (1986) 6745-6763. Helmann,
J.D. and Chamberlin,
sigma factors. Inouye,
M.J.: Structure
Annu. Rev. Biochem.
and function of bacterial
57 (1988) 839-872.
S.: Cloning and DNA sequence
of the gene coding for the major
sigma factor from Myxococcusxunthus. J. Bacterial. Nagaso,
H., Saito, S., Saito, H. and Takahashi,
and expression amylase Sanger,
F., Nicklen,
sequence
of a Streptomyces griseosporeus proteinaceous
inhibitor
terminating
172 (1990) 80-85.
H.: Nucleotide
(HaimII)
gene. J. Bacterial.
S. and Coulson,
inhibitors.
Proc.
A.R.: DNA sequencing Natl.
Acad.
alpha-
70 (1988) 4451-4457. Sci.
with chain-
USA
74 (1977)
5463-5467. Stragier,
P., Parsot,
C. and Bouvier,
served in major and alternate
J.: Two functional
bacterial
sigma factors.
domains
con-
FEBS Lett. 187
(1985) 11-15. Takahashi,
H., Tanaka,
and
Ogawara,
Scientific Tanaka,
K. and Shiina, T.: Genetic
in Streptomyces strains.
gene homologues
H. (Eds.),
Societies
K. and Takahashi,
for the principal
Biology
Press, Tokyo,
of Actinomycetes
in eubacteria:
‘88. Japan
of the gene (rpoDA)
of Pseudomonas aeruginosa. Biochim.
Biophys. Acta 1089 (1991) 113-119. Tanaka, K., Shiina, T. and Takahashi, H.: Multiple principal homologs
of rpoD
Y., Beppu, T.
1988, pp. 58-63.
H.: Cloning and analysis
sigma factor
constituent
In: Okami,
identification
(1988) 1040-1042. Tanaka, K., Shiina, T. and Takahashi,
sigma factor
of the ‘rpoD box’. Science 242
H.: Nucleotide
sequence
of genes
hrdA, hrdC, and hrdD from Streptomyces coelicolor A3(2) having similarity to rpoD gene. Mol. Gen. Genet.
(in press).