Nucleotide sequence of part of the rpoC gene encoding the β′ subunit of DNA-dependent RNA polymerase from some gram-positive bacteria and comparative amino acid sequence analysis

Nucleotide sequence of part of the rpoC gene encoding the β′ subunit of DNA-dependent RNA polymerase from some gram-positive bacteria and comparative amino acid sequence analysis

System. App!. Microbio!. 19, 150-157 (1996) © Gustav Fischer Verlag· Stuttgart· Jena . New York Nucleotide Sequence of Part of the rpoC Gene Encoding...

853KB Sizes 0 Downloads 39 Views

System. App!. Microbio!. 19, 150-157 (1996) © Gustav Fischer Verlag· Stuttgart· Jena . New York

Nucleotide Sequence of Part of the rpoC Gene Encoding the ~' Subunit of DNA-Dependent RNA Polymerase from some Gram-Positive Bacteria and Comparative Amino Acid Sequence Analysis ROBERT MORSE, MATTHEW D. COLLINS, JOHN T. BALSDON, SALLY WALLBANKS, and PETER T. RICHARDSON Department of Microbial Physiology, BBSRC Institute of Food Research, Earley Gate, Whiteknights Road, Reading RG6 6BZ, U.K. Received November 20, 1995

Summary The polymerase chain reaction (PCR) was used to generate a fragment of the rpoC gene encoding the W subunit of DNA-dependent RNA polymerase from the gram-positive species Listeria innocua, Listeria murrayi, Brochothrix thermosphacta, Bacillus anthracis, Staphylococcus aureus and Pediococcus acidilactici. The portion of the PCR fragments sequenced corresponded to positions 1 to 3717 of the homologous rpoC gene from Escherichia coli. Comparative analysis of the deduced amino acid sequence identified structural motifs similar to those found in the Wsubunit of E. coli shown to be involved in chelating zinc atoms and DNA binding.

Key words: DNA-dependent RNA polymerase - rpoC gene

Introduction DNA-dependent RNA polymerase (RNAP) is a multisubunit enzyme consisting of two a subunits (coded by the rpoA gene), one B subunit (rpoB) and one Wsubunit (rpoC). This organisation is found in most Bacteria (Glass and Hayward, 1993) although a different organisation is present in the Eucarya (Archambault and Friesen, 1993) and the Archaea (Klenk and Zillig, 1994). The Eucarya have three RNAPs which although possessing subunits homologous to Band Whave numerous smaller subunits which have no counterparts in bacterial RNAPs. In the Archaea the equivalent B and sometimes Wsubunits are split (Glass and Hayward, 1993). The key role ofRNAP in transcription of DNA indicates it is of ancient origin in evolutionary terms and together with the large size of its subunits (a 36.5 kD, B 151 kD and W156 kD in Escherichia coli) indicates it could be useful as a molecular chronometer in phylogenetic analyses (Palenik, 1992). The rpoB and rpoC genes are a transcriptional unit (Steward and Linn, 1992) that appears to be maintained throughout most bacteria (Tittawella, 1984; Aboshkiwa et al., 1992 and Honore et al., 1993). DNA and amino acid sequences have only been published for a few bacte-

rial species and centre on the rpoB gene (Ovchinnikov et al., 1981; Clark et al., 1992; Aboshkiwa et al., 1992 and Honore et al., 1993) whilst more extensive sequence data are reported for the equivalent genes in archaeal species (Klenk and Zillig, 1994). The majority of structure-function analyses have been carried out in E. coli defining regions of the a subunit involved in transcription activation (Kimura et al., 1994) and RNAP assembly (Sharif et al., 1994) and regions of the B subunit involved in DNA binding (Lisitsyn et al., 1988), rifampicin (Rif) resistance lJin and Gross, 1988), polymerase specificity (Vinella and D'Ari, 1994), the orientation of the Rif binding site to the template DNA (Mustaev et al., 1994) and binding sites for regulatory factors (Severinov et al., 1994). The limited sequence data for the entire rpoC gene includes sequences from two gram-negative species (E. coli, Ovchinnikov et al., 1982 and Pseudomonas putida, EMBL Data Library PSERPOCG) and one gram-positive species with a high %G+C content (Mycobacterium leprae, Honore et al., 1993). In this study we have sequenced part of the rpoC genes from a selection of gram-positive bacteria with a low

rpoC Gene Sequences from Grampositive Bacteria

%G+C content in an effort to increase the available sequence database and identify regions of conservation in these and other organisms. Such data will aid future structure-function analyses of the Wsubunit. Sequence comparison between the amino acid sequence of the Wsubunit from E. coli and the equivalent subunits from Saccharomyces cerevisiae has identified eight regions of conservation (Allison et a!., 1985). Two structural motifs have been identified in E. coli: a Zinc binding region (Wu et a!', 1992) and a DNA binding region (Fukuda and lshihama, 1974). In addition, there is now a growing interest in the sequence determination of highly conserved proteins (or genes), to complement or check the validity of phylogenetic inferences made from small and large subunit rRNA sequence analyses. Recent studies (Palenik and Haselkorn, 1992; Klenk and Zillig, 1994 and Nolte, 1995) indicate DNA-dependent RNA polymerase may have considerable potential as a phylogenetic indicator, and the data presented here greatly expands the available sequence data for the Bacteria which is currently exceedingly limited.

Materials and Methods Bacterial strains and DNA preparation. The bacterial strains used in this study are listed in Table 1. Bacteria were cultured in appropriate media and at temperatures recommended in the relevant culture collection catalogue. Chromosomal DNA was extracted as described previously (Harland et aI., 1993). The authenticity of each strain examined was confirmed by 16S rRNA gene sequence analysis (data not shown). Polymerase chain reaction (PCR) amplification. To obtain partial copies of the rpoC gene two overlapping PCR products were generated (Fig. 1) using oligonucleotide primers (Table 2) designed to DNA sequences coding for conserved amino acid sequences in the upstream rpoB genes of Listeria species (this laboratory, unpublished results) and in the rpoC genes of E. coli and Staphylococcus aureus (Aboshkiwa et aI., 1992). PCR reactions were performed in 1 X PCR buffer (Perkin-Elmer) containing 10mM deoxynucleotide triphosphates and 4ng/[A1 of each oligonucleotide primer in a final volume of 50[Al under a layer of PCR-grade mineral oil (Sigma) on a Biometra PCR thermal cycler. After an initial denaturation step of 94°C for 5 minutes Ampli TM Taq polymerase (Perkin Elmer) was added to a final' concentration of 0.02U/[A1 and 25 cycles of 92 °C for 1 minute, 48°C for 1 minute and 65 °C for 3 minutes performed ending with a final extension step of 65°C for 10 minutes.

Table 1. List of bacterial strains used in this study Bacterial Species

Strain Number!

Pediococcus acidilactici Staphylococcus aureus Bacillus anthracis Listeria innocua Listeria murrayi Brochothrix thermosphacta

NCFB2767 NCD0949 T NCTC10340T NCTCl1288 T NCTC10812 T NCD01676 T

! NCDO, National Collection of Dairy Organisms, Reading; NCTC, National Collection of Type Cultures, Colindale, London; NCFB, National Collection of Food Bacteria, Reading T Type strain

1kb

rpoB

L.i. L.rn.

B.t. B.a. S.a. P.a.

1540 1527 1529 1529

rpoC

1554

1527

151

583

584

583

584 996

718 583 718 10392

1501

719 584

1553

719 815

584

Fig. 1. Positions of oligonucleotide primers used in PCR to generate a partial fragment of the rpoC gene for DNA sequencing. Solid lines indicate the PCR fragments generated for Listeria innocua (L.i.), Listeria murrayi (L.m.), Brochothrix thermosphacta (B.t.), Bacillus anthracis (B.a.), Staphylococcus aureus (S.a.) and Pediococcus acidilactici (P.a.). The region upstream of primer 583 in L. murrayi was present on an AEMBL301 clone previously isolated (unpublished results, this laboratory). Cloning, transformation and sequencing. PCR products were purified from primers, nucleotides and enzyme using a QIAquick PCR purification kit (Qiagen), ligated into pCR™n vector (TA cloning kit, Invitrogen) and transformed into E. coli INYuI cells according to the manufacturer's instructions. DNA sequencing was performed by using the dideoxynucleotide chain termination method (Sanger et aI., 1977) on both positive and negative strands of each cloned PCR product. Two independent PCRJ;roducts were sequenced for each rpoC gene to check Ampl? Taq polymerase fidelity, with a third PCR product being generated to resolve any nucleotide base discrepancies and identify the correct sequence. Sequences were analysed using the Wisconsin Molecular Biology software package (Devereux et aI., 1984). , Nucleotide sequence accession numbers. The EMBL accession numbers for the nucleotide sequences reported here are X89228, Listeria murrayi; X89229, Listeria innocua; X89230, Bacillus anthracis; X89231, Brochothrix thermosphacta; X89232, Pediococcus acidilactici and X89233, Staphylococcus aureus.

Results The PCR amplification strategy used to generate the two fragments of the rpoC gene for sequence analysis is shown in Figure 1. The 5' end of the gene was obtained by using primers complementary to conserved regions in the upstream rpoB gene (Table 2). Figure 2 shows a multiplealignment of the amino acid sequences deduced from the partial rpoC gene sequences of six species of gram-positive bacteria with a low %G+C content together with the sequence for the equivalent E. coli subunit (Ovchinnikov et a!', 1982). Partial rpoC sequences were generated, beginning at the translational start codon and ending at a position relative to amino acid 1249 in the E. coli subunit. The generated PCR fragments represent approximately 90% of the complete Wsubunit from E. coli (assuming there is no substantial insertion at the C-terminal end of the rpoC gene). Due to low amino acid sequence homology it was not possible to design primers to facilitate the sequence

152 Primer

R. Morse et al. Sequence

Gene

583

TGCTCATGCGGAAAATACAA T T T T G T

rpoC

208-227

584

CATAACTTCTACATGTTTATC

rpoC

3768-3748

718

GTGAACGTATGGGNCATATCGA C GA A C T

rpoC

296-317

719

TTAACNCCTTGCATACGGTATAC CCT C G A

rpoC

3740-3718

815

GTTCATCTTTAATGATGGAC

rpoC

1091-1073

996

TTGAAGTTTCAGTAATGTGG

rpoC

1855-1836

1501

TATCAAGTGGTGTATCTC

rpoC

466-449

1527

GTTACNCAACAACCGCT

rpoB

3760-3776

1529

AGCATACGGCGCNGCNTA T T

rpoB

3837-3854

1540

AAGTCTGATGACGTGGTTG C T T T

rpoB

3880-3898

1553

CNACATANGATGCAAAAT G AT G G

rpoC

426-419

1554

TNAAAAACCAAATGTGNG GGT G A

rpoC

350-335

GCAGCTCCTGTTTCTCACATCTGGT

rpoC

322-346

10392

T

C A G C G

G

G T

determination of the C-terminal end of the W subunit. Attempts to generate a PCR product for this region by either inverse PCR (Ochman et aI., 1990) or by using primers in the downstream gene for ribosomal protein S12 (rpsL, Kimura and Kimura, 1987) also failed. In common with M. leprae (Honore et aI., 1993) and Bacillus subtilis (EMBL Data Library BSRPLL), each of the rpoC genes of the six gram-positive species examined initiates with a TTG. In relation to the E. coli Wsubunit amino acid positions, all the gram-positive W subunits have an insertion after amino acid 589 (31 amino acids, 35 amino acids in S. aureus) and deletions between amino acids 11 and 20 inclusive (10 amino acids), 703 and 717 inclusive (15 amino acids) and amino acids 942 and 1129 inclusive (188 amino acids). It has been shown that the W subunit of E. coli binds a single atom of zinc (Wu et aI., 1992). The conserved sequence C-X ll -C-X 1-C-X 12-CXrC found between amino acids 58 and 88 (Fig. 3) represents a 'Zinc-Finger' motif (Latchman, 1995) which is the likely site of binding for the zinc atom. The highly conserved region between positions 358 and 379 defines a helix-tum-helix motif (Fig. 4) homologous to that found in DNA polymerase I of E. coli (Ollis et aI., 1985) and the largest subunits (RP021 and RP031) of yeast RNA polymerases II and III (Allison et aI., 1985). This region in DNA polymerase I is thought to bind to the major groove of duplex DNA and the presence of its homologue in the Wsubunit would be consistent with the DNA-binding activity of this subunit (Fukuda and Ishihama, 1974). The

Position

Table 2. List of oligonucleotides used to generate PCR products. All sequences and positions are given 5'-3' and position numbers refer to the E. coli rpoB and rpoC genes

corresponding region of the RNA polymerase II Wsubunit from Drosophila melanogaster has been shown to nonspecifically bind plasmid DNA in Southwestern assays (Kontermann et aI., 1993). Also present are two identical regions in the bacterial Wsubunits, the yeast RP021 and RP031 subunits (Allison et aI., 1985), and the A' and An subunits of the archaeal species Halobacterium halobium (Zillig et aI., 1989) between amino acids 457 and 464 (YNADFDGD) and amino acids 920 and 926 (AQSIGEP). These sequences have not yet been associated with any protein structural motifs. Discussion Investigation of the structure-function relationships of the Wsubunit of E. coli has used both a genetic approach (isolation and characterization of mutations that disrupt transcription) and chemical cross-linking techniques (using photo-cross-linkable nucleotide analogues to determine contact points for DNA/RNA). One such study to isolate mutations in the Wsubunit of E. coli that affect transcript elongation and termination identified 66 amino acid residues at which substitutions occurred, 44 of which occur before amino acid 1249, a region of the Wsubunit of E. coli homologous to the regions sequenced from gram-positive bacteria (Weilbaecher et aI., 1994). Of these 44 amino acid residues at which mutations occur in E. coli, 28 are conserved and 7 are different in the Wsubunits

rpoC Gene Sequences from Grampositive Bacteria

60

1

~:~: B.t. IV ~

L.i. L.rn.

S.a. P.a.

153

f\ K I j

I

[

J 1< S \.\ S F ~ h S Y

1\

~

P]l



l

\

!J 1\ I P I

L

KIT

1\

"

)

L L I

k"

"

'-;

I I f\ I 1<

K L "1

[

I

1

l\

j.'

'

J

H

c..;.,.

I I'

\

I

F

h ::;

l

(,

c..;! 1) !\ I I) S ,\ S H J"\Slrn(\IR~,S F I s I' [ K ! R I,,~, Y

)

"~I

I

I

1\ I

I

I',

')) H I

"

,

I

r,

1f

t

I

I

\

~

h

J

\

"

"



'\

'J

1\ I

K 1\ '

,

\

t

\

I

I

r K I r h I



'J I ',\

I

r, (' r (

,1.1

I

)

I

l

, It'· l i T 1 I r-) I

I

I

f< R f R rIP I ll\

I I ('

P

13


tIl< j

! H

('

B.a. Q E.C·IV

B.t.

T

L.i.

M

L.rn.

M

S.a. P.a. E.c. B.a.

B.t.

U. L.rn.

S.a. P.a.

T T rt

[I

.-,)",)' '8 S I

f

It

\

'

llJJr

l

I

rJ

\

I

I I

I

1< 11 "

I \ IS' \ \ I ,If\!j),s)\\)1

,.

!'f'1l <..;

" I'

]1

I

R

I

I

I

I

I

'

I

I

I

\

1 '1

01 \ 'Ill \

\ \

)

1 \

'.

1 J

I,

\

\

\

I

\

\

I

I

\

'1

TIl, I I-



1

I

I

\

f

B.t.

U. L.rn.

S.a. P.a.

B.t.

L.i. L.rn.

S.a. P.a. E.c. B.a.

B.t. L.i.

L.rn.

S.a. P.a.

E.c. B.a.

KI D

G

G

Q A

j

'

)

\. I

\.

I

1

\

\

, I

)

\

I f

/; \

I

1)

,.

j

• \ ,

) [

[

[!I

j ,. \ I ! !, \' It\., r ! \ [

I

iJ '.

r

1

I

T

S.a. P.a. E.c. B.a.

1

\

I

I

\. I I rJ I '

I

I

L.rn.

S.a. P.a.

B.t. L.i.

L.rn.

S.a. P.a.

Cl I)

\. 1 I 'J) \ I " \ j d'J I • I I [J j< \. r I IJ r.

B.a. E.C·IAL

v

1 IT)

I I

11 I 11 I '11

l

I

1<

r

I~

I

I

,;

m. "

,~

r

1\

j

I

I

"

I

I

.

I

\.

,

\

j

\

'

1

I

[

I r

1< I \. )-]) \,

11,\ j

I

,

I

j

1\

f

(

I

I

!)

I

\

l

r I

\

1

j

)

I l'

I I

I

t1

r. r "

L [ iT I M 1 1\ L [ [1 r

r

'j

I

GIII'

[

'I

l r

1,1

r r

l,

,r

',I

J

'I

',I"

t

j

J"

I

)

~,

1 I I

J

~)

I

.

M M

\

V

I R . J IJI< I I '

< I P i\

,] I [

1)<

l~

\.

>(

j<,

'v 1 [J

J 1 I~ r \ ),

N l

P 1-< I

,

1 N)

I,

~

~,j

Ii Ii

j"

j

[1 ~ [1ml i1 ~ l- 1 '1 r ! t'r l

I

S

I

IK

1

l- D I S

II 11 [ )~" I II '1 I It '1 I 11 11 I ]1 11 I

i< (, K I. K I K ; K l\ ; K K \; r . K I, 1\ C K (, K ,

!\ (,

(,

N

\, I,

)

G G

\, [\.

j

j

I

!

"

I

~j

j

T



I I . t, 1 1< 1.1 I, T 1 PiN 1 J

r

\, \.

V

C

I,.

'1

I

H PI\. C

j

H P I \ (

, I

N (

11

I

N

'I

I

I

[)

"

\

fi

'I,

~"

\

\

1

I"

I)· 1]1 t 1 \ ( " \ I. [ I; I r ! 1 L \. J \ II I ,I·" I I I , j I ] \ ( 1 (,10 I '; I 1<" I I I 1/ \) 1 \ 0 1 !, \ I~ I, I I 'I" r' 1< I jl r j ( • 1 11 I l' J

I • (L

•.

\.

l-

"

r I 1 \. ,K IJ " \.

(J'

I

I

I I I ! I

j

"(.

P'~

) I

1

\ I

I,

\.

1

\ I (, 11 \ II

I (, N I 1\ 11 \ rJ 11 " 11 11 \ r • N 11 \ \, N

't \

I

1 Y 1 I \ [

I

1 \ 'r \

360

\0

(,

r

Y

,I)'

1

\

\.

420

I R I H

m\ :

r

{I,

I

r

I [

It

I I I ,11 \ II \ f j

J'

I

I( \,

P

, . -,

r" 540

iJ \

I

f I

1< I I...

~

iii,

l

jID,

.

,

RIR

Il l

I I',

\" T ] j ] 1< I i I " J ( I \. I T 1 [ h \ Y I 11k \. I r

\

r

1

\

11 L U

II,,\. C P N

\

HI 0

I

[t"

11 I 0 [1 J 0

["II!I\JlJrHP I \, ! H [ ]-I H S P j j I I I • I l< f H P MAP MAA I I , \ . ' I I \ I P ! H r J 1 \ 1 I.' 11 P MDD I \ , I . ) \ 1 \. I ) Jl P K D E J \.

1/

l

1

\ • I

1 I\.'

01 \.

I I;

j~

~ 1, '1 r () 1 r I; i1 I 0 'J I 1\ I, il L 0 ',f ~ ].. I' r n

j

I

f~

,

Y Q P 1\

L \.

K K

I

P S I I \ . \. C P N I. \ C; r N I 1 \ I \ (; I) S )< \. I \. (; p S

[

\,

'i

,

,1

r

EEA

J

T

r

\.

J

j

, I " I

)

~

1

r

I

r

1 k U N I..

VII

"r K

S

\.

I N

P I I" \,. I,

R \ I \ f l 1< l; ]. •

D

't

I

j

E

K

\.

~

[

V

N

I \.

t'

j<

~.V2~ 300

I

j

I

P:~

H

[

H V L IJ ) N N H I ~ P \. I N I; N iJ I~ 1 I<

11

j

j

) . !' I " \ I I' ~) \ t j

1

t\

I \

I

j

I

Dr

,

I I

,~'

DTJ

J

1< 1< I H \ I 1 h

,

Ii

N I) I 1 R H v 1 N R N N H L K I I N [ j \ I-< f~ \. I N I-{ N N f-' J KIT 1 I N I l \ 1< P \. [ ,1 \<, ~j [J I 1\ I

I

1 1 I 1\ I ! \ ml'l 1 I r f \ ~ I Til r ' f. I I I [ I l' 1 I ~ I ' ) 1 [ I f. 1 \ I '\ If,,] 1\ I

S

1\ '-' j 1\ ' 1 1 'J 1 1 1 K I 'J " "r ',I I 1\ I Ii I j 1\ j N P ] 1 K ' 1 "

(

!

I,;

j

j

Ii)

l

I

E

l'

j<

I \

R

[j

'-

I

J

t

j

I F 1~

'0

t.:

: F ~K~~E

,1

R

. 1< 1\

QT. Esi.

II

j

I'

T \

I

\~

j

[ l , ' j< [

I

1 (

A N L

K

t'

j

P I{ I 1\ [ J 1< I I . Il P I R \ I ).' )" I

S S

A R',D

,< I I I (; \ I 1

I

1\

T QE LKL

I .

I !1 l

1

I

r. ) rJ ; )<

I

I)

(

B.t. U.

j



'j

1

~J)



,

'j

(]

I

,

j

r

\,

1

I

I

I I rlBI .. I " \

1

L.i. L.rn.

I)

\.

., \

1"

\.

I

)

I

I

I'D )1

l

T

1-<

VT E T ND

DiE

Q S V

\. L [ \. 1 \. 1 P \. j , \ I I\.,

[

G

S D EE

I K I' 1 • II\Il1 1 T !' '! ) r 11 \ ' K !l 1 I!' 1

B.t.

DAN. A D

K I

m

fJ

P (' J

I I

tC,lIrmIIKKDI! p C [) I )J , I- K K rJ I L

1\\ 1 V

'-

(, 1 L I ,,1 CrIIPrE]I\K)]1 ; I r 1\ rI I I II r l'I I 1\1\ .1 1

B.a. K IQK D S IDQ CEQINETNSET VDF T . E.c. •A

E.c. B.a.

fiT',';!' HI!l: l:II 80

S P '. 1",Pll I ~ '-1 f '\ (' I \

IIIi

M KIA MIT GP I EL LI L T IM TI K I Q L L K I MI AI NNT VLK Nil L D v K L MI

5

L

T A

589

E,c'I S A AIKIR I T EYE K D AIG LV A K T S B.a. T V NIV T B.t. S V A A AIS A A A vL GET A '. I KN N NM L.i. S A F G I PER D . RK L.rn. S G Y G I Q E R . R K S.a. F T G H S F N PT. NKKI

P.,.

",

",I,,,,, "".'

RAIWMIK.L . LE N SIK M S E P L KT V KF I L T K LV T E KY I D A QE R I E I NEK YN ML D KD V L A

625 E.c. . . . . . . . . . . . . . . . . . . . . . . . . S I V N Q I GL A I S K M L N T C Y R I L G IPK V I lQA I M B.a. EIA K I E K GA N rilE IIA S R E E V A S I RIK R B.t. A SQ PT 'IvaAH EKQAFLL N L H K AK L.i. E DT YRAH AAQE ID I K H K L.rn. E TK DT V TH ANQP ID GI K H K S.a. K NRN IDPILGEGGLIEYFENEE IE N FRS D M P.a. G DE LEAGE . . . . lIrHEQLNQRPILS SGF QYKV LL

E .N c'IT A YY A A R S B.a.

B.t.

L.t L.rn.

5 a.

Q

Pay

Y I I F R

L

LVIIGE D HE QAK DN MVIPEKIHISEIAEIAE IVVKD P AAS DE DN L TI E E H EK DT E K LT QE P EK DK EK VV PD QQ DEHEKL DR TN PD PA IAABIIKK AT

1\ j

)

() r r

·m'lJ KID)

1 r I I,. 11\ I

j

[L

If]

1m

IT

m

I

I

J

A D6815 E E GM D KS DE A DD

GV

GD

,I J E E I r I D D

NA

V

GV E. G

154

R. Morse et al.

E'C.S N• N RVS KIM DIOTETV I N RDGO E EKQVS F . Y A VI G K N . . .. .. K R S

B.a. B.t. Li.

0 L

N I AG GIL NIL

E E

L.rn.

S.a. P.a.

T D D

G E Q R

Q

E

.. . . . . . . .. ....

E .

Q



D T F

I

I

I

I

..

.

••

. [, I

f<

r ,;

L

\

L

1 I

1

.s

T II

m

D

S.a. P.a.

S II S I

l~

E.c. B.a. B.l. Li.

I

l I S 1 H I;

I I< L P I M " r P I

F I

I !

K K

\

~!)

II ( \., 1

L' lSI l

\ 113 H

L.rn.

S.a. P.a.

'\ 1< K

) \ \

L.rn.

I \ I \

D

k )

r r r

V Q

S.a. P E P.a.gRs E~

B.t.

L.i. L.rn. S.a. P.a. E~

B.a. B.t.

I

1< I (

[

l

P

J

r

l,

I) ]J ( I)

II

G 1 U R C L

K KIT D QE ID

0

L 1 t1 R 1 r I ] 11 1< I I I i , 1 (~ 1 I 11 RTF I (, 1 (J L I !1 I{ I ! i I ( I I rr II I [ (' I I) J r It R 1 [ f (; 1 \.

[

l~

]

J

(I

I

D Ii T K

l e I II P G I I ( (; I J) I< (; [ r I

I G

C;

r r

1

P

V

v

"

1< K (, 1

I : . , I H C 4. l~ K \; L, 1 1I I; ! H K I

r

r

I) 1

m)) J

I II

h (, N ,(

1 K l\ I K ,\ r 1\ i\ r K

r r

I~

1 i'.

T

K

T

AR A

N

B ) !. j

~ II

I [1 D S I ,\ LJ S I h LJ oS K I i, Ll S

l

I f\ I I,

! 1

l,,)

S I,

r r

I

I k 1<

J

S

~

I

l~

\

IJ J \

GAD I Lip

l; C; (;

805

j

LIP P [ \

\ [

C

(;

tl R (, 11 l) (

"

1

L \ ,\ U

(, \ L T R I< j \ [ \' r.. (, i [ 1 R 1< I \ ! v " } \ L 1 R I< L \ I) V A ,':. L I R 1< r \ I \ I...

I ,\ L K 1 !I 11 S (, \ T'

0 Q

0 ()

'

V 1\ 0

'

\

I

rJ

RITI8~

N IA MIRPDE F G E 0 RK A REID N IKE D D I IRPDE GK GOmEM D

T

TIE R TIE SD K T M FI TAN M YD

S

H l; C I H I (; C V A Il T (~ C \ "

H 1

\

.s ,; "

e L I r ,\

H I T D \ I: S E \ r TE \ t: T E \ r E K \ t: DR \ [

K. VV S D DF N R W V K T K

0 MY

Il I (; (;

1 '.

I

j'

~IP ~I~ID~Is~ ~

'-

E.c'IQ weD L LEE N S V D A D I H I V ENS V T B.a. B~ OE EABpA K Li'IA TQFVD

B.a.

'J

r \ J< F 11 " C c; 1 )) P (, [ L I V I V P rD:lli]cmrlll]G 1 ' SD V V ) V I [I H I: 1J U ( C 1 D R l; I j K R

I

v

r )

N

I,

K T P 0

V L L P I I S IJ ' PIC [ 1 \ L

A A

I, 0. H (

~

Q Q Q

mI,l

L.rn.

c)

RL RL RL

B.a. NK I S II I I P 113 L) II F' R I (; I 1 v I r \ l I S I 11 C T, P K (~l LJ I E,c'I B.t. D V t L P I I S r~ I k (~[ r ) l I ''; II H (; r R K (, I I LJ , Li. D V 1 L P I 1 s N F PI; I 1 \ I: ) 1 I SIll I; ! f< K ,~ L 1 1) I

745

I) I r I.; tl H (, N ::-; rJ I 1 I. 1 i 1..; '1 P I S (, !\ !< ( N I .s r~ r I l.. 1 I \ I fl!~ I S (, H 1< l, tl T c) tJ r I (I I ;, ,~1 P I ~ C • f) J N j " NIl , 1 \ ' t1 l~ ))

,

C [1\ " 1 A I () '-; \ G r 1\ y C 1 T \ 'v C, \ (~ E T, \ C 1 1\ (.I J \' C r \ t; I ! ' 1 \ (I ~

li

\ \,

1

I

\

l;

r

\' (~ E \ V (; V (, J r.... \' C

I

1\

1\

(J '.,

925 r

! C

1 ....; l:: ICE I " I

1

~;

E

, . 1\ U SIC F \ ( , I I; I

985

SRAAAESSIQVKNKGSIKLSNVKSVVNSSGKLVITSRNTELKLI

A

G V !,

I! I C (;

I JI P i l l I (, (

v

1\

V '.

1045 DEFGRTKESYKVPYGAVLAKGDGEQVAGGETVANWDPHTMPVITEVSGFVRFTDMIDGQT

L.i. L.rn.

S.a. P.a. E~

B.a. B.l.

1105 ITRQTDELTGLSSLVVLDSAERTAGGKDLRPALKIVDAQGNDVLIPGTDMPAQYFLPGKA

L.i. L.rn.

S.a. P.a.

~i I~Q~~~~~OISSGDTLARIP~ESGI~K Li.

. . . . .

) 1 I

) I J ) I 1

D

D.

L.~

S.a. P.a.

S N A

I

l,

i) 1

I PHI

l; 1 P R I I j ~ J P P , (J

j

j

J

I I (I \ J I 1< I (I j I 1 .I (, [ r 1< T ! J 1) j I 1< 1 (J j 1 J • J 1 I I I

I

1\

I 1 r

1\

[

I

1

I

T

f

j

;

1

h

~

K

tJ

r KC

j,

1

K

1\

r "' H

r r t ;

I"

I;

A I SIIII~a; m ~I I S ~ ~ ~II ~ ~

1< N r K C IJ t? N I K C; Q I~ N l l\ ,; (J P I ~ I' !' (,

E.c. G KgT K G K R R L I P V D G smp Y E E M I I K I Q R NIF EIE RIIEID V I S D B.a'ID V K I R V V QIE V E AITIA I YG K P Q K ISH K E MD B~ KAN T TQET T TV MV K IKV DAIR DP L.1. GR. Q I Q T DI R S NI Y R E E S lEE A L.rn. GR G.L IQ VD R S TV Y REV SMLD EA S.a. KL AK Q V K AN E T S LAS G S I I E I Q plio EV P.a. DPAEGTK V K ETmS T SLmI MK AE DYIH AP

"

l, I~

,\

1\

1\

I

T V E

S

VG VA IE VT

E E V V

VS VS ED ES

HDIRI12~ E E

K H

A A

I R I R NY S

K

T T

AG

I K

1239 B.a. I T AIT A Q R E IV E.c. IIH B.t.II T EN Li. L S 0 E L.rn. L S QE S.a. L NAT E S P.a. II L S TEN

R NI G A A

K S

I

Fig. 2. Amino acid sequences encoded by the partial rpoC gene sequences of Bacillus anthracis (B.a.), Brochothrix thermosphacta (B.t.), Listeria innocua (L.i.), Listeria murrayi (L.m.), Staphylococcus aureus (S.a.) and Pediococcus acidilactici (P.a.). The sequences are aligned with the equivalent region from Escherichia coli (E.c.) to which the numbering system relates. Four or more identical residues are shown in black boxes and conservative changes in shaded boxes. of the gram-positive bacteria determined in this study. Of the 7 amino acid residues that are not conserved, 4 cluster between amino acid residues 612 to 666 (referred to as cluster 3a by Weilbaecher et aI., 1994). The remaining 9

amino acid residues at which substitutions occurred in the

Wsubunit of E. coli are present in a region (residues 942 to 1129) that has no equivalent in the W subunits of the gram-positive bacteria. The 3' end of the nascent RNA

rpoC Gene Sequences from Grampositive Bacteria

E.c. L.i.

L.rn.

B.t. B.a. S.a.

P.a.

M.l.

58 88 CARIFGPVKDYECLCGKYKRLKHRGVICEKC CERIFGPMKDWECSCGKYKRVRYKGVVCDRC CERIFGPMKDWECSCGKYKRVRYKGVVCDRC CERIFGPTKDWECSCGKYKRVRYKGVVCDRC CERIFGPQKDWECHCGKYKRVRYKGVVCDRC CERIFGPTKDWECSCGKYKRVRYKGMVCDRC CERIFGPTKDYECACGKYKRIRYKGIVCDRC CEKIFGPTRDWECYCGKYKRVRFKGIICERC

Fig. 3. Amino acid sequences of the putative zinc-finger domain of Wsubunits. The cysteine residues that constitute the motif are highlighted. The species identifications are as for Fig. 2. with the addition of the gram-positive species Mycobacterium leprae (M.l.) which has a high %G+C content.

chain has been shown to contact the Wsubunit of E. coli between amino acids 932 and 1020 (Borukhov et aI., 1991). Most of this sequence is not present in the Wsubunits of the gram-positive bacteria sequenced in this study apart from amino acids 932 to 943 which show a conserved sequence MRTFHXGG indicating this region may be the contact point for the nascent RNA chain. It is pertinent to note that four point mutations affecting transcription termination in E. coli have been mapped to this conserved sequence (Weilbaecher et aI., 1994). A point mutation at amino acid 1033 in E. coli leading to an increase in the production of a transcription regulator DnaA and hence chromosome copy number occurs in the sequence EXXqXVXF (amino acids 1030 to 1037 in E. coli) which is conserved in gram-negative bacteria but not in the subunits sequenced in this study from gram-positive bacteria (Petersen and Hansen, 1991). Interestingly, a similar sequence, EXXGXVII, is present downstream at amino acid

155

residues llSH-1165 in the Wsubunits from both the gram-negative and gram-positive bacteria. Site-directed mutagenesis to isolate mutations with a similar phenotype would be needed to establish whether this region is of functional equivalence. Comparative 16S rRNA sequence over the past decade has revolutionised the field of bacterial systematics (Wheelis et aI., 1992; Woese, 1987 and 1994). To date, however, information on the evolutionary interrelationships of bacteria are based almost entirely on this single chronometer. Consequently, there is now a growing interest in utilising other highly conserved molecules or genes to assess congruence between inferred phylogenies. The W subunit of RNAP polymerase has many of the qualities essential for a good chronometer, such as its universality, high information content and relatively high sequence conservation. Although the relatively small number of genes sequenced to date preclude a meaningful comparison of the Wsubunit and 16S rRNA as molecular chronometers, it indicates that the pattern of relationships inferred from both molecules are in good accord. For example, 1. innocua and 1. murrayi which are genealogically close (exhibiting 96.4% 16S rRNA sequence similarity; Collins et aI., 1991) were also found to possess highly related W subunit sequences (96.1 % similarity, 92.7% identity). Of the strains examined here, according to 16S rRNA sequence data (Collins et aI., 1991) the closest relative of Listeria is Brochothrix thermosphacta which is confirmed by the Wsubunit sequences. It is our next objective to extend considerably the number of available bacterial Wsubunit sequences, so as to make an in-depth phylogenetic comparison with 16S rRNA. Acknowledgements. This work was supported by a grant from the European Community (contract: BI02-CT94-3098).

E. c.

LLDNGRRGRA1TG. SNKR [U;SLADM1 Kf,KQGRE'R,INLLI;I\RVDy:,riR:3V1TVCiPYLRLHQcriL

pr;I
L. i.

LIDNGRRGRPVTG. PGNRt'LKSLSHMLKI;KQGRE'R(lNLU;KRVDY:'GRSV1WriPNLKHYQCliL

PKEMALELE'KPE'

L . Ill.

L1 DNGRRGRPVTG. priNRt'LI\:3LSHMLKGKQGRE'Rf,.INLLGI
PI\E~IALELE'KPF

B. t .

L1 DNGRRGRPVTG. PGNR ['L!\SLSHML!\f;l\QGRFRI,.JNLUiJ\RVDY :;riRSV1V( :Ii [':'L!\MYQCGL

PREt"IA1 ELE'KP F

B. a .

L1 DNGRRGR PVTG. PGNRPLKSLSHMLKI;KQGR FRI.'IH,I ,GKRVDY :;riRSV1Wf; ['NLKMYQCriL

PKEMALELFKPF

S. a.

L1 DNGRRGRPVTG. PGNR['LK:,LSHMLJ\GJ\QGRFR,IJ'ILL/iKRVDY:·;I;RSV1AVGP:;LJ\MYQCf;L

PI\EMALELFKPF

P. a.

LIDNGRRGRPVAG. PGNRPLI·;:3L:,HMLI;GJ\QEGFRQNLLi;r;RVDy:·;riR:3V1 DVCi[':'LKMNQWiL

PV[tIAMELE'KPF

1'1.1.

LFDNGRRGRPVTG. PGI'IRPL!\:,LSDLLI\f;J\QGRFR(INLLGJ\RVDy:,riRSV1WGE'I,ILKLHQCriL

PKLMALELFKPF

D. Ill.

LV [MDM PGM PRAMQI'Slil'PLJ.;A1 }/~RLKGKEGR1 R,,;NLWiJ,RVDF:,ARTV1T P[I['NLR1 DQVGV

, PRS1AQNLT FPE

S. c. 21

YMDND1AG(I['I.'ALQKSGR['VJ\S1RARLJ'I;I\EGR1RGHLMGKRVDE':'ARTV1c,GD['HLELDQVGV .. , . PJ\c,1AKTLTYPE

S. c. 31

VNPAMLPGSSNGGGKV. K['1RGFCQRLJ\GI\QGRFR 1il'IL:'I,KRVDF:·,riRTV1SP[l['NLD1DEVAV .... PDRVAKVLTYPE

E. c. I

J\v1LEYRGL .. AI\LI\:,TYT[lI\LP. LM1NPJ\TGRVHT:::;YHI) . AVTATI,R . LSST[1['I~LQN1 P .. VRHEEGRrURQAfIAPE

hwlix J

Fig. 4. Amino acid alignments of regions of the Wsubunits of bacterial species (abbreviations as previous) against the nucleic acidbinding region of the largest subunit of Drosophila melanogaster RNA polymerase III (D.m.) and the helix-tum-helix regions of the largest subunits of Saccharomyces cerevisiae RNA polymerase II (S.c.21) and RNA polymerase III (S.c.31), and Escherichia coli DNA polymerase I (E.c.I).

156

R. Morse et al.

References Aboshkiwa, M., AI-Ani, B., Coleman, G., Rowland, G.: Cloning and physical mapping of the Staphylococcus aureus rplL, rpoB and rpoC genes, encoding ribosomal protein L7/L12 and RNA polymerase subunits ~ and W. J. Gen. Microbiol. 138, 18751880 (1992) Allison, L.A., Moyle, M., Shales, M., Ingles, c.].: Extensive homology among the largest subunits of eukaryotic and prokaryotic RNA polymerases. Cell 42, 599-610 (1985) Archambault,]., Friesen,]. D.: Genetics of eukaryotic RNA polymerases I, II, and III. Microbiol. Rev. 57, 703-724 (1993) Borukhov, S., Lee, f., Goldfarb, A.: Mapping of a contact for the RNA 3' terminus in the largest subunit of RNA polymerase. J. BioI. Chern. 266, 23932-23935 (1991) Clark, M. A., Baumann, L., Baumann, P.: Sequence analysis of an aphid endosymbiont DNA fragment containing rpoB (~­ subunit of RNA polymerase) and portions of rplL and rpoC. Curro Microbiol. 25, 283-290 (1992) Collins, M. D., Wallbanks, S., Lane, D.]., Shah,]., Nietupski, R., Smida,]., Dorsch, M., Stackebrandt, E.: Phylogenetic analysis of the genus Listeria based on reverse-transcriptase sequencing of 16S rRNA. Int. J. Syst. Bacteriol. 41, 240-246 (1991) Devereux,]., Haeberli, P., Smithies, 0.: A comprehensive set of sequence analysis programs for the Vax. Nucls. Acid. Res. 12, 387-395 (1984) Fukuda, R., Ishihama, A.: Subunits of RNA polymerase in function and structure. J. Mol. BioI. 87, 523-540 (1974) Glass, R. E., Hayward, R. S.: Bacterial RNA polymerases: structural and functional relationships. World J. Microbiol. Biotechnol. 9, 403-413 (1993) Harland, N. M., Leigh, f. A., Collins, M. D.: Development of gene probes for the specific identification of Streptococcus uberis and Streptococcus parauberis based upon large subunit rRNA gene sequences. J. Appl. Bacteriol. 74,526-531 (1993) Honore, N., Bergh, S., Chanteau, S., Doucet-Populaire, F., Eiglmeier, K., Garnier, T, Georges, c., Launois, P., Limpaiboon, T, Newton, S., Niang, K., del Portillo, P., Ramesh, G. R., Reddi, P., Ridel, P. R., Sittisombut, N., Wu-Hunter, S., Cole, S. T.: Nucleotide sequence of the first cosmid from the Mycobacterium leprae genome project: structure and function of the Rif-Str regions. Mol. Microbiol. 7,207-214 (1993) fin, D.]., Gross, C. A.: Mapping and sequencing of mutations in the Escherichia coli rpoB gene that lead to rifampicin resistance. J. Mol. BioI. 202, 45-58 (1988) Kimura, M., Fujita, N., Ishihama, A.: Functional map of the alpha subunit of Escherichia coli RNA polymerase - deletion analysis of the amino-terminal assembly domain. J. Mol. BioI. 242, 107-115 (1994) Kimura, M., Kimura, ].: The complete amino acid sequence of ribosomal protein S12 from Bacillus stearothermophilus. FEBS. Letts. 210, 91-96 (1987) Klenk, H-P., Zillig, W.: DNA-dependent RNA polymerase subunit ~ as a tool for phylogenetic reconstructions: branching topology of the archaeal domain. J. Mol. Evol. 38, 420-432 (1994) Kontermann, R. E., Kobor, M., Bautz, E. K. F.: Identification of a nucleic acid-binding region within the largest subunit of Drosophila melanogaster RNA polymerase II. Prot. Sci. 2, 223-230 (1993) Latchman, D. S.: Gene regulation: a eukaryotic perspective, pp. 190-197. 2nd ed., London: Chapman Hall (1995) Lisitsyn, N. A., Monastyrskaya, G. S., Sverdlov, E. D.: Genes coding for RNA polymerase ~ subunit in bacteria. Structure/ function analysis. Eur. J. Biochem. 177, 363-369 (1988) Mustaev, A., Zaychikov, E., Severinov, K., Kashlev, M., Poly-

akov, A., Nikiforov, V., Goldfarb, A.: Topology of the RNA polymerase active center probed by chimeric rifampicin-nuleotide compounds. Proc. Nat!. Acad. Sci. USA 91, 1203612040 (1994) Nolte, 0.: Nucleotide sequence and genetic variability of a part of the rpoB gene encoding the second largest subunit of DNAdirected RNA polymerase of Neisseria meningitidis. Med. Microbiol. Lett. 4, 59-67 (1995) Ochman, H., Medhora, M. N., Garza, D., Hartl, D. L.: Amplification of flanking sequences by inverse PCR. In PCR protocols; a guide to methods and applications, pp.219-227. Edited by M. A. Innis, D. A. Gelfand, J.J. Sninsky, T.J. White. San Diego: Academic Press (1990) Ollis, D. L., Brick, P., Hamlin, R., Xuong, N. G., Steitz, T. A.: Structure of the large fragment of Escherichia coli DNA polymerase I complexed with dTMP. Nature 313, 762-766 (1985) Ovchinnikov, Y. A., Monastyrskaya, G. S., Gubanov, V. V., Guryev, S. 0., Chertov, O. Y., Modyanov, N. N., Grinkevich, V. A., Makarova, I. A., Marchenko, T. V., Polovnikova, I. N., Lipkin, V. M., Sverdlov, E. D.: The primary structure of Escherichia coli RNA polymerase: nucleotide sequence of the rpoB gene and amino acid sequence of the ~-subunit. Eur. J. Biochem. 116, 621-629 (1981) Ovchinnikov, Y. A., Monastyrskaya, G. S., Gubanov, V. V., Guryev, S.O., Salomatina, I. S., Shuvaeva, T M., Lipkin, V. M., Sverdlov, E. D.: The primary structure of E. coli RNA polymerase. Nucleotide sequence of the rpoC gene and amino acid sequence of the W-subunit. Nucl. Acids. Res. 10, 4035-4044 (1982) Palenik, B.: Polymerase evolution and organism evolution. Curr. Opin. Genet. Dev. 2,931-936 (1992) Palenik, B., Haselkorn, R.: Multiple evolutionary origins of prochlorophytes, the chlorophyll b-containing prokaryotes. Nature 355, 265-267 (1992) Petersen, S. K., Hansen, F. G.: A missense mutation in the rpoC gene affects chromosomal replication control in Escherichia coli. J. Bacteriol. 173, 5200-5206 (1991) Sanger, F., Nicklen, S., Coulson, A. R.: DNA sequencing with chain-terminating inhibitors. Proc. Nat!. Acad. Sci. USA 74, 5463-5467 (1977) Severinov, K., Kashlev, M., Severinova, E., Bass, I., McWilliams, K., Kutter, E., Nikoforov, V., Snyder, L., Goldfarb, A.: A nonessential domain of Escherichia coli RNA polymerase required for the action of the termination factor Ale. J. BioI. Chern. 269, 14254-14259 (1994) Sharif, K. A., Fujita, N., fin, R., Igarashi, K., Ishihama; A., Krakow, ]. S.: Epitope mapping and functional characterization of monoclonal antibodies specific for the a subunit of Escherichia coli RNA polymerase. J. BioI. Chern. 269, 23655-23660 (1994) Steward, K. L., Linn, T.: Transcription frequency modulates the efficiency of an attenuator preceding the rpoBC RNA polymerase genes of Escherichia coli: possible autogenous control. Nucl. Acids. Res. 20, 4773-4779 (1992) Tittawella, I. P. B.: Evidence for clustering of RNA polymerase and ribosomal protein genes in six species of Enterobacteria. Mol. Gen. Genet. 195, 215-218 (1984) Vinella, D., D'Ari, R.: Thermoinducible filamentation in Escherichia coli due to an altered RNA polymerase ~ subunit is suppressed by high levels of ppGpp. J. Bacteriol. 176, 966-972 (1994) Weilbaecher, R., Hebron, c., Feng, G., Landick, R.: Termination-altering amino acid substitutions in the W subunit of Escherichia coli RNA polymerase indentify regions involved in RNA chain elongation. Genes and Development 8,2913-2927 (1994)

rpoC Gene Sequences from Grampositive Bacteria Wheelis, M.1., Kandler, 0., Woese, C. R.: On the nature of

global classification. Proc. Nat!. Acad. Sci. USA 89, 2930-

2934 (1992)

Woese, C. R.: Bacterial evolution. Microbiol. Rev. 51, 221-271 (1987)

Woese, C. R.: There must be a prokaryote somewhere: microbiology's search for itself. Microbiol. Rev. 58, 1-9 (1994)

157

Wu, F. Y.-H., Huang, W.]., Sinclair, R. B., Powers, 1.: The structure of the zinc sites of Escherichia coli DNA-dependent RNA polymerase.

J. BioI. Chern. 267, 25560-25567 (1992)

Zillig, W., Klenk, H.-P., Palm, P., Puhler, G., Gropp, F., Garrett, R. A., Leffers, H.: The phylogenetic relations of DNA-dependent RNA polymerases of archaebacteria, eukaryotes, and eubacteria. Can. J. Microbiol. 35, 73-80 (1989)

Peter T. Richardson, Department of Microbial Physiology, BBSRC Institute of Food Research, Earley Gate, Whiteknights Road, Reading RG6 6BZ, U.K.