Gene. 153 (1995) 17-23 © 1995 Elsevier Science B.V. All rights reserved. 0378-1119/95/$09.50
17
GENE 08450
Sequence, expression and transcriptional analysis of the coronafacate ligase-encoding gene required for coronatine biosynthesis by Pseudomonas syringae (Phytotoxin; amide linkage; intermolecular ligation; polyketide; cfl; ethylcyclopropyl amino acid; IS50)
H. Liyanage a, C. Penfold b, J. T u r n e r b a n d C.L. B e n d e r a aDepartment of Plant Pathology, Oklahoma State University, Stillwater, OK 74078-9947, USA; and bMolecular Biology and Microbiology Sector, School of Biological Sciences, University of East Anglia, Norwich NR4 7T J, UK. Tel. (44-603) 592-192; Fax (44-603) 259-492 Received by A.M. Chakrabarty: 14 July 1994; Revised/Accepted: 19 August/26 August 1994; Received at publishers: 19 September 1994
SUMMARY
Pseudomonas syringae pv. glycinea PG4180 produces the chlorosis-inducing phytotoxin coronatine (COR), which consists of a polyketide component, coronafacic acid (CFA), ligated by an amide bond to coronamic acid (CMA), an ethylcyclopropyl amino-acid derived from isoleucine. We report the nucleotide sequence of a 2.37-kb region containing the coronafacate ligase-encoding gene (cfl) which is required for the amide linkage of CFA and CMA. The transcription start point for cfl was identified, and the Cfl protein was overproduced from the T71ac promoter in Escherichia coli. The deduced amino-acid sequence of Cfl showed homology to a variety of adenylate-forming enzymes which bind and hydrolyze ATP in order to activate their substrates for further ligation.
INTRODUCTION
Coronatine (COR) is a novel phytotoxin produced by sew~ral pathovars (pvs.) of Pseudomonas syringae and is a pathogenicity or virulence factor in plant diseases caused by these organisms. The two major components of COR, coronafacic acid (CFA) and coronamic acid (CMA), have distinctly different biosynthetic origins. CFA has a bicyclic structure and is synthesized as a Correspondence to: Dr. C.L. Bender, 110 Noble Research Center, Oklahoma State University, Stillwater, OK 74078-9947, USA. Tel. (1-4c)5) 744-9945; Fax (1-405) 744-7373; e-mail:
[email protected] te.edu Abbreviations: aa, amino acid(s); Ap, ampicillin; bp, base pair(s); CFA, corenafacic acid; Cfl, coronafacate ligase; cfl, gene encoding Cfl; CMA, coronamic acid; COR, coronatine; DTT, dithiothreitol; GCG, Genetics Computer Group (Madison, Wl, USA); GUS, 13-glucuronidase activity; IPTG, isopropyl-13-o-thiogalactopyranoside; IS, insertion sequence (Fiandt et al., 1972); kb, kilobase(s) or 10130bp; Km, kanamycin; KMB, King's medium B; LB, LurJa-Bertani (medium); MG, mannitolglutamate (medium); MGY, MG containing 0.25 g/1 yeast extract; SSD1 0378-1119(94)00661-X
branched polyketide through an undetermined sequence of events (Parry et al., 1994). CMA, an ethylcyclopropyl amino acid, is derived from the isoleucine biosynthetic pathway and cyclized via an unknown mechanism (Mitchell, 1985; Parry et al., 1994). Both CFA and CMA function as defined intermediates in the COR pathway and can be used in substrate feeding studies to relieve the biosynthetic block in COR- mutants (Bender et al., 1993). Tn5 mutagenesis, substrate feeding studies and nt, nucleotide(s); oligo, oligodeoxyribonucleotide; ORF, open reading frame; P., Pseudomonas; PAGE, polyacrylamide-gel electrophoresis; pv., pathovar; R resistance/resistant; SD, Shine-Dalgarno (sequence); SDS, sodium dodecyl sulfate; Sm, streptomycin; Sp, spectinomycin; Tc, tetracycline; TE, 10 mM Tris.HCl/1 mM EDTA pH 8.0; Tn, transposon; tsp, transcription start point(s); u, unit(s); UAS, upstream activation sequence(s); uidA, gene encoding GUS; wt, wild type; Xaa, any aa; XGluc, 5-bromo-4-chloro-3-indolyl glucuronide; +, compound or activity is at wt level; + / - , compound or activity is at levels significantly lower than observed in wt; - , nonproducer of designated compound/activity; [], denotes plasmid-carrier state; ::, novel junction (fusion or insertion).
18 complementation analyses were previously used to localize the COR biosynthetic cluster within a 30-kb region of a 90-kb plasmid designated p4180A (Bender et al., 1993). CFA and CMA are ligated or 'coupled' via an amide linkage to form the end-product, COR. Tn5 insertions within a 1-kb region of p4180A eliminated the ligation of CFA and CMA, and complementation analyses indicated that a 2.37-kb region was required to restore this activity (Bender et al., 1993). The 2.37-kb region also conferred ligation activity to P. syringae strains which lack the COR biosynthetic cluster; in these experiments, P. syringae transconjugants containing the 2.37-kb region produced COR when supplied with both CFA and CMA (Bender et al., 1993). The aims of the present study were to sequence this 2.37-kb region and locate cfl and its transcription start point (tsp).
RESULTS AND DISCUSSION
(a) Nueleotide sequence of the 2.37-kb region Successful cloning of the DNA responsible for ligating CFA and CMA was reported previously (Bender et al., 1993). The nt sequence of this region contained two ORFs: the 336-nt ORF1 extended from nt 304 to 639 (Fig. 1) and the 1464-nt ORF2 encompassed nt 772 to 2235 (Fig. 1). The G + C content at codon positions 1, 2 and 3 was 60, 51 and 54% for ORF1, and 66, 48 and 74% for ORF2. When the nt sequences of the two ORFs were analyzed with a Pseudomonas codon usage table (J.M. Cherry, personal communication), a high number of rare codons were utilized in ORF1 and the deviation from characterized Pseudomonas genes was significant. The codons utilized in ORF2, however, were typical of other sequenced Pseudomonas genes. Putative SD sequences (Fig. 1) were located 16 bp upstream from the ORF1 tsp (nt 288-291) and 12 bp upstream from the ORF2 tsp (nt 760-762). Four Tn5 insertions designated G70, G73, G109 and G139 were previously shown to abolish ligation activity and were located within a 1-kb region of the COR biosynthetic gene cluster (Bender et al., 1993). In the present study, these four mutations were precisely mapped to a 900-bp region encoded within ORF2 (Fig. 1). The nearest Tn5 insertion upstream from ORF2, G82, did not abolish ligation activity and was located within ORF1. The nearest Tn5 insertion downstream from ORF2 was located 1 kb distal to G73 and also had no effect on ligation activity (Bender et al., 1993). Our results indicate that ORF2 is cfl; all mutants defective in the ligation of CFA and CMA map to ORF2, and all mutants mapping outside of ORF2 exhibited some level of ligation when supplied with the necessary
precursor(s) (Bender et al., 1993). Unlike ORF1, the codon usage pattern within ORF2 is more typical of Pseudomonas genes and the putative SD sequence is located within an acceptable distance from the tsp. Furthermore, no significant similarity was found between the deduced aa sequence of ORF1 and entries in the Swiss-PROT, PIR and GenPept databases. However, the deduced aa sequence of ORF2 showed significant homology to a number of different enzymes which catalyze ligation reactions between molecules (see section d). These similarities are quite exciting because they resemble the kind of homologies we would predict for the enzyme which ligates CFA and CMA. Therefore, based on the mutational studies and functional homologies derived from the predicted aa sequence, we will refer to ORF2 as cfl.
(b) Mapping the cfl tsp Our data suggested that ORF1 might not be translated, since the derived aa sequence indicated the usage of many rare codons and the SD was located too far from the tsp. Interestingly, nt sequence analysis of ORF1 revealed the presence of an upstream activation sequence (UAS), TGTNloACA, which is required for nifA-mediated transcription of selected promoters in Klebsiella pneumoniae (Buck et al., 1986). This finding suggested that ORF1 might contain sequence information necessary for the transcriptional activation of eft. To further investigate this hypothesis, we determined whether constructs containing the 5' end of cfl and upstream DNA could activate transcription of the promoterless glucuronidase-encoding gene (uidA) in pRG960sd. Two constructs were investigated for their ability to promote transcription of uidA; pHLP1 and pHLP2 contained the 0.46-kb PstI-XbaI fragment in opposing orientations relative to uidA (Table I). The PstI site was located 138 nt downstream from the cfl tsp and the XbaI site was located 311 bp upstream from the cfl tsp and in the middle of ORF1 (Fig. 1). When P. syringae pv. glycinea PG4180 transconjugants containing pHLP1, pHLP2 and pRG960sd were grown on M G agar containing 20 ~tg XGluc/ml, a blue color was produced only by transconjugants containing pHLP1. When GUS activity was determined fluorimetrically, PG4180[pHLP1] cells produced 5.8 u GUS/mg of protein, a level significantly higher than PG4180[pHLP2] and PG4180[pRG960sd], which produced 0.3 and 0.4 u, respectively (1 u GUS = 1 ~tmol methylumbelliferone formed/min). These results confirmed the direction of transcription for cfl and suggested that DNA upstream from the cfl tsp was involved in promoting transcription. On the basis of the direction of transcription and the nt sequence, a 20-bp oligo primer was synthesized to
19
A~c=~Ac;Acc~zG`u~6c~AT~c(~`~Tc¢=~`~cA~c~c~cA~c~r~c~c~c~T~c~c~cA~u~Az~c~A~z~TTAT6
120
c~mc~T~G~c~AcAcTA~Tc~TAc~`u~Tc~cr~cr'~TcA~c~T`W~cAT~ccA~('~T~cAT~c~ccT~Tc~GA~Tc~c~TGT~`~
z~o
m~AcT cA~cA~cAcc~c~cAcc=¢TÈAc~=cA~ecc~AGG~cmTGGTc~u~cTA~G~cGcTc~?AcGcc~TT~G~TAGGT~cGT~TT~A~c~TGGGc
360
TACCTCCCTGAACGAGCAGATAGTTTG~GT~CCGATGATAATACTTT~GCGAAGGTTACCCTCATTGCGATCAGTCATTCAG~GTTACAGAACCGTTCTAGAAAACGG~TTTGC~GGTAC f G82 Xtml
480
CTTACG~CMGCG~TT~C~TAC~CC~T~CC~A~AGG~TTCA~TTGTCCG~CTCGGG~GCGTC~TGAACC~G~CG~TTC~GCAG~CGC~TG~G~CAC~TGCG
600
CTGCT~GCGTAC~GGC~CTC~C~GCATTAGCG~CGG~TACGTCTGTTA~GGC~C~GTACG~GGC~CTTGAA~A~GGTC~TG~T~AGG~CG~AGTTT~AC~
~0
•
N
S
L
I
S
E
F
R
S
V
V
A
Q
Q
P
D
T
T
A
V
V
E
D
ACGCCTCGTATTTTTCCAT~GCCGGTGC~TTATCTAG~T~GTCT~TTTCT~GTTCCG~GCGTCGTCGCCC~GCCG~TACCACGGCCGTGGTGCJ~T
~0
Q R A F S F T E L A 0 L A D K V S A G L L Q G L Q P G D R V A I H L G N R L E k CAGCGCGCGTTTTCCTTTACA~GCTGGCAC~CTGGCC~CAAAGTCTCCGCCGGGCTGTTGCAAGGCCTG£AGCCGGGC~TCGCGTGGCTATC~CTTAGGT~TCGGCTGGAACTG
%0
estz V k L Y Y A C L E [ G A V T V P I N R R L V A G E l E H L L H H S G A R Y Y I G GTGGCCTTGTACTATGCCTGCCTG~TCGGCGCAGT~GCCTATC~TCGGCGCCTCGTCGCCGGCGAAATT~CAGTGCATCTGCTTCACCACAGCGGTGCCCGCTACTACATCGGC
103
E 0 E T Y S R Y A A V I A G S A T L E R A W l V A G E E A L g G E R Y L P W S D ~AG~ccTA~AG~GTTA~G~c~;cGGT~TcG~GGGcAGTGc~cGcTGGAAc~G~TG~TcGTcGcGGGc~GGAAG~G~Tc~GGGGAGcGGTAT~TG~TGGTcc~T fGI~ ) ""'">'"'"''~'<"'"'""'"'""""~''"'""'""" L L V S S P S K R I P S H A O S L A A I F Y T ~i~i~i~!i!i!~i~i!i!i~i~;~!i!i~i!i~i~ I V H S Q k T L CTGCTCGTCTCGTCGCC~GT~GCGG~:CGCCTAGCCACGCG~TTCGCTCGCGGC~TTTTCTATACG~CC~C~CA~CT~GGGTATCGTGCATTCCCAAGC~CCCTG
143 1200
A O A V A L N K A I~ N P P A R L K V P S T R A A V H S N N D A I V P W S l L N I G~GcA~GcCGT~G~cCT~T~AAGGCTATGATGccAccCGCAcGGCTC~GGTGcCcTc~CGGGcGGCCGTc~TTC~T~TG~TGC~TcGTGc~cTG~G~ATcCT~T~TC fG139 L A A H R L G R A V V L L P V L T A E T T L A L L Q R L P L S F L H G A P S H F CTGGCCGCGCACCGTCTGGGGCGGGCC;TTGTGTTGCTACCCGTACTTACGGCG~C~CCCTGGCGCTGCTAC~C~CTGCCGCT~GTTTTCTCAAAGGCGCTCCCTCGCACTTC
223 I~0
N N L L A A G E A S A A P P L P S L T S T Y S V S G D L C P P g L G R R M H D L ~T~CCTGCTCGCTGCAGGC~GGCATCGGCAGCCCCCCCCCTTCCCTCTCT~C~GCACCTATAGCGTCAGCGGT~CCTCTGCCCTCCGAAGCTGGGTAGGCGTTGGCAC~CCTA Ps._.ttl M G G T L R G S Y G T T E S G P I F C Q P D V A G T E Q S S I G W P L P G V A L TGGGGCGG~CGTTG~GTGGTTCCTACGGTAC~CG~GTCTGGCCC~TCTTCTGCCAACCC~TGTCGCCGGCACC~GCAATCATC~TCGGCTGGCCGCTGCCCGGCGTCGCGCTG Pstl Q Q T E T G E L L ] R S P A N T P G L W N G Q D A O R L P A T R W I A ~iiiiiiii~iiiiiii~ L V---CAGCAGACGGAGA~GG~GAGCTGTTGATTCGCT~G~CAGCCA~AC~GGGGCTGTGGAAC~GC~AGGATG~TGATCG~CTGC~GCTA~GCGCTGGATAG~T~.(~1AT~TGGTG G70 Q R Q D 0 G G Y L I I G R E K D N L K C D A Y S I S P V E V E Q E L L K k L D 1 CAGCGCCAGGACGATGGCGGTTACCTCATCATCGGTCGGGAAAAGGACATGCTGAAATGCGACGCTTATTCCATATCCCCCGTGGAAGTCGAGCAGGAGCTGCTCAAGCTGCTCGACATC G73 A E A V V F G V P D A T I G E R P V A L L R T T S G R E L P T Q Q L K Q H L K A GCCGAAGCCGTGGTGTTCGGTGTTCCTGATGCCACCATCGGCGAGCGCCCGGTCGCTCTGTTGCGTACTACCAGTGGGCGGGAGCTTCCCACGCAACAGCTGAAGCAGCACTTGAAGGCG
303 1~0
L [ k E Y K H P R Q Y L F V D V F R C P P R G R * TTGATCGCGGAATACAAGCACCCCCGCCAATACCTGTT•GTCGACGTATTCCGCTGTCCTCCGCGGGGAAGGTGAGCCGCAAACAATTGGCAAGCGACTACCGAGAGATCCTCCGCCGCC
687 2280
CCATCGACCTAACCCCGTCAGTGACGCGCGTTGCTGCGGGACCGGCTCTACAGCCGGCCCCGATACCAACCCTTTGCCCACCGCTGGAGCTC Ss._.~tZ
2.372
10~
1~ 1320
25 1560
343 1800 383 1920 423 2040 463 2160
Fig. 1. The nt sequence of the 2.37-kb region containing cfl. The reported nt sequence was deposited with GenBank/EMBL, accession No. U09027. The ORF1 nt sequence is underlined. The deduced aa sequence of cfl (ORF2) is shown in a single letter code above the nt sequence. Exact sites of restriction enzymes are indicated below the nt sequence. Putative SD sequences are double-underlined. The exact location of five Tn5 insertions in the -'..37-kb region was determined by cloning a portion of IS50 and flanking Pseudomonas DNA as a HindlII fragment in pRK415. An 'oligo primer (5'-GGTTCCGTTCAGGACGCTAC) complementary to a 20-bp region near the IS50 ends of Tn5 was synthesized and used to derive the sequence of lhe region flanking the Tn5 insertion. An inverted triangle indicates the tsp as determined by primer extension. The nt showing homology to the nifH promoter (CTGCA) and the UAS which functions as a n/fA-binding site (TGTNloACA) are shaded. The aa containing the putative P-loop (SGTTGITKG) and ATPase (TGD) motifs are shaded. Methods: Sequencing was performed by the dideoxy chain-termination method (Ausubel et al, 1987) using Sequenase 2.3TM (US Biochemical, Cleveland, OH, USA). When necessary, GC compressions were resolved with terminal deoxynucleotidyl transferase (Gibco BRL, Bethesda, MD, USA) after completion of the sequencing reaction. The nt sequence was obtained for both strands using pBluescript SK + derivatives which were constructed using PstI, XbaI and SstI. The T7 and T3 primers were used to sequence inserts cloned in pBluescript SK+ and internal primers (18 nt in length; synthesized by the Oklahoma State University Recombinant DNA/Protein Resource Facility) were used to fill gaps when necessary.
locate the tsp for cfl. A primer complementary to the RNA located between nt 884 and 903 revealed that the cf! transcript begins with the G located at nt 752 (Figs. 1 and 2). The region preceding the tsp showed no homology to cy54 or cr7° recognition sequences; however, a search for transcription factors revealed a sequence preceding the cfl tsp, C T G C A (nt 740-744, Fig. 1), which is identical to the - 10 to - 15 sequence of the K. pneumon-
iae nifH promoter (Brown and Ausubel, 1984). The potential contribution of the n/fH-like promoter sequence and the nifA UAS in the transcriptional activation of cfl are currently under investigation.
(e) Overproduction of Cfl Production of Cfl in E. coli was accomplished by cloning cfl into pET22b +. The plasmid constructed for over-
2O TABLE I Bacterial strains and plasmids Strains, plasmids
Relevant properties
Reference
F- ompT rff mff A(lac-proAB), [F': laclqAM15] dut-1, ung-1, contains F' plasmid pCJ105 (CmR) A(lacZ YA-arg F )m 69
Novagen Bio-Rad Bio-Rad Ausubel et al. (1987)
Bacterial strains" Escherichia coli
BL21 MV1190 CJ236 DH5cx
Pseudomonas syringae pv. glycinea
PG4180 PG4180.G73 PG4180.G70 PG4180.G82 PG4180.G109 PG4180.G139
COR+; contains p4180A KmR CFA +/ CMA+Cfl KmR CFA+/-CMA+Cfl Km R CFA-CMA+Cfl + Km R CFA-CMA+Cfl Km R CFA-CMA+Cfl -
Bender et al. (1993) Bender et al. (1993) Bender et al. (1993) Bender et al. (1993) Bender et al. (1993) Bender et al. (1993)
TcR; RK2-derived cloning vector ApR; ColEI origin ApR; contains T71ac promoter
Keen et al. (1988) Stratagene Novagen Ausubel et al. (1987) Bender et al. (1993)
Plasmids pRK415 pBluescript SK + pET22b + M13mpl9 p4180A pCLB14 pilL8 pilL2 pRG960sd pHLP1 pHLP2 pHLET.1
lacZ'
90-kb plasmid in P. syringae pv. glycinea PG4180; contains COR biosynthetic gene cluster TcR; 2.37 kb of p4180A in pRK415; contains cfl TcR; 2.9-kb SstI-SalI in pRK415 TcR; 1.9-kb SstI-XbaI in pRK415 Sma SpR; 17.0 kb; promoterless uidA; contains SD and start codon Sm~ SpR; 0.46-kb XbaI-PstI of pilL2 in pRG960sd (XbaI-PstI-uidA) SmR SpR; 0.46-kb XbaI-PstI of pilL2 in pRG960sd (PstI-XbaI-uidA) Ape; 1.57-kb NdeI-SstI fragment containing cfl in pET22b +
Bender et al. (1993) Bender et al. (1993) Bender et al. (1993) Van den Eede et al. (1992) This study This study This study
a LB medium (Ausubel et al., 1987) was used for cultivation of E. coli DH5~x. Strains of P. syringae were maintained on MG medium (Keane et al., 1970) supplemented with 0.25 g yeast extract (MGY)/1. P. syringae broth cultures were grown in MGY on a rotary shaker (250 rpm) at 20 24°C. Antibiotics were added to media in the following concentrations (~tg/ml):Ap, 40; Km, 25 (E. coli) or 10 (P. Syringae); Sm, 25; Sp, 25; Tc, 12.5.
expression of cfl, p H L E T . 1 , was derived using the strategy o u t l i n e d in Fig. 3. P l a s m i d p H L E T . 1 was transf o r m e d into cells of E. eoli BL21 which were lysogenic for p h a g e DE3, i n d u c e d with I P T G a n d a n a l y z e d for expressed proteins by SDS-PAGE. Strain BL21[pHLET.1] overproduced a 53-kDa polypeptide u p o n I P T G - i n d u c e d expression of cfl from the T71ac p r o m o t e r (Fig. 3, lane 1). F u r t h e r experiments ( d a t a n o t shown) i n d i c a t e d t h a t the o v e r p r o d u c e d Cfl was present a l m o s t exclusively in the insoluble fraction of the t o t a l cellular p r o t e i n s in I P T G - i n d u c e d BL21 [ p H L E T . 1 ] .
(d) Alignment of the deduced Cfl aa sequence with similar proteins T h e d e d u c e d a a sequence, of Cfl was used to c o n d u c t a B L A S T X search (Altschul et al., 1990) of the SwissP R O T , P I R a n d G e n P e p t databases. Cfl showed significant h o m o l o g y to several C o A ligases, luciferase a n d various p e p t i d e a n d ester synthetases (Table II). All enzymes listed in Table II are k n o w n to activate the c a r b o x y acid
of the c o r r e s p o n d i n g s u b s t r a t e t h r o u g h a d e n y l a t i o n . T u r g a y et al. (1992) previously d i v i d e d the superfamily of a d e n y l a t e - f o r m i n g enzymes into two groups. E n z y m e s in g r o u p I (Table II) activate their a a substrates by aden y l a t i o n a n d b i n d t h e m as thioesters. This g r o u p includes the p e p t i d e synthetases for tyrocidine, gramicidin, L-(~aminoadipyl)-L-cysteinyl-D-valine synthetase a n d enterob a c t i n synthetase c o m p o n e n t F. G r o u p II (Table II) includes v a r i o u s a d e n y l a t e - f o r m i n g enzymes where thioester f o r m a t i o n does n o t occur or has n o t been d e m o n strated. M a n y of the enzymes in g r o u p I I are k n o w n to activate c a r b o x y l i c acids; for example, 4 - c o u m a r a t e - C o A ligase, 4 - c h l o r o b e n z o a t e - C o A ligase, e n t e r o b a c t i n synthetase c o m p o n e n t E a n d bile a c i d - C o A ligase are k n o w n to activate 4 - c o u m a r a t e , 4 - c h l o r o b e n z o a t e , 2 , 3 - d i h y d r o x y b e n z o i c acid a n d C-24 bile acid, respectively ( L o z o y a et al., 1988; M a l l o n e e et al., 1992; Schmitz et al., 1992; S t a a b et al., 1989; T u r g a y et al., 1992). T w o motifs, S G T T G X X K G a n d T G D (Fig. 1, aa 167-175 a n d 379-381), are conserved in Cfl a n d all
21
kDa
DNA 3,
M
12
34
5
6
55 m,
CII ~ 29
mRNA s
"qN m ~ o f
o c
^ A A A A A A O O 3'
trmm~pt
5'
Fig. 2. The tsp for cfl. The 5' end of the cfl mRNA was determined by primer extension analysis with avian myeloblastosis virus (AMV) reverse transcriptase. The primer extension DNA product and the predicled mRNA sequence are indicated. Lanes G, A, T, C represent the sequence ladder generated by T7 DNA polymerase (Sequenase 2.0). Total RNA was isolated from P. syringae pv. glycinea PG4180 as described by Salmeron and St~.skawicz (1993). Primer extension was conducted by combining 50 ~tg of RNA with 20 ng of a 20-nt primer (5'-(AACAGCCCGGCGGAGACTT), boiling 1 min and adding 10 ~t] reaction buffer (100mM Yris-HC1 pH 8.3/140mM KCI/50 mM MgCI2/10mM DTT/50~tM dATP, dGTP and dTTP/50~tCi 1-32p]dCTP) and 8 u AMV reverse transcriptase. After 30 min at 37°C, 3 ~tl of chase solution was added (25 mM of each dNTP), and the incubation continued at 30°C for 30 min. The reaction was then terminate,i by extracting once with Fhenol/chloroform (1:1, v/v), and nucleic acids were recovered by ethanol precipitation. The labelled pellet was then redissolvedin a 20 ~tlvolume containing equal volumes of TE and stop buffer (95% formamide/20mM EDTA/0.05% bromophenol blue). The primer extension product was boiled 2 min prior to electrophoretic separation on an 8% polyacrylamide gel. The primer which was used to d~termine the tsp was also used to prime a nt sequencing reaction using pilL2 as template. enzymes listed in Table II. Hori et al. (1991) speculated tha~: the sequence S G T T G X P K G m a y represent a new class of phosphate-bindi~ag loop (P-loop). M a n y A T P and G T P - b i n d i n g protei:as have a P-loop, a n d the prim a r y structure consists ¢,f a Gly-rich sequence followed by a conserved Lys. The conserved Lys residue is t h o u g h t to be very i m p o r t a n t to the c o n f o r m a t i o n of the c a n o n i c a l P - l o o p a n d m a y interact directly with the [3 a n d 7 phosphates of the b o u n d N T P (Saraste et al., 1990). G o c h t and M a r a h i e l (1994) rece:atly used site-directed m u t a g e n -
Fig. 3. SDS-PAGE analysis of E. coil BL21 hyperproducing Cfl. Lanes 1 6 contain total cellular proteins from the following strains: 1, BL21[pHLET.1], 2 h induction with IPTG; 2, uninduced BL21[pHLET.1]; 3, BL21[pET22b+], 2 h induction with IPTG; 4, uninduced BL21[pET22b+]; 5, BL21, 2 h induction with IPTG; 6, uninduced BL21. Lane M contains the followingmolecular mass markers: glutamate dehydrogenase (55kDa) and carbonic anhydrase (29 kDa). Methods: Site-directed mutagenesis: pHLET was constructed by cloning cfl downstream from the T71ac promoter in pET22b +. To facilitate this, an NdeI site was introduced at the translation start site for cfl. Conversion of the ATG into an NdeI site was accomplished by site-directed mutagenesis and the following oligo primer: 5'-AGAAATCAGACTCATATGAGATAACCTTTTTTT.Mutagenesis was accomplished using the Muta-Gene M 13 In Vitro Mutagenesis Kit (Version 2, Bio-Rad, Hercules, CA, USA). cfl was initially subcloned as an XbaI-SstI fragment into M13mpl9 and transfected into E. coli CJ236. Uracil-containing, ssDNA was isolated and used as a template for mutagenesis (Ausubel et al., 1987). Mutagenized DNA was then transformed into E. coli MVll90, and clones were screened for NdeI. Overexpression of cfl: BL21 cells containing pET22b+ or pHLET.1 were cultured in 5 ml LB broth for 3 h at 37°C, 280 rpm, and then supplemented with 1 mM IPTG. Cells were pelleted, washed once in TE buffer and then resuspended in 250 lal 10 mM Tris.HC1pH 8.0. 25 ~tl of resuspended cells were mixed with an equal volume of 2 × SDSPAGE sample buffer (Bio-Rad) and boiled 5 min. Samples were loaded to a 0.1% SDS-4-20% PAGE gel (Mini-Protean II Ready Gel, BioRad), and electrophoresis was for 5 h at 50 V.
esis to d e m o n s t r a t e the i m p o r t a n c e of the Lys residue in the putative A T P - b i n d i n g site in tyrocidine synthetase I. The T G D motif which is present in Cfl a n d k n o w n a d e n y l a t e - f o r m i n g enzymes ( T u r g a y et al., 1992) was previously shown to be conserved in m a n y cation-ATPases, a n d mutagenesis was recently used to verify the significance of the Asp residue (Gocht a n d Marahiel, 1994). The strong c o n s e r v a t i o n of the P - l o o p a n d ATPase motifs in Cfl suggest that these are critical c o m p o n e n t s of the a d e n y l a t i o n reaction catalyzed by Cfl. F o u r a d d i t i o n a l core sequences (Gocht a n d Marahiel, 1994) have been identified in enzymes which activate aa substrates as acyladenylates a n d b i n d them as thioesters (group I enzymes, Table II). These a d d i t i o n a l motifs are either poorly conserved or absent in enzymes listed in g r o u p II (Table II) a n d is consistent with the inability of these enzymes to form thioesters ( T u r g a y et al., 1992). The four motifs associated with the aa-activating enzymes
22 TABLE II Proteins with similarity to Cfl Enzyme a
Origin
Identityb (%)
Similarityb (%)
Group I Enterobactin synthetase component F (ENTF) Tyrocidine synthetase I (TYCA) Gramicidin S synthetase II (GRS2) L-(~-aminoadipyl)-L-cysteinyl-D-valine synthetase (ACVS)
Escherichia coli Bacillus brevis Bacillus brevis Cephalosporium acremonium
26.5 24.1 25.2 24.5
48.4 46.1 47.3 45.6
Petroselinum crispum Luciola mingrelica Arthrobactor sp. Escherichia coli Eubacterium sp.
23.8 23.3 25.9 25.0 24.5
46.4 45.8 48.5 45.0 50.0
Group II 4-Coumarate: CoA ligase (4-CL) Luciferase (LUCI) 4-Chlorobenzoate CoA ligase (4-CBA) Enterobactin synthetase component E (ENTE) Bile acid-CoA ligase (BAIB)
a References for the cited enzymes are as follows: ENTF (Rusnak et al., 1991); TYCA (Weckermann et al., 1988); GRS2 (Turgay et al., 1992; ACVS (Guti6rrez et al., 1991); 4-CL (Lozoya et al., 1988); LUCI (Devine et al. (1993); 4-CBA (Schmitz et al., 1992); ENTE (Staab et al., 1989); BAIB (Mallonee et al., 1992). b Percent similarity and identify of the cited enzymes to the deduced aa product of cfl were determined by BESTFIT analysis (GCG).
are also either absent or poorly conserved in Cfl, suggesting that Cfl is more closely related to the enzymes in group II.
(e) Conclusions (1) Nucleotide sequence analysis indicated that cfl is encoded by a 1.46-kb ORF. The cfl tsp was localized in the present study, and analysis of upstream DNA suggested relatedness to n/f regulatory sequences. (2) Analysis of the deduced aa product of cfl suggested relatedness to a group of adenylate-forming enzymes which do not bind their substrates as thioesters and do not require 4' phosphopantetheine as a cofactor. Enzymes in this category include 4-coumarate CoA-ligase, bile acid-CoA ligase and enterobactin synthetase component E; these polypeptides share the activity of ATP-dependent linking of cyclic carboxylic acids to AMP or CoA. By analogy, we hypothesize that Cfl adenylates CFA, a cyclic carboxylic acid, and the activated form of CFA is subsequently ligated to CMA. Current efforts are focussed on purification of Cfl and development of a cell-free assay to study its activity in vitro.
ACKNOWLEDGEMENTS
The authors would like to thank M. Ullrich, D. Gross, R. Parry, R. Mitchell and P. Reynolds for stimulating discussions relevant to this study. This research was supported by Oklahoma Agricultural Experiment Station project 2009 and by NSF grants INT-9220628 and MCB-9316488.
REFERENCES Altschul, S.F., Gish, W., Miller, W., Myers, E.W. and Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215 (1990) 403-410. Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Siedman, J.G., Smith, J.A. and Struhl, K.: Current Protocols in Molecular Biology, Greene Publishing and Wiley, New York, NY, 1987. Bender, C.L., Liyanage, H., Palmer, D.A., Ullrich, M. and Mitchell, R.: Characterization of the genes controlling the biosynthesis of the polyketide phytotoxin eoronatine including conjugation between coronafacic and eoronamic acid. Gene 133 (1993) 31-38. Brown, S.E. and Ausubel, F.M.: Mutations affecting regulation of the Klebsiella pneumoniae nifH (nitrogenase reductase) promoter. J. Bacteriol. 157 (1984) 143-147. Buck, M., Miller, S., Drummond, M. and Dixon, R.: Upstream activator sequences are present in the promoter of nitrogen fixation genes. Nature 320 (1986) 374-378. Devine, J.H., Kutuzova, G.D., Green, V.A., Ugarova, N.N. and Baldwin, T.O.: Luciferase from the East European firefly Luciola mingrelica: cloning and nucleotide sequence of the eDNA, overexpression in Escherichia coil and purification of the enzyme. Biochim. Biophys. Acta 1173 (1993) 121-132. Fiandt, M., Szybalski, W. and Malamy, M.H.: Polar mutations in lac, gal, and phage 2 consist of a few IS-DNA sequences inserted with either orientation. Mol. Gen. Genet. 119 (1972) 223-231. Gocht, M. and Marahiel, M.A.: Analysis of core sequences in the DPhe activating domain of the multifunctional peptide synthetase TycA by site-directed mutagenesis. J. Bacteriol. 176 (1994) 2654-2662. Guti6rrez, S., Diez, B., Montenegro, E. and Martin, J.F.: Characterization of the Cephalosporium acremonium pcbAB gene encoding et-aminoadipyl-cysteinyl-valine synthetase, a large multidomain peptide synthetase: linkage to the pcbC gene as a cluster of early cephalosporin biosynthetic genes and evidence of multiple functional domains. J. Bacteriol. 173 (1991) 2354-2365. Hori, K., Yamamoto, Y., Tokita, K., Saito, F, Kurotsu, T., Kanda, M, Okamura, K., Furuyama, J. and Saito, Y.: The nucleotide sequence for a proline-activating domain of gramicidin S synthetase 2 gene from Bacillus brevis. J. Biochem. 110 (1991) 111-119.
23 Keane, P.J., Kerr, A. and New, P.B.: Crown gall of stone fruit, II. Identification and nomenclature of Agrobacterium isolates. Aust. J. Biol. Sci. 23 (1970) 585-595. Keen, N.T., Tamaki, S., Kobayashi, D. and Trollinger, D.: Improved broad-host-range plasmids for DNA cloning in Gram-negative bacteria. Gene 70(1988) 191-197. Lozoya, J., Hoffman, H., Douglas, C., Schulz, W., Scheel, D. and H ahlbrock, K.: Primary structure and catalytic properties of isoenzymes encoded by the two 4-coumarate:CoA ligase genes in parsley. Ear. J. Biochem. 176 (1988) 661-667. Mallonee, D.H., Adams, J.L. and Hylemon, P.B.: The bile acid-inducible baiB gene from Eubacterium sp. strain VPI 12708 encodes a bile acid-coenzyme A ligase. J. Bacteriol. 174 (1992) 2065-2071. Mitchell, R.E.: Coronatine biosynthesis: incorporation of L-[Ul"C]isoleucine and L-[U-:~4C]threonine into the 1-amido-1carboxy-2-ethylcyclopropyl moiety. Phytochemistry 24 (1985) 247-249. Parry, R.J., Mhaskar, S.V., Lin. M.-T., Walker, A.E. and Mafoti, R.: Investigations of the biosynthesis of the phytotoxin coronatine. Can. J Chem. 72 (1994) 86-99. Rusnak, F., Sakaitani, M., Drue:khammer, D., Reichert, J. and Walsh, C.T.: Biosynthesis of the Escherichia coli siderophore enterobactin: sequence of the entF gene, expression and purification of EntF, and analysis of covalent phosphopantetheine. Biochemistry 30 (1991) 2916-2927.
Salmeron, J.M. and Staskawicz, B.J.: Molecular characterization and hrp dependence of the avirulence gene avrPto from Pseudomonas syringae pv. tomato. Mol. Gen. Genet. 239 (1993) 6-16. Saraste, M., Sibbald, P.R. and Wittinghofer, A.: The P-loop - a common motif in ATP- and GTP-binding proteins. Trends Biochem. Sci. 15 (1990) 430-434. Schmitz, A., Gartemann, K.H., Fiedler, J., Grund, E. and Eichenlaub, R.: Cloning and sequence analysis of genes for dehalogenation of 4-chlorobenzoate from Arthrobacter sp. strain SU. Appl. Environ. Microbiol. 58 (1992) 4068-4071. Staab, J.F., Elkins, M.F. and Earhart, C.F.: Nucleotide sequence of the Escherichia coli entE gene. FEMS Microbiol. Lett. 59 (1989) 15-20. Turgay, K., Krause, M. and Marahiel, M.A.: Four homologous domains in the primary structure of GrsB are related to domains in a superfamily of adenylate-forming enzymes. Mol. Microbiol. 6 (1992) 529-546. Van den Eede, G., Deblaere, R., Goethals, K., Van Montagu, M. and Holsters, M.: Broad host range and promoter selection vectors for bacteria that interact with plants. Mol. Plant-Microb. Interact. 5 (1992) 228 234. Weckermann, R., Ftirbass, R. and Marahiel, M.A.: Complete nucleotide sequence of the tycA gene coding the tyrocidine synthetase 1 from Bacillus brevis. Nucleic Acids Res. 16 (1988) 11841.