Gene 215 (1998) 213–222
Identification of mycoplasmal promoters in Escherichia coli using a promoter probe vector with Green Fluorescent Protein as reporter system S. Dhandayuthapani, W.G. Rasmussen, J.B. Baseman * Department of Microbiology, The University of Texas Health Science Center at San Antonio, 7703 Floyd Curl Dr., San Antonio, TX 78284, USA Received 26 February 1998; accepted 17 April 1998; Received by D. Schlessinger
Abstract A promoter probe vector, pGFPUV2, with Green Fluorescent Protein (GFP) as the reporter system was constructed to identify potential mycoplasmal promoter-containing regulatory sequences in E. coli. Libraries of M. pneumoniae and M. genitalium DNA constructed in pGFPUV2 and transformed into E. coli resulted in GFP-expressing clones. Primer extension analysis with E. coli RNA from five M. pneumoniae clones and two M. genitalium clones indicated that transcription originated from the insert DNA fragments of these promoter constructs. Primers based on the insert DNA sequences were used in primer extension reactions with total RNA isolated from M. pneumoniae or M. genitalium. Of the seven primers used, three generated products by primer extension with mycoplasmal RNA. However, only one of the DNAs had a 5∞ end similar to that obtained in a comparable reaction with E. coli RNA, and the start site of this transcript appeared to originate one base prior to the predicted open reading frame. These results indicate that E. coli can identify mycoplasmal promoters which have transcriptional elements resembling E. coli promoters. © 1998 Elsevier Science B.V. All rights reserved. Keywords: Recombinant DNA; Expression of gfp; Primer extension; Promoter sequences
1. Introduction Mycoplasmas, the cell wall-free eubacteria of the class Mollicutes, parasitize a wide variety of hosts including plants, insects, animals and humans. Pneumonia, arthritis and urethritis are common human diseases in which mycoplasmas play major roles ( Krause and TaylorRobinson, 1992; Horner et al., 1993). Recently, several Mycoplasma species have also been implicated in the progression of AIDS, malignant transformation, chronic fatigue syndrome, acute respiratory distress syndrome and other maladies (Lo, 1992; Baseman and Tully, 1997). The pathogenic scheme of mycoplasmas primarily involves their adherence to eukaryotic target cells and subsequent colonization of tissues ( Tryon and Baseman, 1992; Baseman, 1993; Baseman et al., 1995, 1996, Baseman and Tully, 1997). To gain better understanding of the mycoplasmal genes involved in virulence and
* Corresponding author. Tel.: (210) 567 3939; Fax: (210) 567 6612. Abbreviations: DEPC, diethyl pyrocarbonate; DTT, dithiothreitol; GFP, Green Fluorescent Protein; PBS, phosphate buffered saline. 0378-1119/98/$19.00 © 1998 Elsevier Science B.V. All rights reserved. PII: S0 3 7 8 -1 1 1 9 ( 9 8 ) 0 0 26 0 - 1
pathogenesis, the genomes of Mycoplasma pneumoniae (816 kb) and Mycoplasma genitalium (580 kb) were sequenced ( Fraser et al., 1996; Himmelreich et al., 1996), which revealed important insights concerning their genetics and physiology. For example, in contrast to the existence of at least six sigma factors in E. coli, M. genitalium and M. pneumoniae appear to have a single sigma factor that is equivalent to the principal factors s70 and sA of Escherichia coli and Bacillus subtilis, respectively. The alternative sigma factors, which recognize different −35 and −10 sequences and regulate the expression of various genes in response to a spectrum of physiological and environmental signals ( Haldenwang, 1995), are totally absent in these mycoplasmas. This is significant since these alternative sigma factors play key roles in regulating virulence factors in bacterial pathogens (Guiney et al., 1995). In this context, how M. pneumoniae and M. genitalium regulate their gene expression as they invade diverse human environmental habitats remains a fundamental unanswered question. Thus far, little progress has been made in identifying the regulatory sequences of M. pneumoniae and M.
214
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
genitalium. This may be due partly to the lack of plasmids with suitable reporter systems. Although bgalactosidase has been used for promoter studies in Mycoplasma gallisepticum and Acholeplasma sp. ( Knudtson and Minion, 1993), our unpublished observations indicate that this Tn4001-based expression reporter system is ineffective in M. pneumoniae and M. genitalium. Another study ( Knudtson and Minion, 1994) indicated that approx. 10% of the Acholeplasma oculi promoters identified in E. coli demonstrated promoter activity when transformed into A. oculi. However, it is unclear whether E. coli and mycoplasmal RNA polymerases recognize the same promoter position of DNA fragments. Therefore, in this study we investigated whether E. coli could serve as a host for the expression signals of M. pneumoniae and M. genitalium. In addition to identifying putative mycoplasmal promoter-containing regulatory sequences in E. coli, we compared the 5∞ ends of mRNA transcripts of the E. coli-recognized promoters to RNA obtained from M. pneumoniae and M. genitalium. We used a promoter probe vector with the gene encoding Green Fluorescent Protein (GFP) of the jellyfish Aequorea victoria as the reporter system to isolate putative mycoplasmal promoters in E. coli. GFP is a recently developed reporter system (Chalfie et al., 1994) which has several advantages over other conventional reporter systems, such as b-galactosidase, chloramphenicol acetyl transferase and firefly luciferase. In particular, GFP does not require any substrate or cofactors to visualize fluorescence, and simple irradiation of GFP with appropriate light (395 nm) is sufficient to observe green fluorescence (508 nm). The green fluorescence of GFP is due to the formation of a chromophore by cyclization and auto-oxidation of tripeptides Ser–Tyr–Gly (Chalfie et al., 1994), and this chromophore formation has been demonstrated to occur in a wide variety of species. Therefore, GFP has become an attractive reporter system for various studies in bacteria, including host–pathogen interactions and in vivo analysis of gene expression (Dhandayuthapani et al., 1995; Valdivia and Falkow, 1996).
2. Materials and methods 2.1. Bacterial strains, plasmids and culture conditions M. pneumoniae strain B16 and M. genitalium strain G37 were grown in 32-oz (ca. 950 ml ) glass prescription bottles in 100 ml of SP-4 medium at 37°C for 72 h. Glass adherent mycoplasmas were washed four times with phosphate buffered saline (pH 7.2), scraped into PBS and collected by centrifugation (9500g for 20 min). E. coli strain DH5a was grown at 37°C in LB broth. Plasmid pISM2061 was a kind gift from Dr. C. Minion,
Iowa State University, and plasmid pGFPUV was purchased from Clontech (Palo Alto, CA, USA). 2.2. DNA manipulations Standard recombinant DNA techniques were followed (Ausubel et al., 1989). Plasmid pGFPUV1 was created by deletion of the Plac promoter region from the plasmid pGFPUV (Clontech, Genbank Accession No. U62636) using the restriction enzymes SpaI and HindIII and religation after treatment with Klenow to generate blunt ends. Plasmid pGFPUV2 (Fig. 1) was generated by cloning a 2.5-kb DNA fragment (HindIII fragment blunt-ended with Klenow) containing the gentamycinresistance gene into the StuI site of pGFPUV1. The gentamycin gene is a part of the transposon Tn4001 (Genbank Accession Nos M18086 and M29261) (Rouch et al., 1987) and was obtained from the Tn4001-containing plasmid pISM2061 ( Knudtson and Minion, 1993). Chromosomal DNA from M. pneumoniae and M. genitalium was isolated as reported earlier (Dallo et al., 1989). Promoter libraries of M. pneumoniae and M. genitalium were generated by cloning AluI-digested mycoplasmal DNA fragments (0.2–2 kb) into the SmaI site of the plasmid pGFPUV2. Two PCR fragments, PR019 (290 bp) and PR024 (330 bp), were generated using standard protocol with M. genitalium genomic DNA as template and with primers 19FD (5∞-GAAAGATAAAAAGAGGGAAT-3∞) and 19RT (5∞-ACAACAGGATCATAGATCTC-3∞) for PR019, and 24FD (5∞-TTTTGTTATTAACTATCCAA-3∞) and 24RT (5∞-ATGGTGCCATTACACGGGG-3∞) for PRO24, respectively. Sequencing of DNA fragments was performed using an automated cycle sequencing system (Applied Biosystem model 373, Centre DNA Technology Core Facility, UTHSCSA) with fluorescent terminators and by Amplicycle sequencing kit (Perkin Elmer, NJ, USA) using PCR. The genomic locations of the sequences were identified using NCBI databases. 2.3. Isolation of RNA RNA from E. coli expressing GFP was isolated by the CsCI technique (Schurr et al., 1995). LB broth (100 ml ) inoculated with 1 ml of an overnight culture of E. coli was grown with vigorous shaking. The log phase culture was rapidly cooled on dry ice and the bacteria collected by centrifugation. The pellet was resuspended in 5 ml of ice-cold lysis buffer (50 mM Tris–HCl (pH 7.5)) and transferred to 13 ml Sarstedt centrifuge tubes. The cells were resuspended thoroughly, and 1 ml of 20% SDS was added. This lysate was incubated at 65°C for 2 min, and 4 g CsCl was added. After the addition of 5 ml of lysis buffer with mixing, the lysate was centrifuged at 11 000 rpm for 10 min at room temperature. The supernatant obtained was lay-
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
215
Fig. 1. (A) Transcriptional fusion vector PGFPUV2 based on gfp, the gene encoding for codon optimized GFP for visualization with UV light. Genr, gentamycin resistance gene: Ampr, ampicillin-resistance gene; ori, plasmid origin of replication; gfp, the gene coding for codon optimized GFP for visualization with UV light; EcoRI and SpeI, restriction sites in the vector; MCS, multiple restriction sites for cloning DNA fragments in front of gfp. (B) Description of DNA sequence on the 5∞ and 3∞ ends of gfp. Single asterisk represents the start of gfp; double asterisks represent the stop of gfp; triple asterisks represent the start of the 2.5 kb gentamycin resistance gene-containing fragment; SphI, PstI, XbaI, SmaI, KpnI and EcoRI are restriction sites.
ered over a 2 ml 5.7 M CsCl cushion and centrifuged using a SW50.1 rotor for 15 h at 35 000 rpm and 15°C. The translucent RNA pellet was dissolved in DEPCtreated H O, extracted with chloroform and precipitated 2 with ethanol. The RNA was again dissolved in H O, 2 and the concentration was determined spectrophotometrically. Isolation of RNA from M. genitalium and M. pneumoniae was performed as follows. Four millilitres of extraction buffer (4 M guanidium thiocyanate, 24 mM sodium citrate (pH 4.0), 0.5% sarcosyl, 100 mM mercaptoethanol ) were added to 48 h cultures of glassadherent mycoplasmas in Roux bottles. Cells were scraped and lysed by forcing the preparation through a 23-gauge syringe. The lysate was overlaid on a CsCL cushion and centrifuged as for E. coli RNA. The purified RNA was dissolved in DEPC-treated water. 2.4. Primer extension analysis To determine the 5∞ end of mRNA transcripts arising from the putative promoters, primer extension was performed. Briefly, 50 mg of total RNA and [c-32P]-labeled primer (20 000 cpm) in 10 ml were heated at 65°C for 5 min and cooled slowly. After annealing of the primer, 4 ml of 5×first strand buffer (0.25 M Tris–HCl (pH 8.3), 0.375 M KCl, 15 mM MgCl ), 2 ml 2 of 0.1 M DTT, 2 ml of 10 mM dNTPs and 1.5 ml of H O were added and warmed at 37°C for 5 min. Reverse 2 transcriptase (0.5 ml, Superscript, BRL) was then added, and the sample was incubated at 37°C for 30 min. The reaction mixture was extracted with phenol and precipitated with ethanol. The dried precipitate was dissolved
in 10 ml of formamide-loading buffer, and 3 ml was loaded on a 8% sequencing gel. The sequence of the promoter-containing insert DNA fragment, generated with the same primer, was included for comparative purposes. The primers used were: GFP1 (5∞ AAGAATTGGGACAACTCC-3∞), 1RT2 (5∞-CTTTTAACAACGCTAGTGAATGCT-3∞), 3RT (5∞-ATGATCCAGCGCTTTAAGCAAAAG-3∞), 4RT (5∞-TAAAGCCCATGTTAGCCATCACTTGAG-3∞), 7RT (5∞TAAAGGTACGAATAACAAAGCTTC-3∞), 8RT (5∞CCAGCACCGTTACAACCAAAAGCA-3∞), 19RT (5∞-ACAACAGGATCATAGATCTC-3∞) and 24RT (5∞-ATGGTGCCATGTACACGGGG-3∞). Of these primers GFP1 was designed based on the sequences nearer to the start codon of the gfp gene in the vector pGFPUV2. The other primers were designed either based on the insert DNA sequences in the promoter constructs (primers 3RT, 4RT, 7RT and 8RT ) or sequences from the genome adjacent to the 3∞ end of insert DNAs (primers 19RT and 24RT ). The positions of the primers other than GFP1 are depicted in Fig. 3. 2.5. Fluorescent assays The intensity of green fluorescence by E. coli harboring GFP-expressing constructs was quantified in an SLM Aminco SPF-500C spectrofluorometer with excitation and emission maximum of 385 and 508, respectively. E. coli cells were suspended in PBS before quantification by spectrofluorometry. Serial dilutions of the bacterial suspensions in PBS gave proportional fluorescence, and the intensity was expressed as relative fluorescence at 508 nm.
216
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
3. Results 3.1. Construction of a promoter probe vector Although it would be ideal to analyze mycoplasmal promoter activity in the microorganism itself, the use of plasmids and recombinant manipulations in mycoplasmas is not yet feasible. Thus E. coli was employed in this study because of its genetic utility. As a vector we chose pGFPUV in which the gene encoding GFP is codon optimized for prokaryotic gene expression and also modified for screening green fluorescence using ultraviolet light. Furthermore, the intensity of green fluorescence of this mutated GFP is 16 times higher than the wild-type GFP (GFP application notes PT2040-1, Clontech). This high fluorescence of the GFP is likely to facilitate the identification of even weak promoters. A promoterless gfp-containing vector was created by the deletion of the Plac promoter located on the upstream region of gfp in the plasmid pGFPUV, as described in the Materials and Methods section, resulting in the vector pGFPUV1. The gentamycin-resistance gene of transposon Tn4001 from the plasmid pISM2061 was cloned behind gfp in pGFPUV1 to generate pGFPUV2 ( Fig. 1). The plasmid pGFUV2 with a gentamycin-resistance gene was designed to be used both as a plasmid vector in E. coli and as an integrating vector in mycoplasma, as reported earlier with a lacZ reporter system ( Knudtson and Minion, 1994). E. coli strain DH5a transformed with this promoterless vector pGFPUV2 did not exhibit detectable green fluorescence, suggesting that there is no read-through expression due to other promoters in the plasmid. This modified vector still retains several unique restriction sites in the multiple cloning site region, which can be used for cloning promoter-containing DNA fragments to create transcriptional fusions with gfp. 3.2. Isolation of M. pneumoniae and M. genitalium putative promoter clones In order to isolate M. pneumoniae and M. genitalium DNA fragments displaying promoter activity in E. coli, DNA libraries of these Mycoplasma species were constructed in the vector pGFPUV2. We used AluI-digested DNA fragments of these Mycoplasma species since Sau3A1, which is often used to create such libraries, is an infrequent cutter of M. pneumoniae and M. genitalium DNA. Upon transformation into E. coli, these libraries yielded approximately 3000 transformants, each for M. genitalium and M. pneumoniae. Random analysis of the tranformants by miniprep revealed that 75% of the colonies were recombinants with different insert sizes. However, among the transformants screened for green fluorescence using a transilluminator as the UV source, only eight positive colonies from M. pneumoniae and six
positive colonies from M. genitalium were identified. Furthermore, analysis of the transformants for green fluorescence by epifluorescence microscopy did not increase the number of promoter clones from M. pneumoniae and M. genitalium DNA libraries. After determining the sequences of the inserts in these promoter clones to avoid duplicate clones, we selected five promoter clones from M. pneumoniae and two promoter clones from M. genitalium for further analysis ( Table 1). Spectrofluorometric examination of E. coli cells containing these putative promoter clones revealed differences in intensity of green fluorescence among those clones ( Table 1), possibly suggesting differences in promoter strength. Extracts of cells harboring promoter constructs analyzed by Western blot with anti-GFP antibodies revealed bands similar to the size of wild-type GFP (data not shown). This observation is consistent with transcriptional rather than translational fusions of the cloned sequences. 3.3. Primer extension analysis of the putative promoters A major objective of this study was to determine whether mRNA start sites originating from mycoplasmal DNA in E. coli correspond to the transcription initiation sites used in M. pneumoniae or M. genitalium. Therefore, the transcriptional start points (5∞ ends) of these potential mycoplasmal promoters were determined by primer extension analysis. Total RNA from E. coli harbouring the promoter constructs was isolated and used as templates for the primer extension reactions. In initial studies, RNA from each E. coli promoter clone was hybridized with [c-32P]-labeled primer GFP1 and extended with reverse transcriptase. The GFP1 primer, which originated from the 5∞ region of gfp, allowed us to identify the transcriptional start points for promoter constructs pMP3GFP (mp3), pMP4GFP (mp4), pMP7GFP (mp7), pMG19GFP (mg19) and pMG24GFP (mg24) ( Table 2a). However, precise mapping of Table 1 Fluorescence levels of E. coli containing promoter constructs Constructsa
Relative fluorescencebMean + SD
pGFPUV2 pMP1GFP pMP3GFP pMP4GFP pMP7GFP pMP8GFP pMG19GFP pMG24GFP
0 0.246+0.026 1.000+0.026 0.283+0.011 0.187+0.007 0.428+0.012 0.583+0.034 0.180+0.018
aConstructs pMP1GFP, pMP3GFP, pMP4GFP, pMP7GFP, pMP8GFP and constructs pMG19GFP and pMG24GFP are from M. pneumoniae and M. genitalium DNA libraries, respectively.bThe fluorescence levels were determined as described in Materials and Methods with excitation and emission wavelengths 395 and 508 nm, respectively.
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
217
Table 2 Putative −35 and −10 promoter regions of M. pneumoniae and M. genitalium Promoter
Sequence
(a) Transcriptional starts of promoters determined using E. coli RNA mp1 AGAAGGATTGTCAAAAATATTTTTAAAGTGCTAGAATAAAGC mp3 CAGGTCTTGCGCTTTAAGGGTTTATAGTGTTTAATCTTGTCA mp4 TTTGAAATTGACATCCGTTGCTGGTTGTGTCAGAATGGTTGT mp7 GTTTAATGATGAGAGTGGAAGTCTTTTCGGTTATTATCCTTGT mp8 GACTCGTTGACTAATGTATGGGTGAGTGGTACTGTTACTAATT mg19 TAATGGTTGCAATGATATTAAGATTGGTGATATCATTGTTGCT mg24 TTAAACATTGACACAAAAAGTTTGAACTGATATCTTGATATGA Consensus TTGACA (16–19 bp) TATTAT (5–8 bp) E. coli consensus TTGACA (16–18 bp) TATAAT (6–8 bp) (b) Transcriptional starts of promoters determined using mycoplasmal RNA mp1 AGAAGGATTGTCAAAAATATTTTTAAAGTGCTAGAATAAAGC mp7 CAACAAGGTTTTTAGAATTGGTGATAGGTTAACCGCGAAACTG mg24 TTTTACATGA0AAACAAATACTGACAATAAAACTGTTGCTGCAG pl CTACGACAACAACAGTTGCTGTTTAGATTCTTTAAACTTAAACAG crl CATTTATTCATTTGCATTTTTTTAGATAAAATTAAAATTAATGGTA rrnA ATTCTTTAAACATAAATAAAAAGTTTTTCTGTATAATCTTCAGG Consensus (15–18 bp) TAAAAT (6–10 bp) Bold letters in the consensus category indicate that 70% of the promoter sequences contained that specific base in that position. Non-bold letters in the consensus category indicate that at least 40% of the promoter sequences contained that specific base in that position. Bold underlined letters indicate transcriptional starts. mp, M. pneumoniae; mg, M. genitalium; p1, major 170 kDa adhesin gene of M. pneumoniae; crl, cytadherencerelated locus of M. pneumoniae; rrnA, ribosomal RNA operon of M. pneumoniae.
the start points for constructs pMP1GFP (mp1) and pMP8GFP (mp8) was not possible using this GFP1 primer since the start points were located noticeably further upstream. Mapping of the transcriptional starts in these constructs was accomplished by using two additional primers, 1RT2 and 8RT, based on the insert sequences for pMP1GFP and pMP8GFP, respectively (see Materials and Methods). Primer extension products for constructs pMP1GFP and pMG24GFP showed two strong bands for each ( Fig. 2) with size differences of a few bases. In both cases, the top bands were considered as the transcriptional starts since the appearance of additional bands in close vicinity may simply be due to RNA degradation. After the start sites were determined with E. colibased RNA as template, the procedure was repeated with M. pneumoniae or M. genitalium RNA (depending upon the clone origin) to detect the corresponding mRNA transcripts. In addition to primers 1RT2 and 8RT, primers 3RT, 4RT and 7RT, which are positioned 3∞ to the E. coli-identified transcriptional starts in the insert DNAs of the constructs pMP3GFP, pMP4GFP and pMP7GFP, respectively ( Fig. 3), were used for primer extension reactions with M. pneumoniae RNA. Since the E. coli RNA-based transcriptional starts for the constructs pMG19GFP and pMG24GFP were too close to the 3∞ end of the insert DNAs of these constructs, primers based on the regions could not resolve the primer extension products. Therefore, primers 19RT and 24RT were synthesized based on the genome sequences of M. genitalium adjacent to the 3∞ end of the
insert DNAs of the constructs pMG19GFP and pMG24GFP (Fig. 3). Consequently, two PCR fragments, PRO19 and PRO24, encompassing part of the insert DNAs in pMG19GFP and pMG24GFP, respectively, were also made (see Materials and Methods) to generate sequence ladders with the primers 19RT and 24RT, respectively. Of the seven primers used, only primers 1RT2 and 7RT, which represented the constructs pMP1GFP and pMP7GFP of M. pneumoniae, respectively, and 24RT, which represented the construct pMG24GFP of M. genitalium, gave positive signals in primer extension. Increased concentrations of M. pneumoniae and M. genitalium RNA (100 mg per reaction) did not yield positive signals for the other constructs. The primer extension products obtained with E. coli and mycoplasmal RNAs for constructs pMP1GFP, pMP7GFP and pMG24GFP are presented in Fig. 2. The 5∞ end of the RNA expressed from promoter construct pMP1GFP alone mapped at the same position, regardless of the source of RNA. This suggests that this DNA fragment carries a M. pneumoniae promoter. We aligned the DNA sequences upstream of the transcriptional start sites to attempt the identification of conserved sequence elements ( Table 2a). Hexameric sequences representing -35 and -10 regions were readily discernible in all potential promoters. The sequences TTGACA and TATTAT were found at −35 and −10 regions, respectively. These consensus sequences are similar to E. coli s70 promoters for which the consensus sequences are TTGACA and TATAAT for the −35 and −10 regions, respectively.
218
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
Fig. 2. Determination of transcriptional start sites of promoter constructs. (A), (B) and (C ) are primer extension reactions with total RNA from E. coli containing constructs pMP1GFP, pMP7GFP and pMG24GFP as templates, respectively; (D) and ( E) are primer extension analyses with total RNA from M. pneumoniae as template; and (F ) is primer extension analysis with total RNA from M. genitalium as template. Primer extension reactions were performed with end-labeled primers 1RT2, 7RT for promoter constructs pMP1GFP ((A) and (D)) and pMP7GFP ((B) and ( E )), respectively. Primer extension analysis for construct pMG24GFP was performed with end-labeled primers GFP1 (with E. coli RNA as template, (C )) and 24RT (with M. genitalium RNA as template, ( F )). DNA sequencing was performed with pMP1GFP ((A) and (D)), pMP7GFP ((B) and ( E )), pMG24GFP (C ) and PRO24 (F ) as templates. The reaction products were analyzed on 8% urea gels. Sequencing reactions were also performed with the same primer run alongside of the primer extension reactions. G, A, T and C represent the four nucleic acids, and P marks the primer extension reactions. Double arrows indicate corresponding primer extension reactions with E. coli and mycoplasmal RNAs.
The sequences upstream of the start sites found in mycoplasmas were also compared to determine whether a consensus sequence existed. Since only three of our promoter clones exhibited transcriptional starts with mycoplasmal RNA, we included transcriptional starts for the genes p1 (Inamine et al., 1988), rrnA (Hyman et al., 1988) and crl of M. pneumoniae ( Krause et al., 1997) for comparative purposes. Alignment of the sequences revealed no apparent consensus for the −35 region, although a weak consensus was seen for the −10 region ( Table 2b). However, promoter mp1, identified in this study, and promoter crl ( Krause et al., 1997) possess sequences similar to the −35 region. 3.4. Location of the insert DNA sequences of the E. coliidentified promoter clones in the genomes of M. pneumoniae and M. genitalium To determine whether the DNA inserts from the putative mycoplasmal promoters were located upstream
of potential mycoplasmal ORFs, the sequences of the DNA inserts of all promoter clones were compared with NCBI databases. Blast searches of the sequences indicated that the DNA inserts in the M. pneumoniae and M. genitalium genomes were derived from different regions of the mycoplasmal chromosome, as shown in Table 3. The insert DNA fragments of promoter clones PMP1GFP and PMP8GFP were composed of two unrelated M. pneumoniae DNA fragments. A detailed description of the insert DNA fragments with appropriate coding regions, along with the position of the transcriptional start sites in the inserts, is provided in Fig. 3. Interestingly, the transcriptional start of the promoter construct pMP1GFP is identical in both E. coli and M. pneumoniae and mapped one base before the predicted translational start of the M. pneumoniae ORF1140. ORF1140 encodes a hypothetical 120-kDa protein. This is an unexpected observation but not without precedence since some prokaryotes, including Streptomycetes sp. and Mycobacterium, have transcrip-
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
219
Fig. 3. Description of ORFs surrounding the insert DNA fragments of the promoter clones. Open boxes represent regions of the insert in the promoter clones. Stippled boxes represent DNA not in the inserts but adjacent to the insert DNA in the genome. Hatched boxes represent the position of the different primers. Arrows inside the boxes indicate the direction of transcription of ORFs. Open arrows above the boxes indicate mycoplasmal RNA-based transcriptional starts. Closed arrows indicate the E. coli RNA-based transcriptional starts. Partially open and partially closed arrow indicates that both E. coli as well as mycoplasma RNA-based transcriptional starts are at the same position. Open balloons indicate the predicted translational starts of the genes, namely: (1) ORF1140 of M. pneumoniae; (2) ORF341 encoding phenylalanyl-tRNA synthetase alpha beta chain ( pheT ); (3) ORF showing homology with MG105 of M. pneumoniae (MG105H ); (4) ORF MG143 of M. genitalium; and (5) ORF MG114 encoding putative phosphatidyl glycerophosphate synthase ( pgsA). Double balloons in pMP3GFP indicate the translation start of M. pneumoniae repeat region REPMP2/3. Open boxes indicate translational stops of: (a) ORF 175 of M. pneumoniae; (b) ORF805 encoding phenylalanyl-t-RNA synthetase beta chain of M. pneumoniae; (c) virulence-associated protein homologue (vacB) of M. pneumoniae; (d) ORF MG142 of M. genitalium encoding phosphatidyl glycerophosphate synthase. gtaB, M. pneumoniae gene encoding UDP-glucose pyrophosphorylase; asnS, MG113 encoding asparaginyl pyrophosphorylase tRNA synthetase. 1RT2, 3RT, 4RT, 7RT, 8RT, 19RT, 24RT, 19FD and 24FD indicate the positions of different primers used in this study.
tional and translational starts close to each other (Strohl, 1992; Dhandayuthapani et al., 1997). E. coli and M. pneumoniae RNA-based transcriptional starts for the promoter construct pMP7GFP mapped 100 bp downstream and 45 bp upstream of the start of potential
ORF202. The E. coli RNA-based transcriptional signal of the promoter construct pMP4GFP was upstream of the ORF enclosing pheS. However, the transcriptional start sites of constructs pMP3GFP and pMP8GFP failed to map in front of any ORFs. Instead, these start sites
220
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
Table 3 Location of DNA inserts from putative promoter clones in M. pneumoniae and M. genitalium genomes Clone
Insert size(bp)
Genome location
Genbank/EMBL accession
pMP1GFP
960(409+551)
AE000045/U00089;AE000011/U0089
pMP3GFP pMP4GFP pMP7GFP pMP8GFP
592 696 454 525(208+317)
pMG19GFP pMG24GFP
560 488
Bases Bases Bases Bases Bases Bases Bases Bases Bases
569647-570056 of M. pneumoniae genome (section 45 of 63) 12596-125407 of M. pneumoniae genome (section 11 of 63) 15932-16523 of M. pneumoniae genome (section 2 of 63) 69813-70506 of M. pneumoniae genome (section 6 of 63) 726198-726651 of M. pneumoniae genome (section 57 of 63) 232472-232265 of M. pneumoniae genome (section 18 of 63) 444591-444275 of M. pneumoniae genome (section 35 of 63) 182302-182961 of M. genitalium genome (section 16 of 56) 140849-141336 of M. genitalium genome (section 12 of 56)
exhibited a direction of transcription opposite to the proposed direction of translation of the respective ORFs 175 and 197 of M. pneumoniae. The transcriptional starts of M. genitalium promoter constructs pMG19GFP and pMG24GFP, using E. coli RNA as template, were located in the immediate upstream region of potential ORFs, MG143 and MG114, respectively. The M. genitalium RNA-based transcriptional start of construct pMG24GFP was located in the upstream region of ORF MG114 but at a distance of 131 bp upstream of the E. coli RNA-based transcription site.
4. Discussion E. coli has been the choice for the cloning and expression of genes from many species of prokaryotes and eukaryotes. E. coli is also promiscuous in its ability to recognize transcription and translation signals from a wide variety of microorganisms (Moran et al., 1982). However, in this study only a limited number of GFPexpressing putative promoter clones were isolated from DNA libraries of M. pneumoniae and M. genitalium constructed in the promoter probe vector pGFPUV2 and transformed into E. coli. The identification of a relatively small number of mycoplasmal promoter clones was somewhat surprising since microorganisms, such as Acholeplasma oculi ( Knudtson and Minion, 1994) and Streptococcus pneumoniae (Chen and Morrison, 1987) displayed strong promoter activity in E. coli, which was attributed to their A+T-rich DNA. Mycoplasma pneumoniae and M. genitalium have comparable A+T-rich DNA. One possible explanation for this disparity could be our use of the AluI restriction enzyme, as opposed to Sau3AI used in the other studies, to construct genomic DNA libraries from the Mycoplasma species. AluI might have destroyed the continuity of the mycoplasmal DNA fragments which E. coli often recognizes as promoters. Further, the possibility exists that the use of different reporter systems and selection markers might lead to the identification of distinct sets of promoters (Bannantine et al., 1997).
AE000002/U00089 AE000006/U00089 AE000057/U00089 AE000018/U00089;AE000035/U00089 L43967/U39694 L43967/U39690
In this study we detected three categories of mycoplasmal DNA elements with promoter-like activity. The first category is represented by DNA fragments encoding RNA with a similar 5∞ end in both E. coli and mycoplasmas (i.e. mp1). The existence of this promoter extends the notion that E. coli RNA polymerase can recognize transcriptional signals from other bacterial species (Moran et al., 1982). A second category includes promoters mp7 and mg24 encoding RNAs with different 5∞ ends depending on the source bacterium. In the case of promoter mp7, it appears that the M. pneumoniae RNAbased transcriptional start site lies upstream of the predicted translational start of ORF202 of M. pneumoniae, while the E. coli RNA-based transcriptional start site lies downstream of ORF 202 (Fig. 3). The situation with promoter mg24 differs from mp7 in that both E. coli and M. genitalium RNA-based transcriptional starts exist in the upstream region but at different locations of the ORF encoding the M. genitalium pgsA gene. Whether the E. coli RNA-based transcriptional start for this promoter represents an additional promoter for the pgsA gene not expressed under specific growth conditions, or represents a fortuitous sequence recognized by E. coli RNA polymerase, is unresolved. Similarly, the complete absence of mycoplasmal RNA transcripts corresponding to the E. coli RNA from putative promoters mp3, pm4, mp7 and mg19 suggests a third category of promoters. In this case, either the mycoplasmal RNA transcripts are not expressed in mycoplasmas under the culture conditions used in this study or these putative signals represent fortuitous sequences which elicit artifactual promoter-like activity in E. coli. We favor the former possibility because the E. coli RNA-based transcriptional starts of the promoters mp4 and mg19 lie in front of the potential ORFs in M. pneumoniae and M. genitalium, respectively. The polarity of transcription from mp3 and mp8 is opposite to the direction of translation of the surrounding ORFs. It is possible that E. coli RNA-based transcriptional starts for these promoters represent antisense promoters. In Salmonella typhimurium, one of the promoter clones identified using GFP as the reporter system was an antisense promoter
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
downregulating the expression of yisH gene ( Valdivia and Falkow, 1997). Although −35 and −10 consensus sequence regions could be determined for specific promoters based on the transcriptional starts with E. coli RNA, we were unable to identify a consensus −35 region for promoters based on mycoplasmal RNA. Similarly, promoters of M. hyopneumoniae rrnA operon ( Taschke et al., 1987), M. capricolum rrnA operon ( Taschke and Herrmann, 1988), M. gallispticum pmga gene (Markham et al., 1994) and several promoters of unidentified genes in A. laidlawii (Jarhede et al., 1995) have also been reported to show a divergent −35 region. The divergent feature of the −35 region in mycoplasmal RNA-based promoters resembles those of mycobacteria (Bashyam et al., 1996) and streptomycetes (Strohl, 1992) in which the alignment of several promoters revealed no specific consensus at −35 regions. When streptomycete promoters were studied for activity in E. coli, only 20% of the promoters exhibited promoter activity in E. coli. The importance of −35 regions for the expression of promoters in E. coli may also explain the low frequency of promoter clones detected in M. pneumoniae and M. genitalium DNA libraries. Whether the divergent nature of the −35 regions in mycoplasmal promoters indicates that the sigma factor of mycoplasmas has more tolerance for −35 regions deserves further investigation. An additional interesting observation is the existence of the transcriptional start of M. pneumoniae ORF1140 one nucleotide upstream of its translational start. This close proximity of transcriptional and translational start sites has not been previously reported in mycoplasmas. However several Streptomycete RNAs (Strohl, 1992) have been found to have transcriptional and translational starts at the same position. In addition, the polA promoter of Streptococcus pneumoniae (Lopez et al., 1989), the phage lambda cI gene (Ptashne et al., 1993), and the oxyR gene of Mycobacterium leprae (Dhandayuthapani et al., 1997) were also reported to have similar arrangements. An explanation offered for this type of transcriptional/translational overlap is that the ribosomes in these microorganisms may not require extended complementarity of the mRNA template and the 3∞ end of 16S rRNA for the translation of protein (Strohl, 1992). In conclusion, this study has demonstrated that DNA fragments of Mycoplasma sp. can drive gfp expression in E. coli. The finding that one of the mycoplasmal DNA elements with promoter-like activity in E. coli directs the synthesis of a mRNA transcript with a similar 5∞ end in mycoplasmas indicates that E. coli can identify promoter regions of mycoplasmas, albeit with low efficiency. The divergent nature of the −35 region on mycoplasmal promoters may indicate that the mycoplasmal sigma factor exhibits different functional specificities from E. coli and that this regulatory difference may
221
influence the overall responsiveness of mycoplasmas to a variety of environmental stresses, including colonization and persistence. Currently, we are using vector pGFPUV2 to study the response of mycoplasmal promoters to environmental cues in E. coli, which is expected to provide evidence concerning the function of specific mycoplasmal genes in vivo.
Acknowledgement We are grateful to Dr William Haldenwang and Dr David Kolodrubetz for their critical reading of the manuscript. We also thank Dr R. Herrmann for identifying the location of the putative promoter sequences in the M. pneumoniae genome prior to the publication of the M. pneumoniae genome sequences. This work was supported in part by US Public Health Service grant AI 41010 from the National Institute of Allergy and Infectious Diseases, National Institutes of Health.
References Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A., Struhl, K., 1989. Current Protocols in Molecular Biology. John Wiley, New York. Bannantine, J.P., Barletta, R.G., Theon, C.A., Andrew Jr., R.E., 1997. Identification of Mycobacterium paratuberculosis gene expression signals. Microbiology 143, 921–928. Baseman, J.B., 1993. The cytadhesins of Mycoplasma pneumoniae and Mycoplasma genitalium. In: Rottem, S., Kahane, I. (Eds.), Subcellular Biochemistry: Mycoplasma Cell Membranes. Plenum Press, New York, pp. 243–259. Baseman, J.B., Tully, J.G., 1997. Mycoplasmas: sophisticated reemerging and burdened by their notoriety. Emerging Inf. Dis. 3, 21–32. Baseman, J.B., Lange, M., Criscimagna, N.L., Giron, J.A., Thomas, C.A., 1995. Interplay between mycoplasmas and host target cells. Microb. Pathog. 19, 105–116. Baseman, J.B., Reddy, S.P., Dallo, S.F., 1996. Interplay between mycoplasma surface proteins, airway cells and the protean manifestations of mycoplasma-mediated human infections. Am. J. Respir. Crit. Care. Med. 154, S137–S144. Bashyam, M.D., Kaushal, D., Dasgupta, S.K., Tyagi, A.K., 1996. A study of the mycobaterial transcriptional apparatus: identification of novel features in promoter elements. J. Bacteriol. 178, 4847–4853. Chalfie, M., Tu, Y., Euskirchen, G., Ward, W.W., Parsher, D.C., 1994. Green Fluorescent protein as a marker for gene expression. Science 263, 802–805. Chen, J., Morrison, D., 1987. Cloning Streptococcus pneumoniae DNA fragment requires vectors protected by strong transcriptional terminators. Gene 55, 179–187. Dallo, S.F., Chavoya, A., Su, C.-J., Baseman, J.B., 1989. DNA and protein homologies between the adhesins of Mycoplasma pneumoniae and Mycoplasma genitalium. Infect. Immun. 57, 1059–1065. Dhandayuthapani, S., Via, L., Thomas, C.A., Horowitz, P., Deretic, D., Deretic, V., 1995. Green fluorescent protein as marker for gene expression and cell biology of mycobaterial interactions with macrophages. Mol. Microbiol. 17, 901–912. Dhandayuthapani, S., Mudd, M., Deretic, V., 1997. Interactions of OxyR with the promoter region of the oxyR and aphC genes from
222
S. Dhandayuthapani et al. / Gene 215 (1998) 213–222
Mycobacterium leprae and Mycobacterium tuberculosis. J. Bacteriol. 179, 2401–2409. Fraser, C.M., Gocayne, J.D., White, O., Adams, M.D., Clayton, R.A., Fleischmann, R.D., Bult, C.J., Kerlavage, A.R., Sutton, G., Kelley, J.M., Fritchman, J.L., Weidman, J.F., Small, K.V., Sandusky, M., Fuhrmann, J., Nguyen, D., Utterback, T.R., Saudek, D.M., Philips, C.A., Merrick, J.M., Tomb, J.-F., Dougherty, B.A., Bott, K.F., Hu, P.-C., Lucier, T.S., Peterson, S.N., Smith, H.O., Hutchison III, C.A., Venter, J.C., 1996. The minimal gene complement of Mycoplasma genitalium. Science 270, 397–403. Guiney, D., Fang, F., Krause, M., Libby, S., Buchmeier, N., Fierer, J., 1995. Biology and clinical significance of virulence plasmids in Salmonella serovars. Clin. Infect. Dis. 21, S146–S151. Haldenwang, W.G., 1995. The sigma factors of Bacillus subtilis. Microbiol. Rev. 59, 1–30. Himmelreich, R., Hilbert, H., Plagens, H., Pirkl, E., Li, B., Herrmann, R., 1996. Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucl. Acids. Res. 24, 4420–4449. Horner, P.J., Gilroy, C.B., Thomas, B.J., Olof, R., Naidoo, M., TaylorRobinson, D., 1993. Association of Mycoplasma genitalium with acute non-gonococcal urethritis. Lancet 342, 582–585. Hyman, C.H., Gafny, R., Glaser, G., Razin, S., 1988. Promoter of the Mycoplasma pneumoniae rRNA operon. J. Bacteriol. 180, 3262–3268. Inamine, J.M., Loechel, S., Hu, P.-C., 1988. Analysis of the nucleotide sequence of the P1 operon of Mycoplasma pneumoniae. Gene 73, 175–183. Jarhede, K.T., Le Hanaff, M., Wieslander, A., 1995. Expression of foreign genes and selection of promoter sequences in Acholeplasma laidlawii. Microbiology 141, 2071–2079. Knudtson, K., Minion, F.S., 1993. Construction of Tn4001 lac derivatives to be used as promoter probe vectors in mycoplasmas. Gene 137, 217–222. Knudtson, K., Minion, F.S., 1994. Use of lac gene fusions in the analysis of Acholeplasma upstream gene regulatory sequences. J. Bacteriol. 176, 2763–2766. Krause, D.C., Taylor-Robinson, D., 1992. Mycoplasmas which infect humans. In: Maniloff, R.M., McElhaney, R.N., Finch, L.R., Baseman, J.B. (Eds.), Mycoplasmas: Molecular Biology and Pathogenesis. American Society of Microbiology, Washington, D.C., pp. 417–444. Krause, D.C., Proft, T., Hedreyda, C.T., Hilert, H., Plagens, H., Herrmann, R., 1997. Transposon mutagenesis reinforces the correlation between Mycoplasma pneumoniae cytoskeletal protein HMW2 and cytadherence. J. Bacteriol. 179, 2668–2677. Lo, S.-C., 1992. Mycoplasmas and AIDS. In: Maniloff, J., McElhaney,
R.N., Finch, L.R., Baseman, J.B. ( Eds.), Mycoplasma: Molecular Biology and Pathogenesis. American Society of Microbiology, Washington D.C., pp. 525–545. Lopez, P., Martinez, S., Diaz, A., Espinoza, M., Lacks, S.A., 1989. Characterization of the polA gene of Streptococcus pneumoniae and comparison of the DNA polymerase I. It encodes to homologous enzymes from Escherichia coli and phage T7. J. Biol. Chem. 264, 4255–4263. Markham, P.F., Glew, M.D., Sykes, J.E., Bowden, T.R., Pollocks, T.D., Browning, G.F., Whithear, K.G., Walker, I.D., 1994. The organisation of the multigene family which encodes the major cell surface protein, pMGA of Mycoplasma gallisepticum. FEBS Lett. 352, 347–352. Moran Jr., C.P., Lang, N., LeGrice, S.F.J., Lee, G., Stephens, M., Sonenshein, A.L., Pero, J., Losick, R., 1982. Nucleotide sequences that signal initiation of transcription and translation in Bacillus subtilis. Mol. Gen. Genet. 196, 339–346. Ptashne, M., Backman, K., Humayun, M.Z., Jeffrey, A., Maurer, R., Meyer, B., Sauer, R.T., 1993. Autoregulation and function of a repressor in bacteriophage lambda. Science 194, 156–161. Rouch, D.A., Byrne, M.E., Kong, Y.C., Skurray, R.A., 1987. The aacA-aphD gentamicin and kanamycin resistance determinant of Tn4001 from Staphylococcus aureus: expression and nucleotide sequence analysis. J. Gen. Microbiol. 133, 3039–3052. Schurr, M.J., Yu, H., Boucher, J.C., Hibler, N.S., Deretic, V., Multiple promoters and induction by heat shock of the gene encoding the alternative sigma factor AlgU (sigE ) which controls mucoidy in cystic fibrosis isolates of Pseudomonas aeruginosa. 1995. J. Bacteriol. 177, 5670–5679. Strohl, W.R., 1992. Compilation and analysis of DNA sequences associated with apparent streptomycete promoters. Nucl. Acids Res. 20, 961–974. Taschke, C., Herrmann, R., 1988. Analysis of transcription and processing signals in the 5∞ regions of the two Mycoplasma capricolum rRNA operons. Mol. Gen. Genet. 212, 522–530. Taschke, C., Klinkert, M., Pirkl, E., Herrmann, R., 1987. Gene expression signals in Mycoplasma hyopneumoniae and Mycoplasma capricolum. Isr. J. Med. Sci. 23, 347–351. Tryon, V.V., Baseman, J.B., 1992. Pathogenic determinants and mechanisms. In: Maniloff, J., McElhaney, R.N., Finch, L.R., Baseman, J.B. ( Eds.), Mycoplasmas: Molecular Biology and Pathogenesis. American Society of Microbiology, Washington, D.C., pp. 457–471. Valdivia, R.H., Falkow, S., 1996. Bacterial genetics by flow cytometry: rapid isolation of Salmonella typhimurium acid-inducible promoters by differential fluorescence induction. Mol. Microbiol. 22, 367–378. Valdivia, R.H., Falkow, S., 1997. Fluorescence-based isolation of bacterial genes expressed within host cells. Science 277, 2007–2011.