Gene fusion techniques cloning vectors for manipulating lacZ gene fusions

Gene fusion techniques cloning vectors for manipulating lacZ gene fusions

1984, Gene Anal Techn 1:43-51 Gene Fusion Techniques N I E L S P. FIIL, and One potential limitation of existing in vivo gene fusion techniques is ...

928KB Sizes 49 Downloads 126 Views

1984, Gene Anal Techn 1:43-51

Gene Fusion Techniques

N I E L S P. FIIL, and

One potential limitation of existing in vivo gene fusion techniques is the difficulty of analyzing the DNA sequence of the fusions. This problem has been approached by isolating transducing phages carrying alleles of interest and then analyzing these phages [5]. We now report a simple, rapid technique for cloning lacZ gene fusions on a small, high-copy-number plasmid that permits detailed biochemical analysis, including DNA sequence analysis. Our approach includes a technique for locating defined DNA fragments that carry the promoter-regulatory region of the target gene. We tested our approach using the ompF gene of E.

M I C H A E L N. H A L L

coll.

Cloning Vectors for Manipulating lacZ Gene Fusions M I C H A E L L. B E R M A N , D O L O R E S E. J A C K S O N , A U D R E E F O W L E R , IRVING ZABIN, LASSE CHRISTENSEN,

A simple vector system for cloning gene fusions of lacZ is described. We apply one of these new vectors to the cloning and transcriptional analysis of the promoter region of the ompF gene of Escherichia coli.

Gene fusions have been widely used to study gene function in prokaryotes. Rapid in ;¢ivo techniques have been developed for the construction of gene fusions between any target gene and the structural genes of the lac operon in Escherichia coli [1-3]. Gene fusions with lacZ can be of two types: 1) operon fusions, in which lacZ with its accompanying ribosome-binding site and AUG initiation codon is placed under the transcriptional regulation of a target gene, and 2) protein fusions, in which the amino terminus of a hybrid lacZ monomer is encoded by the target gene. In protein fusions, the ribosome-binding site and the AUG codon are provided by the target gene; as a result, the production of 13-galactosidase is subject to the translational as well as the transcriptional regulation of the target gene. The construction of both operon and protein fusions to lacZ in E. coli has been used to study various biological problems [1, 4]. From the Laboratory of Genetics and Recombinant DNA, LBI-Basic Research Program, NCI-Frederick Cancer Research Facility, Frederick, Maryland (M.L.B.; D.E.J.)g'the Department of Biological Chemistry, UCLA School of Medicine and Molecular Biology InstitutevUniversityof California, Los Angeles, California (A.E; I.Z.)g'the Novo Institute, Novo alle, DK-2880 Bagsvaerd, Denmark (L.C.; N.EE),j/and the Department of Biochemistry and Biophysics, University of California, San Francisco, California (M.N.H.). Address reprint requests to: Michael L. Berman, Laboratory of Genetics and Recombinant DNA, LBI-Basic Research Program; NCI-Frederick Cancer Research Facility, Frederick, MD 21701. Received December 12, 1983.

The ompF gene is the structural gene for a 36,500-M r outer membrane protein. This protein, as well as other outer membrane proteins in E. coli, was studied previously using the techniques of gene fusion [6]. Hall and Silhavy [7] constructed gene fusions in vivo between lacZ and the loci involved in the production of the two major outer membrane porin proteins, ompF and ompC. The expression of both ompF and ompC is regulated through a complex of genes at the ompB locus [7]. To better understand the molecular details of porin regulation, we devised a plasmid system to clone portions of these genes as fusions to lacZ. In this report we apply this vector to the cloning and transcriptional analysis of the promoter region of ompF. Materials and Methods

Plasmid Isolation and D N A Sequence Analysis Three cloning vectors were constructed starting from the plasmid pBR322 [8, 9]. The experimental design for using these vectors to clone lac fusions is described in the Results section. The source of lac operon DNA is the plasmid pMC871 [10]. The lac operon fragments incorporated in these vectors begin at unique sites within lacZ and terminate at the AvaI site at codon 70 of lacY [11]. The AvaI site in lacY is joined to the AvaI site at position 1425 of pBR322 such that the lacZ sequences replace the tet gene sequences and are oriented clockwise in the standard pBR322 map. The structure of these vectors is depicted in Figure 1. Plasmid pMLB524 is a pBR322 derivative in which the EcoRI-AvaI (1427 base pairs [bp]) fragment is replaced with a 315-bp EcoRIAvaI fragment from the lac operon. The plas-

© 1984ElsevierSciencePublishingCo., Inc., 52 VanderbiltAve., New York, NY 10017 0735-0561/84/$03.00

~~.. "~'~-..

~ f3

.--

j

•-._ _._~

44 1984, Gene Anal Techn 1:43-51

--~_j - ___.. (1) EcoRI

pMLB 524

bla

ori

~ Aval

(3251/1) EcoRI

i

J

(313)

pMLB 1060

(1)

(872)

(1410)

EcoRI

Aval

Aval

ori

t

I

~e

I

,

Sstl

EcoRI

(27)

(1095)

bla

(1)

(532)

(1136)

(1981)

(2519)

EcoRI

Aval

Sstl

Aval

Aval

ori

I

I

I

I

I

pMLB 1097 ~. Cla I

EeoRI

(23)

(2204)

'lacZ

pMLB 524

IGAA TTC! glu phe 1006 1007

IGAG CTCJ pMLB 1060

glu leu 650 651

IATC GATI pMLB 1097

ile 279

asp 280

Figure 1. Fusion cloning vectors pMLB524, pMLBI060, and p--]qIL"]~97. These vectors are pBR322 derivatives [8]. The tet determinant of pBR322 has been replaced with different fragments from the lac operon (heavy line) (see Materials and Methods). The numbers refer to the amino acid residue in 13galactosidase (lacZ) [20] and lactose permease (lacY) [1 I]. The primes indicate that genetic material from the gene is deleted or replaced in the direction of the prime, ori is the pBR322 origin, and the extent of the 13-1actamase gene (bla) is indicated.

mid will accept EcoRI fragments from various lacZ gene fusions to reform an intact structural gene specifying a L a c Z ÷ phenotype. Plasmid pMLB1060 is a subclone of the 1384-bp SstI-AvaI fragment from lac cloned into the pBR322 derivative pPCVI; pPCVI carries a synthetic SstI oligonucleotide linker at the ClaI site (position 23) of pBR322 (H. de Boer, unpublished). This 1383-bp lac fragment replaces the 1398-bp SstI-AvaI fragment from the tet gene. Plasmid pMLB1060 is lacZ- and will receive SstI fragments from other cloned lacZ genes to reform a lacZ ÷ plasmid. In the third construction, the 2496-bp ClaI-AvaI fragment from lac was cloned into the ClaI-AvaI sites of pBR322. In this vector, the ClaI-AvaI fragment

lac

(4348/1) EcoRI

bla

(5457/1) EcoRI

Y '

AAA~CTC GGGI lys leu gly 69 70 71

(309 bp)

AAA ~;TC GGGI lys leu gly 69 70 71

(1377bp)

AAA ~3TC GGGI lys leu gly 69 70 71

(2490 bp)

from lac replaces a 1402-bp fragment from the tet gene. The resulting plasmid, pMLBI097, is missing sequences up to the ClaI site at codon 279 of lacZ. This plasmid will receive ClaI fragments from cloned lacZ genes to reform a lacZ ÷ plasmid. The methods used for DNA isolation, transformations, and DNA sequence analysis according to Maxam and Gilbert [12] have been described previously [5]. Enzymes were obtained from commercial sources and used according to manufacturers' recommendations. The properties of the various ompF fusions have been described previously [7]. The bacterial strain used for plasmid transformations is MC1000 [13], into which the ompBlO1 allele was introduced [7]. Prior experience has shown that protein fusion between lacZ and genes whose products are normally localized to the cellular membranes (e.g., lamB or ompF) cannot be produced in large amounts without affecting cell viability [14]. Since the ompB locus codes for a positive activator of the ompF gene, all cloning experiments were c o n d u c t e d in a strain carrying an ompB mutation. This mutation depresses the

© 1984 Elsevier Science Publishing Co., Inc., 52 Vanderbilt Ave., New York, NY 10017

1984, Gene Anal Techn 1:43-51

expression of ompF, thus avoiding any potential harmful effects caused by overproduction of the cloned ompF-lacZ hybrid protein.

X

t 1

t A

t

2

B

3

Figure 2. The line represents a region of DNA including the structural gene X (heavy line). Arrows divide the region into two DNA fragments, A and B.

S1 Mapping The procedure used for the S1 nuclease mapping was a modification of the method of Berk and Sharp [15], as described by Gryczan et al. [16]. The DNA probe was obtained from the plasmid pLaC6-4, which is a subclone of the promoter region of pMH621. In pLaC6-4, the EcoRI site of pBR322 has been moved by Bal31 nuclease treatment to a position 390 bp upstream of the BglII site in ompF. This plasmid was labeled at the BglII site using polynucleotide kinase and [~/-32p]ATE The plasmid was then cut with EcoRI, and the 390-bp fragment now labeled at the 5' end of the DNA strand complementary to ompF m R N A was isolated. R N A was purified from a strain carrying the plasmid pLaC6, which is a subclone of the entire EcoRI-BglII fragment ( l l 0 0 bp), from pMH621 (using a synthetic HindIII linker) between the EcoRI and HindIII sites of pBR322. The products of the S1 nuclease digestion was displayed on a 12% urea-polyacrylamide gel next to the 390-bp probe treated with DNA sequence reagents.

fragment will express gene X activity when the second fragment is inserted in the proper orientation. The analysis and cloning of lacZ gene fusions presents a perfect test of this approach. In the case of operon fusions, the entire lacZ structural gene is intact, whereas LacZ ÷ protein fusions (hybrid lacZ genes that specify active hybrid proteins) must carry the structural gene sequences of lacZ from at least codon 25 through the end of the gene [17, 18]. If we wish to confine our analysis to the initial portion of a gene fusion we can choose any unique site within the lacZ structural gene and clone the region from this site tfirough the end of lacZ into a recipient vector. Since this portion of lacZ itself encodes only a segment of the structural gene, this recipient vector would require the insertion of the beginning of a functional lacZ gene to specify 13-galactosidase activity. Computer analysis [19] of the DNA sequence of lacZ [20] shows numerous unique restriction sites within the structural gene (Table 1). We sub-

Results Table 1. Unique Restriction Enzyme Sites in lacZ

Experimental Design The ease of DNA sequence analysis, biochemical studies, and recombinant DNA manipulations is increased by cloning genes or small high-copynumber plasmid vectors. A typical approach to cloning any target DNA sequence is to choose restriction enzymes that cleave in the regions adjacent to, but not within, the gene (Figure 2, sites 1 and 3). However, in some cases analysis may be confined to a particular portion of a gene or operon. For instance, the analysis of regulatory mutants is confined to the beginning (5' end) of the target gene and in this case careful construction of a recipient cloning vector can expedite subcloning sections of any gene. In the example presented in Figure 2, the cloning of the DNA segments A and B can be facilitated by constructing a recipient vector carrying either of these gene segments. Thus, clones of fragments A or B alone will be phenotypically X - since one segment of gene X is missing, but a vector carrying either

Location in lacZ Enzyme AatlI AccI Avail BclI BssH2 ClaF EcoRF EcoRV HglEII MstI MstlI NdeI NspCI SnaI SstI c XholI

Site GACGTC GT(A/C)(G/T)AC GG(A/T)CC TGATCA GCGCGC ATCGAT GAATTC GATATC ACC(N)6GGT TGCGCA CCTNAGG CATATG PuCATGPy GTATAC GAGCTC PuGATCPy

Codon number~

Nucleotideb

211 925 518 453 504 279 1006 375 65 51 79 990 924 925 650 918

631 2775 1553 1358 1510 835 3016 1123 195 152 235 2968 2771 2775 1948 2754

a The first nucleotide of the indicated recognition site occurs within this codon of the structural gene [20]. b The position of the first nucleotide of the indicated recognition site in the structural gene [20]. c Sites used in the construction of the vectors shown in Figure 1.

© 1984 Elsevier Science Publishing Co., Inc., 52 Vanderbilt Ave., New York, NY 10017

- j

46 1984, Gene Anal Techn 1:43-51

P ompF'

(a)

I

I

'lacZ Y+ I/lllllllllllllllllJ

J

All

t

t

3

(b)

A

J

¥+ lacZ ' ' ompF P f/.//i////////////A I l 3

0

t

4

t

2

1

't

6

exo

t'

5

exo

t

CI

6

t

(2/ O

R

t r

5

4

I pMLB524

(c)

LaeZ +

'lacZ

y,

Figure 3. Cloning of o m p F - l a c Z fusion from strain MH621. ~ I 3 ] - a ~ a m of the chromosomal arrangement in strain MH621. The boxed lines represent bacterial DNA. P is the promoter for ompF. B is the fusion joint between ompF and lacZ. Lac operon DNA is shown as a c r o s s h a t c h e d box. The letters J,A,R,O,C1, and exo refer to k phage genes. A is a possible site for prophage excision as indicated by the long arrow. The location of this site is not fixed and will vary for any particular transducing phage. The numbered arrows below the map are EcoRI sites (see text). Maps are not drawn to scale. (b) Diagram o f o m p F - l a c Z transducing phage. Symbols are the same as in (a). The sizes of the EcoRI fragments from left to right are: 21.3 kbp; 4.4 kbp; 0.86 kbp; 1.1 kbp; 7.5 kbp; 5.9 kbp; and 3.6 kbp. EcoRI sites 6, 5, and 4 correspond to the three sites in wild-type h in exo (65.6%), in gene O (81.0%) and adjacent to gene S (93.1%) [21]. The 21.3-kbp EcoRI fragment includes the left end and the structural genes of k; it terminates at the EcoRl site in lacZ. EcoRI sites 2 and 1 are within bacterial DNA adjacent to ompF. (c) Diagram of subclone of EcoRI fragment from the X phage into pMLB524. Note that this plasmid is L a c Y - , whereas the phage is LacY +.

cloned portions of lacZ distal to three of these sites into pBR322 (Figure 1, and Materials and Methods). All of these vectors carry the frag-

ments from the end of lacZ up to an AvaI site within the lac Y structural gene (lac Y is the gene immediately following lacZ and its DNA sequence is known [11]). Since all three sites occur naturally in lacZ, any gene fusion can serve as a donor of a DNA fragment that will reform a lacZ + clone. We can conveniently use the indicator dye 5bromo-4-chloro-3-indolyl-t3-D-galactoside (XG) to detect lacZ expression since this substrate does not require lactose permease (a product of lac Y) to enter the cell.

Cloning o f o m p F - l a c Z Gene Fusions To test the utility of the proposed cloning methodology, we attempted to subclone the initial segment of previously isolated ompF-lacZ fusions into pMLB524. First, we isolated a specialized transducing phage carrying gene fusions between ompF and lacZ. The in vivo construction of the

© 1984 Elsevier Science Publishing Co., Inc., 52 Vanderbilt Ave., New York, NY 10017

1984, Gene Anal Techn 1:43-51

ompF-lacZ gene fusions results in a resident h prophage adjacent to the fusion [7] (Figure 3, line a). The excision event leading to formation of a specialized transducing phage from a strain carrying an ompF-lacZ fusion is shown in Figure 3, line a. We isolated plaque-forming phages from several strains, including two with ornpF-lac operon fusions (MH513, MH514) and three with protein fusions (MH610, MH621, and MH622) [7]. In general, the in vivo gene fusion technique of Casadaban [1, 3] and the events of prophage induction result in the formation of two unique D N A - D N A fusion joints, designated A and B in Figure 3. The position of joint B varies from one gene fusion isolate to another and marks the junction between lacZ and, in our case, the target gene, ompF. Fusion joint A is the result of prophage excision and therefore will determine the extent of bacterial DNA adjacent to ompF and the extent of h DNA from the red region carried by the specialized transducing phage. Joint A can be a junction between h DNA and bacterial DNA or between bacterial DNA and bacterial DNA. If we consider the location of EcoRI sites in h DNA, we can predict that our transducing phages will be of three types. In the first type, no additional EcoRI sites would be introduced by bacterial DNA substitution. Such phages would yield a unique fragment from the naturally occurring EcoRI site in lacZ extending through the fusion joint, through the chromosomal sequences, and to the first EcoRI site in h phage DNA. The size of this fragment in these phages would vary according to the location of fusion joints A and B (Figure 3). In the second and third types of transducing phages, additional EcoRI sites are located within the bacterial DNA sequences incorporated into the phage. Type 2 phages would carry the EcoRI sites adjacent to, but not within, the target gene (ompF in our case). In these phages, a unique EcoRI fragment would extend from the site within lacZ through the fusion joint and end in bacterial DNA. The size of this fragment from any type 2 transducing phage would depend only on the location of fusion joint B (Figure 3), i.e., on the amount of lacZ DNA and target gene DNA remaining after the fusion event. Type 3 phages would be similar to type 2, except they would carry an EcoRI site located within the target gene. In these phages, a unique EcoRI fragment would carry lacZ and the fusion joint B, but not the beginning or the promoter region of the target gene. Restriction endonuclease digestions were used

-_jr ~-,-__.-_ ...-

A

B

C

A

B

C .._~1125 ~'1060

m m

4300 g

3251

Figure 4. Left panel, agarose gel electrophoresis of EcoRIDNA. This is an 0.8% agarose gel. Lane A is hpqb621; lane B is pMH621; and lane C is pMLB524. The fragments marked with the sympbol (*) are the unique (non-h bands) from the ompF-lacZ fusion phage hpqb621. The sizes of fragments indicated are in base pairs. Right panel, polyacrylamide gel electrophoresis of two ompF-lacZ clones. This is a 6% acrylamide gel. All DNAs are digested with the enzyme HpalI. Lane A is a Lac ÷ clone of the EcoRI fragment from hpqb621 in pMLB524; lane B is a Lac ÷ clone of the EcoRI fragment from hpqb622 in pMLB524; and lane C is a pBR322 control. The sizes of the fragments indicated are in base pairs.

to establish a map of the several h fusion transducing phages isolated from the fusion strains. An example of the digestion with EcoRI of the phage carrying the protein fusion from MH621 is shown in Figure 4 (left panel, lane A) and is represented in Figure 3, line b. From the reported DNA sequence of the end of lacZ [20], as well as the map of hplac5 [21], we expected a large constant-size EcoRI fragment from the left arm of our transducing phages. A 21.3-kbp EcoRI fragment, presumably including the entire left arm of h, was seen in all our ompF-lacZ transducing phages. Three additional EcoRI fragments (7.4 kbp, 5.9 kbp, and 3.6 kbp) are conserved in all the ompFlacZ transducing phages; they originate from the right arm of h and are identical to fragments from wild-type h DNA digested with EcoRI (results not shown). We expected any additional fragments to originate from sequences between the EcoRI site in lacZ and the EcoRI site in k-exo. One EcoRI fragment (850 bp) not present in wild-type h is common to all the ompF-lacZ transducing phages and therefore must originate within the bacterial sequences adjacent to ompF. This leaves in our

© 1984 Elsevier Science Publishing Co., Inc., 52 Vanderbilt Ave., New York, NY 10017

.._--

..--- j

48 1984, Gene Anal Techn 1:43-51

digest two additional non-h-phage EcoRI fragments unique to the fusion transducing phages (Figure 4, left panel, lane A): one must span the bacterial D N A - h D N A junction and the other must carry the ompF-lacZ fusion joint. The identity of fragments could be established by a cloning experiment. We cloned the EcoRI-generated fragments from the ompF-lacZ transducing phages on the highcopy-number plasmid pMLB524. This plasmid takes advantage of the naturally occurring EcoRI site at amino acid position 1006 of lacZ (see Materials and Methods and Figure 1). Digestion of any of the three types of transducing phages with EcoRI produces unique fragments that start at the EcoRI site in lacZ and include the fusion joint and D N A sequences from at least a portion of the target gene. Such fragments from phage types 1 and 2 (no EcoRI site in the target gene) will carry the beginning of the target gene and the adjacent regulatory signals. All fragments alone encode only truncated ~-galactosidase monomer and therefore, if cloned, would not make the cell LacZ +. However, insertion of such fragments into the EcoRI site of pMLB524, which carries the DNA coding for the 17 carboxy-terminal amino acids of lacZ, reconstructs an entire functional lacZ gene; the resulting plasmid is therefore phenotypically LacZ+. In order to use pMLB524 to clone the intact gene fusion from type 3 phage, a partial EcoRI digestion must be employed. We cloned the ompF-lacZ gene fusions from the transducing phages onto pMLB524. These plasmids were selected as ampicillin-resistant, LacZ + transformants using the indicator dye XG. In these experiments, equimolar amounts of EcoRI-digested phage DNA and plasmid DNA yielded about 50% Lac + plasmids upon transformation of the A(lac) recipient strain. These plasraids were analyzed to identify the corresponding EcoRI fragment from the transducing phages. The results for hp®621 are shown in Figure 4, left panel. The EcoRI fragment from this phage carrying lacZ is 4.3 kbp in length and corresponds to an EcoRI fragment unique to the transducing phage. (The remaining, previously unassigned, unique fragment from hpCtb621 is 1.1 kbp in length and must therefore span the fusion joint between bacterial D N A and h-DNA.) Therefore, the ompF-lacZ phages are type 2. By judging from this and previous results, we expected approximately 3.0 kbp of this fragment to carry lacZ sequences. The remaining 1.3 kbp must originate

from the ompF region of the chromosome in MH621.

DNA and Protein Sequence Analysis of Two o m p F - l a c Z Hybrid Protein Gene Fusions Our DNA sequence analysis concentrated on the D N A encoding the amino-terminal part of the OmpF and the presumptive promoter region. We chose to examine the protein fusions in detail because the distance from lacZ to the promoter of ompF can be estimated from the size of the hybrid proteins observed on sodium dodecyl sulfate-polyacrylamide gels. Strains MH621 and MH622 have been shown to produce hybrid proteins of about 119,000 M r and 116,000 M r, respectively [7]. The molecular weight of the [3-galactosidase monomer is 116,000 M r [20]. Since the genetic techniques used in the construction of the protein fusions result in the replacement of at least the first 17 amino acids of 13-galactosidase, we expected both of the ompF-lacZ hybrids to be joined near the amino terminus of OmpE The small difference in the sizes of the two hybrid proteins (3,000 M r ~- 28 amino acids) will be reflected in the size of any DNA fragment originating from the region of the fusion joint. We expect this difference in amino acids to be reflected at the DNA level. We examined the restriction enzyme digest of the two clones of these fusions, pMH621 and pMH622, for any unique fragments using several different restriction enzymes. Such DNA fragments should arise from the region of the fusion joint, since all other fragments are identical. With various restriction enzymes tested, the unique fragment from pMH621 was found to be slightly larger than the corresponding fragment from pMH622. An example of this analysis using the restriction enzyme HpalI is shown in Figure 4, right panel. This result confirms that the ompFlacZ hybrid protein from MH621 is slightly larger than the hybrid protein from MH622. We can locate the first HpalI site at codon 56 in lacZ by examining the published DNA sequence of lacZ [20]. Based upon the size of the unique HpaI1 fragments, we would expect a difference of about 30 amino acids between the two hybrid proteins. The hybrid protein from strain MH622 was isolated and subjected to an amino acid sequence analysis as described previously [22]. The sequence of the 24 amino-terminal amino acids from this ompF-lacZ hybrid protein is as follows:

© 1984 Elsevier Science Publishing Co., Inc., 52 Vanderbilt Ave., New York, NY 10017

1984, Gene Anal Techn 1:43-51

i0

20

30

40

50

6n

70

80

90

i

I

I

I

I

I

I

l

l

I

l

I

I

i

i

i

I

i

Taq I ATcGATAAAGTTTCCATCAGAAACAAAATTTcCGTTTAGTTAATTTAAATATAAGGAAATcATATAAATAGATTAAAATTGcTGTAAATATcATcAcGTC

i00

TcTATGGAAATATGACGGTGTTCAcAAAGTTCcTTAAATTTTACTTTTGGTTACATATTTTTTCTTTTTGAAACCAAATcTT•ATCTTTGTAGcACTTTC

200

Pst I ~ ACGGTAGCGAAACGTTAGTTTGAATGGAAAGATGCL~rG~BR~ACACATAAAGACACCAAACTCTCATCAATAGTTCCGTAAATTTTTATTGACAGAACTTA

300

I

i

I

I

I

l

i

i

I

I

i

f

I

i

I

I

b

i

TTGACGGCAGTGGCAGGTGTCATAAAAAAAACCATGAGGGTAATAAATAATGATGAAGCGCAATATTCTGGCAGTGATCGTCCCTGCTCTGTTAGTAGCA METMetLysArgAsnIleLeuAlaVaIIleVaIProAlaLeuLeuValAla

i

400

I

I

I

I

Pst I Bgl II GGTACTGCAAACGC-T~-C][GAAATCTATAACAAAGATGGCAACAAAGTAG--G)~FCTG GlyThrAlaAsnAlaAlaGluIleTyrAsnLysAspGlyAsnLysValAspLeu I 5 10

NH2-Met-Met-lys-X-X-ile-leu-ala-val-ile-valpro-ala-leu-leu-val-ala-gly-thr-ala-asn-pro-gly-val. Amino acids 2 2 - 2 4 most likely correspond to amino acids 19-21 of 13-galactosidase [20]. After locating the DNA fragments from the beginning of the ompF gene (Figure 4), it was simple to obtain DNA sequences from this region. The sequence obtained from pMH621 is presented in Figure 5, aligned with the amino-terminal protein sequence deduced from the above analysis. This DNA sequence has also been published by Mutoh et al. [23].

I

i

f

i

Figure 5. DNA sequence from the p r o m o t e r region of the OmpF gene. The underlined amino acids are the signal sequence of OmpE The numbered amino acids are from mature OmpF protein. The underlined basis (positions 238-254) are potential mRNA start sites as determined by the Sl mapping experiment (see Figure 6). The start point corresponding to the major Sl-generated fragment is underlined twice. The arrows indicate a pronounced region of dyad symmetry within the transcribed sequences.

of the strand complementary to the sequence shown in Figure 5. The size of this band after hybridization and S1 treatment is 194-210 bp. The positions of the ompF mRNA potential start sites are indicated in Figure 5.

Characterization of the o m p F m R N A Transcript

Discussion

To locate the start point of transcription of the ompF gene, we performed a modified S 1 nuclease experiment using RNA isolated in vivo. The details of this experiment are presented in Materials and Methods. The results are shown in Figure 6. The DNA fragments generated by S 1 nuclease terminate in a 3'[OH] group, whereas the DNA fragments corresponding to a particular position generated by the chemistry used for DNA sequence analysis are actually missing that base and terminate in a 3'[PO4] group at the preceding base [12]. In both products, the location of the 5' end label is identical. The apparent position of the S 1-generated band was therefore adjusted by 1.5 nucleotides (1 nucleotide for the chemically removed base and 0.5 nucleotide for the effect of the terminal phosphate on mobility) to achieve correspondence with the base-specific fragments generated by the DNA sequence reactions. The DNA probe was a 390-bp fragment labeled at the 5' end

The vector system we have described for cloning gene fusions to lacZ relies on naturally occurring unique sites in lacZ. Essentially, the lacZ sequence serves as a genetic tag to aid the physical isolation and analysis of adjacent DNA sequences. We have successfully used this technique to clone both operon and protein ompF-lacZ gene fusions and to clone lamB-lacZ, tsx-lacZ, ompA-lacZ, and tyrT-lacZ fusions (M. Berman, unpublished data). Specialized transducing phages were used in these examples, although preliminary results indicate that digesting chromosomal DNA from a strain carrying a lacZ fusion will also generate LacZ + clones using pMBL524. The only potential limitation of this vector system is the requirement for a particular enzyme digestion. If this site occurs within the target gene, then a partial digest is required to obtain clones of the entire gene fusion. However, it is statistically unlikely that all three enzymes (EcoRI, SstI, and ClaI) will cleave

© 1984 Elsevier Science Publishing Co., Inc., 52 Vanderbilt Ave., New York, NY 10017

50

._'-_-_~

1984, Gene Anal Techn 1:43-51

10x

G

G+A C+T

C

lx

TC. 100 (227)

~A. 90 (23r) ! I-Ao 80 (247) TT • 70 (257) TG, 60 (267)

kC • 50 (277)

AT. 40 (287)

j,Go 30 (297)

G C° 20 (3O7)

;C • 10 (317)

Figure 6. Nuclease SI mapping experiment. Lanes marked 1 x~ x (relative amount of sample loaded on gel) are the results of in vivo mRNA hybridized to a 390-bp probe from the ompFpromoter and treated with nuclease S 1 (see Materials and Methods). The four center lanes are the end-labeled DNA probe subjected to DNA sequence analysis [12]. A,G,C, and T refer to base-specific cleavage reactions. Base positions 1100 from the bottom of the gel are shown. The numbers in parentheses (317-227) refer to position in the DNA sequence of Figure 5. The dots above bases refer to nearest positions of Sl-generated bands. The arrowhead is at the base corresponding to the major S 1 product. The position of the S lgenerated fragments is adjusted and described in the text before .inclusion in Figure 5. Note that the sequence as read is complementary to that in Figure 5.

within the target gene sequences. Mapping of transducing phages or the chromosome will allow the choice of the appropriate cloning vector.

Clearly, the availability of a h transducing phage carrying the gene fusions facilitates the cloning protocol. Recently, a simple method for obtaining gene fusions in vivo has been reported [3]. This one-step fusion method of Casadaban employs a defective Mu phage, Mud(103). A technique has been described by Komeda and Iino [24] to convert a Mud(103) fusion prophage to a h prophage in any fusion strain. This procedure utilizes hpl(209). The resulting h prophage carries a small amount of Mu phage DNA at the fusion joint (at site B in Figure 3). We have successfully used such converted Mud lysogens with our cloning system. The sequence of the hybrid protein from strain

© 1984 Elsevier Science Publishing Co., Inc.. 52 Vanderbilt Ave., New York, NY 10017

1984, Gene Anal Techn 1:43-51 --...-7".. .._..-

MH622 establishes the start point of translation of the ompF gene. The amino-terminal residue corresponds to the methionine codon at position 350 in Figure 5. Previous results have shown that the signal sequence of a localized protein when fused to 13-galactosidase is not necessarily cleaved following synthesis [22]. By comparing the N-terminus in our sequence with the reported N-terminus of mature ompF [25], we can place the start point of the mature porin at position'417 in our sequence. These results confirm that a 22-amino acid signal sequence is present in the wild-type ompF gene. The presence of this signal sequence was deduced from the previously published DNA sequence [23]. The method used to locate the fusion joint fragment from our clones can be applied to any gene fusion system. The 65-bp difference in size between the unique bands in lanes A and B of Figure 4 indicates a 22-amino acid difference between the two hybrid proteins. The hybrid protein encoded by MH622 contains 20 amino acids from OmpF and is fused directly to LacZ at amino acid 18 (see Results). The hybrid protein from MH621 is fused to LacZ at amino acid 20 and contains 35 amino acids from OmpE Between ompF and lacZ in the MH621 fusion is DNA from the end of bacteriophage Mu that encodes 11 additional amino acids in the same reading frame (data not shown). Therefore, there are 24 more amino acids in the MH621 fusion than in MH622. This corresponds to an actual 72-bp difference in the size of the DNA fragments spanning the fusion joints. The availability of clones on small, high-copynumber plasmids facilitates analysis of regulatory regions. For instance, the techniques of DNA footprint and transcriptional analysis are much easier to perform using such clones. We demonstrated this by analyzing the in vivo transcripts of ompF. The S1 experiment described in Figure 6 allows the placement of probable start points of transcription at nucleotide positions 238-254 (Figure 5). This leaves a notable transcribed, but untranslated, leader on the mRNA. Experiments are underway to determine the sequence of the 5' end of the in vivo mRNA from ompF and analyze the sequence of cis-acting ompF regulatory mutations. This work was sponsored by the National Cancer Institute, DHHS, under Contract No. N01-CO-75380 with Litton Bionetics, Inc. and by NSF Grant No. PCM-7819974 and PHS Grant No. AI-04-181. Thanks to Dr. M. Casadaban for sup-

plying plasmid pMC871 and to Dr. H. de Boer for supplying plasmid pCVI. Special thanks to L. Jenkins for careful preparation of this manuscript.

References 1. Bassford, E, Beckwith, J., Berman, M., Brickman, E., Casadaban, M., Guarente, L., Saint-Girons, I., Sarty, A., Schwartz, M., Shuman, H., and Silhavy, T. (1978) In Molecular Aspects of Operon Control (Miller, J. H. and Reznikoff, W. S. eds.), pp. 245-261, Cold Spring Harbor Laboratory, New York. 2. Casadaban, M. J. (1976) J. Moi. Biol. 104, 541-555. 3. Casadaban, M. J., and Cohen, S. N. (1979) Proc. Natl. Acad. Sci. USA 76, 4530-4533. 4. Weinstock, G. M., Berman, M. L., and Silhavy, T. J. (1983) In Gene Amplification and Analysis (Papas, T. S., Rosenberg, M., and Chirikjian, J. G. eds.) pp. 27-64 Elsevier Science Publishing Co., Inc. New York. 5. Berman, M., and Landy, L. (1979) Proc. Natl. Acad. Sci. USA 76, 4303-4307. 6. Hall, M., and Silhavy, T. J. (1981) Ann. Rev. Genet. 15, 91-142. 7. Hall, M., and Silhavy, T. J. (1981) J. Mol. Biol. 146, 23 -43. 8. Bolivar, E, Rodriguez, R., Greene, E J., Betlach, M., Heyneker, H. L., Boyer, H. W., Crosa, J., and Falkow, S. (1977) Gene 2, 95-113. 9, Peden, K. W. C. (1983) Gene 22, 277-280. 10. Casadaban, M., Chou, J., and Cohen, S. N., (1980) J. Bacteriol. 143,971-980. I1. Buchel, D. E., Gronenborn, B., and Muller-Hill, B. (1980) Nature 283,541-545. 12. Maxam, A. M. and Gilbert, W. (1977) Proc. Natl. Acad. Sci. USA 74, 560-564. 13. Casadaban, M., and Cohen, S. N. (1980) J. Mol. Biol. 138, 179-207. 14. Emr, S. D., Hall, M. N., and Silhavy, T. J. (1980) J. Cell Biol. 86, 701-711. 15. Berk, A. J., and Sharp, P. A. (1977) Cell 12, 721-732. 16. Gryczan, T. J., Grandi, G., Hahn, J., Grandi, R., and Dubnau, D. (1980) Nucleic Acids Res. 8, 6081-6097. 17. Brickman, E., Silhavy, T. J., Bassford, E J. Jr., Shuman, H. A., and Beckwith, J. R. (1979) J. Bacteriol. 139, 13-18. 18. Zabin, I. (1982) Mol. and Cell. Biochem. 49, 87-%. 19. Queen, C. L., and Korn, L. J. (1980) In Methods in Enzymology. (L. Grossman and K. Moldave eds.), pp. 595609, Academic Press, New York. 20. Kalnins, A., Otto, K., Rather, U., and Miiller-Hill, B. (1983) EMBO J 2, 593-597. 21. Helling, R. B., Goodman, H. M., and Boyer, H. W. (1974) J. Virol. 14, 1235-1244. 22. Sarthy, A., Fowler, A., Zabin, I., and Beckwith, J. (1979) J. Bacteriol. 139, 932-939. 23. Mutoh, N., Inokuchi, K., and Mizushima, S. (1982) FEBS Letters 137, 171-174. 24. Komeda, Y, and Iino, T. (1979) J. Bacteriol. 139, 721-729. 25. Chen, R., Kramer, C., Schmidmayr, W., and Henning, U. (1979) Proc. Natl. Acad. Sci. USA 76, 5014-5017.

© 1984 Elsevier Science Publishing Co., Inc., 52 Vanderbilt Ave., New York, NY 10017