Purification of plasmid-expressed proteins which lack functional assay systems

Purification of plasmid-expressed proteins which lack functional assay systems

ANALYTICAL BIOCHEMISTRY 18 1,336-340 (1989) Purification of Plasmid-Expressed Proteins Which Lack Functional Assay Systems Christopher C. Marvel*?...

1MB Sizes 0 Downloads 32 Views

ANALYTICAL

BIOCHEMISTRY

18 1,336-340

(1989)

Purification of Plasmid-Expressed Proteins Which Lack Functional Assay Systems Christopher C. Marvel*?l and Harold 0. Kammen? *University of Southern California School of Medicine, Albert Soiland Cancer Research Laboratov, 1414 S. Hope Street, Los Angeles, California 90015 and TSchool of Public Health, University of California-Berkeley, Berkeley, California 94720

Received

December

14,1988

A general method for the purification of proteins whose genes are cloned into plasmid vectors, but whose biochemical and functional characteri&ics are unknown, is described. A cell-free transcription-translaCon system from Escherichia coli K-12 is used to synthesize in vitro radiolabeled protein expressed from recombinant plasmid vectors. The radiolabeled proteins are then fractionated and used as markers for the purification of nonradiolabeled proteins without recourse to functional assays. Biochemical analysis of the purified proteins can reveal information about their cellular localizaCon, binding parameters, and physical, enzymatic, or regulatory properties. This information complements in vivo genetic analysis with the goal of identifying t.he gene and the function of its protein product. An example using this technique in which the product of the usg gene in the hisT operon of E. coli has been purified and biochemically characterized is described. a 1969 Academic PUS, IN.

Molecular biology methodologies permit the rapid cloning, sequencing, and characterization of genomic DNA segments which code for specific protein products. Genes of known function may be identified by hybridization, mutant complementation, and catalytic or immunological characteristics of their protein products. The analysis of cloned DNA often reveals additional genes near the one of interest. These genes may be identified by sequence analysis (open reading frames) or by expression of proteins in maxicell, minicell, or irz uitro transcription-translation systems (l-3). If the nearby genes are localized within the same operon as the identified gene it raises the question of the reason for their concerted regulation and possible interrelation of their functional properties. ’ To whom 336

correspondence

should

be addressed.

To answer these questions deletions can be introduced into the nearby genes in vitro and null mutant strains constructed in viva by homologous recombination of these constructs into host strains (4,5). From these strains it may be determined if auxotrophy, lethality, or phenotypic changes are produced by the disruption of these genes. However, these genetic approaches may not suffice to clarify the function of the genes or their products. In these cases, the isolation and analysis of the protein products could provide valuable insights into their function, effects on operon regulation, binding to other protein products of the operon, or modulation of the catalytic activities of other operon components. The lack of information of a protein’s function would generally preclude the development of an activity assay for its purification. However, if a radiolabeled protein can be expressed, identified, and isolated, this protein can be used as a tracer to monitor the purification of sufficient unlabeled protein for analysis of its biochemical properties. PRINCIPLES

AND

DISCUSSION

OF

THE

METHOD

1. The preparation of radiolabeled protein. The first step in the procedure is the identification and preparation of radiochemically labeled proteins expressed from the recombinant plasmids. The maxicell (1) and minicell (2) systems express proteins from plasmid DNA in uiuo but have the disadvantage of requiring extensive manipulation of plasmid-transformed cells. In contrast, the cell-free in vitro expression system described by Zubay (3) requires only simple benchtop manipulations. It has the added advantage of producing higher incorporation of radiolabeled amino acids into plasmid-directed proteins than either of the in viuo systems. An example of the protein labeling patterns seen with maxicell and in vitro transcription-translation systems for cloned genes contained on the hisT operon is shown in Fig. 1. In our hands, Escherichia coli in vitro transcription-translation 0003~!2697/39$3.00 Copyright @31989 by Academic Press, Inc. All rights of reproduction in my form reserved.

PURIFICATION

A BCDE

Maxicells

OF

PLASMID-EXPRESSED

FGHIJK

in vitro System --

FIG. 1. Maxicell and in uitro cell-free expression of plasmid proteins. This figure illustrates the radiolabeled protein products synthesized from plasmid constructs, using the in uiuo maxicell system (1) and the cell-free in uirro system of Zubay (3). The maxicell data (left) show the expression pattern of several plasmids used to characterize the usg protein coding region. The in i&o data (right) show the expression patterns from the identical plasmids. Lanes A and F are controls without plasmids; lanes B and G are pBR322; lanes C and H are pUC9 derivatives which contain coding segments of the usg gene; lanes D and J contain plasmid pNU49 which expresses both hisI” and wg; lanes E and K contain plasmid $219; the plasmids used in lanes H and I are pUC9 derivatives which contain an identical insert (the hisTand usg region) cloned in opposite orientations and which illustrate artifactual protein expression from vector promoters. These plasmid constructs have been described (9,lO) as well as the correlation of radiolabeled protein products to coding regions of plasmid $210 (10,ll).

kits purchased from Amersham Corp. (Arlington Heights, IL) have proven to be convenient and reproducible for preparing radiolabeled protein. The plasmid constructs used for protein expression should contain an appropriate signal for irr uitro expression (a Pribnow box for transcriptional initiation and Shine-Dalgarno sequence for translational initiation). The expression of individual genes may be complicated if the genes are translationally linked, or if complex regulation of their expression occurs. Consequently, the correspondence between the gene and the protein product must be rigorously established for each plasmid construct. This can be performed by correlating expressed protein size with predicted changes occurring in subclones containing deletions, inverted regions, or other manipulations in the predicted coding regions. The commercial irr uitro system used in this work was prepared from E. coZi K-12 and is optimal for expression of E. coli DNA sequences. However it has been reported that E, coli extracts can accurately express protein products from DNA of other bacterial sources as Staphylococws aureus and Bacillus subtilis (6). Cloned DNA from other organisms containing open reading frames could also be expressed if the appropriate E. coli recognition signals are contained in the plasmid vector. Although it has been reported that in uitro and in viuo systems process proteins identically, it is possible that certain pro-

337

PROTEINS

teins might undergo different processing in vitro than the occurring in the in viva systems. For this reason, it is advisable to verify that the protein product obtained by expression of the in vitro system is identical in size to that found in viva (mincells or maxicells). Open reading frames for eukaryotic proteins can be expressed if engineered to contain suitable recognition signals; however, it should be noted that correct post-translational modifications (glycosylation, phosphorylation, etc.) would not occur in the bacterial system. It should also be noted that anomalous migration of proteins can occur in SDS’-polyacrylamide gels (7). When this occurs size discrepancies of protein products could lead to misinterpretations about the correspondence of open reading frames to protein products. Suitable analysis should also verify that the cloned gene is not ligated into the vector such that artifactual run-on or run-off translation occurs from the vector sequence (see lanes H and I, Fig. 1). In some cases, the expressed protein of interest may be masked by proteins derived from vector sequences. For example many plasmid vectors contain the resistance marker for ampicillin. In such vectors any expressed proteins migrating near the p-lactamase gene product (31,000) will be difficult to identify. In such cases it is useful to clone into other vector systems which contain different resistance markers (as chloramphenico1 acetyltransferase 25,000). 2. Purification of the radiolabeled protein. After identification of the correct radiolabeled protein the next objective is to obtain radiochemically pure protein products by procedures that preserve native structure, as with conventional protein purification. Conventional enrichment procedures, salt fractionation, gel filtration, ion-exchange, and affinity methods, are applicable for this task. It is advantageous to use small plasmid vectors to harbor the gene inserts; these will express fewer extraneous vector proteins than larger complex plasmids. Similarly, it is also advantageous to subclone relatively small inserts whenever possible, so that only a limited number of insert-coded products will be radiolabeled. Because the number of radiolabeled products is small it is often possible to obtain a significantly enriched fraction of a radiolabeled protein in a single fractionation step. Since the labeling systems contain relatively crude cell preparations, a radiochemically pure material will very likely be biochemically nonhomogeneous. However, the quantity of this heterogeneous material is minimal compared to that of the material being fractionated in subsequent steps. 3. The use of radiolabeledprotein topurify nonradioactive protein. One experimental approach is to mix a small quantity of the purified radiolabeled protein with ’ Abbreviation

use&

SDS,

sodium

dodecyl

sulfate.

338

MARVEL

AND

the crude material to be fractionated. The course of the fractionation is then followed by assaying the recovery of the desired protein (from the recovery of radioisotope) and specific activity (from the radioactivity per unit quanity of bulk protein). Another approach would be to add the radiolabeled protein at the start of each purification step, assessing the purification from the recovery of radioactive material, in conjunction with electrophoretie analysis of the fractions. If enrichment steps are carried out without the prior addition of the radioactive protein, it is advisable to include both the radioactive and the nonradioactive standards in electrophoretic gels, so that the mobility of labeled plasmid-expressed proteins and bulk nonradioactive proteins can be directly compared. As the purification proceeds, the identity of the desired protein product may become self-evident from comigration of radiolabeled and stained protein bands in electrophoretic analysis. With single subunit proteins, this poses no difficulty. For multisubunit proteins, the identification may require the characterization on nondenaturing gels of radiolabeled protein with heterogeneous unlabeled subunits. Once the electrophoretic position of the enriched protein is established, it may be possible to monitor and complete the purification solely from assays based on the stained gel electrophoresis patterns. Homogeneity of the product will be obvious from electrophoresis criteria and by congruence of radiolabeled and nonlabeled protein bands in a variety of analytical steps. 4. Confirmation of the identity of isolated proteins. When the presumed protein product has been sufficiently fractionated to produce a separable distinct band or group of subunit bands on SDS-polyacrylamide gels, verification of its identity is feasible. This verification can be obtained even when the fractionated material is quite heterogeneous. The subunit bands are extracted from the gels and subjected to N-terminal analysis with a gas phase protein sequenator (8). Because of the sensitivity of this system 10-20 pg of a 40,000 protein is sufficient for this analysis. This sequence information allows comparisons of the N-terminal sequence to those predicted from the DNA sequence. If multiple amino acids are detected with each cycle the distinct gel band is probably heterogeneous and further fractionation is required. This gel-purified protein can also be used to obtain amino acid composition to verify the correct open reading frame composition. If ambiguities exist about the presumptive protein’s coding region more detailed chemical characterization can be conducted, e.g., peptide cleavage and sequence analysis of the fragments. When a suc5. Applications of the isolated proteins. cessful purification is achieved for the gene product of an unknown function, in addition to a structural characterization, it becomes feasible to analyze its possible bio-

KAMMEN

chemical properties and relationships. These include (i) the ability of the protein to bind to macromolecular or other protein ligands; this could be carried out by using radiolabeled ligands to which the protein may bind or, conversely, by using the radiolabeled protein with nonradioactive ligands; (ii) possible functional relationships can be tested by analyzing the effects of the isolated protein on catalytic activities or substrate recognition by the products of adjacent genes in the operon; (iii) general properties, such as the ability to bind to specific or general affinity supports, immunological behavior, or possession of nuclease, ATPase, or other activities by the isolated protein and (iv) the preparation of specific antibodies and immunochemical reagents against the purified protein. 6. An example in the use of this methodology. The isolation of the usg protein. During the cloning of the E. coli hisT gene its presence was identified, in plasmid constructs, by elevated levels of its enzymatic activity [tRNA pseudouridine synthase I or PSU I]. Subcloning and analysis demonstrated that the hisT gene was located internally within an operon containing at least four addition genes (g-11). An adjacent gene, designated mg, was located upstream and was translationally linked to the hisT gene (10). One operon gene (pdxI3) was subsequently identified (12); however, the identity and functions of the remaining operon genes are still obscure. Because of the translational coupling between the usg and the hisT genes, we were motivated to purify the mg protein to determine whether it interacted with PSU I or its tRNA substrates. Earlier genetic analysis had indicated that the classic hisT phenotype was independent of the usg gene (lo), but it was possible that interactions between usg protein and PSU I might exist that would affect the substrate recognition or catalytic activities of PSU I. Although the synthesis of usg protein was amplified by plasmids containing the usg gene, no unique Coomassie blue-stained band could be identified corresponding to the mg protein in crude cell extracts by polyacrylamide gel electrophoresis. This made it impossible to follow the purification of r.4sgprotein from the stained gel patterns. The only property that appeared to be exploitable for its purification was its expression from plasmids containing the usg gene. From work with minicell, maxicell, and cell-free expression systems, it was known that the u-sg protein subunit migrated as an apparent 43,000-45,000 protein in SDS-polyacrylamide gels (9,lO). The molecular mass determined from the gene sequence corresponds to 37,300. The aberrant mobility of the mg protein in SDS gels is believed to arise from its high content of aspartic acid (10). For preparation of radiolabeled usg protein we used plasmid $210 (9) which expresses high levels of the usg

PURIFICATION

OF

PLASMID-EXPRESSED

PROTEINS

II II v

FIG. 2. Fractionation of unlabeled protein and radiolabeled tracer proteins by HPLC. This figure shows the radioactive profile when a 5~1 aliquot of the transcription-translation products of $210 is fractionated on a Altex Spherogel TSK-250 gel filtration column. The peak eluting at position 30 has been demonstrated to contain the ug protein by the comigration of purified mg-radiolabeled protein at this position. The rise in radioactivity at fraction 38 is due to unincorporated label and aminoacylated tRNAs. The Azm profile is from the coresolution of 5 mg of pooled protein from a DEAE-cellulose chromatography fraction of usg-enriched material. Autoradiographs of the products of the cell-free expression of #210 are seen in lane A and the fractionated radiolabeled usg protein (fraction 30) is seen in lane B. Coomassie blue staining patterns of the DEAE-eluant fraction applied to the gel filtration column are shown in lane C. and that of column fraction 30 is seen in lane D.

protein in maxicells and in uitro cell-free expression systems (shown in Fig. 1). Since the in uitro system was more reproducible and yielded greater incorporation of radiolabeled amino acid ( [3H]leucine), the in uitro incubation mixtures were used to prepare labeled wg protein for larger scale fractionation of usg protein. Our standard reaction system contained 5 pg of plasmid DNA and 60 /Xi of L-[4,5-3H]leucine. “E. coZi S-30 extract, ” “supplement solution,” and “amino acids minus leucine” were added to the reaction system as recommended by Amersham. Total reaction volume was 35 ~1 and after incubation for 45 min at 37OC, insoluble material was pelleted at 4C by centrifugation in a microfuge. The supernatant material was fractionated immediately or frozen at -8O’C. Prolonged incubation or increasing the concentration of plasmid DNA did not result in increased polypeptide synthesis. These reaction conditions for maximum expression of usg containing plasmids are consistent with studies of in uitro expression reported by Beres et al. (13). The reaction volume was occasionally scaled up fourfold without changes in incorporation parameters. Maximum expression occurred when the S-30 extract was used without refreezing. The labeled usg protein was enriched to about 80% radiochemical purity by applying a single HPLC gel filtration step to the crude in uitro labeled extract (Fig. 2). In this separation, the peak at fractions 30 and 31 contained the labeled usg protein. This was confirmed by comparing the mobility of the labeled material in these HPLC fractions with the mobilities of the products of

+210 expression on electrophoresis in SDS-polyacrylamide gels. The transcription-translation incubation mixtures (containing 20 pg of plasmid $210 and 240 PCi of L-[4,53H]leucine typically led to the recovery of 125,000-200,000 cpm of ‘H-labeled protein in the pooled gel filtration usg fractions. This material was then used as a tracer protein for subsequent fractionation of the unlabeled usg protein from bulk extracts. Labeled usg protein (50,000-100,000 cpm) was added to sonic extracts of E. coli cells transformed with plasmid $210 and purification was followed by recovery of the label after fractionation with ammonium sulfate and DEAE-cellulose, gel filtration on a TSK-250 column, and affinity chromatography on Affi-Gel Blue. An example of the use of the radiolabeled material in a single fractionation step is shown in Fig. 2. The starting material was a pooled DEAE-cellulose fraction enriched in usg protein. The DEAE pool was mixed with radioactive usg protein and resolved on a TSK-250 gel filtration column. The protein distribution in each fraction was assayed by polyacrylamide gel electrophoresis, using both protein staining and radioautography. A significant enrichment in the radioactive label and a Coomassie blue-stained band at 43,000 were concurrently seen at fractions 2931. Fractions 30 and 31 were pooled and subjected to preparative electrophoresis in denaturing polyacrylamide gels. From this, sufficient homogeneous usg monomer was isolated for N-terminal sequence analysis. The Nterminal analysis indicated that the product contained a blocked N-terminus.

340

MARVEL

AND

KAMMEN

Functional tests were conducted with highly enriched usg protein fractions which showed that the usg protein did not bind to tRNA or PSU I, nor did it otherwise affect the rate or extent of PSU I enzyme activity. Details of the purification and characterization of the usg protein will be presented elsewhere.

4. Winans,

ACKNOWLEDGMENTS

8.

This work received support from the Southern California Cancer Center and Cancer Support Grant P30 CA14089 to the University of Southern California Comprehensive Cancer Center (C.C.M.) and the National Science Foundation Grant DMB 85-10491 (H.O.K.)

REFERENCES 1. Sancar,

A., Hick,

A. M., and

Rupp, W. D. (1979)

J. Bacterial.

137,

692-693. 2. Roozen,

K. J., Fenwick,

R. G., and Curtiss,

107,21-33. 3. Zubay,

G. (1973)

Annu.

Reu. Genet.

7,267-287.

R. (1971)

J. Bacterial.

5. 6. 7.

9.

S. C., Elledge, S. J., Krueger, J. H., and Walker, G. C. (1985) J. Bucteriol. 161,1219-1221. Arps, P. J., and Winkler, M. E. (1987) J. Bucteriol. 169, 10611070. Pratt, J. M., Boulnois, G. J., Darby, V., Orr, E., Wahle, E., and Holland, I. B. (1981) Nucleic Acids Res. 9,4459-4474. Burton, Z., Burgess, R. R., Lin, J., Moore, D., Holder, S., and Gross, C. A. (1981) Nucleic Acids Res. 9,2889-2903. Hunkapiller, M. W., Hewick, R. M., Dreyer, W. J., and Hood, L. E. (1983) in Methods in Enzymology (Hirs, C. H. W., and Timasheff, S. N., Eds.), Vol. 91, pp. 399-413, Academic Press, New York. Marvel, C. C., Arps, P. J., Rubin, B. C., Kammen, H. O., Penhoet, E. E., and Winkler, M. E. (1985) J. Bacterial. 161,60-71.

10. Arps, P. J., Marvel, E. E., and Winkler, 5315. 11. Nonet, M. L., Marvel,

C. C., Rubin, M. E. (1985)

B. C., Tolan, Nu&ic Acids

C. C., and Tolan,

D. R. (1987)

D. A., Penhoet, Res. 13, 5297J. Eiol.

&em.

262,12,209-12,217. 12. Arps, P. J., and 1079. 13. Beres, L., Smith, niques &464-468.

Winkler, B., Cannon,

M. E. (1987)

J. Bacterial.

F., and Cannon,

M. (1986)

169, 1071BioTe&-