A systematic approach to increase the efficiency of membrane protein production in cell-free expression systems

A systematic approach to increase the efficiency of membrane protein production in cell-free expression systems

Protein Expression and Purification 82 (2012) 308–316 Contents lists available at SciVerse ScienceDirect Protein Expression and Purification journal h...

1MB Sizes 0 Downloads 21 Views

Protein Expression and Purification 82 (2012) 308–316

Contents lists available at SciVerse ScienceDirect

Protein Expression and Purification journal homepage: www.elsevier.com/locate/yprep

A systematic approach to increase the efficiency of membrane protein production in cell-free expression systems Stefan Haberstock a, Christian Roos a, Yvette Hoevels b, Volker Dötsch a, Gisela Schnapp b, Alexander Pautsch b, Frank Bernhard a,⇑ a b

Institute of Biophysical Chemistry, Centre for Biomolecular Magnetic Resonance, J.W. Goethe-University, Frankfurt-am-Main, Germany Department of Lead Identification and Optimization Support, Boehringer Ingelheim Pharma GmbH & Co. KG, D-88397 Biberach-an-der-Riss, Germany

a r t i c l e

i n f o

Article history: Received 11 November 2011 and in revised form 25 January 2012 Available online 8 February 2012 Keywords: Cell-free expression Tag variation Initiation of translation GPCRs P-CF GPCRs Thermostability

a b s t r a c t High amounts of membrane protein samples are needed for structural or functional analysis and a first bottleneck is often to obtain sufficient production efficiencies. The reduced complexity of protein production in cell-free expression systems results in a frequent correlation of efficiency problems with the essential transcription/translation process. We present a systematic tag variation strategy for the rapid improvement of cell-free expression efficiencies of membrane proteins based on the optimization of translation initiation. A small number of rationally designed short expression tags is attached via overlap PCR to the 5-prime end of the target protein coding sequence. The generated pool of DNA templates is analyzed in a cell-free expression screen and the most efficient template is selected for further preparative scale protein production. The expression tags can be minimized to only a few codons and no further impact on the coding sequence is required. The complete process takes only few days and the synthesized PCR fragments can be used directly as templates for preparative scale cell-free reactions. The strategy is exemplified with the production of a set of G-protein coupled receptors and yield improvements of up to 32-fold were obtained. All proteins were finally synthesized in amounts sufficient for further quality optimization and initial crystallization screens. Ó 2012 Elsevier Inc. All rights reserved.

Introduction In cell-free (CF)1 expression systems, the complexity of recombinant protein production is largely reduced and efficiencies remain mainly controlled by the central translation process. Yield optimization of difficult targets appears therefore to be most straight forward if critical and low efficient steps in the translation process could be optimized. Frequent rare codons as well as unfavorable initiation of translation has been identified as major limiting steps in protein biosynthesis [1,2]. Formation of secondary structures involving critical areas at the 5-prime end of the mRNA such as the ribosomal binding site could slow down or even prevent the initiation process [3–5]. ⇑ Corresponding author. Fax: +49 69 798 29632. E-mail address: [email protected] (F. Bernhard). Abbreviations used: CD, circular dichroism; CECF, continuous exchange cell-free; CF, cell-free; CMC, critical micellar concentration; CPM, 7-diethylamino-3-(40 maleimidylphenyl)-4-methylcoumarin; D-CF, detergent mode of CF expression; DTT, dithiothreitol; FM, feeding mixture; Fos16, n-hexadecylphosphocholine; GFP, green fluorescent protein; GPCR, G-protein coupled receptor; HA, hemagglutinin; HRP, horseradish peroxidase; IMAC, immobilized metal affinity chromatography; NMR, nuclear magnetic resonance; P-CF, precipitate forming CF expression mode; PCR, polymerase chain reaction; RM, reaction mixture; RT, room temperature; SDS, sodium dodecyl sulfate. 1

1046-5928/$ - see front matter Ó 2012 Elsevier Inc. All rights reserved. doi:10.1016/j.pep.2012.01.018

Although algorithms have been established that could facilitate the detection of such secondary structures, the high complexity of mRNAs still makes reliable predictions very difficult. A high frequency of rare codons could cause translational pausing, misincorporation of amino acids or even premature termination events, but such problems could be addressed by the expression of optimized synthetic genes [6]. By focusing on the prevalent problems with the initiation of translation [7,8], we have established a fast tag variation strategy in particular adapted to the general production of membrane proteins. A set of non-functional tags of minimal size is fused to the 5-prime ends of low expressing target coding regions. The often inefficient posttranslational cleavage of membrane proteins in detergent micelles can thus be avoided. The sequential steps of the tag variation protocol are: (I) Preparing DNA templates by overlap PCR for each tag – target combination, (II) DNA template screening in analytical scale continuous exchange cell-free (CECF) reactions, (III) the potential further optimization of the most efficient DNA template and (IV) final CECF protocol optimization with the optimized DNA template for preparative scale production. Characteristics of the strategy are high success in yield optimization with random targets, low workload and low impact on the protein structure.

S. Haberstock et al. / Protein Expression and Purification 82 (2012) 308–316

The tag variation strategy is exemplified with a number of notoriously difficult to express human G-protein coupled receptors (GPCRs). GPCRs represent a majority of the current drug targets and they are involved in a variety of human diseases such as asthma [9], diabetes [10] or even cancer [11]. They are usually very difficult to synthesize in conventional cellular expression systems and a number of reports on their CF production document a high potential of this approach [12–18]. The chemoattractant receptorhomologous molecule expressed on Th2 cells (CRTH2), the chemokine receptors type 2 (CCR2) and type 3 (CCR3), the C-X-C chemokine receptor type 2 (CXCR2) as well as the muscarinic acetylcholine receptor M3 (CHRM3) belong to class A, whereas the glucagone-like peptide 1 receptor (GLP1R) belong to class B.

Materials and methods DNA template preparation Plasmid templates with T7-tag-target constructs were generated by standard restriction cloning procedures for insertion of amplified target coding sequences into pET21a(+) vector (Merck, Darmstadt, Germany). Plasmid DNA templates were purified using standard kits (NucleoBondÒ, Machery – Nagel, Düren, Germany). Linear DNA templates were generated by PCR reactions using VentDNA-polymerase (New England Biolabs, Frankfurt, Germany). For the tag variation screen, first a pool of tag fragments was generated by amplifying the T7 promotor region and the ribosome binding site of an expression vector with primer P1 (Table 1), annealing upstream of the T7 promotor and a series of primers P2, extending the tag fragment at the 3-prime end with the tag sequence and the PreScission protease cleavage site sequence. In addition, a pool of target fragments is generated by amplifying genes of interest with corresponding primers P3, annealing to the 5-prime end of the coding sequence and extending the target fragments at the 5-prime end with the PreScission protease cleavage site sequence, and primer P4, annealing downstream of the T7 terminator (Fig. 1B). PCR products of the two pools were purified with the QIAquick PCR purification kit (Qiagen, Hilden, Germany) and quantified with a NanoDrop (Peqlab, Erlangen, Germany). Tag and target fragments were combined to linear CF expression templates

309

in a 1:1 M stoichiometry with 200 ng target fragment in 50 ll overlap PCR reactions [19–21]. Cell-free expression Compounds for CF expression were prepared as described previously [22]. Analytical scale CECF reactions for screening and for the optimization of reaction conditions were performed in 24 well microplates and Mini-CECF reactors [23]. The RM volume was 50 ll with a 1:16 ratio to the FM volume. DNA template was used with 20 ng/ll if not otherwise stated and optimal concentrations for Mg2+ ions were screened for each target between 13 and 19 mM in analytical scale P-CF reactions. For quantification, CF reactions were performed in triplicate in presence of 50 nM 35S-Met (1000 Ci/mmol, Hartmann Analytic GmbH, Braunschweig, Germany). The RM was supplemented by additional 17 ll Escherichia coli S30 extract and 33 ll FM to a total volume of 100 ll and used in a 1:7.5 ratio to the FM. The DNA template concentration was 10 ng/ll RM if not state otherwise. For the RM disposable dialysis container (D-Tube™ Dialyzer Mini, MWCO 12–14 kDa, Merck, Darmstadt, Germany) were used, while the FM was filled into cryogenic vials (Fischer Scientific, Schwerte, Germany). The D-tube container were properly placed into the cryogenic vials and the reaction was incubated at 30 °C for 16 h with gentle agitation. The RM was then diluted by addition of 900 ll H2O and centrifuged at 18,000g for 10 min at RT. 900 ll supernatant were removed and the protein pellets were resuspended by addition of 900 ll H2O. This procedure was repeated three times to remove unincorporated 35S-Met. The washed protein pellets were then resuspended in the remaining 100 ll supernatant and thoroughly mixed with 5 ml scintilation solution (RotiszintÒeco plus, Roth, Karlsruhe, Germany). Scinitilation counting was performed with a LS 6500 multi-purpose scintillation counter (Beckmann, Krefeld, Germany). Protein analysis and purification SDS–PAGE and immunoblotting was done as described [24]. Proteins expressed in analytical scale P-CF (precipitate forming) reactions were harvested by centrifugation at 18,000g for 10 min at 4 °C. The pellets were suspended in 20 mM HEPES (pH 7.0),

Table 1 Oligonucleotide primer used in this study.1

1

Primer

Orientation

Sequence

P1 P4 P2 - PS P2 - AT

Forward Reverse Reverse Reverse

GAT CGA GAT CTC GAT CCC GCG GGA TAT AGT TCC TCC TTT CAG C CGG GCC CTG AAA CAG CAC TTC CAG CAT ATG TAT ATC TCC TTC

P2 - H

Reverse

CGG GCC CTG AAA CAG CAC TTC CAG TGG ACC ATC GTA TGG TTT CAT ATG TAT ATC TCC TTC

P2 - SER

Reverse

CGG GCC CTG AAA CAG CAC TTC CAG TGA TGA TGA TGA TGA TTT CAT ATG TAT ATC TCC TTC

P2 - G

Reverse

CGG GCC CTG AAA CAG CAC TTC CAG TTC TTC TCC TTT ACT TTT CAT ATG TAT ATC TCC TTC

P2 - R

Reverse

CGG GCC CTG AAA CAG CAC TTC CAG GAT TAC AAG GAT GAC TTT CAT ATG TAT ATC TCC TTC

P2 - GLP1R - AT1

Reverse

GAC CCG GCG CAC CCG CCA TTT TCA TAT GTA TAT CTC CTT CTT A

CGG GCC CTG AAA CAG CAC TTC CAG ATA ATA TTT ATA ATA TTT CAT ATG TAT ATC TCC TTC

P2 - GLP1R - AT2

Reverse

GAC CCG GCG CAC CCG CCA TAT ATT TCA TAT GTA TAT CTC CTT CTT A

P2 - GLP1R - AT3

Reverse

GAC CCG GCG CAC CCG CCA TAT AAT ATT TCA TAT GTA TAT CTC CTT CTT A

P2 - GLP1R - AT4

Reverse

GAC CCG GCG CAC CCG CCA TTT TAT AAT ATT TCA TAT GTA TAT CTC CTT CTT A

P2 - GLP1R - AT5

Reverse

GAC CCG GCG CAC CCG CCA TAT ATT TAT AAT ATT TCA TAT GTA TAT CTC CTT CTT A

P2 - GLP1R - AT6

Reverse

P3 P3 P3 P3 P3 P3

Forward Forward Forward Forward Forward Forward

GAC CCG GCG CAC CCG CCA TAT AAT ATT TAT AAT ATT TCA TAT GTA TAT CTC CTT CTT A CTG GAA GTG CTG TTT CAG GGC CCG ATG AGC GCG AAC GCG ACC CTG CTG GAA GTG CTG TTT CAG GGC CCG ATG GCG GGT GCG CCG GGT C ATG GCG GGT GCG CCG GGT C CTG GAA GTG CTG TTT CAG GGC CCG ATG CTG TCC ACA TCT CGT TCT C CTG GAA GTG CTG TTT CAG GGC CCG ATG ACC ACC AGT CTG GAT ACC CTG GAA GTG CTG TTT CAG GGC CCG ATG ACC CTG CAT AAC AAC AGC A

-

CRTH2 GLP1R GLP1R (short) CCR2 CCR3 CHRM3

Tag sequences are underlined.

310

S. Haberstock et al. / Protein Expression and Purification 82 (2012) 308–316

Fig. 1. Template generation for the tag variation screen. (A) 1st step PCR generating linear tag and target fragments. For the tag fragment an expression vector with T7 promotor region is used as template with the P1 primer annealing upstream of the T7 promotor and a tag specific P2 primer annealing at the ribosome binding site extending the fragment at the 3-prime end with the expression tag sequence and the PreScission site. The target coding sequence is copied of a suitable vector with a P4 primer annealing downstream of the T7 terminator region and a P3 primer annealing at the start of the coding sequence extending the fragment at the 5-prime end with the PreScission site. (B) 2nd step combinatorial PCR. The tag and target fragment are combined in equal molar ratio in one PCR reaction. The fragments anneal via the PreScission site and the full length construct is amplified with primer P1 and P4.

500 mM NaCl, 20 mM DTT, 5% glycerol, 2% Fos16 by pipetting the solution up and down till the solution became clear. Subsequently, the volume was increased 10-fold with 20 mM HEPES (pH 7.0), 500 mM NaCl, 5% glycerol, 0.0053% Fos16 and the sample was incubated for efficient resolubilization for 1 h at 22 °C and with constant agitation. The resolubilised protein was mixed with 150 ll Ni–NTA agarose beads (Qiagen, Hilden, Germany) and incubated over night at 4 °C with gentle agitation. The protein loaded Ni–NTA agarose beads were then washed using poly prep chromatography columns (Bio-Rad, München, Germany) with 15 column volumes of 20 mM HEPES (pH 7.0), 500 mM NaCl, 75 mM imidazole, 5% glycerol, 0.0053% Fos16 and eluted with the same buffer containing 350 mM imidazole. In case of CXCR2 the Fos16 was replaced with the indicated detergents by addition of the second detergent during CXCR2 binding to Ni–NTA and an additional washing step with the second detergent prior to imidazole washing. For size exclusion chromatography (SEC), Ni–NTA agarose purified proteins were centrifuged at 18,000g for 10 min at 4 °C prior to analyses on a Superdex 200 3.2/30 column connected to an Äkta Purifier (GE Healthcare, München, Germany) with a running buffer of 20 mM HEPES (pH 7.0), 500 mM NaCl, 5% glycerol, 0.0053% Fos16.

Circular dichroism (CD) spectroscopy 200 ll of 2–10 lM purified protein was transferred into a MINI Slide-A-Lyzer (Thermo Scientific, Langenselbold, Germany) and dialyzed against 10 mM HEPES (pH 7.0), 5 mM NaCl, 0.0053% Fos16 at 4 °C over night. CD spectra were recorded with a 1 mm quartz cuvette and a Jasco J-180 spectropolarimeter (Jasco, GroßUmstadt, Germany). Measurements were carried out at standard sensitivity, 3 nm band width, response of 1 s and a scanning speed of 100 nm/min. The presented data show a baseline corrected

average of three measurements scanning the wavelength from 200 to 260 nm at 20 °C. Thermofluor assay Forty microliters of 20 lM purified protein was mixed with 1 ll 10 mM 7-diethylamino-3-(40 -maleimidylphenyl)-4-methylcoumarin (CPM). The samples were then transferred to a MX3005 qPCR machine (Agilent, Waldbronn, Germany) and a temperature gradient was applied from 25 to 92 °C with a ramp rate of 1 °C/cycle. The increase in fluorescence signal of the CPM dye upon protein unfolding and subsequent binding to free cysteins was measured with an excitation wavelength of 350 nm and an emission wavelength of 492 nm. Results Design of expression tags and generation of CF expression templates A set of small N-terminal expression tags containing a core of six codons were designed according to several putatively critical characteristics (Table 2). In particular, AT richness downstream of the translational start codon [7] and the potential advantage of an AAA sequence at the second codon position were considered [8]. Furthermore, the T7 tag which was previously identified as beneficial for the CF expression of several GPCRs was included [22]. The PreScission protease recognition site was included as hybridization site between expression tag and target coding sequence in the PCR overlap strategy, and it further left the option for posttranslational tag removal (Fig. 1). All tags started with AAA followed by five codons of varying composition (Table 2). (I) The AT-tag is based on AT richness [7] and has an AT content of 100%. (II) The H-tag consists of the first codons of the hemagglutinin-tag frequently used for protein

S. Haberstock et al. / Protein Expression and Purification 82 (2012) 308–316 Table 2 Expression-tags. Name

Nucleotide sequence

Amino acid sequence

AT (%)

PS T7 AT SER H G R

CTGGAAGTGCTGTTTCAGGGCCCG ACCCATTTGCTGTCCACCCGTCATGCTAGCCAT AAATATTATAAATATTAT AAATCATCATCATCATCA AAACCATACGATGGTCCA AAAAGTAAAGGAGAAGAA AAAGTCATCCTTGTAATC

LEVLFQGP THLLSTRHASH KYYKYY KSSSSS KPYDGP KSKGEE KVILVI

37 45 100 72 55 72 66

expression [25,26]. (III) The G-tag is based on the first codons of green fluorescent protein, showing exceptional high expression yields in the CF system. (IV) The SER-tag is a modified former (His)5 tag containing a frame shift caused by a deletion of the first nucleotide. (V) The R-tag is a random combination of nucleotides but avoiding rare codons. In addition, the T7 –tag was included because of previous positive experiences with CF expression of some GPCRs [13]. Finally, the PS-tag serves as a control for the effect of the PreScission site used as hybridization site. All target sequences were furthermore modified with a C-terminal poly(His)10 purification tag. The GPCR coding regions were fused with the seven expression tags by a previously published overlap PCR strategy [19] generating linear DNA templates containing the T7 promoter, RBS, expression-tag, target sequence and T7 terminator (Fig. 1). As the PreScission site serves as common hybridization site, only one specific oligonucleotide primer is necessary for each selected target (P3 in Fig. 1). The primers P1 and P4 can be used universally while for each selected expression-tag one P2 primer is necessary. The PCR step using P1 and the individual P2 primers generates a pool of tag-fragments covering the T7 promoter and the individual expression tags. As template for this PCR, almost any plasmid containing a T7 promoter such as pET21a(+) can be used. The PCR with P4 and the individual P3 primers generates a pool of target-fragments (Fig. 1A). Templates are plasmids containing the target coding sequences under control of T7 regulatory elements including the T7 terminator. In a second step, the two PCR fragment pools are combined and amplified using primers P1 and P4. Each target fragment is fused with a complete set of tags in a combinatorial overlap PCR, thus generating linear templates suitable for CF expression screening (Fig. 1B).

311

improvement was as expected only moderate from 1.11 to 1.38 mg/ml for CRTH2 and from 0.93 to 1.73 mg/ml for CCR2. For CHRM3 with an initial yield of 1.4 mg/ml, no further significant improvement could be obtained (Fig. 3). The expression levels of all targets were finally in preparative ranges from 0.86 to 2.33 mg/ml RM. It should be noted that all targets showed significant variations in their expression yields during the screen, underlining the tremendous impact of the short 5-prime tags on the production efficiency. Furthermore, all tags showed high expression levels with at least one of the targets (Fig. 2). Clearly worst was the R-tag being only efficient with CHRM3. The GPCRs CRTH2, GLP1R and CCR2 showed high expression yields with almost all other expression tags. Most specific was the expression of CCR3 which was only efficient with the AT-tag. The T7-tag showed almost no expression at all with CCR3 and CHRM3.

Individual expression tag optimization Proteolytic removal of the expression tags is usually inefficient with detergent solubilized membrane proteins and remaining them attached to the target proteins will therefore be the preferred option. However, effects on the function or stability of the protein as well as problems upon structural analysis by crystallization or NMR spectroscopy could result. We therefore exemplified the further optimization of an identified optimal tag by the sequential size reduction of the AT-tag in the AT-GLP1R construct (Fig. 4). First, the PreScission site was removed and then the residual sequence of the AT-tag was shortened successively down to one codon. Expression templates were generated by PCR and analyzed in analytical scale CECF reactions by using the P-CF mode and quantification by 35S-Met incorporation. The deletion of the PreScission site showed no significant change of the expression yield (Fig. 4). Already the addition of one additional codon (AAA) resulted in more than 10-fold increased expression yields in comparison with the non-tagged native sequence. The successive extension of the AT-tag up to the six codon full-length size did mostly show only small increase in the expression efficiencies. A remarkable exception was the four codon (AAA-TAT-TAT-AAA) AT-tag resulting in a further strong boost of the expression efficiency of up to 30-fold being almost twice as high as with the full-length AT-tag (Fig. 4). This improvement of the AT4-tag if compared with the AT6-tag could be observed with linear as well as with plasmid DNA templates.

Tag variation template screening The expression efficiencies of plasmid templates containing the non-tagged targets were first quantified by performing analytical scale CECF reactions in the P-CF mode in presence of 35S-Met. With CHRM3, CCR2, CRTH2 and CXCR2 already expression levels in the range of 1 mg/ml could be obtained, while almost no expression was detectable with CCR3 and GLP1R. The tag variation was subsequently performed with the targets CRTH2, GLP1R, CCR2, CCR3 and CHRM3 and the expression with the purified DNA templates was analyzed in analytical scale CECF reactions in the P-CF mode (Fig. 2). The tags giving apparently best expression yields for each target were identified as CRTH2 + H-tag, GLP1R + AT-tag, CCR2 + Htag, CCR3 + AT-tag and CHRM3 + G-tag. The selected fragments were inserted into the pET21a(+) vector and the resulting constructs in addition to plasmids containing the targets with the T7-tag were used as CF expression template (Fig. 3). The expression yields of CCR3 and GLP1R could be increased from 0.08 to 0.86 mg/ml and from 0.08 to 2.33 mg/ml, respectively. The non-modified constructs of CRTH2, CCR2 and CHRM3 were already expressed in relatively high efficiencies and thus the

CF expression efficiencies obtained with plasmid template versus linear templates Linear DNA fragments generated by PCR can be used as templates for CF expression reactions but a higher instability due to nuclease attack is often reported. We analyzed the expression efficiencies of five GPCRs in combination with their identified optimal N-terminal tags by using either linear DNA fragments or corresponding plasmid DNA based on the pET21a(+) vector as templates. The expression yields at final concentrations of 10 ng/ll RM were nearly identical (Fig. 5A). However, considering the stoichiometry an approximately 3–4.5 higher molarity of linear templates were provided. The correlation of expression efficiency on the DNA template concentrations was analyzed with increasing amounts of linear and plasmid template of the best expressing target AT-GLP1R and of AT-CCR3 as target with the lowest yield. In both cases the protein production could be increased with template concentrations until approximately 0.625 ng/ll RM for linear and 1.25 ng/ll RM for plasmid DNA and reached a relatively stable plateau above these concentrations (Fig. 5B and C).

312

S. Haberstock et al. / Protein Expression and Purification 82 (2012) 308–316

Fig. 2. Tag variation template screening. Representative production of five GPCRs combined with seven N-terminal expression tags each and expressed in analytical scale CECF reactions in the P-CF mode. The corresponding precipitates were treated equally by solubilization in 10 ll SDS loading buffer, separation by 12% SDS–PAGE and immunoblotting with antibodies directed against the C-terminal poly(His)10 tags. The combinations with the best expression yield are: CRTH2 + H-tag, GLP-1R + AT, CCR2 + H-tag, CCR3 + AT-tag, CHRM3 + G-tag.

Fig. 3. Expression yields of optimized constructs. Constructs were expressed in analytical scale CECF reactions in the P-CF mode and quantified by 35S-Met incorporation. Values are averages of at least triplicate determinations. Yield improvements from templates with no tag to optimal tag are given in percent.

Quality evaluation of the CF produced GPCRs The quality of CF produced GPCRs depends on a multitude of parameters such as expression mode, reaction conditions and type and concentration of hydrophobic environments. P-CF synthesized CXCR2 was initially solubilized in 1% Fos16 and the first detergent was subsequently exchanged into a number of second detergents upon IMAC purification. The eluted fractions were then analyzed for homogeneity by SEC and for stability with the thermofluor assay by monitoring the fluorescence of the thiol-reactive dye 7-diethylamino-3-(40 -maleimidylphenyl)-4-methylcoumarin (CPM) during temperature induced denaturation (Fig. 6). The thermofluor profiles are shown as first deviation simplifying the Tm determination indicated by the profile minima. CXCR2 showed best homogeneity with a Tm of 47 °C if it remained in Fos16. Exchange

Fig. 4. Individual expression tag optimization. The N-terminal AT-tag of the optimized AT-GLP1R construct was successively shortened in order to define the best compromise of protein modification and expression optimization. All constructs were P-CF expressed in triplicates in presence of 35S-Met (50 nM) and quantified by scintillation counting. The expression yield of the non-modified constructs was set as 100%. Plasmid or linear PCR fragments were used as documented. The remaining number of codons of the AT-tag derivatives is indicated. PS: tag includes PreScission protease recognition site.

into the detergents Brij35, Fos12 or DDM resulted in partial aggregation and reduced stability. The other five P-CF produced GPCRs were therefore solubilized in Fos16 and purified by IMAC. Approximately 50% of the synthesized protein according to the 35S-Met incorporation could be recovered in the elution fractions. Three general parameters of quality were evaluated: Secondary structure formation, thermostability and sample homogeneity. Secondary structures were analyzed by CD spectroscopy after dialyzing samples against 10 mM HEPES (pH

S. Haberstock et al. / Protein Expression and Purification 82 (2012) 308–316

313

with CXCR2 and CCR3, while a mixture of two populations is more pronounced with CCR2, GLP1R and CRTH2. The size ratio in between the two populations is roughly 1:1.75 and the equilibrium could be shifted towards the lower molecular weight fraction by the addition of 20 mM DTT (data not shown). Small peaks in the void volume visible for CCR2, CHRM3 and CRTH2 could be identified as nucleic acid contaminations (data not shown) and they were eliminated by the addition of RNAse A in the samples of CCR3 and GLP1R (Fig. 7C).

Discussion

Fig. 5. Optimization of DNA template concentration. (A) Expression yield with linear and circular DNA templates. Tag optimized constructs were expressed in analytical scale P-CF reactions in presence of 35S-Met from linear PCR and circular plasmid DNA templates and quantified by scintillation counting. The expression yield is shown in percent with the circular plasmid template set as 100%. Both templates were used in final concentrations of 10 ng/ll RM. (B) Gradient with linear DNA templates; (C) Gradient with plasmid DNA templates. The two targets with the highest (AT4-GLP1R) and the lowest (AT-CCR3) expression yield were expressed in analytical scale P-CF reactions with increasing amounts of DNA template. The expression yield was quantified by scintillation counting of incorporated 35S-Met and is shown in percent with the highest value set as 100%. Plateau phase expression started for linear template at 0.625 ng/ll and plasmid template at 1.25 ng/ll.

7.0), 5 mM NaCl, 0.0053% Fos16 (10 CMC). GPCRs are considered to be mostly a-helical proteins and all five CD spectra show a dominant a-helical fold with the typical minima at 222 and 208 nm wavelength (Fig. 7A). In the thermofluor assay, all five GPCRs showed in Fos16 relatively similar melting temperatures in between 47 and 55 °C (Fig. 7B). Upon SEC elution profiling, mixtures of heterogeneous GPCR populations were detected (Fig. 7C). Generally, a larger fraction ranging from 250 to 320 kDa was present while a minor part corresponding to putative oligomers eluted at higher molecular masses corresponding to 420–500 kDa. Best peak profiles containing mostly the lower molecular mass fraction were obtained

We demonstrate a fast screening approach for the improvement of CF production efficiencies of membrane proteins based on improving the well recognized critical step of translation initiation [27–29]. The first codons downstream of the translational start site play a crucial role in modulating the efficiency of the translation process and this area is even sometimes defined as downstream box [19,30]. The important characteristics of high AT content [7] and presence of a triple A sequence as second codon [8] were considered in our tag design. Bioinformatic programs are available that help to suppress potentially inhibitory secondary structure formations of mRNAs by predicting sequence optimizations [31,32]. However, those predictions are target dependent and corresponding studies attempted to find folding patterns or energy models explaining empirically determined expression yields [2,3,33]. A recent study confined a critical area of 42 nucleotides from 4 to +37 with regard to the start codon explaining most of the observed expression yield variations in a GFP mutation study [2]. While our expression tags cover exactly this region, the individual tags had quite variable effects on the expression levels of different targets. Critical mRNA regions affecting the translation efficiency must therefore extend beyond the proposed 42 nucleotide region, thus making reliable predictions even more difficult. Considering that additional parameters such as codon usage may play a role as well [1], the design of universally applicable 5-prime expression enhancing sequences may stay illusive. High AT content of tags or AAA as second codon are furthermore not exclusive parameters for mediating efficient translation. The Htag had in general very beneficial effects although having with 55% the lowest AT content of the five rationally designed tags, whereas the AT-tag showed lower expression yields in cases of CCR2 and CHRM3 despite the 100% AT content. Accordingly, the AAA codon alone did also not result in high expression yields as indicated by the generally low efficiency of the R-tag or by failure of most tags to enhance the expression of CCR3. Both observations support the postulation that mRNA folding at the translation initiation region is neither only controlled by its intrinsic sequence composition nor by the presence of distinct sequence motifs. The central problem of inefficient initiation of translation in CF expression systems has been addressed before in several approaches. The 5-prime sequence of the coding region of chloramphenicol acetyltransferase was effective in enhancing translation of several CF expressed soluble proteins. Furthermore, a set of naturally occurring signal sequences was screened for their efficiencies to boost the expression of soluble human erythropoietin [8] and the 21 amino acid signal sequence of OmpA resulted in a 10fold increase of the expression efficiency up to almost 0.6 mg/ml RM. However, this tag was found to be completely inefficient in another screen with a small soluble domain of adiponectin as target [20]. Here, rather larger proteins as N-terminal fusion partner such as peptidyl-prolyl cis–trans isomerase were found to be effective in enhancing translation. The overall consensus is that the efficiency of individual expression tags obviously depends on the attached target coding sequence. Our approach therefore differs from previous reports as we have designed a tag variation strategy rather

314

S. Haberstock et al. / Protein Expression and Purification 82 (2012) 308–316

Fig. 6. Quality variation of CXCR2. CXCR2 was expressed in the P-CF mode and in the CECF configuration, resolubilised with 1% Fos16 in 20 mM HEPES, pH 7.0, 200 mM NaCl, 20 mM DTT, 5% glycerol and purified by IMAC with simultaneous exchange into different detergents. (A) Elution profiles after separation on a Superdex 200 3.2/30 gel filtration column. (B) Purified protein was mixed with 250 lM CPM and a temperature gradient was applied from 25 to 92 °C using a MX3005 qPCR machine. The change in CPM fluorescence signal was monitored at 492 nm upon excitation at 350 nm wavelength. First deviation of raw data are presented with the minima indicating the Tm. Brij35:<25 °C, Fos12:<25 °C, Fos16: 47 °C, DDM: 37 °C.

Fig. 7. Evaluation of sample quality. Optimized GPCR templates were expressed in the P-CF mode and in the CECF configuration, resolubilised with 1% Fos16 in 20 mM HEPES, pH 7.0, 200 mM NaCl, 20 mM DTT, 5% glycerol and purified by IMAC. (A) Purified proteins were dialyzed overnight at 4 °C against 400 times sample volume of 10 mM HEPES, pH 7.0, 5 mM NaCl, 1% glycerol, 10 CMC Fos16. CD spectra of the protein sample were measured from 260 to 200 nm wavelength using a Jasco CD spectrometer and 1 mm quartz cuvettes. (B) Purified proteins were mixed with 250 lM CPM and a temperature gradient was applied from 25 to 92 °C using a MX3005 qPCR machine. The change in CPM fluorescence signal was monitored at 492 nm upon excitation at 350 nm wavelength. The first deviation of the raw data is presented with the minima indicating the Tm. CCR2: 55 °C, CCR3: 50 °C. CHRM3: 52 °C, GLP1R: 47 °C, CRTH2: 47 °C, CXCR2: 47 °C. (C) Elution profiles after separation on a Superdex 200 3.2 /30 gel filtration column.

than propagating distinct tags as being universal. We have combined previous observations and we have adjusted our protocol to the production of membrane proteins as their posttranslational proteolytic processing is usually very inefficient and associated with inacceptable sample losses. Tags are therefore small and designed as rather permanent attachment and we demonstrate that

individual tag size screening can even result in further considerable yield improvement. Using linear DNA fragments as expression templates is one important benefit of CF expression systems [8,19– 21]. In contrast to previous reports recommending high concentrations or specially treated extracts [8,20], we could further show that even low concentrations of only approximately 1 ng/ll RM

S. Haberstock et al. / Protein Expression and Purification 82 (2012) 308–316

of linear DNA templates are already sufficient to produce preparative scale amounts of GPCRs thus resulting in a considerable reduction of costs and workload. Only few GPCRs have so far been produced in preparative scales in E. coli cells [34–36] while usually yields are far below mg amounts per liter of cell culture [37,38]. In addition, GPCRs are often expressed as large fusion proteins or they are synthesized as inclusion bodies which may be difficult to refold. After tag variation, all analyzed GPCRs could finally be CF synthesized in preparative scales. This is the first report of preparative scale CF expression for all six GPCRs, while CCR3 and GLP1R were only synthesized after tag variation. CF expression is much faster than conventional in vivo expression and can be completed within 24 h [37,39–42]. The preliminary quality analysis of the P-CF produced GPCRs gave evidence of folded proteins after their resolubilization in detergent and CD spectroscopy revealed dominant alpha helical folds as expected for GPCRs. SEC profiling indicated mostly putative monomeric samples with CXCR2 and CCR3 in Fos16, while larger amounts of putative oligomeric states are present with CCR2, CHRM3, GLP1R and CRTH2. In addition, the determined thermal stabilities in between 47 and 55 °C matches well with the reported melting points of other GPCRs [43]. The production of GPCRs in CF systems based on E. coli extracts becomes more and more interesting. Homogenous samples of the porcine and human vasopressin type 2 receptors, the human endothelin B receptor and of the human corticotropin releasing factor receptor in mostly dimeric forms could be obtained after P-CF expression as analyzed by electron microscopy [44]. Ligand binding activity of several CF expressed GPCRs has been reported [12,14,15,17,44,45]. However, it should be noted that a considerable number of CF expression conditions have to be analyzed in order to determine the ideal reaction conditions with respect to expression mode or type of supplied detergent for the production of high quality samples suitable for ligand binding experiments [12]. The presented tag variation protocol documents an efficient and reliable strategy to get rapidly access to sufficient amounts of membrane protein samples but probably has to be followed by subsequent quality optimization approaches implementing expression condition screens. Conclusion We have extended previous strategies by (I) demonstrating that the tag variation approach is applicable for the CF production of large eukaryotic membrane proteins, (II) showing that small tags comprising not more than six codons are sufficient as translation enhancer, (IV) demonstrating that after identification, target specific tags can be further optimized by serial truncations and (V) showing that linear PCR products constructed for the tag variation screening can already result in preparative scale CF expression of GPCRs without further processing. The whole process can be organized into a three step protocol by first screening different expression tags, subsequent individual optimization of tag sizes in order to avoid posttranslational processing steps and finally evaluating the optimal template concentration. We further present the sequence of four efficient expression tags, the AT-tag, the H-tag, the G-tag and the SER-tag which can be recommended as a comprehensive pool for tag variation screening. Preliminary analysis indicated significant quality variation of the GPCRs depending on the selected expression conditions while solubilization in Fos16 resulted predominantly in mixtures of putative monomers and oligomers. Acknowledgments This work was supported by the Collaborative Research Center (SFB) 807 of the German Research Foundation (DFG). We further

315

thank the European Drug Initiative on Channels and Transporters (EDICT), contract number HEALTH-F4-2007-201924, the European initiative on Structural Biology of Membrane Proteins (SBMP), contract number PITN-GA-2008-211800 and the NIH (grant number U54 GM094608) for funding. References [1] K.E. Griswold, N.A. Mahmood, B.L. Iverson, G. Georgiou, Effects of codon usage versus putative 50 -mRNA structure on the expression of Fusarium solani cutinase in the Escherichia coli cytoplasm, Protein Expr. Purif. 27 (2003) 134–142. [2] G. Kudla, A.W. Murray, D. Tollervey, J.B. Plotkin, Coding-sequence determinants of gene expression in Escherichia coli, Science 324 (2009) 255–258. [3] M.N. Hall, J. Gabay, M. Debarbouille, M. Schwartz, A role for mRNA secondary structure in the control of translation initiation, Nature 295 (1982) 616–618. [4] B.S. Laursen, H.P. Sorensen, K.K. Mortensen, H.U. Sperling-Petersen, Initiation of protein synthesis in bacteria, Microbiol. Mol. Biol. Rev. 69 (2005) 101–123. [5] M.H. de Smit, J. van Duin, Secondary structure of the ribosome binding site determines translational efficiency: a quantitative analysis, Proc. Natl. Acad. Sci. USA 87 (1990) 7668–7672. [6] S.C. Makrides, Strategies for achieving high-level expression of genes in Escherichia coli, Microbiol. Rev. 60 (1996) 512–538. [7] G. Qing, B. Xia, M. Inouye, Enhancement of translation initiation by A/T-rich sequences downstream of the initiation codon in Escherichia coli, J. Mol. Microbiol. Biotechnol. 6 (2003) 133–144. [8] J.H. Ahn, M.Y. Hwang, K.H. Lee, C.Y. Choi, D.M. Kim, Use of signal sequences as an in situ removable sequence element to stimulate protein synthesis in cellfree extracts, Nucleic Acids Res. 35 (2007) e21. [9] M. Arima, T. Fukuda, Prostaglandin D2 receptors DP and CRTH2 in the pathogenesis of asthma, Curr. Mol. Med. 8 (2008) 365–375. [10] S. Ali, B.J. Lamont, M.J. Charron, D.J. Drucker, Dual elimination of the glucagon and GLP-1 receptors in mice reveals plasticity in the incretin axis, J. Clin. Invest. 121 (2011) 1917–1929. [11] F. Balkwill, Cancer and the chemokine network, Nat. Rev. Cancer 4 (2004) 540– 550. [12] F. Junge, L.M. Luh, D. Proverbio, B. Schafer, R. Abele, M. Beyermann, V. Dotsch, F. Bernhard, Modulation of G-protein coupled receptor sample quality by modified cell-free expression protocols: a case study of the human endothelin A receptor, J. Struct. Biol. 172 (2010) 94–106. [13] C. Klammt, D. Schwarz, N. Eifler, A. Engel, J. Piehler, W. Haase, S. Hahn, V. Dotsch, F. Bernhard, Reprint of ‘‘Cell-free production of G protein-coupled receptors for functional and structural studies’’ [J. Struct. Biol. 158, 482–493], J. Struct. Biol. 159 (2007) 194–205. [14] C. Klammt, M.H. Perrin, I. Maslennikov, L. Renault, M. Krupa, W. Kwiatkowski, H. Stahlberg, W. Vale, S. Choe, Polymer-based cell-free expression of ligandbinding family B G-protein coupled receptors without detergents, Protein Sci. 20 (2011) 1030–1041. [15] G. Ishihara, M. Goto, M. Saeki, K. Ito, T. Hori, T. Kigawa, M. Shirouzu, S. Yokoyama, Expression of G protein coupled receptors in a cell-free translational system using detergents and thioredoxin-fusion vectors, Protein Expr. Purif. 41 (2005) 27–37. [16] X. Wang, K. Corin, P. Baaske, C.J. Wienken, M. Jerabek-Willemsen, S. Duhr, D. Braun, S. Zhang, Peptide surfactants for cell-free production of functional G protein-coupled receptors, Proc. Natl. Acad. Sci. USA 108 (2011) 9049–9054. [17] L. Kaiser, J. Graveland-Bikker, D. Steuerwald, M. Vanberghem, K. Herlihy, S. Zhang, Efficient cell-free production of olfactory receptors: detergent optimization, structure, and ligand binding analyses, Proc. Natl. Acad. Sci. USA 105 (2008) 15726–15731. [18] K. Sansuk, C.I. Balog, A.M. van der Does, R. Booth, W.J. de Grip, A.M. Deelder, R.A. Bakker, R. Leurs, P.J. Hensbergen, GPCR proteomics: mass spectrometric and functional analysis of histamine H1 receptor after baculovirus-driven and in vitro cell free expression, J. Proteome Res. 7 (2008) 621–629. [19] J.M. Son, J.H. Ahn, M.Y. Hwang, C.G. Park, C.Y. Choi, D.M. Kim, Enhancing the efficiency of cell-free protein synthesis through the polymerase-chainreaction-based addition of a translation enhancer sequence and the in situ removal of the extra amino acid residues, Anal. Biochem. 351 (2006) 187–192. [20] A.V. Kralicek, M. Radjainia, N.A. Mohamad Ali, C. Carraher, R.D. Newcomb, A.K. Mitra, A PCR-directed cell-free approach to optimize protein expression using diverse fusion tags, Protein Expr. Purif. 80 (2011) 117–124. [21] T. Yabuki, Y. Motoda, K. Hanada, E. Nunokawa, M. Saito, E. Seki, M. Inoue, T. Kigawa, S. Yokoyama, A robust two-step PCR method of template DNA production for high-throughput cell-free protein synthesis, J. Struct. Funct. Genomics 8 (2007) 173–191. [22] D. Schwarz, F. Junge, F. Durst, N. Frolich, B. Schneider, S. Reckel, S. Sobhanifar, V. Dotsch, F. Bernhard, Preparative scale expression of membrane proteins in Escherichia coli-based continuous exchange cell-free systems, Nat. Protoc. 2 (2007) 2945–2957. [23] B. Schneider, F. Junge, V.A. Shirokov, F. Durst, D. Schwarz, V. Dotsch, F. Bernhard, Membrane protein expression in cell-free systems, Methods Mol. Biol. 601 (2010) 165–186. [24] Y. Ma, D. Munch, T. Schneider, H.G. Sahl, A. Bouhss, U. Ghoshdastider, J. Wang, V. Dotsch, X. Wang, F. Bernhard, Preparative scale cell-free production and quality optimization of MraY homologues in different expression modes, J. Biol. Chem. 286 (2011) 38844–38853.

316

S. Haberstock et al. / Protein Expression and Purification 82 (2012) 308–316

[25] E.W. Wilker, R.A. Grant, S.C. Artim, M.B. Yaffe, A structural basis for 14-33sigma functional specificity, J. Biol. Chem. 280 (2005) 18891–18898. [26] K.M. Kim, R.R. Gainetdinov, S.A. Laporte, M.G. Caron, L.S. Barak, G proteincoupled receptor kinase regulates dopamine D3 receptor signaling by modulating the stability of a receptor-filamin-beta-arrestin complex. A case of autoreceptor regulation, J. Biol. Chem. 280 (2005) 12774–12780. [27] M.H. de Smit, J. van Duin, Control of translation by mRNA secondary structure in Escherichia coli. A quantitative analysis of literature data, J. Mol. Biol. 244 (1994) 144–150. [28] M. Kozak, Regulation of translation via mRNA structure in prokaryotes and eukaryotes, Gene 361 (2005) 13–37. [29] M. Kozak, Initiation of translation in prokaryotes and eukaryotes, Gene 234 (1999) 187–208. [30] M.L. Sprengart, E. Fuchs, A.G. Porter, The downstream box: an efficient and independent translation initiation signal in Escherichia coli, EMBO J. 15 (1996) 665–674. [31] G.W. Hatfield, D.A. Roth, Optimizing scaleup yield for protein production: Computationally Optimized DNA Assembly (CODA) and Translation Engineering, Biotechnol. Ann. Rev. 13 (2007) 27–42. [32] D.H. Mathews, D.H. Turner, M. Zuker, RNA secondary structure prediction. Curr Protoc Nucleic Acid Chem Chapter 11 (2007) Unit 11 12. [33] J. Duan, M.S. Wainwright, J.M. Comeron, N. Saitou, A.R. Sanders, J. Gelernter, P.V. Gejman, Synonymous mutations in the human dopamine receptor D2 (DRD2) affect mRNA stability and synthesis of the receptor, Hum. Mol. Genet. 12 (2003) 205–216. [34] H. Ren, D. Yu, B. Ge, B. Cook, Z. Xu, S. Zhang, High-level production, solubilization and purification of synthetic human GPCR chemokine receptors CCR5, CCR3, CXCR4 and CX3CR1, PLoS One 4 (2009) e4509. [35] K. Schroder-Tittmann, E. Bosse-Doenecke, S. Reedtz-Runge, C. Ihling, A. Sinz, K. Tittmann, R. Rudolph, Recombinant expression, in vitro refolding, and biophysical characterization of the human glucagon-like peptide-1 receptor, Biochemistry 49 (2010) 7956–7965.

[36] J.L. Baneres, J.L. Popot, B. Mouillac, New advances in production and functional folding of G-protein-coupled receptors, Trends Biotechnol. 29 (2011) 314–322. [37] H.M. Weiss, R. Grisshammer, Purification and characterization of the human adenosine A(2a) receptor functionally expressed in Escherichia coli, Eur. J. Biochem. 269 (2002) 82–92. [38] J.F. White, L.B. Trinh, J. Shiloach, R. Grisshammer, Automated large-scale purification of a G protein-coupled receptor for neurotensin, FEBS Lett. 564 (2004) 289–293. [39] B. Wu, E.Y. Chien, C.D. Mol, G. Fenalti, W. Liu, V. Katritch, R. Abagyan, A. Brooun, P. Wells, F.C. Bi, D.J. Hamel, P. Kuhn, T.M. Handel, V. Cherezov, R.C. Stevens, Structures of the CXCR4 chemokine GPCR with small-molecule and cyclic peptide antagonists, Science 330 (2010) 1066–1071. [40] F. Xu, H. Wu, V. Katritch, G.W. Han, K.A. Jacobson, Z.G. Gao, V. Cherezov, R.C. Stevens, Structure of an agonist-bound human A2A adenosine receptor, Science 332 (2011) 322–327. [41] A. Rinken, K. Kameyama, T. Haga, L. Engstrom, Solubilization of muscarinic receptor subtypes from baculovirus infected Sf9 insect cells, Biochem. Pharmacol. 48 (1994) 1245–1251. [42] S.J. Allen, S. Ribeiro, R. Horuk, T.M. Handel, Expression, purification and in vitro functional reconstitution of the chemokine receptor CCR1, Protein Expr. Purif. 66 (2009) 73–81. [43] A.I. Alexandrov, M. Mileni, E.Y. Chien, M.A. Hanson, R.C. Stevens, Microscale fluorescent thermal stability assay for membrane proteins, Structure 16 (2008) 351–359. [44] C. Klammt, A. Srivastava, N. Eifler, F. Junge, M. Beyermann, D. Schwarz, H. Michel, V. Doetsch, F. Bernhard, Functional analysis of cell-free-produced human endothelin B receptor reveals transmembrane segment 1 as an essential area for ET-1 binding and homodimer formation, FEBS J. 274 (2007) 3257–3269. [45] J.P. Yang, T. Cirico, F. Katzen, T.C. Peterson, W. Kudlicki, Cell-free synthesis of a functional G protein-coupled receptor complexed with nanometer scale bilayer discs, BMC Biotechnol. 11 (2011) 57.