E unum pluribus: multiple proteins from a self-processing polyprotein

E unum pluribus: multiple proteins from a self-processing polyprotein

Review TRENDS in Biotechnology Vol.24 No.2 February 2006 E unum pluribus: multiple proteins from a self-processing polyprotein Pablo de Felipe1, Ga...

219KB Sizes 0 Downloads 30 Views

Review

TRENDS in Biotechnology

Vol.24 No.2 February 2006

E unum pluribus: multiple proteins from a self-processing polyprotein Pablo de Felipe1, Garry A. Luke1, Lorraine E. Hughes1, David Gani2, Claire Halpin3 and Martin D. Ryan1 1

Centre for Biomolecular Sciences, School of Biology, Biomolecular Sciences Building, University of St Andrews, North Haugh, St Andrews, Scotland, UK, KY16 9ST 2 University of Birmingham, School of Chemical Sciences, Edgbaston, Birmingham, UK, B15 2TT 3 Plant Research Unit, School of Life Sciences, University of Dundee at SCRI, Invergowrie, Dundee, Scotland, UK, DD2 5DA

Many applications of genetic engineering require transformation with multiple (trans)genes, although to achieve these using conventional techniques can be challenging. The 2A oligopeptide is emerging as a highly effective new tool for the facile co-expression of multiple proteins in a single transformation step, whereby a gene encoding multiple proteins, linked by 2A sequences, is transcribed from a single promoter. The polyprotein selfprocesses co-translationally such that each constituent protein is generated as a discrete translation product. 2A functions in all the eukaryotic systems tested to date and has already been applied, with great success, to a broad range of biotechnological applications: from plant metabolome engineering to the expression of T-cell receptor complexes, monoclonal antibodies or heterodimeric cytokines in animals.

Introduction The introduction of novel traits or the repair of genetic lesions often requires the stable co-expression of multiple proteins within the same cell. Conventionally, multiple transgenes are inserted into the genome, with each driven by its own promoter. Because gene targeting is usually not possible, these transgenes are inserted randomly into the genome. The subsequent variation between transgenes in the site of insertion and in the number of copies inserted leads to two major problems: (i) potential transgene segregation in subsequent generations, and (ii) lack of coordination of the transcriptional activity of the disparate transgenes. An alternative approach is suggested by the replication mechanism of positive-stranded RNA viruses. Picornaviruses, such as poliovirus and foot-and-mouth disease virus (FMDV), encode all of their proteins within a single open reading frame (ORF) (Figure 1a). Individual virus proteins are derived from the polyprotein precursor by (auto)proteolytic processing. In the case of the FMDV, the 18 amino-acid-long 2A region has a major role in polyprotein processing. This region is not a proteinase; rather it is thought to mediate a novel type of cotranslational, intraribosomal cleavage (Figure 1a). This Corresponding author: Ryan, M.D. ([email protected]). Available online 27 December 2005

discovery has led to a breakthrough in protein coexpression technology. Using this short sequence, multiple proteins can be co-expressed from a single mRNA by fusing multiple ORFs with intervening 2A sequences into a single, long ORF; furthermore, the resulting single polyprotein self-processes into multiple products, hence the phrase e unum pluribus – a commonly used, although grammatically incorrect, simple transposition of the US motto, taken to mean: ‘out of one, many’. Translational strategies of co-expression Traditionally, proteins have been co-expressed by the creation of simple fusions or by fusing proteins using proteinase cleavage linker sequences [1]. The limitations of these strategies are: (i) fusing multiple proteins might compromise their functions; (ii) the fusion protein can only be targeted to a single subcellular site; (iii) when proteinase cleavage sites are incorporated, the polyprotein substrate and processing enzyme must be co-expressed in the same subcellular compartment. The first breakthrough in co-expression came about from the discovery of internal ribosome entry sequences (IRESes). These sequences are able to direct ribosomes to initiate translation at internal sites within the mRNA. IRESes are, however, typically large (w500 nucleotides) and, importantly, translation of the second ORF is much lower (w10%) compared with the first ORF. Furthermore, problems can arise from competition between different IRESes [2]: by comparison, 2A is much shorter and essentially yields equimolar products. 2A-mediated cleavage Initial studies on the processing of the FMDV polyprotein revealed that 2A – comprising the 18 amino-acid-long 2A region together with the N-terminal proline of protein 2B (immediately downstream of 2A) – was able to mediate cleavage at its own C-terminus (Figure 1a) [3]. When this oligopeptide sequence was inserted between reporter proteins to create an artificial polyprotein, the constituent reporter proteins were cleaved apart at the C-terminus of 2A highly efficiently (Figure 1b) [4]. Early analyses of this cleavage reaction using cell-free translation systems (rabbit reticulocyte lysates, wheat germ extracts) yielded vital clues to the underlying

www.sciencedirect.com 0167-7799/$ - see front matter Q 2005 Elsevier Ltd. All rights reserved. doi:10.1016/j.tibtech.2005.12.006

Review

TRENDS in Biotechnology

Vol.24 No.2 February 2006

69

Single, long open reading frame (ORF)

(a)

L

P1

P2

P3

2A Translation, polyprotein processing

L

P1

+

P2

+

+

P3

2A

-Q

LLNFDLLKLAGDVESNPG

P-

pCATGUS pCAT2AGUS pCAT pGUS

(b)

CAT

CAT–2A–GUS

GUS

CAT–GUS

2A

GUS

Translation, 2A-mediated cleavage

CAT–2A

+

GUS CAT–2A CTA TRENDS in Biotechnology

Figure 1. 2A: from viral polyprotein to vector systems. (a) The FMDV polyprotein (shaded boxed areas) showing the position of 2A. Curved arrows above the boxed area indicate co-translational cleavages. The solid curved arrow indicates the 2A-mediated cleavage at its own C-terminus. The dashed arrows indicate proteolytic cleavages mediated by the two proteinases encoded elsewhere in the FMDV polyprotein: the L proteinase cleaving at its own C-terminus and the 3C proteinase (located within the P3 region) cleaving between the P2 and P3 polyprotein domains. The sequence of 2A is shown below, together with the cleavage site between proteins 2A and 2B. (b) Schematic representation of the first biotechnological use of 2A in the construction of a polyprotein comprising chloramphenicol acetyl-transferase (CAT) and b-glucuronidase (GUS) in a CAT–2A–GUS polyprotein. 2A mediates co-translational cleavage into CAT–2A and GUS proteins [3]. The panel inset shows translation profiles obtained using a cell-free rabbit reticulocyte-lysate translation system programmed with the control plasmids pCAT–GUS, pCAT and pGUS, together with the self-processing polyprotein encoded by pCAT–2A–GUS.

mechanism. Cleavage only occurred co-translationally, and the uncleaved products of translation (w10% of the total) could not produce the cleavage products posttranslationally. Furthermore, proteins within the ORF that are encoded upstream of 2A accumulated to higher levels than those encoded downstream of this region. This was not considered to be the result of trivial effects, such as degradation of the transcript RNA or protein, but appeared to be because of differential levels of synthesis,

whereby a proportion of ribosomes terminated at the cleavage site leading to the observed imbalance when using these cell-free systems. From this, we concluded that the mechanism of 2A-mediated cleavage was not proteolytic [4–6]. A model was proposed in which 2A-mediated cleavage occurs through a manipulation of the translational apparatus (Box 1) [5–7], whereby promotion of the hydrolysis of the ester linkage between the nascent

Box 1. Translational model of 2A-mediated cleavage: ribosome skipping The nascent 2A peptide is thought to interact with the exit tunnel of the ribosome, inducing a pause in translation. Those residues of 2A at the base of the exit tunnel (DxExNPG) are proposed to form a turn that shifts the ester linkage between the peptide and tRNAGly away from the prolyl–tRNA (a sterically hindered nucleophile). The system becomes stalled, and prolyl–tRNA is unable to form a peptide bond. We propose that hydrolysis of the ester bond between the nascent peptide and tRNAGly occurs within the ribosome, thereby releasing the translation product. Nascent proteins are never detected when still covalently bound to tRNAGly. We coined the term ‘CHYSEL’ (cis–acting hydrolyase element) for 2A and 2A-like sequences, reflecting their www.sciencedirect.com

ability to promote the hydrolysis of the ester linkage at this specific site by whatever mechanism. In this manner, 2A and the polypeptide upstream form a discrete translation product (Figure 1). Translation of the remainder of the ORF can then occur – in essence re-initiating from the prolyl–tRNA. 2A appears, therefore, to induce the synthesis of a specific glycyl–prolyl peptide bond to be skipped [5–7]. Expression of a 2A-containing artificial polyprotein in a range of yeast mutant cells confirms that certain translation factors affect the outcome of 2A-mediated cleavage, supporting the translational model for 2A-mediated cleavage (V. Doronina and J. Brown, personal communication).

70

Review

TRENDS in Biotechnology

peptide and tRNA results in the release of the peptide: the term CHYSEL (cis-acting hydrolyase element) has been coined to reflect this property of 2A (and similar sequences). Although this model was, in part, developed

Vol.24 No.2 February 2006

to account for the imbalance in products observed with cell-free translation systems, many cellular expression studies suggest that the same imbalance does not occur in cultured cells, plants or animals. We interpret these data

(a) EYFP

MT

ECFP

2A +

EYFP

PAC 2A

MT

+

ECFP

PAC

(b) EYFP

CR

ECFP

2A +

EYFP

PAC 2A

CR

ECFP

+

PAC

(c) GalT

ECFP

EYFP 2A

GalT

EYFP

+

ECFP

GalT–EYFP–2A detected with anti-2A antibodies

TRENDS in Biotechnology

Figure 2. Subcellular targeting of processing products from 2A self-processing polyproteins. Images were taken 36–48 hours post-transfection of human HeLa cells. Bar represents 10 mm. (a) A post-translational mitochondrial targeting signal (MT) directs the enhanced cyan fluorescent protein (ECFP) to the mitochondria, whereas the EYFP– 2A remains in the cytosol (and diffuses into the nucleus). (b) The EYFP–2A (no signal sequence) is located in the cytosol, whereas the co-translational signal sequence (CR) directs the ECFP to the endoplasmic reticulum (ER). (c) The GalT co-translational signal sequence directs GalT–EYFP–2A to the Golgi apparatus, whereas the ECFP (no signal) slipstream is translocated into the ER. Detection of the GT–EYFP–2A protein by anti-2A rabbit antibodies (using a Cy5 secondary antibody from Jackson Immuno Research, Inc.; www.jacksonimmuno.com) is also shown. 2A: FMDV 2A oligopeptide; GalT: Golgi signal-anchor from the b-1,4 galactosyltransferase; CR: ER signal peptide of calreticulin; MT: mitochondrial signal of the subunit VIII of cytochrome C oxidase; PAC: puromycin N-acetyl transferase. www.sciencedirect.com

Review

TRENDS in Biotechnology

Vol.24 No.2 February 2006

71

Box 2. Efficiency of cleavage is influenced by the length of 2A Studies using cell-free translation systems with constructs containing 2A showed cleavage activities of w90%. This was improved still further by using longer versions of 2A containing a few extra amino acids from the FMDV protein 1D, immediately upstream of 2A [13–15]. Most groups using FMDV 2A (usually the 20-amino-acid version) reported increased cleavage activity in tissue-cultured cells, and several were not able to detect uncleaved protein, although in one

as reflecting the limitations of cell-free translation systems compared with expression in cells. The 2A region remains as a C-terminal extension of the upstream protein, and this must be taken into account if the authentic C-terminus is required for activity or subcellular targeting: such proteins should be encoded at the C-terminus of the entire polyprotein. To date, there are only two reports where the presence of a 2A C-terminal extension has resulted in a slight reduction in enzymatic activity [8,9]; however, the 2A extension can be used to detect expression [4,10,11] or localization using anti-2A antibodies (Figure 2c). All cleavage products downstream of 2A contain an N-terminal proline: these proteins are, however, metabolically stable [12]. 2A-mediated cleavage occurs in all the eukaryotic cells tested to date (mammal, insect, yeast, fungi, plant) as shown in Tables S1 and S2 (Supplementary material), but not in bacteria [13]. Using cell-free translation systems, the length of the 2A sequence was shown to be important for cleavage, and a similar effect has been observed in cultured cells (Box 2) [8,13–22]. 2Alike sequences have been identified in other virus (poly)protein systems (Box 3) [7,13,14,19,23–25]. Different lengths of 2A and four different 2A (or 2A-like) sequences have been used for co-expression, in vivo, and highly efficient cleavage was reported in all cases (Table 1). The key point for biotechnologists is that cellular expression results in multiple, discrete proteins (in essentially equimolar quantities), derived from a single ORF. Co- and post-translational subcellular targeting of cleavage products in yeast, plant and mammalian cells Signal sequences for post-translational targeting can also be included within 2A polyproteins, either up- or downstream of 2A (Figure 2a). Correct post-translational targeting to the nucleus [26,27], chloroplast [28,29], mitochondria [30], membranes [18] or cytosolic tubules,

report 17- and 24- amino-acid versions of FMDV 2A produced only 70% and 85% cleavage, respectively [16]. Cleavage activity has been increased by the introduction of a flexible linker at the N-terminus [17–19], although this effect might simply be the result of the insertion making the inhibitory residues more distal. It is clear that the proximal upstream context of the shorter versions of 2A can have an effect upon cleavage activity [8,20–22].

which are formed by the movement proteins of certain plant viruses [31], have all been reported. Because 2A works co-translationally, one aspect that increases the biotechnological value of this mechanism is the effect observed when co-translational signal sequences are included at different sites within the self-processing polyprotein. The co-translational nature of the cleavage means that polyproteins can be designed such that some proteins can be co-translationally targeted to the exocytic pathway, whereas others are post-translationally targeted to different cellular compartments. This provides a substantial enhancement compared with other polyprotein-based systems, which require post-translational (proteolytic) processing. Our analyses of polyproteins comprising a cytosolic protein followed by a protein that is co-translationally targeted to the secretory pathway showed that although the first protein was, indeed, located in the cytoplasm, the signal sequence of the second protein (encoded downstream of 2A) was recognized by the signal recognition particle (SRP), and entered the exocytic pathway in both plant and mammalian cells (Figure 2b) [28,30]. To determine the effect of the presence of a cotranslational signal sequence at the N-terminus of the polyprotein, a polyprotein comprising a protein co-translationally targeted to the secretion pathway followed by 2A and then GFP was expressed in yeast cells. The N-terminal protein containing the signal sequence entered the exocytic pathway, whereas the protein downstream of 2A (GFP with no signal sequence) was localized in the cytoplasm [10]. A similar type of construct comprising an ER-targeted version of GFP (erGFP), 2A and a gene conferring phleomycin resistance (ble) was expressed in tobacco plants. Here, too, the N-terminal protein (erGFP–2A) localized to the ER, whereas the C-terminal protein localized to the cytoplasm [28]. However, when the proteins flanking 2A both contained co-translational signal sequences they both entered the ER [32,33]. It appears

Box 3. 2A-like sequences Searching databases for the presence of the DxExNPGP motif (conserved among all this type of picornavirus 2A) revealed the presence of active 2A-like sequences in a range of mammalian and insect virus groups [7,13,14,23,24]. Analyses using cell-free translation systems showed that all of these 2A-like sequences were active, some being more efficient than the 20-amino-acid version of FMDV 2A. Analyses of the FMDV Thosea asigna virus (TaV) and Porcine teschovirus-1 (PTV-1) 2A sequences showed complete cleavage at these sites, whereas the Equine rhinitis A virus (ERAV) 2A yielded barely detectable amounts of uncleaved product [14]. Detailed studies with PTV-1 2A indicated an almost equimolecular ratio of products [19]. Another study, with a tricistronic construct www.sciencedirect.com

using TaV and PTV-1 2As showed complete cleavage at the TaV 2A site, and increased cleavage at the PTV-1 2A site [19]. The availability of a range of 2A-like sequences can be useful when the presence of direct repeats within a construct might cause problems. Antibodies raised against 2A can also be used to detect the presence of a range of different 2A-like C-terminal extensions (de Felipe, P. and Ryan, M.D., unpublished results; J. Holst, A.L. Szymczak and D.A.A.Vignali, personal communication). The list of 2A-like sequences will expand as more virus genomes are sequenced, particularly insect viruses, although most remain untested. To date, the TaV 2A sequence is the shortest that gives complete cleavage in the specific polyproteins tested.

72

Review

TRENDS in Biotechnology

Vol.24 No.2 February 2006

Table 1. 2A and 2A-like sequences used in biotechnology to date Virus Foot-and-mouth disease (FMDV) Equine rhinitis A virus (ERAV) Porcine teschovirus-1 (PTV1) Thosea asigna virus (TaV)

Virus family Picornaviridae Picornaviridae Picornaviridae Tetraviridae

Host Mammals Mammals Mammals Insects

2A or 2A-like sequence -PVKQLLNFDLLKLAGDVESNPG ----QCTNYALLKLAGDVESNPG -----ATNFSLLKQAGDVEENPG ------EGRGSLLTCGDVEENPG

PPPP-

Refs (see legend) [19] [19,25] [19,25]

A more comprehensive list of 2A-like sequences is maintained at http://www.st-andrews.ac.uk/ryanlab/Index.htm. The length of the FMDV 2A is that used in [19]. Other reports use different lengths of FMDV 2A (Supplementary material; Tables S1 and S2). The conserved –DxExNPG P– motif is shown above in bold.

that, in the case of plant and yeast cells, signal sequences are essential for each protein to enter the ER. In mammalian cells, in contrast to yeast and plant cells, both the N-terminal protein (with a signal sequence) and the protein downstream of 2A (with no signal sequence) were translocated into the ER [30]. This was shown for polyproteins containing an N-terminal-cleaved signal sequence from calreticulin (CR) and a signal anchor from the type-II transmembrane protein b-1,4 galactosyltransferase (GalT) (Figure 2c). In mammalian cells, it was concluded that once a signal sequence is encountered and a translocon pore established, the ribosome remains attached to the translocon throughout the translation of the remainder of the ORF. Proteins encoded downstream of a protein containing a signal sequence are, at the least, susceptible to being transported through the established translocon complex: a phenomenon known as slipstream translocation [30]. The purely qualitative analyses we have performed with a limited range of constructs indicate the presence of a second cleaved signal sequence, which, although not being deleterious to the mechanism of protein targeting for the constructs we analyzed, was superfluous. However, efficient translocation of proteins into the ER, coupled with the generation of an authentic N-terminus by the removal of the signal peptide by signalase, are both arguments for retaining the signal sequence of proteins located downstream of 2A. When a polyprotein comprising an N-terminal type-I transmembrane protein (Lyt2), 2A, and p21 (a protein containing a nuclear localization signal) was expressed, the slipstream translocation of the p21 protein into the ER appeared not to have occurred, but analysis of transfected cells showed that p21 was active in the nucleus [18]. Truncated low-affinity nerve growth factor receptor (DLNGFR), another type-I transmembrane protein, was used to create a DLNGFR–2A–EGFP polyprotein [34]. In this case, the authors reported translocation of EGFP into the exocytic pathway. This protein appeared, however, to co-localize with the transmembrane protein encoded upstream of 2A on the plasma membrane of the cell, raising the possibility that the fluorescence arose from the uncleaved product (w20% for this construct). Unlike secreted or type-II transmembrane proteins, when a ribosome translates the C-terminal cytosolic tail of a type-I transmembrane protein, the seal formed between the ribosome and the translocon pore in the ER membrane must be opened to enable the cytosolic domain to be extruded into the cytoplasm [35,36]. If 2A-mediated cleavage occurs during extrusion of this domain, one can envisage the protein downstream of 2A also exiting the complex through this gap in the seal. The presence of a post-translational signal could then target such a protein www.sciencedirect.com

to another compartment, as in the case of p21 locating to the nucleus [18]. Co-translational subcellular targeting of cleavage products in mammalian cells: multiple signal sequences Genes encoding the p40 and p35 subunits of IL-12 were assembled as a p40–2A–p35 polyprotein, with each constituent retaining its signal sequence. Following expression of this construct, increased levels of active IL12 were present in the media [37], meaning that both subunits were transported correctly. The correct subcellular localization of proteins was also observed from polyproteins comprising tandem type-I transmembrane proteins [19,38]. Polyproteins with an N-terminal CR or GalT signal were modified by inserting a second type-II signal-anchor sequence downstream of 2A. In the presence of the GalT signal anchor sequence, the protein downstream of 2A unexpectedly localized to the surface of the mitochondria instead of to its intended location, the Golgi. However, when a different type-II signal-anchor protein, simian virus 5 hemagglutinin-neuraminidase (SV5 HN), was inserted downstream of 2A, the first protein correctly targeted the exocytic pathway and the SV5 HN was localized, correctly, on the cell surface [30]. It appears, therefore, that certain type-II signal-anchor sequences located downstream of 2A do not function correctly. When multiple co-translational signal sequences are present, the first signal folds in its normal environment and can interact with SRP; however, subsequent, internal signal sequences fold in an aberrant environment within the extended tunnel of the ribosome–translocon complex. These studies have been performed with a limited number of signal sequences. Clearly, the ability of internal signal sequences in the polyprotein to direct correct subcellular localization needs further characterization and this should be considered when designing polyproteins. The biotechnological applications of 2A Coordinated co-expression of multiple proteins The broad range of biotechnological uses to which the 2A system has been applied is shown in Tables S1 and S2 (Supplementary material). 2A is particularly useful where stoichiometric co-expression is crucial. For example, IL-12 is a heterodimer (p35–p40), but p40 homodimers are antagonistic; therefore, 2A has been used to express active IL-12 without generating p40 homodimers [37]. Increased stoichiometric co-expression of the heavy and light chains of an antibody was accomplished using a 24-amino-acid version of 2A [39]. The authors reported correct dimerization of the two chains with no detectable excess of either chain in the supernatant of transfected cells. When

Review

TRENDS in Biotechnology

compared with a construct in which the two chains were co-expressed using an IRES, the 2A-based system produced a 16-fold higher yield of antibody. 2A facilitates the stringent co-expression of selectable and/or marker proteins with the protein(s) of interest [18,26,40]: long-term, stable expression has been observed in both plants [8,28,29,32,33,41–43] and animals [19– 21,25,27,39,44–46]. Use of 2A in plant biotechnology The introduction of a desirable trait into an organism often requires the co-expression of multiple genes, and the particular problems associated with co-expressing multiple genes introduced in plants have been discussed recently [47–50]. Transgenes can be stacked by crossing singly transformed plants or by a process of reiterative transformation plus selection. However, different transgenes integrated into different loci are subject to segregation in subsequent generations resulting in the loss of the desired trait. Linking the genes into a single transcription unit that encodes a self-processing polyprotein can overcome the problem of a genetically unstable product. 2A has also been used in several examples of metabolome engineering and for the introduction of novel product traits, for example, higher plants synthesize carotenoids but cannot form ketocarotenoids. To confer this trait, the catalytic components of the astaxanthin pathway were introduced into tobacco and tomato plants by cloning the 4,4 0 -b-oxygenase (crtW) and 3,3 0 -bhydroxylase (crtZ) genes derived from marine bacteria. In addition, an N-terminal chloroplast-targeting sequence (ssu) was fused to each gene, and they were assembled into a ssucrtW–2A–ssucrtZ polyprotein. Examination of these plants revealed that each enzyme had localized to chloroplasts, and metabolite profiling of the leaves confirmed the formation of ketolated carotenoids from b-carotene in addition to hydroxylated intermediates [29]. To give further examples: drought-resistant potatoes were produced by the co-expression of Zygosaccharomyces rouxii trehalose-6-phosphate synthase (ZrTPS1) and trehalose-6-phosphate phosphatase (ZrTPS2) proteins [43]; the heterologous expression of antimicrobial proteins (AMPs), or defensins, from Dahlia merckii seeds (DmAMP1) and Raphanus sativus seeds (RsAFP2) led to improved disease resistance in Arabidopsis thaliana [32]; and using 2A to co-express b and d zeins (important sources of dietary sulfur) led to an improvement in the nutritional value of the host plants [33]. Recombinant viruses The space available for the insertion of foreign sequences into certain virus vector systems is a limiting factor, although the much shorter 2A sequence compares favorably with heterologous promoters or IRESes. 2A has been used to link proteins as N- or C-terminal extensions of virus polyproteins. Furthermore, a cassette system, comprising 2A–gene–2A, has also been used to introduce proteins at internal sites where the heterologous protein is excised from the flanking sequences (Supplementary material; Table S1). www.sciencedirect.com

Vol.24 No.2 February 2006

73

The generation of an uncleaved, full-length translation product also can be advantageous. Santa Cruz and coworkers constructed a polyprotein comprising GFP and the capsid protein (CP) of potyvirus X (PVX), linked by a short (17-amino-acid) version of 2A, which produced a mixture of products [51]. Although expression of a simple GFP–CP fusion protein did not produce virus particles, expression of the GFP–2A–CP polyprotein resulted in the formation of virus particles with GFP-decorated rods. It is thought that the CP cleavage product initiated rod formation, with the uncleaved fusion protein then being incorporated into developing virus particles [52–54]. In other plant virus systems, however, the presence of the uncleaved GFP–2A–CP fusion protein inhibited the efficient formation of virions [55]. 2A has been used to create recombinant viruses expressing reporter proteins or large amounts of protein of biotechnological significance. Epitopes have been inserted into the genomes of both negative-stranded (influenza virus) and positive-stranded (polio, Mengo, Kunjin, Dengue and Sindbis viruses) RNA viruses for vaccination purposes (Supplementary material; Tables S1 and S2). Gene therapy Gene therapy vectors incorporating 2A have been designed to deliver suicide genes [40,56], or the cell-cycle regulator p21, to destroy cancerous cells [18]. Vectors based upon adenovirus-associated virus (AAV) hold great promise but can encode no more than w5 kb. The incorporation of 2A saves vital coding capacity and has been used, successfully, for the in vivo production of the heavy and light chains of monoclonal antibodies with antiangiogenic activity [39]. AAVs have also been designed using 2A to co-express the marker-protein enhanced green fluorescent protein (EGFP) with potentially therapeutic proteins, such as human a-synuclein or Cu,Zn-superoxide dismutase [21]. Where protein expression levels are crucial, stringent marker co-expression can be of great assistance. The human homeobox protein HOXB4 confers an advantage to ex vivo transduced hematopoietic stem cells during the process of cell repopulation. HOXB4 was co-expressed with EGFP in the form of a HOXB4–2A–EGFP polyprotein such that HOXB4 expression could be directly monitored by EGFP fluorescence [27]. By this means, the role of HOXB4 in hematopoiesis was clarified and a precise therapeutic range of HOXB4 expression levels was defined [44,46]. 2A has been used to co-express the HOXB4 and O6-methylguanine-DNA-methyltransferase (MGMT, P140K mutant) proteins to further enhance the in vivo selection of transfected bone marrow cells [20]. Two 2A-like sequences (TaV and PTV-1) have been combined to design a tricistronic vector that can coexpress the a-L-iduronidase (IDUA) gene with two different reporter genes (luciferase and DsRed2) [25]. The DsRed2 marker enabled the detection of expression at the cellular level, whereas the luciferase marker enabled expression to be monitored at the tissue level using realtime, in vivo whole-body imaging. This construct directed increased expression of IDUA, a lysosomal enzyme

74

Review

TRENDS in Biotechnology

involved in glucosaminoglycan (GAG) degradation, which is deficient in mucopolysaccharidosis type I (MPS I), a condition currently treated by bone marrow transplantation. To date, the most complex 2A-based construct is the assembly of the four transmembrane proteins of the CD3 complex into a CD3d–2A–CD3g–2A–CD33–2A–CD3z polyprotein of w700 amino acids [19]. In this study, 2A sequences from FMDV, TaV, and ERAV were used – each cleaving highly efficiently. This polyprotein was coexpressed with another construct encoding the a– and b– T-cell-receptor (TCR) chains, themselves co-expressed from a TCRa–2A–TCRb polyprotein. The presence of the 2A C-terminal extensions did not affect the ability of the CD3d, g, 3, or the TCRa subunits to assemble and produce a functional TCR–CD3 complex in mice. Although constraints imposed by the retrovirus packaging system would have precluded the assembly of all six components into one polyprotein, concerns regarding the ability of the cell to translate these longer recombinant ORFs should be weighed against the length of the FMDV polyprotein itself (w2300 amino acids) and many, much larger, naturally-occurring eukaryotic proteins. Biomedical applications: immune responses against 2A Generally, peptides are not good immunogens in their own right, but their immunogenicity can be enhanced by chemically cross-linking them to larger carriers. For biomedical applications using 2A, one concern is that proteins expressed with such C-terminal extensions would, themselves, act as a carrier to stimulate an anti2A immune response; this remains an open but important question. Any potential carrier-effects could be abrogated by removal of 2A. This has been successfully accomplished by the inclusion of a furin proteinase cleavage site (RAKR) immediately upstream of 2A [39]. Gel and mass spectrographic analyses of an antibody expressed from a heavy chain–furin cleavage-site–2A–light chain construct showed that furin completely removed the 2A C-terminal extension from the heavy chain. Similarly, potentially therapeutic proteins or vaccines expressed in plants could have their 2A extensions removed by endogenous proteinases [32]. A bright, new CHYSEL in the toolbox The breadth of the current applications discussed above reflects the amazingly broad applications of 2A in biomedicine and biotechnology. One of the major problems confronting biotechnologists – how to generate multiple proteins from a single transcript RNA – has perhaps been solved by viruses rapidly evolving for millennia. Foot-andmouth disease is one of the most infectious mammalian viruses, and recently caused huge economic losses in the UK; perhaps it is providential that this mechanism, first identified by basic research into FMDV, can be adapted for positive purposes. Acknowledgements Antibodies against 2A were generously provided by D.A.A. Vignali (St. Jude Children’s Research Hospital, Memphis, USA; www.stjude.org). The www.sciencedirect.com

Vol.24 No.2 February 2006

authors wish to acknowledge the long-term support by the BBSRC (www. bbsrc.ac.uk) and the Wellcome Trust (www.welcome.ac.uk).

Supplementary data Supplementary data associated with this article can be found at doi:10.1016/j.tibtech.2005.12.006

References 1 de Felipe, P. (2002) Polycistronic viral vectors. Curr. Gene Ther. 2, 355–378 2 Douin, V. et al. (2004) Use and comparison of different internal ribosomal entry sites (IRES) in tricistronic retroviral vectors. BMC Biotechnol. 4, 16 3 Ryan, M.D. et al. (1991) Cleavage of foot-and-mouth disease virus polyprotein is mediated by residues located within a 19-amino-acid sequence. J. Gen. Virol. 72, 2727–2732 4 Ryan, M.D. and Drew, J. (1994) Foot-and-mouth disease virus 2A oligopeptide mediated cleavage of an artificial polyprotein. EMBO J. 13, 928–933 5 Ryan, M.D. et al. (1999) A model for non-stoichiometric, cotranslational protein scission in eukaryotic ribosomes. Bioorg. Chem. 27, 55–79 6 Donnelly, M.L.L. et al. (2001) Analysis of the aphthovirus 2A/2B polyprotein ‘cleavage’ mechanism indicates not a proteolytic reaction, but a novel translational effect: a putative ribosomal ‘skip’. J. Gen. Virol. 82, 1013–1025 7 Ryan, M.D. et al. (2002) The aphtho- and cardiovirus ‘primary’ 2A/2B polyprotein ‘cleavage’. In The Picornaviruses (Semler, B. and Wimmer, E., eds), pp. 213–223, ASM Press 8 Ma, C. and Mitra, A. (2002) Expressing multiple genes in a single open reading frame with the 2A region of foot-and-mouth disease virus as a linker. Mol. Breed. 9, 191–199 9 Ansari, I.H. et al. (2004) Involvement of a bovine viral diarrhea virus NS5B locus in virion assembly. J. Virol. 78, 9612–9623 10 de Felipe, P. et al. (2003) Co-translational, intraribosomal cleavage of polypeptides by the foot-and-mouth disease virus 2A peptide. J. Biol. Chem. 278, 11441–11448 11 Mattion, N.M. et al. (1996) Foot-and-mouth disease virus 2A protease mediates cleavage in attenuated Sabin 3 poliovirus vectors engineered for delivery of foreign antigens. J. Virol. 70, 8124–8127 12 Varshavsky, A. (1992) The N-end rule. Cell 69, 725–735 13 Donnelly, M.L.L. et al. (1997) The cleavage activities of aphthovirus and cardiovirus 2A proteins. J. Gen. Virol. 78, 13–21 14 Donnelly, M.L.L. et al. (2001) The ‘cleavage’ activities of foot-andmouth disease virus 2A site-directed mutants and naturally occurring ‘2A-like’ sequences. J. Gen. Virol. 82, 1027–1041 15 Hahn, H. and Palmenberg, A.C. (2001) Deletion mapping of the encephalomyocarditis virus primary cleavage site. J. Virol. 75, 7215–7218 16 Groot Bramel-Verheije, M.H. et al. (2000) Expression of a foreign epitope by porcine reproductive and respiratory syndrome virus. Virology 278, 380–389 17 Kinsella, T.M. et al. (2002) Retrovirally delivered random cyclic peptide libraries yield inhibitors of interleukin-4 signaling in human B cells. J. Biol. Chem. 277, 37512–37518 18 Lorens, J.B. et al. (2004) Stable, stoichiometric delivery of diverse protein functions. J. Biochem. Biophys. Methods 58, 101–110 19 Szymczak, A.L. et al. (2004) Correction of multi-gene deficiency in vivo using a single ‘self-cleaving’ 2A peptide-based retroviral vector. Nat. Biotechnol. 22, 589–594 20 Milsom, M.D. et al. (2004) Enhanced in vivo selection of bone marrow cells by retroviral-mediated coexpression of mutant O6-methylguanine-DNA-methyltransferase and HOXB4. Mol. Ther. 10, 862–873 21 Furler, S. et al. (2001) Recombinant AAV vectors containing the footand-mouth disease virus 2A sequence confer efficient bicistronic gene expression in cultured cells and rat substantia nigra neurons. Gene Ther. 8, 864–873 22 Lengler, J. et al. (2005) FMDV-2A sequence and protein arrangement contribute to functionality of CYP2B1–reporter fusion protein. Anal. Biochem. 343, 116–124

Review

TRENDS in Biotechnology

23 Ryan, M.D. et al. (2004) Foot-and-Mouth Disease Virus Proteinases. In Foot-And-Mouth Disease (Domingo, E. and Sobrino, F., eds), pp. 53–76, Horizon Bioscience 24 Szymczak, A.L. and Vignali, D.A.A. (2005) Development of 2A peptidebased strategies in the design of multicistronic vectors. Expert Opin. Biol. Ther. 5, 627–638 25 Osborn, M.J. et al. (2005) A picornaviral ‘2A-like’ sequence based tricistronic vector allowing for high level therapeutic gene expression coupled to a dual reporter system. Mol. Ther. 12, 569–574 26 Precious, B. et al. (1995) Inducible expression of the P, V, and NP genes of the paramyxovirus simian virus 5 in cell lines and an examination of the NP–P and NP–V interactions. J. Virol. 69, 8001–8010 27 Klump, H. et al. (2001) Retroviral vector-mediated expression of HOXB4 in hematopoietic cells using a novel co-expression strategy. Gene Ther. 8, 811–817 28 El Amrani, A. et al. (2004) Coordinate expression and independent subcellular targeting of multiple proteins from a single transgene. Plant Physiol. 135, 16–24 29 Ralley, L. et al. (2004) Metabolic engineering of ketocarotenoid formation in higher plants. Plant J. 39, 477–486 30 de Felipe, P. and Ryan, M. (2004) Targeting of proteins derived from self-processing polyproteins containing multiple signal sequences. Traffic 5, 616–626 31 Thomas, C.L. and Maule, A.J. (2000) Limitations on the use of fused green fluorescent protein to investigate structure–function relationships for the cauliflower mosaic virus movement protein. J. Gen. Virol. 81, 1851–1855 32 Franc¸ois, I.E.J.A. et al. (2004) Processing in Arabidopsis thaliana of a heterologous polyprotein resulting in differential targeting of the individual plant defensins. Plant Sci. 166, 113–121 33 Randall, J. et al. (2004) Co-ordinate expression of b- and d-zeins in transgenic tobacco. Plant Sci. 167, 367–372 34 Amendola, M. et al. (2005) Coordinate dual-gene transgenesis by lentiviral vectors carrying synthetic bidirectional promoters. Nat. Biotechnol. 23, 108–116 35 Liao, S. et al. (1997) Both lumenal and cytosolic gating of the aqueous ER translocon pore are regulated from inside the ribosome during membrane protein integration. Cell 90, 31–41 36 Woolhead, C. et al. (2004) Nascent membrane and secretory proteins differ in FRET-detected folding far inside the ribosome and in their exposure to ribosomal proteins. Cell 116, 725–736 37 Chaplin, P.J. et al. (1999) Production of interlukin-12 as a selfprocessing polypeptide. J. Interferon Cytokine Res. 19, 235–241 38 Arnold, P.Y. et al. (2004) Diabetes incidence is unaltered in glutamate decarboxylase 65-specific TCR retrogenic nonobese diabetic mice: generation by retroviral-mediated stem cell gene transfer. J. Immunol. 173, 3103–3111 39 Fang, J. et al. (2005) Stable antibody expression at therapeutic levels using the 2A peptide. Nat. Biotechnol. 23, 584–590

Vol.24 No.2 February 2006

40 de Felipe, P. et al. (1999) Use of the 2A sequence from foot-and-mouth disease virus in the generation of retroviral vectors for gene therapy. Gene Ther. 6, 198–208 41 Halpin, C. et al. (1999) Self-processing polyproteins – a system for coordinate expression of multiple proteins in transgenic plants. Plant J. 17, 453–459 42 Ma, C. and Mitra, A. (2002) Intrinsic direct repeats generate consistent post-transcriptional gene silencing in tobacco. Plant J. 31, 37–49 43 Kwon, S.J. et al. (2004) Genetic engineering of drought resistant potato plants by co-introduction of genes encoding trehalose-6phosphate synthase and trehalose-6-phosphate phosphatase of Zygosaccharomyces rouxii. Korean J. Genet. 26, 199–206 44 Schiedlmeier, B. et al. (2003) High-level ectopic HOXB4 expression confers a profound in vivo competitive growth advantage on human cord blood CD34C cells, but impairs lymphomyeloid differentiation. Blood 101, 1759–1768 45 Cao, Y.A. et al. (2005) Molecular imaging using labeled donor tissues reveals patterns of engraftment, rejection, and survival in transplantation. Transplantation 80, 134–139 46 Pilat, S. et al. (2005) HOXB4 enforces equivalent fates of ES-cellderived and adult hematopoietic cells. Proc. Natl. Acad. Sci. U. S. A. 102, 12101–12106 47 Halpin, C. et al. (2001) Enabling technologies for manipulating multiple genes on complex pathways. Plant Mol. Biol. 47, 295–310 48 Halpin, C. and Boerjan, W. (2003) Stacking transgenes in forest trees. Trends Plant Sci. 8, 363–365 49 Halpin, C. and Ryan, M.D. (2004) Redirecting metabolism by coordinate manipulation of multiple genes. In Metabolic Engineering in the Post Genomic Era (Kholodenko, B. and Westerhoff, H., eds), pp. 377–408, Horizon Scientific Press 50 Halpin, C. (2005) Gene stacking in transgenic plants – the challenge for 21st century plant biotechnology. Plant Biotechnol. J. 3, 141–155 51 Cruz, S.S. et al. (1996) Assembly and movement of a plant virus carrying a green fluorescent protein overcoat. Proc. Natl. Acad. Sci. U. S. A. 93, 6286–6290 52 Roberts, A.G. et al. (1997) Phloem unloading in sink leaves of Nicotiana benthamiana: comparison of a fluorescent solute with a fluorescent virus. Plant Cell 9, 1381–1396 53 O’Brien, G.J. et al. (2000) Rotavirus VP6 expressed by PVX vectors in Nicotiana benthamiana coats PVX rods and also assembles into virus like particles. Virology 270, 444–453 54 Smolenska, L. et al. (1998) Production of a functional single chain antibody attached to the surface of a plant virus. FEBS Lett. 441, 379–382 55 Takeda, A. et al. (2004) The C terminus of the movement protein of Brome mosaic virus controls the requirement for coat protein in cellto-cell movement and plays a role in long-distance movement. J. Gen. Virol. 85, 1751–1761 56 Nestler, U. et al. (1997) Foamy virus vectors for suicide gene therapy. Gene Ther. 4, 1270–1277

Reproduction of material from Elsevier articles Interested in reproducing part or all of an article published by Elsevier, or one of our article figures? If so, please contact our Global Rights Department with details of how and where the requested material will be used. To submit a permission request on-line, please visit: http://www.elsevier.com/wps/find/obtainpermissionform.cws_home/obtainpermissionform Alternatively, please contact: Elsevier Global Rights Department Phone: (+44) 1865-843830 [email protected]

www.sciencedirect.com

75