Bioorganic & Medicinal Chemistry Letters xxx (xxxx) xxxx
Contents lists available at ScienceDirect
Bioorganic & Medicinal Chemistry Letters journal homepage: www.elsevier.com/locate/bmcl
Mutation of the start codon to enhance Cripavirus internal ribosome entry site-mediated translation in a wheat germ extract Atsushi Ogawa , Masashi Takamatsu ⁎
Proteo-Science Center, Ehime University, 3 Bunkyo-cho, Matsuyama, Ehime 790-8577, Japan
ARTICLE INFO
ABSTRACT
Keywords: Cell-free translation Wheat germ extract Internal ribosome entry site Translational enhancer Start codon
Wheat germ extract (WGE) is one of the most widely used eukaryotic cell-free translation systems for easy synthesis of a broad range of proteins merely by adding template mRNAs. Its productivity has thus far been improved by removing translational inhibitors from the extract and stabilizing the template with terminal protectors. Nonetheless, there remains room for increasing the yield by designing a terminally protected template with higher susceptibility to translation. Given the fact that a 5′ terminal protector is a strong inhibitor of the canonical translation, we herein focused on Cripavirus internal ribosome entry sites (IRESes), which allow for a unique translation initiation from a non-AUG start codon without the help of any initiation factors. We mutated their start codons to enhance the IRES-mediated translation efficiency in WGE. One of the mutants showed considerably higher efficiency, 3–4-fold higher than that of its wild type, and also 3–4-fold higher than the canonical translation efficiency by an IRES-free mRNA having one of the most effective canonical-translation enhancers. Because this mutated IRES is compatible with different types of genes and terminal protectors, we expect it will be widely used to synthesize proteins in WGE.
Cell-free translation systems are powerful tools for biochemically synthesizing a selected protein in vitro.1 They have an advantage over living cells in that they require no transformation of cells, thereby circumventing some tasks, as well as reducing the risk of biohazards and bioethical concerns. They also allow for synthesis of proteins that adversely affect cell growth. In particular, a wheat germ extract (WGE)based system enables expression of a broad range of proteins in soluble forms. This is in contrast to the widely used E. coli-based systems, which are often less useful in translating eukaryotic proteins.2 In addition, the WGE activity for protein synthesis has been improved by extensively washing the wheat embryos when preparing the extract to remove translational inhibitors.3 Nonetheless, there remains room for increasing the productivity in WGE by designing a protein template (i.e., mRNA) with the following two characteristics: higher resistance to unremovable endogenous nucleases; and higher susceptibility to translation. In terms of the former, we have recently identified 5′ and 3′ terminal protector sequences whose rigid structures prevent the exonucleases from degrading in vitrotranscribed RNAs in WGE.4 As for the latter, another group previously selected a translational enhancer, called E01, which functions in the 5′ untranslated regions (UTR) to enhance the canonical eukaryotic translation in WGE.5 However, E01 is useless in terminally protected mRNAs, because a 5′ terminal rigid protector inhibits eukaryotic ⁎
ribosome loading in canonical translation.6 This means an efficient internal ribosome entry site (IRES) is necessary instead of E01 to effectively utilize the terminally protected mRNAs. We herein report an IRES with a mutated start codon that exhibits considerably higher IRESmediated translation efficiency in WGE, 3–4-fold higher than the canonical translation efficiency by IRES-free mRNA with E01 in the 5′ UTR. mRNA translation generally starts on the AUG codon. This is because there is only one type of initiator tRNA (tRNAiMet) whose anticodon is CAU complementary to the canonical start codon, in contrast to various types of elongator tRNAs. At the beginning of translation, the initiator with the first amino acid (aa1: Met) occupies the P site of the ribosome on the AUG codon with the help of several initiation factors. An elongator then brings aa2 corresponding to the second codon into the A site to take aa1 from the initiator in the P site. The resulting peptidyl-tRNA (aa1-aa2-tRNA) translocates to the P site to make the A site vacant for the next elongator. This mechanism also applies to most IRES-mediated non-canonical translations.7 However, the intergenic region IRESes (IGR-IRESes) of Cripaviruses (and of Aparaviruses) are an exception, because they allow for translation from non-AUG codons without need of any initiation factors.7 These atypical IRESes are composed of three domains (Fig. 1A), the upstream two of which directly recruit the ribosome, and the last of
Corresponding author. E-mail address:
[email protected] (A. Ogawa).
https://doi.org/10.1016/j.bmcl.2019.126729 Received 19 August 2019; Received in revised form 25 September 2019; Accepted 1 October 2019 0960-894X/ © 2019 Elsevier Ltd. All rights reserved.
Please cite this article as: Atsushi Ogawa and Masashi Takamatsu, Bioorganic & Medicinal Chemistry Letters, https://doi.org/10.1016/j.bmcl.2019.126729
Bioorganic & Medicinal Chemistry Letters xxx (xxxx) xxxx
A. Ogawa and M. Takamatsu
Fig. 1. The general structure of Cripavirus IGR-IRESes and partial sequences of their representatives, the PSIV IRES and CrPV IRES. (A) Schematic illustration of Cripavirus IGR-IRESes intervening between two ORFs. Green and blue letters indicate important structures and interactions for efficient IRES-mediated translation (SL: stem-loop; PK: pseudoknot), respectively. The domain 3 is composed of the SL-VI and the PK-I. (B and C) The sequences of the domain 3 and following 12 nucleotides (the start codon and wt4-12) in the PSIV IRES (B) and the CrPV IRES (C). Mutated start codons and their encoding amino acids are indicated by arrows. The 4-digit numbers represent nucleotide positions in each viral genome. Asterisks and large dots symbolize base-pair interactions for pseudoknots and helical duplexes, respectively.
which (domain 3) mimics a tRNA-mRNA complex to occupy the P site.8 This occupation of the P site enables an elongator to enter the A site and then be translocated in a unique way.9 In this manner, the first elongator functions as an initiator, meaning that any codon except for nonsense codons is, in principle, available as the start codon,10 although the start codon (and first amino acid) is moderately conserved among known species: CAA for Gln, GCU for Ala, and GCA for Ala.11 We herein focus on these Cripavirus IGR-IRESes for three reasons: (1) translation initiation, which is considered the limiting step of translation, is expected to be fast in these IRESes because no initiation factor is required12 (in fact, they are known to efficiently function in various types of eukaryotic translation systems including WGE)10,12–15; (2) the start codon is not completely conserved among species, so that it might be possible to obtain IRESes more efficient than the wild type by exchanging these start codons, as reported with Aparavirus IGR-IRESes in yeast and HeLa cells16; and (3) mutating the highly sophisticated core structures of other IRESes (including the IGR-IRESes) is less likely to enhance the translation efficiency. We chose two representative Cripavirus IGR-IRESes as fundamental IRES for efficient IRES-mediated translation in WGE: the Plautia stali intestine virus (PSIV) IGR-IRES, and the cricket paralysis virus (CrPV) IGR-IRES (Fig. 1B and C, respectively). These IRESes were selected because they have been well studied,8–17 their structures have been resolved,8,9,18 and they function well in WGE.4,10,15,17 Their wild-type start codons are different and encode different amino acids: CAA for Gln in the PSIV IRES, and GCU for Ala in the CrPV IRES. Because the first amino acids in known Cripavirus IGR-IRESes are limited to Gln and Ala,11 as described above, we selected six codons encoding these two amino acids (CAA and CAG for Gln, GCU, GCA, GCG, and GCC for Ala) as start codon candidates to identify an efficient start codon in the chosen IGR-IRESes (Fig. 1B and C). We previously identified a region of the PSIV IRES that is required to maximize its translation efficiency in WGE: a 6004–6204 segment covering the start codon (CAA) and the following 9 nucleotides (wt412; Fig. 1B) in the second open reading frame (ORF2).17 We here ligated this segment or a slight variant of it with a different start codon (CAG, GCU, GCA, GCG, or GCC) into the YPet (yellow fluorescent protein) gene19 devoid of the first AUG codon through the SpeI site. In addition, a stem-loop structure beginning with a G triplet (5SL, a 5′ terminal
protector), which effectively inhibits mRNA degradation by endogenous 5′ exonucleases4,20 and the ribosome loading in the canonical translation6 in WGE, was fused to the 5′ terminus to construct six mRNA variants named PSIV(N1N2N3), where N1N2N3 represents the start codon (Fig. 2A, top). Incidentally, these mRNAs have a long (970 nt) 3′ UTR for protection from 3′ exonucleases.21 We also prepared a control mRNA (ctrl) for the canonical translation, which was composed of E01 in the 5′ UTR, the YPet gene with the first AUG codon, and the same 3′ UTR as PSIV(N1N2N3) (Fig. 2A, bottom). We then translated these seven mRNAs for 1 h in WGE and measured the fluorescence intensity of the expressed YPet as an index of the translation efficiency (Fig 2B). The wild-type IRES mRNA, PSIV(CAA), exhibited moderate translational activity, which was 48% of that by the ctrl. This ratio is in good accord with a previous result using the firefly luciferase gene as the reporter gene.17 As for mRNAs with a mutated start codon, although PSIV(CAG) encoding the same first amino acid (Gln) as the wild type was much less translated for some reason,22 the other four mRNAs (in which the start codon was Ala) were translated well, indicating that the PSIV IRES is highly tolerant of Ala-encoding start codons. In particular, three of these mRNAs—PSIV(GCA), PSIV (GCG), and PSIV(GCC)—showed slightly (1.2–1.3-fold) higher efficiency than the wild type. In terms of the CrPV IRES, we anticipated that a 6027–6228 segment corresponding to the 6004–6204 segment of the PSIV IRES would be optimal for the IRES-mediated translation, in light of the highly conserved core structure between these two IRESes. We thus used this segment or a slightly changed one with a different start codon to construct six mRNAs encoding YPet, CrPV(N1N2N3) (Fig. 2A, top), as in the preparation of PSIV(N1N2N3). Fig. 2C shows the results of 1-h translation of these mRNAs (including ctrl) in WGE. The wild-type IRES mRNA, CrPV(GCU), was as efficiently translated as the ctrl, which means that CrPV(GCU) is 2-fold more susceptible to translation than PSIV(CAA). This high efficiency comparable to that by ctrl was surprising, because ctrl has one of the most effective translational enhancers, E01, for highly efficient canonical translation in WGE. More surprisingly, all the mRNAs with a mutated start codon, whether it encoded Ala or Gln, were more efficiently translated than the wild type and ctrl. It should be noted that the translation efficiencies of CrPV (GCA), CrPV(GCG), and CrPV(GCC) were considerably (about 3-fold) 2
Bioorganic & Medicinal Chemistry Letters xxx (xxxx) xxxx
A. Ogawa and M. Takamatsu
Fig. 2. IRES-mediated translation of terminally protected mRNAs that have a Cripavirus IGR-IRES (the PSIV IRES or CrPV IRES) with a wild-type or mutated start codon. (A) Schematic illustration of PSIV(N1N2N3) or CrPV(N1N2N3) (top), where N1N2N3 is the start codon, and ctrl for the canonical translation (bottom). 40S represents the small subunit of a eukaryotic ribosome. (B and C) The relative fluorescence intensities (i.e., relative translation efficiencies) of YPet translated from PSIV(N1N2N3) (B) or CrPV(N1N2N3) (C) in WGE. The far right bar represents the canonical translation efficiency of ctrl, which was used in both experiments. “WT” above the far left bar indicates the wild-type start codon.
higher.23 The set of start codons in these three mRNAs was identical to those of the three mRNAs, PSIV(N1N2N3), that overcame their wild type PSIV(CAA), suggesting that Ala-encoding start codons other than GCU are preferable for allowing Cripavirus IGR-IRESes to efficiently initiate the IRES-mediated translation in WGE. To investigate whether the start-codon preference of the CrPV IRES is also observed for mRNAs with a short 3′ terminal protector, we next
altered the long 970-nt 3′ UTR of CrPV(N1N2N3) (that encode Ala as the first amino acid) into a 36-nt protector named 4p5-10 (CrPV (N1N2N3)-3p; Fig. 3A). This short (and thus manageable) protector was recently in vitro-selected from a random pool to stabilize transcripts instead of a long (and thus cumbersome) 3′ UTR in WGE.4 In that report, mRNA that was protected with 4p5-10 showed a 5–6-fold higher translation efficiency than 3′ terminus-unprotected control mRNA with 3
Bioorganic & Medicinal Chemistry Letters xxx (xxxx) xxxx
A. Ogawa and M. Takamatsu
Fig. 3. Compatibility of effective CrPV IRESes with a short (36-nt), manageable 3′ terminal protector, 4p5-10. (A) Schematic illustration of CrPV(N1N2N3)-3p (and CrPV (GCU) in parenthesis, a control mRNA in this experiment). (B) The relative fluorescence intensities (i.e., relative translation efficiencies) of YPet translated from CrPV (GCU) and CrPV(N1N2N3)-3p in WGE. Fluorescence images of translated YPet are shown above the graph.
the same length of 3′ UTR. In the current study, we now had a CrPV (GCU) with a 27-fold longer, 970-nt 3′ UTR, as a 3′ terminus-protected control mRNA (Fig. 3A, in parenthesis). We thus evaluated how much these five mRNAs (four CrPV(N1N2N3)-3p and CrPV(GCU)) were translated in WGE (Fig. 3B). The 1-h translation efficiency of CrPV (GCU)-3p with 4p5-10 was almost identical to that by CrPV(GCU) with the 27-fold longer 3′ UTR, clearly showing the high protection ability of 4p5-10 despite its shortness. In addition, the other three mRNAs with 4p5-10 were 3–4 fold more efficiently translated than CrPV(GCU)-3p with a start-codon preference of GCA > GCC > GCG ≫ GCU (wild type). These results agree with those of CrPV(N1N2N3), indicating that CrPV IRESes with a start codon encoding Ala are compatible with the short 3′ terminal protector, 4p5-10. Finally, we exploited a different downstream gene to examine the versatility of efficient CrPV IRESes with a mutated start codon encoding Ala. Specifically, we altered the YPet gene in four mRNAs with 4p5-10 (CrPV(N1N2N3)-3p) into the Nanoluciferase (nLuc) gene (CrPV (N1N2N3)-nLuc-3p; Supplementary Fig. S1A). We incubated these gene-altered mRNAs for 1 h in WGE and estimated the translation efficiency with the chemiluminescence intensity of expressed nLuc in the presence of its substrate, furimazine (Supplementary Fig. S1B).24 As expected, all three mRNAs with a mutated start codon were more efficiently translated than the wild type, CrPV(GCU)-nLuc-3p, although the enhancement ratios were slightly lower than those in YPet-encoding mRNAs. In addition, the start-codon preference was almost conserved: GCA was the most effective one. These results suggest that the CrPV IRES with the GCA start codon is promising as a versatile IRES for achieving highly efficient translation in WGE. In summary, we mutated the start codon of two representative Cripavirus IGR-IRESes (the PSIV IRES and the CrPV IRES) to enhance IRES-mediated translation of terminally protected mRNAs in WGE.
Surprisingly, almost of the selectively constructed mutants were more efficiently translated than the wild type. In particular, the most efficient mutant, the CrPV IRES with a GCA start codon, showed considerably high IRES-mediated translation efficiency that was 3–4-fold higher than the canonical translation efficiency by IRES-free mRNA with E01, one of the most effective canonical-translation enhancers, in the 5′ UTR.25 Its versatility was also confirmed: the high efficiency was conserved between two completely different 3′ protectors (3′ UTRs) or ORFs. mRNAs protected with 5SL beginning with a G triplet have another advantage over 5′-E01 mRNAs, which have a 5′ GAA at the terminus, in being easier to prepare in large amounts by in vitro transcription by a T7 RNA polymerase. Therefore, the highly efficient IRES identified here is expected to be widely used in cooperation with terminal protectors to synthesize proteins in WGE. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgment This work was supported by Japan Society for the Promotion of Science (JSPS) KAKENHI Grant Numbers 16K05846 and 19 K05697. Appendix A. Supplementary data Supplementary data to this article can be found online at https:// doi.org/10.1016/j.bmcl.2019.126729. 4
Bioorganic & Medicinal Chemistry Letters xxx (xxxx) xxxx
A. Ogawa and M. Takamatsu
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. Hertz MI, Thompson SR. Nucleic Acids Res. 2011;39:7276. 17. Ogawa A. RNA. 2011;17:478. 18. Pfingsten JS, Costantino DA, Kieft JS. Science. 2006;314:1450. 19. Nguyen AW, Daugherty PS. Nat Biotechnol. 2005;23:355. 20. Ogawa A, Doi Y. Org Biomol Chem. 2015;13:1008. 21. Ogawa A, Tabuchi J, Doi Y. Bioorg Med Chem Lett. 2014;24:3724. 22.. It is probably because the third nucleotide (G) (perhaps with surrounding bases) adversely affects the PSIV IRES core structure by interacting with its specific sequence, given the fact that CrPV(CAG) was well translated. 23.. Although we used RNA-folding software to predict secondary structures of (cropped) CrPV(N1N2N3), there was no remarkable difference among them. Therefore, we presume that an elongator tRNA for GCU is poorer in WGE and thus CrPV(GCU) showed the lower translation efficiency. 24. Hall MP, et al. ACS Chem Biol. 1848;2012:7. 25.. Although we tried only Gln and Ala-encoding codons to identify an efficient start codon in this study, there is a possibility that other start codons might show a higher translation efficiency.
Carlson ED, Gan R, Hodgman CE, Jewett MC. Biotechnol Adv. 2012;30:1185. Madono M, Sawasaki T, Morishita R, Endo Y. New Biotechnol. 2011;28:211. Madin K, Sawasaki T, Ogasawara T, Endo Y. Proc Natl Acad Sci USA. 2000;97:559. Ogawa A, Kutsuna A, Takamatsu M, Okuzono T. Bioorg Med Chem Lett. 2019;29:2141. Kamura N, Sawasaki T, Kasahara Y, Takai K, Endo Y. Bioorg Med Chem Lett. 2005;15:5402. Ogawa A. ChemBioChem. 2009;10:2465. Kieft JS. Trends Biochem Sci. 2008;33:274. Costantino DA, Pfingsten JS, Rambo RP, Kieft JS. Nat Struct Mol Biol. 2008;15:57. Pisareva VP, Pisarev AV, Fernández IS. eLIFE. 2018;7:e34062. Shibuya N, Nishiyama T, Kanamori Y, Saito H, Nakashima N. J Virol. 2003;77:12002. Au HHT, Jan E. PLoS One. 2012;7:e51477. Hodgman CE, Jewett MC. New Biotechnol. 2014;31:499. Sasaki J, Nakashima N. J Virol. 1999;73:1219. Kamoshita N, Nomoto A, RajBhandary UL. Mol Cell. 2009;35:181. Ogawa A, Masuoka H, Ota TACS. Synth Biol. 2017;6:1656.
5