TIBS 1515 No. of Pages 4
Forum
Designing an Elusive CG!GC CRISPR Base Editor Kiran S. Gajula
1, ,@
*
Protein engineering advances, including DNA repair manipulation of CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat) machinery, have paved the way for the first set of DNA precision base editors (CG!TA and AT!GC), with wide-ranging implications for treating many human genetic diseases. By utilizing the latest protein evolution advances, a hypothetical model for the first transversion (CG!GC) base editor can now be proposed. The original utility of the CRISPR-Cas9 system to create double-stranded DNA (dsDNA) breaks for desired insertions or deletions in DNA loci was hampered by non-homologous end joining preferentially competing with homology-directed repair (HDR), creating undesirable indels, although it was most effective for applications such as gene knockouts.
opposite the target C, they sidetracked the DNA repair machinery to repair the other strand rather than the U-containing strand. By also incorporating uracil-DNA glycosylase inhibitor protein into the fusion to restrict the DNA repair and with further optimizations, they were able to achieve up to 75% C!T (thymidine) base-editing efficiency. This C!T mutation is the most frequently observed spontaneous mutation in mammalian cells and, despite DNA repair, it often persists and is the cause of nearly 50% of pathogenic human SNPs [2]. To correct this mutation, a T!C [or A!G (adenosine!guanosine)] BE was needed, and because there was no known DNA deaminase that could deaminate A, Gaudelli et al. [2] turned to TadA, an adenosine deaminase that deaminates A to inosine (I) (read as G during replication). However, because of its innate preference to act on A in RNA rather than DNA, the group exceptionally evolved TadA to deaminate A in DNA and optimized it to provide 68% editing efficiency. Another class of diversity-generating BEs, TAM and CRISPR-X, used activationinduced cytidine deaminase (AID) that can convert C into the other three bases in a 4- or 100-bp catalytic window, respectively.
CRISPR Base Editors
DNA-Modifying Enzymes in the Design of Base Editors
To circumvent the above-mentioned problem, CRISPR base editing was invented to correct point mutations at a desired locus, independent of dsDNA breaks or an HDR donor template. The first base editor (BE), designed by Komor et al. [1], involved fusing rat APOBEC1, a cytidine (C) deaminase, to a catalytically dead Cas9 (dCas9) and was targeted to a C by using a single guide RNA (sgRNA) in the strand opposite the protospacer. Hydrolytic deamination of C resulted in uridine (U), and using the Cas9 D10A nickase (nCas9), which nicks the strand
Among the four DNA nucleosides, C, A, and G contain exocyclic amines that are prone to deamination, as opposed to thymidine that lacks this group. C and A, when deaminated, cause a reversal of polarity of base hydrogen-bonding surfaces and cause U to pair with A and I with C. But xanthosine (deaminated G) prefers to base pair with C alone. As such, using deaminases on C or A provides a direct way to change these bases, but not on G. There is no denying that the above-mentioned two base-editing efforts are quite remarkable in their own way of design and
have set the field of precision base editing in motion. Both approaches used deaminases, which are the only enzymes among other natural DNA-modifying enzyme classes [3] that bring about a direct change in the DNA base at the canonical level. Yet, is there any way to think outside the box to come up with new strategies for other BEs?
Designing a CG!GC Base Editor AID plays a prominent role in somatic hypermutation (SHM) and class-switch recombination to produce high-affinity Igs in activated B cells. AID deaminates C to U in Ig genes that are excised by uracil-DNA glycosylase (UDG) to cause apurinic/apyrimidinic (AP) sites that, upon further processing by various translesion synthesis (TLS) polymerases including Rev1, drive SHM. Rev1 is a Y-family TLS polymerase, unique in its ability to incorporate C opposite U and AP sites. Normal replicative DNA polymerases always insert A opposite AP lesions, while Rev1 inserts C instead over A, G, or T [4]. In a study to interrogate the role of Rev1 in SHM, the mutated Ig genes from Rev1deficient mice contained barely any C!G mutations but contained increased levels of A!T, C!A, and T!C mutations, indicating that Rev1 inserts Cs opposite AP sites during SHM [5]. However, Rev1 is recruited only to the sites of abasic sites generated by UDG, which excises uracils generated by AID in the S phase, but not the G1 phase, of the cell cycle. At stalled replication forks, Rev1 bypasses AP sites by incorporating Cs [6] (Figure 1A). Based on this study, it is enticing to come up with a hypothetical model for a CG!GC BE in which AID-UDG could be fused to the N terminus of a dCas9 that also contains Rev1 at its C terminus and direct it to a locus containing a target C. However, a major obstacle to this idea is a lack of access for Rev1 to clamp onto the AP site-containing strand and to use it as
Trends in Biochemical Sciences, Month Year, Vol. xx, No. yy
1
TIBS 1515 No. of Pages 4
(A)
(B) 5’
A G C T T C G A
PAM
C
3’
G
3’
5’
5’
3’
G
PAM
3’ 5’
C
enCas9 nicks the strand opposite target C while AID and UDG sequenally act on C to create an AP site Nick sealed by DNA ligase and DNA repair restores G opposite C
A G U T T U G A UDG
5’ 3’ UDG
PAM
AP site G
enCas9
5’
C
REV1*
T
*
G
A
UDG T
3’ 5’
3’
MSH2 MSH6
A G U T T U G A
PAM
5’ 3’
*G
REV1*
A
Exo1 MSH2 MSH6
Rev1* inserts C opposite AP site and leaves a ligatable nick
Polη Pol
During enCas9 release, Rev1* opens the nick C A C G T U G A
G
A T
*
T
C C
A
REV1
PAM
5’
Error-prone synthesis of top strand
3’
5’
5’
3’
REV1
T
A G U T C C A C
A C
*G
3’ C
T A
G
REV1*
3’ 5’
G
REV1*
Extension
Nick translaon
Error-prone synthesis of boom strand
G1 Phase
PAM
S Phase
Figure 1. Roles of AID and Rev1 in Somatic Hypermutation and Their Potential Application in the Design of CG!GC CRISPR Base Editor. (A) Competitive pathways for hypermutation. In the G1 phase of the cell cycle, uracils produced by activation-induced cytidine deaminase (AID) is processed by the mismatch-repair complex MSH2-MSH6 and recruit exonuclease 1 (Exo1) to excise a short path of DNA containing U-G mismatches. In an error-prone manner, nucleotides are incorporated by DNA polymerase ɳ (Polɳ). In S phase, the uracils that are transferred or generated in single-stranded DNA are excised by uracil-DNA glycosylase (UDG) to cause abasic sites and are bypassed by various translesion polymerases, including Rev1, during replication. A, adenosine; C, cytidine, G, Guanosine, T, thymidine; U, uracils produced by activation-induced cytidine deaminase (AID) are processed. Red dots indicate AP sites. Adapted from [6]. (B) Hypothetical model for the CG-to-GC CRISPR (Clustered Regularly Interspaced Short Palindromic Repeat) base editor. Upon single guide RNA (sgRNA)-directed targeting of AID-UDG-enCas9-Rev1* fusion to the site containing the target C, AID and UDG sequentially act on C to produce an abasic site, while enCas9 nicks the opposite strand approximately 13–17 nucleotides downstream near the protospacer adjacent motif (PAM) site. During enCas9 release from the target site and when the two strands are about to hybridize again, Rev1* begins to open the nick and clamps onto the apurinic/apyrimidinic (AP)-containing strand. Rev1* begins degrading the DNA from the nick, while also polymerizing the same strand from the other end. Upon incorporating a C opposite the AP site, Rev1* leaves a ligatable nick, which is sealed by DNA ligase. DNA repair then restores a G opposite the C at the target site.
a template to polymerize and fill in the C opposite the AP site. This is because, upon dCas9 release, the single-stranded bubble is closed, and there is no access for Rev1 to latch onto the template strand, even if there was a nick in the opposite strand caused instead by nCas9.
polymerase I (PolI5M) was fused to the C terminus of nCas9 and was custom targeted to several genomic loci with relevant sgRNAs. In their method termed EvolvR, nCas9, upon binding to the target locus, creates a nick near the protospacer, and while it is being released, PolI5M opens up the nick and synthesizes In a recent prominent work by Halperin the strand across in an error-prone manet al. [7], an enhanced error-prone variant ner to introduce sequence diversity within of nick-translating Escherichia coli DNA a window of 17 nucleotides from the nick 2
Trends in Biochemical Sciences, Month Year, Vol. xx, No. yy
site. To increase the mutation rate by approximately ninefold, the authors introduced three mutations (K848A, K1003A, and R1060A) into nCas9 (enCas9) that are known to reduce the affinity of Cas9 to nonspecific DNA to boost its dissociation after nicking the DNA [8]. This work now clearly opens the door to address the snag described earlier, in that Rev1, which does not possess nick-translating capability, could now be rationally
TIBS 1515 No. of Pages 4
Constant inflow
Constant oulow Rev1* corrects G-to-C mutaon in gene III (pIII producon)
MP
AP
MP
AP
PACE
Host cells
MP
AP
SP Rev1* unable to correct G-to-C mutaon in gene III (no pIII producon)
SP infecon and connuous mutagenesis
AP Mutated gene III
Mulplexing sgRNAs
MP enCas9
PolI3M
TBD
(EvolvR)
SP AID
UDG
enCas9
REV1*
Phage genes
Figure 2. Phage-Assisted Continuous Evolution for the Development of a CG-to-GC Base Editor. M13 bacteriophages carrying an evolving gene encoding activation-induced cytidine deaminase-uracil-DNA glycosylase-enCas9-Rev1 replacing the phage gene III can be used to infect Escherichia coli, which continuously dilute the phage population faster than bacterial cell division but slower than phage replication, causing mutations to be accumulated in the selection phage (SP). The E. coli also incorporate an accessory plasmid (AP) encoding gene III with a guanosine-to-cytidine (G-to-C) mutation and multiplexed single guide RNAs (See figure legend on the bottom of the next page.)
Trends in Biochemical Sciences, Month Year, Vol. xx, No. yy
3
TIBS 1515 No. of Pages 4
engineered to contain the N-terminal 50 to-30 exonuclease domain of PolI, carefully by not perturbing the N-digit of Rev1 that interacts with an incoming dCTP during polymerization [4]. Since base editing is efficient at 15 2 nucleotides from the protospacer adjacent motif (PAM), it is also important to set the processivity of Rev1 to achieve the desired mutation. As such, a thioredoxin binding domain could be inserted into the thumb domain of Rev1 [6]. As illustrated in Figure 1B, the combined action of AID-UDG results in an AP site, while enCas9 nicks near the PAM site. During enCas9 dissociation, Rev1* (evolved Rev1 with PolI-like 50 -to-30 exonuclease activity and desired processivity) causes nick translation and incorporates C against an AP site to leave a ligatable nick [9]. Despite this rational design, the desired target mutation may not be achieved. As such, the nucleotides on these new domains could be diversified with various sgRNAs by using the EvolvR system and could be subjected to the versatile protein evolution method, phage-assisted continuous evolution (PACE), successfully used by Hu et al. to expand the PAM recognition sites of Streptococcus pyogenes Cas9 from NGG to NG, GAA, and GAT [10]. In their strategy, the M13 bacteriophages carrying an evolving gene encoding dCas9 and the v subunit of bacterial RNA polymerase replaced the phage gene III, a gene that is critical for the production of infectious progeny phage, and used to infect E. coli, which continuously dilutes the phage population faster than cell division but slower than phage replication, causing mutations to be accumulated in the selection phage. The E. coli also incorporated an accessory plasmid encoding
phage gene III and diverse sgRNAs with NNN PAMs along with a mutagenesis plasmid MP6 that diversified the evolving selection phage gene. Recognition of the compatible PAM sequences and target sites by the phage-encoded dCas9 variants drove gene III expression and phage propagation. The main advantage of PACE was its capacity to rapidly perform many rounds of evolution in a day, a process that would otherwise take weeks. In the proposed model, instead of using MP6, an EvolvR system could be encoded to diversify the target regions of evolving Rev1 and could then be targeted to correct a G!C mutation in gene III, without which there will not be any binding of protein III to F-pilus and hence phage infection (Figure 2). Over multiple rounds, this strategy has the potential to yield the correct Rev1 variant with a fine balance between C incorporation opposite an AP site and enhanced fidelity toward any particular target locus for C!G editing. Important refinements to this method include using mutated versions of C deaminases that could act on a narrow window of one to two nucleotides [11] or that can preferentially deaminate a particular C when many Cs are present in the editing window by significantly voiding bystander base editing [12].
Concluding Remarks
several others [13]. Also, using this CG!G C editor followed by the CG!T A editor [1] results in another G C!T A transversion editor, with a potential for additional disease coverage. It is only a matter of time before other BEs could be designed based on rapidly evolving science. Acknowledgments K.S.G. apologizes for not citing all relevant articles due to space constraints and is thankful to Dr Rahul Kohli for inspiration and setting high research standards. K.S.G. is also grateful to Kanvasri Jonnalagadda for editing the manuscript and being a motivation behind the work. 1 Division of Infectious Diseases, Department of Medicine, Perelman School of Medicine, University of Pennsylvania,
Philadelphia, PA 19104, USA @ Twitters: @ksgajula, @ksgajula@ksgajula *Correspondence: . https://doi.org/10.1016/j.tibs.2018.10.004 References 1. Komor, A.C. et al. (2016) Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature 533, 420–424 2. Gaudelli, N.M. et al. (2017) Programmable base editing of AT to GC in genomic DNA without DNA cleavage. Nature 551, 464–471 3. DeNizio, J.E. et al. (2018) Harnessing natural DNA modifying activities for editing of the genome and epigenome. Curr. Opin. Chem. Biol. 45, 10–17 4. Nair, D.T. et al. (2011) DNA synthesis across an abasic lesion by yeast REV1 DNA polymerase. J. Mol. Biol. 406, 18–28 5. Jansen, J.G. et al. (2006) Strand-biased defect in C/G transversions in hypermutating immunoglobulin genes in Rev1-deficient mice. J. Exp. Med. 203, 319–323 6. Weill, J. and Reynaud, C. (2008) DNA polymerases in adaptive immunity. Nat. Rev. Immunol. 8, 302–312
Recent advances in CRISPR-Cas technology, with creative rapid protein evolution methods, provide a platform for an elusive CG!G C transversion BE to be innovatively modeled for precision base editing, with a potential to correct many pathogenic human SNPs, including base editing for delayed HIV1 disease progression and reduced susceptibilities to hepatitis C, Kaposi’s sarcoma, and diabetes mellitus, among
9. Haracska, L. et al. (2002) Yeast Rev1 protein is a G template-specific DNA polymerase. J. Biol. Chem. 277, 15546–15551
(sgRNA), both for rational diversification of Rev1-containing polymerase I (PolI) domains using the EvolvR system and for the correction of the mutated gene III by evolved Rev1 for phage propagation. Adapted from [9]. Abbreviations: PACE, phage-assisted continuous evolution; TBD, thioredoxin binding domain.
13. Amberger, J.S. et al. (2015) OMIM.org: Online Mendelian Inheritance in Man (OMIM(R)), an online catalog of human genes and genetic disorders. Nucleic Acids Res. 43 (Database issue), D789–D798
4
Trends in Biochemical Sciences, Month Year, Vol. xx, No. yy
7. Halperin, S.O. et al. (2018) CRISPR-guided DNA polymerases enable diversification of all nucleotides in a tunable window. Nature 560, 248–252 8. Slaymaker, I.M. et al. (2016) Rationally engineered Cas9 nucleases with improved specificity. Science 351, 84–88
10. Hu, J.H. et al. (2018) Evolved Cas9 variants with broad PAM compatibility and high DNA specificity. Nature 556, 57–63 11. Kim, Y.B. et al. (2017) Increasing the genome-targeting scope and precision of base editing with engineered Cas9cytidine deaminase fusions. Nat. Biotechnol. 35, 371–376 12. Gehrke, J.M. et al. (2018) An APOBEC3A-Cas9 base editor with minimized bystander and off-target activities. Nat. Biotechnol. 36, 977–982