Trends in Cell Biology
Review
Endogenous Fluorescence Tagging by CRISPR Hassan Bukhari1,2 and Thorsten Mu¨ller2,3,* Fluorescent proteins have revolutionized biomedical research as they are easy to use for protein tagging, cope without fixation or permeabilization, and thus, enable live cell imaging in various models. Current methods allow easy and quick integration of fluorescent markers to endogenous genes of interest. In this review, we introduce the three central methods, zinc finger nucleases (ZFNs), transcription activator-like effectors (TALENs), and CRISPR, that have been widely used to manipulate cells or organisms. Focusing on CRISPR technology, we give an overview on homology-directed repair (HDR)-, microhomology-mediated end joining (MMEJ)-, and nonhomologous end joining (NHEJ)-based strategies for the knock-in of markers, figure out recent developments of the technique for highly efficient knock-in, and demonstrate pros and cons. We highlight the unique aspects of fluorescent protein knock-ins and pinpoint specific improvements and perspectives, like the combination of editing with stem cell derived organoid development.
A Powerful Tool to Understand Real Protein Function GFP has been extensively used for studying gene expression, characterization of protein localization/ colocalization, and for unravelling cellular signaling pathways [1,2]. The multitude of GFP-based studies performed to characterize protein localization and behavior, were often performed using transient overexpression systems. Mainly, easy nature of experiments (i.e., generation of plasmidbased proteins fluorescent tags and subsequent characterization of proteins function with analytical and quantitative methods) and difficult nature of generating endogenous gene tags contributed to the dependency on the overexpression-based system. Although overexpression-based generation of proteins in cell lines is an important way to decipher proteins structure [3], mapping of protein interactions [4], or production of therapeutic proteins, it has already been shown that these systems might not recapitulate the endogenous expression of proteins, which could be remarkably low in a cell [5]. Similarly, increased expression of a protein might lead to problems such as protein misfolding, false localization, and nonspecific protein–protein interactions [6]. To overcome these pitfalls, the usual remedy is to silence the endogenous gene and then express a gene-GFP fusion; for example, utilization of snoMEN (snoRNA modulator of gene expressioN) vectors for the convenient functional replacement of endogenous cell proteins with tagged and or mutated recombinant proteins in cultured cells [7]. This endogenous gene replacement system might be a bit better than simple overexpression as (at least) the endogenous gene is silent. Thus, there might not be a mixing effect of the endogenous and the over expressed gene. However, these systems might not really recapitulate protein behavior and there is an increasing need to study proteins in their native environment. At the turn of this decade, many techniques have been developed to manipulate endogenous genes at cellular levels by using ZFNs, TALENs, or CRISPR, the so-called CRISPR-associated (Cas) system (CRISPR/Cas) [8–10]. In brief, all gene-editing tools cause double- or single-strand breaks within the genome at specific sites. Upon double-strand breaks (DSBs), the cellular repair machinery causes base insertions or deletions, which can also be used to include desired sequences, or the cell repairs DNA by homologous recombination (HR) events – a method that has long been shown to be error proof [11]. By contrast, single-strand breaks on both strands in close proximity induced by exonucleases lead to cleaved DNA with single-stranded tails (overhangs) that can be repaired by any of the DNA repair pathways. An example is the single-strand annealing, which can repair both singleand double-strand breaks [12], and usually occurs in higher eukaryotes and only proceeds if two matching sequences are part of the single strands and anneal with each other causing a length reduction of the sequence [13]. In this review, we highlight the most relevant DNA cutters ZFNs, TALENs, and CRISPR/Cas9 with respect to the use of editing techniques to knock-in fluorescent reporter genes for live cell or 3D tissue imaging of endogenous proteins.
912
Trends in Cell Biology, November 2019, Vol. 29, No. 11 ª 2019 Elsevier Ltd. All rights reserved.
https://doi.org/10.1016/j.tcb.2019.08.004
Highlights Endogenous protein tagging allows low artificial analysis of gene expression, protein function, protein–protein interaction, cleavage, and degradation. Gene editing is used to knock-in fluorescent proteins for live imaging. GFP knock-in offers uniqueness as compared to generic editing. Organoids from edited iPS cells enable spatiotemporal monitoring of specific proteins in human brainlike tissue.
1Department of Pathology, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA, USA 2Department of Molecular Biochemistry, Cell Signalling, Ruhr-University Bochum, Bochum, Germany 3Institute of Psychiatric Phenomics and Genomics (IPPG), University Hospital, LMU Munich, Munich 80336, Germany
*Correspondence:
[email protected]
Trends in Cell Biology
Trends in Cell Biology
Figure 1. ZFNs, TALENs, and CRISPR/Cas9 Approaches for Specific DNA DSBs Induction. There are three strategies to cut DNA for subsequent gene editing including ZFNs, TALENs, and CRISPR. ZFNs consist of multiple DNA recognition sites and are bound to the nuclease domain of FokI, which is capable to cleave DNA upon monomer dimerization. TALENs originate from plant pathogenic bacteria, from genus Xanthomonas, and serve as transcription factors. For gene editing, they can be used (like for ZFNs) upon fusion to FokI nuclease. High specificity is obtained by the combination of several TALENs. Currently, CRISPR is the method of choice for gene editing due to its simplicity, fast practicability, robustness, and application diversity. The gRNA (also termed single sgRNA) consists of the crRNA (bases 1–32) and the tracrRNA (37–100), which naturally occur. Fusion of both (1–100) generates a gRNA, which binds to a selectable sequence in the genome according to the spacer sequence (1–20). The only prerequisite is the presence of a PAM site (NGG) 30 to the recognition site. Upon gRNA interaction with DNA, Cas9 is recruited, causing a DSB 3 bases upstream of the PAM site. Abbreviations; DSB, double-stranded break; gRNA, guide RNA; PAM, protospacer-adjacent motif; sgRNA, single-guide RNA; TALEN, transcription activator-like effector; tracrRNA, transcactivating RNA; ZFN, zinc finger nuclease.
DNA Cutters: ZFNs There are three central tools to cut DNA at specific sites, which have been developed at the turn of this decade. The fusion of the nuclease domain of the FokI enzyme to an array of three (or more) zinc fingers (ZFs) was an innovative approach to combine a DNA cutter to a specific DNA recognition site [14]. The later design from the same group allowing the dimerization of two FokI monomers to cleave DNA was another important discovery allowing a specific genome cut [15]. The principle of the method is shown in Figure 1. The editing approach is simple because the two ZFNs, each targeting a different DNA sequence in a head-to-head orientation and spaced from each other in a way that the nuclease domain of both ZFNs, the FokI, dimerize properly, leading to the introduction of a DSB in the sequence between the two ZF-binding sites. This trait of ZFNs has elevated its specificity, as dimerization of the endonuclease increased the length of the recognition sites, endowing this gene-editing technique to be a specific and flexible approach. ZFs bind to a large range of sequences and nuclease domain of Fok1 do not require any specific sequence, thus, ZFNs can target wide range of sequences [16]. However, there are also some pitfalls, which obscure this technique to acquire its fullest potential. For example, the feature that ZFNs or wild-type FokI needs to dimerize to perform cleavage can be an advantage; that is, flexibility and specificity, but can also be of
Trends in Cell Biology, November 2019, Vol. 29, No. 11
913
Trends in Cell Biology
disadvantage: symmetry at FokI dimerization interface can lead to homodimer formation [17]. In principle, FokI is a monomer, binds DNA as a monomer [18], has a separate domains for DNA recognition and binding and has only one catalytic site, thus, allowing only single-strand DNA cutting [19]. It is worth mentioning that FokI cuts a cognate DNA site by forming a dimer at that site, and primary monomer protein bound to the DNA has nicking activity without its association to the secondary monomer [20]. Therefore, symmetry of FokI dimerization interface plays a crucial role in its cleavage efficiency. Typically, the FokI domains are positioned near each other on the same DNA helix, indicating the need for dimerization of the cleavage domain, and ZF-binding site are separated by a 6-bp spacer sequence, which has been shown to result in efficient cleavage [21]. A FokI domain can be coupled to an array of ZFs and as one finger binds to 3 bp, an array with three or four fingers (each on sense and antisense strand) will cover an 18- or 24-bp recognition site in the genome (Figure 1). A recognition sequence of 18 bp of DNA sequence grants ZFNs specific targeting within 68 billion bp of DNA [22], which is a good number in terms of specificity addressing the human genome. Several attempts have been made to artificially modify FokI dimeric interface leading to generation of heterodimeric forms [23,24]. Furthermore, by using directed evolution efficient cleavage domain of ZFNs termed as Sharkey [25], ZFNs with superior cleavage and suppressed homodimerization [26], and with three- and four-finger ZF protein fusions to modified FokI nuclease domain [27], resulted in the reduction of toxicity of designer ZFNs. Regardless of these developments, made to improve flexibility and specificity, ZFNs are limited by poor targeting density. It is simply because of the fact that each zinc-finger recognizes a 3-bp sequence and yet any open-source collection of 64 zinc fingers covering possible combinations of triplet sites does not exist and this pitfall has been overcome by other programmable nucleases [28].
TALENs Another recent development of the gene editing technology is the curation of (TALENs), which is also illustrated in Figure 1. TALEs are members of plant pathogenic bacteria, from genus Xanthomonas, basically serving as transcription factors: injected into plant cells by bacterial type II secretion system and after nuclear import, TALEs are targeted to specific gene promoters [29,30]. TALEs recognize DNA in a modular fashion (Figure 1); that is, tandem polymorphic amino acid repeats bind to specific single contiguous nucleotides in the target DNA [31,32]. Each repeat of TALEs contains 33–35 amino acids and recognizes a single nucleotide. The last repeat of TALEs is truncated at 20 amino acids and is called a half repeat. DNA recognition specificity is contained within amino acids 12 and 13 and this pair of residues displays high polymorphism, and is thus called a repeat-variable residue (RVD) [33]. There are four common RVDs and they display preferential association with different nucleotides: HD (histidine, aspartic acid) recognizes cytosine, NG (asparagine, glycine) recognizes thymine, NI (asparagine, isoleucine) recognizes adenine, and NN (asparagine, asparagine) recognizes guanine [31,32]. TALEs bind to target DNA as a right-handed superhelix with each repeat forming a left-handed, two-helix bundle, which presents an RVD-containing loop to the DNA major groove [34]. The one to one correspondence of specific RVDs to corresponding nucleotides has enabled researchers to generate gene-targeting proteins using custom arrays of TALE repeats [31,35]. First use of TALENs was the fusion of TALEs to the catalytic domain of FokI endonuclease, leading to the generation of chimeric nuclease introducing DNA DSBs at specific target sites [36]. Soon, synthetic TALE proteins were created, and these proteins were used for activated targeted gene expression in plant and human cells [37]. Similarly, 17 TALEs were synthesized, which were customized to recognize specific DNA-binding sites and modulate transcription of endogenous genes, that is, SOX1 and KLF4, in human cells [38]. Through their simplicity in design, the TALENs gene-editing tool surpassed ZFNs but RNA guided endonuclease (RGENs) have a crucial advantage over both ZFNs and TALENs because of their simple design and preparation.
914
Trends in Cell Biology, November 2019, Vol. 29, No. 11
Trends in Cell Biology
CRISPR/Cas The CRISPR/Cas system has its origin in bacteria and archaea, where it evolved as an adaptive immunity against invading viral or plasmid DNA [39]. Existence of a repeat of 29 nucleotides, downstream of IAP gene in the bacterial genome was first described in 1987 [40]. These repeat sequences were different from other repeat sequences; that is, TALE repeats, as they were interspaced by five intervening 32-nucleotide nonrepeated sequences. To appreciate the uniqueness of these repeat sequences, they were simply named as clustered regularly interspaced short palindromic repeats or CRISPR [41]. The typical CRISPR array consists of CRISPR-associated (Cas) gene and a series of noncoding repetitive elements, which are interspaced by short variable sequences, also known as spacers. There are multiple CRISPR/Cas families, however, Cas9 is most often used as it was initially chosen for gene editing due to the fact that Cas9 requires the fewest components for its action. The mentioned short spacers, that is, typically 30 bp, are derived from the foreign genetic elements and are basic elements for an adaptive immunity against elements such as phages [39,42]. On the phage genome, corresponding sequence of spacer is called protospacer and this protospacer sequence is flanked by a short protospacer-adjacent motif (PAM), which plays a crucial role for Cas nuclease target recognition (Figure 1). The CRISPR array is further transcribed into short RNA molecules also known as CRISPR RNAs (crRNA). Together with a second short transactivating RNA molecule (tracrRNA), they guide and facilitate Cas to the target recognition; both crRNA and tracrRNA have been fused into a single guide RNA (sgRNA) to facilitate Cas9 targeting [43]. After gRNA mediated binding of the Cas to the PAM site in the DNA, a DSB is introduced [44]. The most used Cas nuclease is from Streptococcus pyogenes (SpCas9) and requires a 50 -NGG PAM motif and any genomic region of interest (containing PAM motif) can be easily targeted by merely altering the 20-bp guide sequence of sgRNA, which is adjacent to the PAM site [45]. It is worth mentioning that different spCas9 variants have been generated, which not only recognize canonical NGG PAM sequences, but also can target alternative PAM sequences (NAG and NGA PAMs), thus enhancing the targeting of endogenous gene sites [46,47]. The wild-type Cas9 has two conserved nuclease domains, HNH and RuvC: the former is used to cleave the DNA strand complementary to the 20-nucleotide sequence of the crRNA; and the latter domain is used to cleave the strand opposite to the complementary DNA strand [43,48]. Mutation in the RuvC domain, that is, D10A, leads to its inactivation and results in a Cas9 nickase (Cas9n) [43]. While N863A and H840A mutations has been shown to inactivate HNH [49], mutation in the domains D31A (RuvC) and N891A (HNH) results in an RNA-guided DNA-binding protein [48]. These catalytically inactive versions of Cas9, along with activator or repressor domains of other enzymes, have been used as transcriptional modulators [50–54], and these dCas9-based inhibition and activation systems are commonly referred to as CRISPRi and CRISPRa [55]. A spectacular application of gene-editing tools for basic research is the integration of initially discussed fluorescent protein coding sequences to endogenous genes of interest. The DNA cutters, ZFNs, TALENs, and CRISPR/Cas, have their advantages and disadvantages, which are summarized in Table 1. The combination of these techniques allows for the first-time live cell imaging of endogenous proteins in a robust and reliable fashion.
Different Strategies for CRISPR/Cas Fluorescent Knock-Ins In 2013, mice carrying a fluorescent reporter tag in the OCT4, NANOG, and SOX2 genes were made by CRISPR/Cas9 and the procedure was termed a one-step procedure, as zygotes were simultaneously co-injected with Cas9 mRNA and different guide RNAs (sgRNAs) as well as donor DNA vectors [56]. In the same year, a CRISPR/Cas9-mediated GFP knock-in was done in Caenorhabditis elegans, where multiple mutant cell lines were obtained in a cost-effective manner [57]. This targeting of C. elegans genome by CRISPR/Cas9 gene editing was found to be
Trends in Cell Biology, November 2019, Vol. 29, No. 11
915
Trends in Cell Biology
Nuclease ZFNs
Pros
Cons
Highly specific due to the length
ZFNs not available for every triplet [28].
of recognition site recognized by
Labor intensive cloning [99].
the ZFNs dimer [8].
Potential toxicity [100,101]. Reduced specificity due to homodimer formation [102].
TALENs
High specificity [32].
Labor-intensive cloning [99].
Also usable for gene expression/ repression [103]. Robust activity [104]. Low toxicity [105]. Easy to engineer [36]. CRISPR/Cas
High specificity [48].
PAM site necessary [108].
Also usable for gene expression/
Off-target effects [109].
repression [50–54]. Robust activity [43]. Low toxicity [106]. Easy to engineer [107]. Table 1. Pros and Cons of the Commonly Used Nucleases, ZFNs, TALENs and CRIPSR/Cas
advantageous over previously used Mos1-triggered (inducible transposon) chromosomal breaks [58]; that is, 1 million potential targets were found for the Cas9 while for Mos1, only 14 000 were available. Since that time, different CRISPR-related knock-in techniques have been developed, which basically correspond to three different strategies: homologous recombination (HR), NHEJ, and MMEJ (Figure 2). An overview of the key (i.e., innovative and unique) CRISPR/Cas9-mediated fluorescent protein knock-ins, within the past 5 years is given in Table 2, including reference links.
HR-Driven CRISPR knock-Ins Following induction of a DSB at a specific site in the genome, a donor fragment is provided containing the sequence to be knocked-in (e.g., GFP) flanked by homology arms (of different length) corresponding to the up and downstream of the genome cleavage site.
The Beginning: Caenorhabditis and Leishmania In the nematode C. elegans, after demonstration of the customized and efficient genome editing [59], a co-CRISPR strategy (selection of the successful edits based on the successful knock in of GFP) was introduced to improve the detection of HR events, which made it possible to rapidly edit the nematodes genome [60]. Soon after, HR (common subtype of HDR) was reported to be a robust method for gene editing in C. elegans: by using short linear repair homology templates (30–60 bases), 23 unique edits at 11 genes were introduced [61]. Another less complex model system like C. elegans, namely Leishmania donovani [62], was used in another study and LdMT gene was tagged with GFP to visualize its low expression. Strikingly, only 25 nucleotides were used for the HDR. It is worth mentioning that prior attempts using 50bp homology arms were found to be less efficient in murine embryonic stem cells at that time [63]. Unlike the mammalian cells, where DSBs are mainly repaired by NHEJ [64], the DNA repair of L. donovani relies on HDR and minimally on MMEJ [65]. Reliance on the HDR and absence of NHEJ in L. donovani suggest that short templates are sufficient for faithful DNA repair [66]. It turns out that the model system strictly determines the method of choice for efficient gene editing and nematodes can be easily edited using HDR, even with small homology arms.
916
Trends in Cell Biology, November 2019, Vol. 29, No. 11
Trends in Cell Biology
Trends in Cell Biology
Figure 2. Repair Pathways upon DSBs. Following DNA DSB induction, different repair mechanisms are available, which are used to knock-in reporter genes. HDR is used to knock-in inserts like GFP. In this example, the fluorescent protein GFP is inserted 50 to the stop codon of a gene of interest by including a gRNA mediated cut at a suitable PAM site (close to the stop codon). 50 and 30 homology arms of different length up to 1500 bp have been used for HDR. For high efficiency, the donor fragment is flanked by additional CRISPR cleavage sites to provide a linear fragment for HDR. MMEJ is used with high efficiency to gene edit target cells. Short (micro) homology domains are used to knock-in GFP. The mechanistically differences between MMEJ and NHEJ are illustrated. Both pathways benefit from the nuclease resection activity generating 30 ssDNA overhangs. In HDR, ssDNA strand invasion into the donor fragment occurs. In MMEJ nuclease resection causes single strands over the whole microhomology domain, which produces ‘sticky ends’ due to the presence of the same microhomology domains in the genome. More recently, NHEJ has been used for gene editing. Upon gRNA mediated cleavage a linear fragment is provided in the nucleus (e.g., by a donor vector, which includes the GFP flanked by CRISPR cleavages sites), which is then inserted in some cells by NHEJ repair. Abbreviations: DSB, double-strand break; dsDNA, double-stranded DNA; gRNA, guide RNA; HDR, homology-directed repair; MMEJ, microhomology-mediated end joining; NHEJ, nonhomologous end joining; PAM, protospacer-adjacent motif; ssDNA, single-stranded DNA.
The Future: Stem Cells In a human embryonic stem cell line, CRISPR/Cas9-mediated GFP knock-in was reported in 2015 [67], where a self-cloning CRISPR/Cas9 strategy was devised: a self-cleaving palindromic sgRNA plasmid and a short double-stranded DNA (encoding the desired locus-specific sgRNA sequence spanned by homologous sequence to the self-cleaving plasmid) were introduced into the target cells. After the self-cleavage of the palindromic plasmid, locus-specific sgRNA (within the short double-strandedDNA-containing homology arms to the plasmid) was ligated in the plasmid through HDR. This method allowed robust introduction of sgRNA within the cells and successfully demonstrated GFP knock-in into 12 genes by using 80 nucleotide homology arms (in both human and mouse cell lines), albeit with low efficiency (2–4%). In addition, others also revealed successful knock-in into human iPSCs. Authors used an 1.5-kb template flanked by a 1.3-kb 50 and 6-kb 30 homology arm to tag MYF5 gene (to allow characterization of myogenic progenitors, e.g., MYF5, from the human iPSCs) [68]. Taken together, utilization of gene-editing technologies have significantly elevated HDR-based editing efficiency [69]. However, most authors describe targeting of stem cells with HR repair
Trends in Cell Biology, November 2019, Vol. 29, No. 11
917
Trends in Cell Biology
Organisms/cell types
Integration methods
Plasmodium
HDR
falciparum
Yarrowia lipolytica
HDR
Homology arms (HA) length
Nucleases Reporter
Target genes
Purpose
The 3.334-kb
Cas9
/ PF47
Improvements in
GFP
insertion cassette
basic Cas9 gene
contained upstream
editing, i.e. large
and downstream
integration by
homology arms to the
devising a suicide-
Pf47 locus.
rescue system [110].
HA = 1 kb
Cas9
GFP
/ AXP / XPR2
Development of a
/ A08
CRISPR-Cas9-based
/ D17 / MFE1
tool for targeted, marker less gene integration into the Yarrowia lipolytica genome [111].
Schizosaccharomyces
HDR
HA = 25 b
Cas9
GFP
/ REB1+
pombe
To expedite genome editing in Schizosaccharomyces pombe [112].
Aspergillus fumigatus
MMEJ
HA = 39 b
Cas9
GFP
/ CNAA
Utilization of short homology arms (micro-homology end joining method) for the insertion of mutations and tags [113].
Caenorhabditis
HDR
HA = 1.5 kb
Cas9
GFP
/ NMY-2 / HIS-72
elegans
To demonstrate that CRISPR/Cas9 mediated gene editing is efficient in introducing knock-out and knock-Ins in the C. elegans genome [57].
Caenorhabditis
HDR
HA = 30 b
Cas9
GFP
elegans
/ K08F4.2 / FBF-2
To demonstrate the
/ MES-2 / LIN-15b
short homology arms
/ DEPS-1 / MEX-6
(30-60 bases) could
/ GLH-1 / HTP-3
also yield efficient results in C. elegans compared to long homology arms [61].
Drosophila melanogaster
HDR
HA = 3 kb for OCRL HA = 3.5 kb for VPS35
Cas9
GFP
/ OCRL
To demonstrate the
/ VPS35
efficiency of tissue specific tagging of endogenous proteins [114].
Table 2. Overview of the Key CRISPR-Cas9-Mediated Fluorescent Protein Knock-ins, in Simple to Complex Model Systems, over the Past 5 Years (Continued on next page)
918
Trends in Cell Biology, November 2019, Vol. 29, No. 11
Trends in Cell Biology
Organisms/cell types
Integration methods
Homology arms (HA) length
Nucleases Reporter
Target genes
Purpose
Drosophila
HDR
HA = 60 b
Cas9
/ MESR4
To demonstrate the
GFP
melanogaster
functionality of minimal in vivo GFP interference (miGFPi) strategy which aims to characterize gene function and conduct loss-of-function experiments in D. melanogaster [115].
Danio rerio
MMEJ
HA = 10–40 b
Cas9
GFP
/ Tyrosinase
Utilization of PITCh
mCherry
/ KRRTT1C19E
(Precise Integration into Target Chromosome) targeting method in Danio rerio (Zebrafish) [85].
Mus musculus
Mus musculus/ESCs
HDR
HDR
V5 tag
/ NANOG / SOX2
Demonstration of the
(tagged with V5)
GFP
/ OCT4
One-step generation
HA = 2–3 kb for
mCherry
HA = 60 b for SOX2
Cas9
method for the
NANOG (tagged with
generation of Mus
mCherry)
musculus (mouse)
HA = 2–4.5 kb for
model with reporters
OCT4 (tagged with
in endogenous genes
GFP)
[56].
HA = 70–80 b
Cas9
GFP
/ ZFP42 / TDGF1
To demonstrate the
/ NANOG
functionality of
/ RPP2525
multiplexed editing regulatory assay (MERA). The assay basically utilized CRISPR-Cas9 gene editing to analyze the functional impact of the regulatory genome [116].
Mus musculus/MIN6 and murine ESCs
HDR
HA = 318 b
Cas9
GFP
/ PDX1
To demonstrate the
Cas9
functionality of the
nickase
utilization of single homology arm as a functional approach for homology directed repair [117].
Table 2. Continued
(Continued on next page)
Trends in Cell Biology, November 2019, Vol. 29, No. 11
919
Trends in Cell Biology
Organisms/cell types
Integration methods
Homology arms (HA) length
Nucleases Reporter
Target genes
Purpose
Rattus norvegicus/
HDR
HA = 80 b–1 kb
Cas9
/ Rat THY1
To demonstrate that
/ ROSA 25
CRISPR/Cas9 and
GFP
zygote
ssODNs are an efficient integration system for knock-Ins [71]. / Human FBL
Description of PITCh
/ B. mori BLOS2
(Precise Integration
Ombyx mori/larvae
/ X. laevis NO29 /
into Target
Xenopus laevis/
FGK
Chromosome) system
Homo sapiens/
HDR
HA = 1 kb
Cas9
HEK293T, HeLa
MMEJ
HA = 8 b
TALENs
GFP
oocytes
[84]. / HIST1H3A
To demonstrate the
HEK293T
/ GATA6
efficiency and
Mus musculus/ESCs
/ NANOG
robustness newly
/ FAM25C
designed self-cloning
/ HIST1H2B / KLF4
CRISPR/Cas9
/ ESRRB / NFYA
technology [67].
Homo sapiens/ESCs,
HDR
HA = 80 b
Cas9
GFP
/ RPP25 / SOX2 / TDGF1 / ZFP42 Homo sapiens/ESCs
NHEJ
HA = 900 b–1.3 kb
Cas9
GFP
HDR
/ GAPDH / ACTB
To demonstrate that
/ SOX17 / T gene
NHEJ-based knock-in
/ NANOG / PAX6
method is more
/ OCT4
efficient than HDR
/ TRA-1–60
method of integration, i.e. NHEJ yielded integration of 4.6 kb promoterless ires-eGFP fragment into GAPDH locus with 1.7% efficiency [74].
th
Knock in of 11 beta
/ ARL6IP1
Development of a
HEK293 GFP1-10 cell
strand of GFP termed
/ ATP1A1
rapid and readily
line
as GFP11 into an
/ ATP2A2 / CANX
applicable method for
already GFP 1–10
/ CBX1 / CKAP4
tagging endogenous
stable HEK293 cell
/ CLTA / CNN2
human proteins with
line.
/ CTCF / CYB5B
GFP at a genome-
/ FBL / HIST2H2BE
wide scale [118].
Homo sapiens/
HDR
HA = 70 b
.
Cas9
/ LAMP1/ LMNA / MAP4 / MAPRE1 / MCM7 / MSN / OST4 / PRKACA / RAB11 / REEP4 / REEP5 / SEC61B / SMC1 / SPTLC1 / TOMM70A / VIM / VAPB / VDAC3 Table 2. Continued
920
Trends in Cell Biology, November 2019, Vol. 29, No. 11
(Continued on next page)
Trends in Cell Biology
Organisms/cell types
Integration methods
Homology arms (HA) length
Nucleases Reporter
Target genes
Purpose
Homo sapiens/K562-
HDR
HA = 133 b
Cas9
/ Human b-globin
To develop the traffic
Conversion of already
50 cells
knocked in GFP to
light reporter system,
HEK293 cells
BFP
i.e. conversion of already knocked in GFP to BFP, for quantifying the efficiency of HDR and NHEJ repair mechanism [72].
The donor vector
Cas9
/ Piggybac target
Development of
HEK293T, DLD1,
contained sequence
Cas9
vector which
SRIRACCHA method
SW480, HEPG2,
homologous to
nickase
contained target site
(i.e. a stable, but
NIH3T3,
upstream puromycin
flanked by upstream
reversible, integrated
and IEC6 cells
resistance cassette
puromycin and
reporter for assaying
and downstream
downstream
CRISPR/Cas-
hygromycin reporter
hygromycin reporter
stimulated HDR
gene (H2B-GFP) in the
gene.
activity) for the
Homo sapiens/HeLa,
HDR
GFP
Piggyback target
quantification of
vector.
Cas9/ guide RNA activity [119].
Homo sapiens/iPSCs
HDR
HA = 1 kb
Cas9
GFP
/ Alpha tubulin
Presentation of
/ Beta actin
CRISPR/Cas9
/ Desmoplakin
genome-editing
/ Fibrillarin
strategy to
/ LMNB1
systematically tag
/ non muscle myosin
endogenous proteins
heavy chain IIB
with fluorescent tags
/ Paxillin / SEC61B
in iPSCs [120].
/ ZO1 / TOM20 Homo sapiens/
NHEJ
HA = 800 b
For NHEJ, a self-
To test efficiencies of
HEK293
HDR
(HDR for ROSA26
cleaving GFP-plasmid
the gRNA and
gene)
was used.
showing that NHEJ
/ PTEN / PML
can be used efficiently
Mus musculus/
/ TP53 / PRNP
to integrate 10kb
N2a, NIH/3T3
/ SPRN / PIWI4
donor plasmid with up
/ PIWI2 / ROSA26
to 30% efficiency [76].
HELA cells
Cas9
GFP
Table 2. Continued
mechanism to be challenging due to low efficiency [70] and nowadays other methods seem to be more promising.
Single-Stranded Oligodeoxynucleotide (ssODN) Bridging to Advance HDR An alternative homology-related approach was developed in 2016 [71]. Authors demonstrated functionality of an HDR assay assisted with ssODNs. By the use of this method, which they call ‘two-hit by gRNA and two oligos with a targeting plasmid’ (2H2OP), they were able to knock-in GFP at different loci. First, by using one long 1-kb ssODN (i.e., 60–300 bp two homology arms flanking the GFP cassette), they were able to knock-in a GFP at the rat THY1 locus. In another strategy, they used two short ssODNs (i.e., two 80 bp), which contained homology to the insertion cassette (i.e., 5.5 kb CAG–GFP) and the ROSA26 locus. Similarly, ssODNs of 133 bp were used to convert the already
Trends in Cell Biology, November 2019, Vol. 29, No. 11
921
Trends in Cell Biology
knocked in BFP to GFP [72], and others demonstrated that PCR based ssODNs also serve as efficient template for GFP knock-in in C. elegans [73].
NHEJ-Driven CRISPR Knock-ins Until 2016, all GFP knock-ins were done using HDR, yet it was demonstrated that NHEJ might be the method of choice for highly efficient knock-ins, even in stem cells [74]. In that study, the use of ‘no homology arms’ was compared to 1-kb homology arms, and the NHEJ-mediated method yielded a more efficient GFP knock-in in various genes in the human embryonic stem cells (ESCs) compared to HDR. This opened an avenue for CRISPR-Cas9-mediated knock-ins to investigate and utilize the NHEJ mechanism, which is explained in the following section. Alternatively, to the HDR-driven editing technology, NHEJ normally introduces short base pair insertions or deletions (Figure 2). NHEJ is highly efficient and more likely than HDR, especially in eukaryotic cells; however, specificity of integration is hampered by INDELs (insertions and deletions), which might cause frame shifts, particularly problematic for knock-ins of fluorescents proteins (fusion genes). Furthermore, NHEJ-mediated knock-ins may result in more random integration compared to HDR because of the free-ended donor plasmids. By contrast, the positive fluorescence signal can be easily used to identify correct in-frame knock-ins and in combination with fluorescence-activated cell sorting (FACS) instruments, those cells can be easily separated to establish clonal lines. A mechanistic comparison of HDR and NHEJ is shown in Figure 2. The NHEJ repair system has been used to integrate desired sequences into the duck enteritis virus genome, and GFP was inserted in between UL26 and UL27 loci [75]. NHEJ was also used to integrate large 10-kb donor plasmids with up to 30% efficiency in various human and murine cell lines [76]. Finally, NHEJ-driven knock-in of a promoterless internal ribosome entry site (ires)-EGFP fragment revealed GFP-positive ESCs [74]. Robust knock-in of large cassettes in nondividing cells was shown to be possible also in vivo by application of a sophisticated NHEJ-related approach, the so-called homologous independent targeted integrations (HITI) strategy. A linear donor fragment for integration is provided to Cas9-cleaved cells causing integration by NHEJ. High efficiency was obtained by flanking the donor fragment with Cas9 cleavage sites and cloning of this cassette into a vector backbone. By this approach, regular highly efficient transfection was applicable, and generation of the linear dsDNA fragment is produced close to the locus of editing in the nucleus [77]. These data clearly indicate that NHEJ provides a valuable tool for highly efficient gene editing and especially for specific cells like human or mouse stem cells.
MMEJ-Driven CRISPR Knock-ins The main DNA repair pathways were considered to be HDR and NHEJ, yet almost a decade ago, a third efficient, but error-prone repair pathway was described [78]. Initially called the alternative end-joining mechanism and observed in active murine B lymphocyte development, the pathway was later termed MMEJ [79]. The salient feature, which distinguished MMEJ (Figure 2) from HDR and NHEJ, is the utilization of short 5–25-bp long microhomology sequences during the repair of the broken DNA ends. Encoding of such microhomology domains within the genome, repair would result in the deletion of the regions in between those domains. Thus, MMEJ is often associated with chromosomal abnormalities; for example, inversion, deletions, translocation, and rearrangement [79]. However, for gene editing, the mechanisms of MMEJ is of central interest if the aimed knockin donor is flanked by such microhomology domains. Indeed, soon after its discovery, MMEJ was used in several gene-editing approaches; for example, in combination with ZFNs [80,81] or with both ZFNs and TALENs, and insertion of 15 kb of transgene in human cell lines [82]. In combination with CRISPR/Cas9, MMEJ-driven repair was used for knock in into the zebrafish genome [83], or within a method termed as PITCh (precise integration into target chromosome) for the integration of DNA in human cells and animals, including silkworms, zebra fish, and frogs [84–86]. In brief, the PITCh system uses short, 5–25 bp, microhomologous sequences as homology arms, and in the PITCh system, DSBs are generated in the donor vector (termed as PITCh vector) along with the genomic DNA using programmable nucleases (e.g., TALENs, ZFNs, and CRISPR all have been used with the PITCh system). After cleavage, the donor sequence (e.g., specific gene cassette) is then inserted into the genome by MMEJ. Further identification of exonuclease 1 as an enhancer for PITCh in human cells and by
922
Trends in Cell Biology, November 2019, Vol. 29, No. 11
Trends in Cell Biology
combining exonuclease 1 and PITCh-directed donor vectors, one-step knock-in of gene cassettes was achieved in human cells and mouse zygotes [87]. Although utilization of MMEJ in gene editing seems to be limited, potentially because of its error-prone nature, the method has potential as it reduces the labor-intensive cloning of homology arms and has been shown to be effective in several model systems.
Uniqueness of Fluorescent Protein (e.g., GFP) Knock-ins Endogenous gene tagging using fluorescent proteins corresponds to a valuable tool to understand gene and protein function in cells or more complex model systems; for example, a labeled protein is detectable in live cells over long time periods, thus enabling a thorough study of its subcellular localization [88]. Notwithstanding, some effort is needed to establish these model systems for several reasons. Fluorescent proteins like GFP have a size of 27 kDa and thus, the corresponding donor needs to be flanked by large homology arms while using HDR. By contrast, NHEJ and MMEJ have been successfully used to achieve the same goal with less effort, albeit both methods of DNA repair are error prone. When it comes to the method of choice for fluorescent proteins knock-ins, there are no major constraints with respect to efficiency and rapid design as compared to including point mutations in a gene of interest. However, as HDR is an error-proof method, several efforts have been made to improve its efficiency: utilization of longer homology arms (500 bp or more) and suppression of NHEJ key molecules KU70, KU80, or DNA ligase IV by gene silencing has led to efficient knock-ins [89]. Furthermore, selection of a PAM site close to the knock-in site, that is, within 30 bp, and mutation of the PAM site within the donor plasmid has also led to the increase of HDR efficiency [90]. Further improvements included synchronizing the expression of Cas9 with cell-cycle progression [91], transfection of a double cut HDR donor vector instead of circular plasmid vector [92], and covalent tethering of an ssODN via HUH-Cas9 fusion lead to increased HDR efficiency [93]. Besides helping to decipher important biological questions, use of fluorescent protein tagging also enables a fast generation of clonal cell lines due to the fact that these cells are easy to monitor using fluorescence microscopy. Methods like FACS are used to isolate single fluorescence-positive cells into a single well of 396-well dishes in a rapid fashion, which best guarantees culturing of cells with the same genotype. Even laboratories without such equipment are able to generate genetically equal clones, just by seeding cells with low concentration and picking of grown fluorescent colonies. This major ‘what you see is what you get (WYSIWYG)’ advantage also facilitate the cloning efforts, which have initially been reported to need more efforts. Since the researcher has full (optical) control there is no need of using separate selection markers any more, which might have to be included in the donor in point mutation experiments. In the same direction, target cells do not have to be pretested for selection drug resistance and cell death induction. The method is hampered by some pitfalls known for GFP from overexpression experiments. For example, tagging might have a relevant impact on protein folding, stability, or cleavage due to the size of GFP, or GFP might be slivered from the protein. Although addition of suitable linkers, for example, GS linkers, can ensure proper folding and stability of the protein [94], the large sized GFP tag to smaller protein might still influence their localization, interaction, stability, and dynamics. Of course, these constraints have to be taken into account, but validation of GFP results by immune-labeling methods can be done easily and these controls belong to a sophisticated experiment. WYSIWYG is the central requirement for live cell imaging experiments. Other live staining methods are often hampered by penetration of the dye and target specificity; disadvantages that are not relevant for endogenous fluorescent protein knock-ins. Thus, the technique is of central relevance for future applications, especially in the field of stem cell biology.
Concluding Remarks Fluorescent reporter knock-ins provide exciting opportunities for a wide range of applications, from basic research to biomedical applications. In particular, editing human stem cells and using them for the differentiation of lineage-specific cell types or even organoids holds a lot of promise to understand gene and protein function in healthy humans as well as dysfunction in complex diseases (Figure 3). Stem cell knock-ins are still challenging, potentially due to some unique properties of the DNA repair machinery of those cells; for example, it was shown that the knock-in efficiency in human ESCs is
Trends in Cell Biology, November 2019, Vol. 29, No. 11
923
Trends in Cell Biology
Trends in Cell Biology
Figure 3. Future Applications of Reporter Knock-ins. Gene editing is applied to iPSCs cells in order to label different proteins of interest (e.g., Amyloid Precursor Protein (APP, GFP, green), APP-binding protein FE65 (RFP, red), and microtubule-associated protein tau (BFP, blue). Subsequently, embryonic bodies and organoids are generated to study protein function in detail by live imaging. Expression of the tagged proteins (e.g., beginning in the embryoid body) can lead to the characterization of protein’s subcellular localization, cleavage, and degradation. Upon further differentiation, cell-type-specific localization is detectable. Biomedical research and other studies can be performed, for example, to study effects of drug treatment on tagged proteins. Abbreviations: APP, amyloid precursor protein; iPSC, induced pluripotent stem cell.
lower than that observed in somatic cells [74]. Although DNA repair proteins are highly expressed in human iPSCs [95,96], HDR-related knock-ins by exogenous templates is low. Notably, it was demonstrated in human preimplantation embryos that a gene with induced DSB was predominantly repaired using the homologous sequence of the sister allele as template and not using synthetic DNA [97]. Further investigation is needed for highly efficient stem cell editing, and the reader is referred to a recent review for further insights on this specific topic [98]. So far, knock-in stem cell approaches have been mainly used to facilitate optimization of differentiation protocols in order to confirm specific protein expression. However, the endogenous tagging of stem cells is a method of high potential and wide range of applications can be envisioned. (i) A combination of state of the art imaging techniques [i.e., photoactivated localization microscopy (PALM), reversible saturable optical fluorescence transitions (RESOLFT), structured illumination microscopy (SIM), stimulated emission depletion (STED) microscopy, direct stochastic optical reconstruction microscopy (dSTORM), and fluorescence-lifetime imaging microscopy (FLIM)]. CRISPR-mediated fluorescence tagging can be used to study in detail protein colocalization to cellular organelles (e.g., endocytosis, axonal cargo transport, and cytoskeletal dynamics), reveal highly resolved details of protein–protein interactions within the cellular complexes (e.g., promyelocytic leukemia nuclear bodies) and the life time of tagged protein within the cell. These applications can significantly enhance our understanding of cellular signaling and cell biology. (ii) After stem-cell differentiation, the endogenously tagged transcription factors can be monitored via live cell imaging and their shuttling to the nucleus can shed
924
Trends in Cell Biology, November 2019, Vol. 29, No. 11
Trends in Cell Biology
light on the activated gene cascades and potentially be helpful for studying embryogenesis and accompanying stages. (iii) Several biochemical studies (immunoprecipitation, chromatin immunoprecipitation, and affinity-based mass spectrometry) can be applied to fluorescently tagged endogenous proteins in differentiated iPSCs to identify novel interaction partners in physiological environments. (iv) A disease-associated fluorescently tagged iPSC line (e.g., amyloid precursor protein, tau, FUS, a-synuclein) can be used as a base cell line to further study the impact of disease-associated mutations on the disease-associated pathways; that is, protein aggregation, mislocalization, perturbations in the cellular signaling, and anomalies in their dynamics and interactions. (v) We expect CRISPR technology to be increasingly used in the iPSC-derived organoids: protein function (subcellular localization, cell type specific expression, cleavage, and degradation) can be studied in developing as well as adult to aged organoids (or organoid slice cultures; Figure 3), under their native conditions. Multiplexing, the simultaneous fluorescence knock-in of different targets (for the purpose of a ‘brick’ system), for example, labeling of different disease-relevant genes, is possible and will provide insights in protein–protein interaction in a cell-type and time-dependent fashion in 3D tissue. The integration of disease-specific gene mutations is an additional piece in the ‘box of bricks’ in this context. In conclusion, the long-lasting method of reporter cassettes is thought to have experienced a revival in the fast-evolving field of 3D organoid technology, uncovering mechanisms of human development, physiology, and pathophysiology (see Outstanding Questions).
Acknowledgments This work was supported by funding from ‘‘Deutsche Forschungs- Gemeinschaft’’ (DFG, Germany MU3525/3-2), Mercator Research Center Ruhr (Mercur, Germany, Pr-2016-0010), and ‘‘Forschungsfo¨rderung der Medizinischen Fakulta¨t’’ (FoRUM, Germany, F870R-2016).
Disclaimer Statement The authors declare that they have no conflict of interest. References 1. Ettinger, A. and Wittmann, T. (2014) Fluorescence live cell imaging. Methods Cell Biol. 123, 77–94 2. Magliery, T.J. et al. (2005) Detecting protein-protein interactions with a green fluorescent protein fragment reassembly trap: scope and mechanism. J. Am. Chem. Soc. 127, 146–157 3. Bu¨ssow, K. (2015) Stable mammalian producer cell lines for structural biology. Curr. Opin. Struct. Biol. 32, 81–90 4. Hein, M.Y. et al. (2015) A human interactome in three quantitative dimensions organized by stoichiometries and abundances. Cell 163, 712–723 5. Beck, M. et al. (2011) The quantitative proteome of a human cell line. Mol. Syst. Biol. 7, 549 6. Gibson, T.J. et al. (2013) The transience of transient overexpression. Nat. Methods 10, 715–721 7. Ono, M. et al. (2010) Analysis of human small nucleolar RNAs (snoRNA) and the development of snoRNA modulator of gene expression vectors. Mol. Biol. Cell 21, 1569–1584 8. Bibikova, M. et al. (2003) Enhancing gene targeting with designed zinc finger nucleases. Science 300, 764 9. Doudna, J.A. and Charpentier, E. (2014) Genome editing. The new frontier of genome engineering with CRISPR-Cas9. Science 346, 1258096 10. Wright, D.A. et al. (2014) TALEN-mediated genome editing: prospects and perspectives. Biochem. J. 462, 15–24 11. Johnson, R.D. and Jasin, M. (2001) Doublestrand-break-induced homologous recombination in mammalian cells. Biochem. Soc. Trans. 29, 196–201
12. Ceccaldi, R. et al. (2016) Repair pathway choices and consequences at the double-strand break. Trends Cell Biol. 26, 52–64 13. Carroll, D. (2004) Using nucleases to stimulate homologous recombination. Methods Mol. Biol. 262, 195–207 14. Kim, Y.G. et al. (1996) Hybrid restriction enzymes: zinc finger fusions to Fok I cleavage domain. Proc. Natl. Acad. Sci. U. S. A. 93, 1156–1160 15. Smith, J. et al. (2000) Requirements for doublestrand cleavage by chimeric restriction enzymes with zinc finger DNA-recognition domains. Nucleic Acids Res. 28, 3361–3369 16. Urnov, F.D. et al. (2010) Genome editing with engineered zinc finger nucleases. Nat. Rev. Genet. 11, 636–646 17. Beumer, K. et al. (2006) Efficient gene targeting in Drosophila with zinc-finger nucleases. Genetics 172, 2391–2403 18. Skowron, P. et al. (1993) Atypical DNA-binding properties of class-IIS restriction endonucleases: evidence for recognition of the cognate sequence by a FokI monomer. Gene 125, 1–10 19. Wah, D.A. et al. (1997) Structure of the multimodular endonuclease FokI bound to DNA. Nature 388, 97–100 20. Sanders, K.L. et al. (2009) Targeting individual subunits of the FokI restriction endonuclease to specific DNA strands. Nucleic Acids Res. 37, 2105– 2115 21. Bibikova, M. et al. (2001) Stimulation of homologous recombination through targeted cleavage by chimeric nucleases. Mol. Cell. Biol. 21, 289–297
Outstanding Questions Are edited organoids the next-generation tool to study protein function and disease? For example, the subcellular localization of specific proteins dependent on various parameters (time, cell type) can be studied. In this respect, organoids from patients can be compared to those from controls. Is it the best approach to unravel temporal gene expression between developing and aged tissue? A marker fused to the corresponding gene allows monitoring of protein levels in a cell-type-specific fashion. Can multi-edited organoids be used to study protein–protein interaction under low artificial conditions in a 3D model? For example, the interaction of neuronal proteins with glial proteins can be analyzed in a close to human model system. What are the best labels to reduce potential artefacts, including protein stabilization? Do large protein labels like GFP affect endogenous protein function in 3D tissue? Is this method suitable to test drug treatment better than using animal models and thus, will this method foster animal protection? Is the method of gene-edited organoids suitable for personalized medicine? For example, the patient-specific generation of cerebral organoids to test the response of neurodegenerative disorders or major depression, instead of using a trial-and-error approach?
Trends in Cell Biology, November 2019, Vol. 29, No. 11
925
Trends in Cell Biology
22. Cathomen, T. and Joung, J.K. (2008) Zinc-finger nucleases: the next generation emerges. Mol. Ther. 16, 1200–1207 23. Miller, J.C. et al. (2007) An improved zinc-finger nuclease architecture for highly specific genome editing. Nat. Biotechnol. 25, 778–785 24. Szczepek, M. et al. (2007) Structure-based redesign of the dimerization interface reduces the toxicity of zinc-finger nucleases. Nat. Biotechnol. 25, 786–793 25. Guo, J. et al. (2010) Directed evolution of an enhanced and highly efficient FokI cleavage domain for zinc finger nucleases. J. Mol. Biol. 400, 96–107 26. Doyon, Y. et al. (2011) Enhancing zinc-fingernuclease activity with improved obligate heterodimeric architectures. Nat. Methods 8, 74–79 27. Ramalingam, S. et al. (2011) Creating designed zincfinger nucleases with minimal cytotoxicity. J. Mol. Biol. 405, 630–641 28. Bae, K.H. et al. (2003) Human zinc fingers as building blocks in the construction of artificial transcription factors. Nat. Biotechnol. 21, 275–280 29. Kay, S. et al. (2007) A bacterial effector acts as a plant transcription factor and induces a cell size regulator. Science 318, 648–651 30. Ro¨mer, P. et al. (2007) Plant pathogen recognition mediated by promoter activation of the pepper Bs3 resistance gene. Science 318, 645–648 31. Boch, J. et al. (2009) Breaking the code of DNA binding specificity of TAL-type III effectors. Science 326, 1509–1512 32. Moscou, M.J. and Bogdanove, A.J. (2009) A simple cipher governs DNA recognition by TAL effectors. Science 326, 1501 33. Bogdanove, A.J. and Voytas, D.F. (2011) TAL effectors: customizable proteins for DNA targeting. Science 333, 1843–1846 34. Deng, D. et al. (2012) Structural basis for sequencespecific recognition of DNA by TAL effectors. Science 335, 720–723 35. Mak, A.N. et al. (2013) TAL effectors: function, structure, engineering and applications. Curr. Opin. Struct. Biol. 23, 93–99 36. Christian, M. et al. (2010) Targeting DNA doublestrand breaks with TAL effector nucleases. Genetics 186, 757–761 37. Geissler, R. et al. (2011) Transcriptional activators of human genes with programmable DNA-specificity. PLoS One 6, e19509 38. Zhang, F. et al. (2011) Efficient construction of sequence-specific TAL effectors for modulating mammalian transcription. Nat. Biotechnol. 29, 149–153 39. Barrangou, R. et al. (2007) CRISPR provides acquired resistance against viruses in prokaryotes. Science 315, 1709–1712 40. Ishino, Y. et al. (1987) Nucleotide sequence of the iap gene, responsible for alkaline phosphatase isozyme conversion in Escherichia coli, and identification of the gene product. J. Bacteriol. 169, 5429–5433 41. Jansen, R. et al. (2002) Identification of genes that are associated with DNA repeats in prokaryotes. Mol. Microbiol. 43, 1565–1575 42. Sun, C.L. et al. (2013) Phage mutations in response to CRISPR diversification in a bacterial population. Environ. Microbiol. 15, 463–470 43. Jinek, M. et al. (2012) A programmable dual-RNAguided DNA endonuclease in adaptive bacterial immunity. Science 337, 816–821 44. Sander, J.D. and Joung, J.K. (2014) CRISPR-Cas systems for editing, regulating and targeting genomes. Nat. Biotechnol. 32, 347–355 45. Mojica, F.J. et al. (2009) Short motif sequences determine the targets of the prokaryotic CRISPR defence system. Microbiology 155, 733–740
926
46. Kleinstiver, B.P. et al. (2015) Engineered CRISPRCas9 nucleases with altered PAM specificities. Nature 523, 481–485 47. Kleinstiver, B.P. et al. (2015) Broadening the targeting range of Staphylococcus aureus CRISPRCas9 by modifying PAM recognition. Nat. Biotechnol. 33, 1293–1298 48. Gasiunas, G. et al. (2012) Cas9-crRNA ribonucleoprotein complex mediates specific DNA cleavage for adaptive immunity in bacteria. Proc. Natl. Acad. Sci. U. S. A. 109, E2579–E2586 49. Nishimasu, H. et al. (2014) Crystal structure of Cas9 in complex with guide RNA and target DNA. Cell 156, 935–949 50. Qi, L.S. et al. (2013) Repurposing CRISPR as an RNAguided platform for sequence-specific control of gene expression. Cell 152, 1173–1183 51. Larson, M.H. et al. (2013) CRISPR interference (CRISPRi) for sequence-specific control of gene expression. Nat. Protoc. 8, 2180–2196 52. Gilbert, L.A. et al. (2013) CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell 154, 442–451 53. Bikard, D. et al. (2013) Programmable repression and activation of bacterial gene expression using an engineered CRISPR-Cas system. Nucleic Acids Res. 41, 7429–7437 54. Konermann, S. et al. (2013) Optical control of mammalian endogenous transcription and epigenetic states. Nature 500, 472–476 55. Shalem, O. et al. (2015) High-throughput functional genomics using CRISPR-Cas9. Nat. Rev. Genet. 16, 299–311 56. Yang, H. et al. (2013) One-step generation of mice carrying reporter and conditional alleles by CRISPR/ Cas-mediated genome engineering. Cell 154, 1370–1379 57. Dickinson, D.J. et al. (2013) Engineering the Caenorhabditis elegans genome using Cas9triggered homologous recombination. Nat. Methods 10, 1028–1034 58. Frøkjaer-Jensen, C. et al. (2010) Targeted gene deletions in C. elegans using transposon excision. Nat. Methods 7, 451–453 59. Tzur, Y.B. et al. (2013) Heritable custom genomic modifications in Caenorhabditis elegans via a CRISPR-Cas9 system. Genetics 195, 1181–1185 60. Kim, H. et al. (2014) A co-CRISPR strategy for efficient genome editing in Caenorhabditis elegans. Genetics 197, 1069–1080 61. Paix, A. et al. (2014) Scalable and versatile genome editing using linear DNAs with microhomology to Cas9 Sites in Caenorhabditis elegans. Genetics 198, 1347–1356 62. Zhang, W.W. and Matlashewski, G. (2015) CRISPRCas9-mediated genome editing in Leishmania donovani. MBio 6, e00861 63. Li, K. et al. (2014) Optimization of genome engineering approaches with the CRISPR/Cas9 system. PLoS One 9, e105779 64. Rothkamm, K. et al. (2003) Pathways of DNA double-strand break repair during the mammalian cell cycle. Mol. Cell. Biol. 23, 5706–5715 65. Passos-Silva, D.G. et al. (2010) Overview of DNA repair in Trypanosoma cruzi, Trypanosoma brucei, and Leishmania major. J. Nucleic Acids 2010, 840768 66. Glover, L. et al. (2008) Sequence homology and microhomology dominate chromosomal doublestrand break repair in African trypanosomes. Nucleic Acids Res. 36, 2608–2618 67. Arbab, M. et al. (2015) Cloning-free CRISPR. Stem Cell Reports 5, 908–917 68. Wu, J. et al. (2016) Generation and characterization of a MYF5 reporter human iPS cell line using
Trends in Cell Biology, November 2019, Vol. 29, No. 11
Trends in Cell Biology
69.
70.
71. 72.
73.
74.
75.
76.
77.
78. 79.
80.
81. 82.
83.
84.
85. 86.
CRISPR/Cas9 mediated homologous recombination. Sci. Rep. 6, 18759 Rong, Z. et al. (2014) Homologous recombination in human embryonic stem cells using CRISPR/Cas9 nickase and a long DNA donor template. Protein Cell 5, 258–260 Howden, S.E. and Thomson, J.A. (2014) Gene targeting of human pluripotent stem cells by homologous recombination. Methods Mol. Biol. 1114, 37–55 Yoshimi, K. et al. (2016) ssODN-mediated knock-in with CRISPR-Cas for large genomic regions in zygotes. Nat. Commun. 7, 10431 Glaser, A. et al. (2016) GFP to BFP Conversion: a versatile assay for the quantification of CRISPR/ Cas9-mediated genome editing. Mol. Ther. Nucleic Acids 5, e334 Paix, A. et al. (2016) Cas9-assisted recombineering in C. elegans: genome editing using in vivo assembly of linear DNAs. Nucleic Acids Res. 44, e128 He, X. et al. (2016) Knock-in of large reporter genes in human cells via CRISPR/Cas9-induced homologydependent and independent DNA repair. Nucleic Acids Res. 44, e85 Chang, P. et al. (2018) The application of NHEJCRISPR/Cas9 and Cre-Lox system in the generation of bivalent duck enteritis virus vaccine against avian influenza virus. Viruses 10, 81. Published online February 13, 2018. https://doi.org/10.3390/ v10020081 Ta´las, A. et al. (2017) A convenient method to prescreen candidate guide RNAs for CRISPR/Cas9 gene editing by NHEJ-mediated integration of a ‘self-cleaving’ GFP-expression plasmid. DNA Res. 24, 609–621 Suzuki, K. et al. (2016) In vivo genome editing via CRISPR/Cas9 mediated homologyindependent targeted integration. Nature 540, 144–149 Nussenzweig, A. and Nussenzweig, M.C. (2007) A backup DNA repair pathway moves to the forefront. Cell 131, 223–225 McVey, M. and Lee, S.E. (2008) MMEJ repair of double-strand breaks (director’s cut): deleted sequences and alternative endings. Trends Genet. 24, 529–538 Orlando, S.J. et al. (2010) Zinc-finger nucleasedriven targeted integration into mammalian genomes using donors with limited chromosomal homology. Nucleic Acids Res. 38, e152 Cristea, S. et al. (2013) In vivo cleavage of transgene donors promotes nuclease-mediated targeted integration. Biotechnol. Bioeng. 110, 871–880 Maresca, M. et al. (2013) Obligate ligation-gated recombination (ObLiGaRe): custom-designed nuclease-mediated targeted integration through nonhomologous end joining. Genome Res. 23, 539–546 Auer, T.O. et al. (2014) Highly efficient CRISPR/ Cas9-mediated knock-in in zebrafish by homology-independent DNA repair. Genome Res. 24, 142–153 Nakade, S. et al. (2014) Microhomology-mediated end-joining-dependent integration of donor DNA in cells and animals using TALENs and CRISPR/ Cas9. Nat. Commun. 5, 5560 Hisano, Y. et al. (2015) Precise in-frame integration of exogenous DNA mediated by CRISPR/Cas9 system in zebrafish. Sci. Rep. 5, 8841 Sakuma, T. et al. (2015) Homologous recombination-independent large gene cassette knock-in in CHO cells using TALEN and MMEJdirected donor plasmids. Int. J. Mol. Sci. 16, 23849– 23866
87. Aida, T. et al. (2016) Gene cassette knock-in in mammalian cells and zygotes by enhanced MMEJ. BMC Genomics 17, 979 88. Li, X. et al. (1998) Generation of destabilized green fluorescent protein as a transcription reporter. J. Biol. Chem. 273, 34970–34975 89. Chu, V.T. et al. (2015) Increasing the efficiency of homology-directed repair for CRISPR-Cas9induced precise gene editing in mammalian cells. Nat. Biotechnol. 33, 543–548 90. Paquet, D. et al. (2016) Efficient introduction of specific homozygous and heterozygous mutations using CRISPR/Cas9. Nature 533, 125–129 91. Gutschner, T. et al. (2016) Post-translational regulation of Cas9 during G1 enhances homologydirected repair. Cell Rep. 14, 1555–1566 92. Zhang, J.P. et al. (2017) Efficient precise knockin with a double cut HDR donor after CRISPR/Cas9mediated double-stranded DNA cleavage. Genome Biol. 18, 35 93. Aird, E.J. et al. (2018) Increasing Cas9-mediated homology-directed repair efficiency through covalent tethering of DNA repair template. Commun. Biol. 1, 54 94. Chen, X. et al. (2013) Fusion protein linkers: property, design and functionality. Adv. Drug Deliv. Rev. 65, 1357–1369 95. Rocha, C.R. et al. (2013) The role of DNA repair in the pluripotency and differentiation of human stem cells. Mutat. Res. 752, 25–35 96. Weissbein, U. et al. (2014) Quality control: genome maintenance in pluripotent stem cells. J. Cell Biol. 204, 153–163 97. Ma, H. et al. (2017) Correction of a pathogenic gene mutation in human embryos. Nature 548, 413–419 98. He, X. et al. (2018) New turns for high efficiency knock-in of large DNA in human pluripotent stem cells. Stem Cells Int. 2018 99. Kim, H. and Kim, J.S. (2014) A guide to genome engineering with programmable nucleases. Nat. Rev. Genet. 15, 321–334 100. Pruett-Miller, S.M. et al. (2008) Comparison of zinc finger nucleases for use in gene targeting in mammalian cells. Mol. Ther. 16, 707–717 101. Porteus, M.H. (2006) Mammalian gene targeting with designed zinc finger nucleases. Mol. Ther. 13, 438–446 102. Alwin, S. et al. (2005) Custom zinc-finger nucleases for use in human cells. Mol. Ther. 12, 610–617 103. Li, T. et al. (2011) Modularly assembled designer TAL effector nucleases for targeted gene knockout and gene replacement in eukaryotes. Nucleic Acids Res. 39, 6315–6325 104. Taheri-Ghahfarokhi, A. et al. (2015) Genome modification of pluripotent cells by using transcription activator-like effector nucleases (TALENs). Methods Mol. Biol. 1330, 253–267 105. Sommer, D. et al. (2014) Efficient genome engineering by targeted homologous recombination in mouse embryos using transcription activator-like effector nucleases. Nat. Commun. 5, 3045 106. Barrangou, R. and Marraffini, L.A. (2014) CRISPR-Cas systems: prokaryotes upgrade to adaptive immunity. Mol. Cell 54, 234–244 107. Cong, L. et al. (2013) Multiplex genome engineering using CRISPR/Cas systems. Science 339, 819–823 108. Shah, S.A. et al. (2013) Protospacer recognition motifs: mixed identities and functional diversity. RNA Biol. 10, 891–899 109. Gabriel, R. et al. (2015) Mapping the precision of genome editing. Nat. Biotechnol. 33, 150–152 110. Lu, J. et al. (2016) A redesigned CRISPR/Cas9 system for marker-free genome editing in Plasmodium falciparum. Parasit. Vectors 9, 198
Trends in Cell Biology, November 2019, Vol. 29, No. 11
927
Trends in Cell Biology
111. Schwartz, C. et al. (2017) Standardized markerless gene integration for pathway engineering in Yarrowia lipolytica. ACS Synth. Biol. 6, 402–409 112. Hayashi, A. and Tanaka, K. (2019) Shorthomology-mediated CRISPR/Cas9-based method for genome editing in fission yeast. G3 (Bethesda) 9, 1153–1163 113. Zhang, C. et al. (2016) Highly efficient CRISPR mutagenesis by microhomology-mediated end joining in Aspergillus fumigatus. Fungal Genet. Biol. 86, 47–57 114. Koles, K. et al. (2015) Tissue-specific tagging of endogenous loci in Drosophila melanogaster. Biol. Open 5, 83–89 115. Wissel, S. et al. (2016) A combination of CRISPR/ Cas9 and standardized RNAi as a versatile platform for the characterization of gene function. G3 (Bethesda) 6, 2467–2478
928
116. Rajagopal, N. et al. (2016) High-throughput mapping of regulatory DNA. Nat. Biotechnol. 34, 167–174 117. Basiri, M. et al. (2017) The convenience of single homology arm donor DNA and CRISPR/Cas9nickase for targeted insertion of long DNA fragment. Cell J. 18, 532–539 118. Leonetti, M.D. et al. (2016) A scalable strategy for high-throughput GFP tagging of endogenous human proteins. Proc. Natl. Acad. Sci. U. S. A. 113, E3501–E3508 119. Wen, Y. et al. (2017) A stable but reversible integrated surrogate reporter for assaying CRISPR/ Cas9-stimulated homology-directed repair. J. Biol. Chem. 292, 6148–6162 120. Roberts, B. et al. (2017) Systematic gene tagging using CRISPR/Cas9 in human stem cells to illuminate cell organization. Mol. Biol. Cell 28, 2854– 2874
Trends in Cell Biology, November 2019, Vol. 29, No. 11