Gene Insertion Into Genomic Safe Harbors for Human Gene Therapy

Gene Insertion Into Genomic Safe Harbors for Human Gene Therapy

ACCEPTED ARTICLE PREVIEW Accepted Article Preview: Published ahead of advance online publication Gene insertion into genomic safe harbors for human g...

460KB Sizes 3 Downloads 97 Views

ACCEPTED ARTICLE PREVIEW

Accepted Article Preview: Published ahead of advance online publication Gene insertion into genomic safe harbors for human gene therapy

Eirini P Papapetrou, and Axel Schambach

t p ri

Cite this article as: Eirini P Papapetrou, and Axel Schambach, Gene insertion into genomic safe harbors for human gene therapy, Molecular Therapy accepted article preview online 12 February 2016; doi:10.1038/mt.2016.38

c us

This is a PDF file of an unedited peer-reviewed manuscript that has been accepted for publication. NPG is providing this early version of the manuscript as a service to our customers. The manuscript will undergo copyediting, typesetting and a proof review before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers apply.

d e t

an m

p e c

c A

Received 26 August 2015; accepted 05 February 2016; Accepted article preview online 12 February 2016

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

Gene insertion into genomic safe harbors for human gene therapy

Eirini P Papapetrou1,2,3* and Axel Schambach4,5 1

Department of Oncological Sciences, Icahn School of Medicine at Mount Sinai, New York,

NY, USA; 2The Tisch Cancer Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA; 3The Black Family Stem Cell Institute, Icahn School of Medicine at Mount Sinai, New York, NY, USA; 4Institute of Experimental Hematology, Hannover Medical School, Hannover, Germany; 5Division of Hematology/Oncology, Boston Children’s Hospital, Harvard Medical School, Boston, MA, USA.

t p ri

c us

*Correspondence: Eirini P Papapetrou, MD, PhD

Department of Oncological Sciences Icahn School of Medicine at Mount Sinai One Gustave L. Levy Place

d e t

Box 1044A

p e c

New York, NY 10029

E-mail: [email protected]

Phone: 212-824-9337 Fax: 646-537-9576

an m

c A

Short title: Genomic safe harbors for gene therapy

Key words: safe harbor, integration sites, gene addition, iPSCs, retroviral/lentiviral vectors

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

Abstract

Genomic safe harbors (GSHs) are sites in the genome able to accommodate the integration of new genetic material in a manner that ensures that the newly inserted genetic elements: (a) function predictably and (b) do not cause alterations of the host genome posing a risk to the host cell or organism. GSHs are thus ideal sites for transgene insertion whose use can empower functional genetics studies in basic research and therapeutic applications in human gene therapy. Currently, no fully validated GSHs exist in the human genome. Here, we review our formerly proposed GSH criteria and discuss additional considerations on extending these

t p ri

criteria, on strategies for the identification and validation of GSHs, as well as future prospects on GSH targeting for therapeutic applications. In view of recent advances in genome biology,

c us

gene targeting technologies and regenerative medicine, gene insertion into GSHs can potentially catalyze nearly all applications in human gene therapy.

d e t

an m

p e c

c A

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

The concept of GSH and previously proposed criteria

The majority of gene therapy applications that are currently in the clinic or in advanced stages of preclinical development rely on gene addition. Efficient transgene insertion and expression is predominantly achieved through gamma-retroviral and lentiviral vectors1-5. However, transgenes newly inserted into random genomic positions interact with the host genome in unpredictable ways. Reciprocal interactions between a transgene and the cell’s genomic context can affect the expression of the transgene itself leading to attenuation or complete silencing6-10 and, more critically, the expression of endogenous genes located in the immediate

t p ri

neighborhood of the insertion site or at a distance through longer-range interactions. While the former effects diminish or abrogate the therapeutic effect, the latter can have catastrophic consequences for the host organism. Dysregulated expression of key genes may alter the

c us

cellular behavior in dramatic ways, promoting clonal expansion or malignant transformation of the host cell, as demonstrated in clinical trials of retroviral gene transfer into hematopoietic

an m

stem cells (HSCs), resulting in “preleukemia” or Myelodysplastic Syndrome (MDS) in some cases and in overt acute leukemia in others11-15.

d e t

We previously introduced a conceptual framework for genomic safe harbors (GSHs) - genomic

p e c

sites that can support predictable transgene expression while minimizing the risk of unwanted interactions with the host genome16,17. In order to avoid insertions that can lead to

c A

dysregulation of the endogenous gene expression program of the host cell, we proposed five GSH criteria16,17. These were intended to avoid the two types of insertional events that predominantly result in gene dysregulation: activation of adjacent genes and gene disruption. The most common mechanism of insertional oncogenesis is activation of oncogene expression through increase in the rate of transcription by the vector’s promoter/enhancer load. In these cases, vector integration sites are typically located just upstream of the dysregulated gene, most commonly within 50 kb of the transcriptional start site. The second most common mechanism of dysregulation of endogenous genes results from insertions within transcription units, mediated by formation of new gene products through fusion of viral and cellular sequences as a result of aberrant splicing or readthrough, or by truncation of endogenous transcription units removing negative regulatory sequences15.

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

Our five criteria thus aimed to exclude regions of the genome in close proximity (within 50 kb) to coding and non-coding genes – requiring extra distance (at least 300 kb) from genes known to play a role in cancer in humans or model organisms and microRNA genes18-20. They also aimed at avoiding disruption of endogenous coding genes and conserved genetic elements by exclusion of sites located inside transcription units and ultra-conserved genomic regions. We provided proof of principle in a disease model of beta-thalassemia showing that a therapeutic beta-globin transgene inserted in sites meeting the five criteria – i.e. extragenic sites distant from endogenous genes - can afford therapeutic levels of expression without dysregulation of expression of endogenous genes16.

t p ri

No chromosomal location in the human genome has yet been demonstrated to qualify as a bona fide GSH - i.e. to be adequate for reliable and safe therapeutic transgene addition - with

c us

any level of confidence. Three sites have to date mostly been targeted for transgene addition: (1) the adeno-associated virus site 1 (AAVS1), a naturally occurring site of integration of AAV

an m

virus on chromosome 19; (2) the chemokine (C-C motif) receptor 5 (CCR5) gene, a chemokine receptor gene known as an HIV-1 coreceptor; and (3) the human ortholog of the mouse Rosa26 locus, a locus extensively validated in the murine setting for the insertion of ubiquitously expressed transgenes21

d e t

22 23, 24

. While none of the above sites perfectly match the

above criteria, some functional data is available for some of them on robustness of expression

p e c

of integrated transgenes and their effects on their neighboring genome25. The AAVS1 site on chromosome 19, in particular, has gained popularity because of the commercial availability of

c A

effective tools for its targeting and its ability to support transgene expression in multiple cell types

26-29

. However, it cannot support faithful transgene expression in at least some cell

lineages and transgenes inserted at the AAVS1 locus can be silenced through mechanisms that include DNA methylation30. Furthermore, integration in the AAVS1 locus disrupts the gene phosphatase 1 regulatory subunit 12C (PPP1R12C) and the consequences of its haploinsufficiency or complete inactivation in various cell types have not been investigated in depth. The CCR5 gene locus was identified as a putative safe harbor after the discovery that people with a naturally occurring homozygous deletion leading to complete CCR5 gene disruption were resistant to HIV-1 infection and had no signs of overt pathology23, although subsequent studies associated CCR5 gene knockout with increased susceptibility to disease caused by infection with the West Nile Virus and the related Japanese Encephalitis Virus31-33. Therefore, the safety of all of the above sites is at present unknown at best and – taking into

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

account their localization inside coding genes and in gene-dense areas and their proximity to genes implicated in cancer - very uncertain17. Although sites like AAVS1, CCR5 and other intragenic regions may be acceptable for research applications, a much higher burden of proof of safety is needed for clinical applications.

Given this uncertainty and the growing body of information on functional features of the human genome, we next discuss additional considerations that could be applied to the core GSH criteria, as well as approaches towards the identification and validation of GSHs (Figure 1).

Considerations on extended GSH criteria

t p ri

The GSH definition presented above is a reflection of lessons learned from side effects related

c us

to insertional mutagenesis by LTR-driven retroviruses. The criteria were mostly empirically based on known genome annotations, primarily coding and non-coding genes. With regards to

an m

putative regulatory genomic sequences, a conservative approach of excluding only regions exhibiting very high conservation was adopted. Perspectively, GSH criteria will need to be amended to take into account emerging information on genome organization and function,

d e t

including three-dimensional nuclear organization and the rapid discovery of new genomic

p e c

functional elements. The distances specified in the original criteria may also be revised as additional information on how the regional, domain-level and large-scale organization of the

c A

genome influences the function of genetic sequences. For example, longer-range interactions exceeding the 300kb limit that we originally set seem increasingly possible. Data supporting the possibility of oncogene activation by a viral genome integrated ~500 kb upstream has more recently been reported, albeit in HeLa cells which are highly aneuploid34. Below, we discuss some putative additions to the original criteria. It should be noted that our GSH definition does not include any toxicity that may result from expression (at physiologic or supra-physiologic levels) of the transgene itself.

Essential genes. Recent studies taking advantage of CRISPR and other high-throughput methods to knockout genes have addressed gene essentiality in human cells

35, 36

. These

genome-scale systematic approaches identified approximately 2,000 genes (approximately 10% of the ~20,000 protein-coding genes in the human genome) that are indispensable for cell viability. Although essentiality of a specific gene may be heavily context-dependent and

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

disruption of only one allele may be tolerated, inactivation of this set of genes should probably be avoided. Similarly, essential and non-essential genes can be teased at the organismal level from Genome-Wide Association Studies (GWAS), as well as from rare human knockouts (like the CCR5 case discussed above)37, 38. However all the above – gene essentiality screens and human genetics – only inform on which genomic regions should not be disrupted and not on potential problems that may arise from aberrant transcriptional upregulation.

New regulatory transcripts. These include regulatory non-coding RNAs (ncRNAs) other than miRNAs (which are taken into consideration in the original criteria), mainly regulatory small non-coding RNAs (sncRNAs), and long non-coding RNAs (lncRNAs)39. Some lncRNAs have

t p ri

been shown to play important roles in various cellular and physiologic processes, including gene expression and gene regulation, chromatin dynamics, differentiation and development40.

c us

The disruption or dysregulation of ncRNAs can lead to diseases, e.g. cancer and immunological disorders. For example, HOTAIR (HOX antisense intergenic RNA) encodes a

an m

lncRNA regulating key epigenetic regulators and silencing and its dysregulation could lead to cancer formation and other diseases41. Even though the role of most lncRNAs remains at present uncertain, it is probably prudent that a GSH is not located in close proximity to or

d e t

disrupt a ncRNA, especially if the latter is well characterized.

p e c

Three-dimensional nuclear organization. Our original criteria take into consideration exclusively kilobase-scale genomic interactions, involving primarily enhancer-promoter

c A

interactions. Indeed all known to date insertional oncogenesis events have occurred within this scale. As our perception of our genome’s dynamic folding and packaging inside the cell nucleus evolves, we may become increasingly aware of the influence of additional levels of genome organization and higher order chromatin structure on transgene expression42. Genome-wide interaction studies using chromosome conformation capture-based approaches (such as 3C, 5C and Hi-C) revealed that the genome is partitioned into megabase-scale topologically associated domains (TADs)43, 44. Genetic elements frequently interact with each other within a domain, but engage very rarely in inter-domain interactions. TAD boundary regions (the generally “non-looped” stretch of DNA between two TADs) often contain CTCF binding sites that block interactions across adjacent TADs. Thus, TADs are believed to represent regulatory genomic units within which enhancers and promoters can interact and which are insulated from each other.

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

There are still more questions than answers on how chromatin topology relates to genome function. Genome folding is not as rigid and stable, as for example protein structure and whether it is a cause or consequence of genome activity is still unclear45. However, there clearly is a correlation between chromatin topology and gene activity. Furthermore, TADs and their boundaries are largely conserved across different cell types. Thus, the relative location of a genomic site with respect to TADs could inform the choice of GSHs. For example, it might be useful to avoid sites within TADs enriched in cancer genes or to favor a GSH located in a TAD border.

t p ri

“Epigenetic” marks and regulatory DNA. An additional layer of complexity is added by epigenetic modifications which register, signal or perpetuate altered genomic activity states46.

c us

These include DNA modifications (e.g. 5-methylcytosine, 5-hydroxymethylcytosine), posttranslational histone tail modifications and nucleosomal remodeling47. Several features can

an m

help distinguish accessible and transcriptionally active from repressive chromatin and these may help predict the capacity of a given genomic site to support adequate transgene expression. Such predictions can be made in a cell type-specific manner using data obtained

d e t

by various methods and through large-scale initiatives (ENCODE, Roadmap), including ChIPSeq data on transcription factor binding, genome-wide DNA methylation, promoter/enhancer

p e c

signatures inferred by histone marks and chromatin accessibility. Nucleosomes in the vicinity of

active

enhancers

typically

c A

contain

histones

with

characteristic

post-translational

modifications, such as histone H3 lysine 4 monomethylation (H3K4me1) and H3K27 acetylation (H3K27ac) at their amino termini, while inactive enhancers show enrichment of the Polycomb protein-associated repressive H3K27me3 mark48. Moreover, chromatin accessibility as assessed by DNAse I hypersensitivity or more recently ATACseq identifies DNA sequences that are not bound by nucleosomes and therefore accessible to the transcriptional machinery.

Despite the currently rather loose definition of regulatory DNA and much debate surrounding the portion of the genome that is functional, there is likely more to the human genome than is covered by the core criteria that should not be interfered with. In contrast to the linear organization of the genome, which can be easily analyzed by DNA sequencing in cases of insertional mutagenesis in preclinical models and the clinic, the contribution of epigenetic features, if any, to insertional dysregulation of genes is very hard to study, as epigenetic

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

modifications are dynamic, unstable and cell-context dependent. In addition, vector insertion likely reshapes the surrounding chromatin in ways that we still do not adequately understand25. It is generally assumed that markers of “active” chromatin would be desirable in a GSH as they are predictive of high transcriptional activity and thus likelihood of transgene expression, while on the contrary integration into an area of “repressive” chromatin will likely result in silencing. Furthermore, as discussed below, “closed” chromatin may also hamper gene targeting. In view of the current state of knowledge, no formal guidelines based on epigenetic information can be added to the DNA sequence-based criteria. However information on chromatin status may facilitate predictions regarding the likelihood of expression from any given genomic site in any given cell type. For example, a GSH may be a locus with active chromatin in the desired cell

t p ri

type(s), but not in others in which aberrant expression may be detrimental, such as tissue stem cells.

c us

Prospects for improved identification and functional characterization of GSHs

an m

Subtypes of GSHs. Different types of GSHs can be envisioned depending on the specific application, taking into account parameters such as the transgene (levels of expression

d e t

required), the target cell type (risk of transformation and potential for ex vivo manipulation) and the disease (absolute number and proportion of genetically modified cells required for

p e c

therapeutic effect). What is an acceptable GSH may vary widely. For example, applications that require transgene over-expression for therapeutic benefit4 will need to utilize sites that can

c A

afford these levels at a given tissue. “Conditional” or application-specific GSHs can be envisioned in some cases. For example, a site which disrupts a highly expressed gene and “hijacks” the transcriptional output corresponding to one of its two (or more in some cases) endogenous copies may be a viable option, if the function of the endogenous gene is not dosage-dependent and/or if it is expressed at such high levels that its haploinsufficiency phenotype is minimal or inconsequential for the target cell. For example, in a recent study the liver-specific albumin gene locus was used to insert a human coagulation factor IX via a recombinant adeno-associated virus (rAAV) vector49. An argument could thus be made that genomic loci that have been carefully studied might constitute preferable GSHs than sites meeting the criteria discussed before, which mostly select extragenic regions likely to be less studied. In another study, a human factor IX gene was inserted into ribosomal DNA (rDNA) sequences, which exist in ~400 copies in the human genome50. Alternatively, one or few

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

“universal” GSHs, sites with robust expression and no side effects across all cell types and all stages of differentiation, if identified and validated, could accommodate most or all applications. In the latter case, the requirements for validation and ascertainment of safety would be much higher, but the pay-off would also be proportionately high. If adopted by a large number of investigators, concerted efforts towards validation of universal GSHs in diverse systems could quickly build a record of efficacy and safety.

Discovery of GSHs. Once we have determined the desirable attributes of the sites, we aim for, how can we go about identifying them? In silico approaches can be a useful first step, but predictions afforded by bioinformatics tools can only go so far. Retroviral and, particularly,

t p ri

lentiviral vectors can be particularly helpful as tools to search for promising GSHs. The mouse Rosa26 locus, a locus widely used in mouse transgenics, was discovered through a retroviral

c us

51

gene trap screen . Because lentiviral vectors preferentially integrate in transcriptionally active genomic sites52, candidate sites “captured” through lentiviral integration will be enriched in

an m

sites that can support transgene expression. At the same time, these sites will also be more likely to be in the vicinity of endogenous genes and perturb their expression16. (Less than 20% of lentiviral integrations genome-wide meet the core GSH criteria.) Retroviral or lentiviral

d e t

screens can be coupled to recombinase-mediated cassette exchange (RMCE) strategies to allow researchers to “re-use” individual integration sites in order to assess different expression

p e c

cassettes harboring a variety of cis-regulatory elements, both constitutive and tissue-specific, with relative ease53, 54. Human pluripotent stem cells (hPSCs) offer an ideal platform for such

c A

screens and can streamline the discovery and the validation phases, as discussed further below.

Functional validation. Candidate sites that seem to have desirable safety and expression features will need to be further validated in relevant settings. Since expression from a specific genomic site can be cell type- and differentiation stage- dependent, these may involve in vitro differentiation systems based on hPSCs (embryonic stem cells, ESCs and induced pluripotent stem cells, iPSCs) and/or mouse models engineered through various approaches to harbor transgenes integrated in the candidate GSH or its homologous murine locus. These systems allow testing the stability of transgene expression and its effects on the neighboring genome and epigenome in diverse cell types and developmental stages. hPSC and mouse models will likely provide complementary information as the former will allow assessment of the effects of

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

integration in the cognate human genomic and cellular context without confounding speciesspecific or synteny-related factors. On the other hand, mouse models will allow assessment of potential genotoxic effects at the level of the organism to exclude detrimental side effects, like carcinogenesis or serious organ pathology. Furthermore, because of the longer experience and familiarity of the research community with them, they will help build confidence on GSH use within the scientific and regulatory environment. One can thus envision a biphasic validation workflow, consisting of an initial screen for molecular and cellular toxicities using hPSCs in vitro and a subsequent, more targeted, in vivo test for toxicities at the organismal level using mouse, large animal or non-human primate models. The former would entail targeting a reporter transgene, driven by one or more strong

t p ri

constitutive promoters (in the case of validating putative universal GSH sites) or an applicationspecific promoter, in hPSC lines and testing undifferentiated as well as differentiated progeny

c us

belonging to a panel of cell types representative of human tissues (encompassing for example blood, heart, muscle, liver, neuronal and other tissues). Apart from transgene expression,

an m

toxicity testing would utilize established phenotypic and functional assays, as well as gene expression and epigenome assays, which could be primarily focused on the genomic region immediately adjacent to the site and, secondarily, also test in a genome-wide fashion for

d e t

potential distal perturbations employing next-generation-based genomics assays, Hi-C and other assays paralleling future technical developments. Ideally few reference hPSC lines that

p e c

have undergone extensive quality control will be available for these validation assays. Sites that are cleared in the initial validation phase could be tested further in animal models. The

c A

mouse appears the most appropriate model at present, although with the emergence of CRISPR transgenesis of larger animal models may become more accessible. In this validation phase, mice with transgenes integrated in the homologous loci will be generated and transgene expression in multiple tissues will be assessed. In parallel the animals would be observed long-term for morbidities and tumors. Testing in animal models affords significantly lower throughput, requires longer experimental periods and may be of limited value for candidate GSHs located in regions with poorly conserved synteny, but will likely be essential to exclude toxicities that would not be evident in in vitro systems. Finally, it should be noted that, irrespective of what physical location criteria a candidate site meets, a verdict of safety ultimately relies on functional assays. The stringency of the latter will be rudimentary for claiming a GSH to be safe with any degree of confidence.

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

Towards applications in human gene therapy

Diseases and transgenes. All diseases that can be treated by gene addition could provide candidates for gene therapy applications utilizing one or more established GSHs. In the case of monogenic diseases caused by specific gene mutations, while in situ editing of the diseaseassociated locus by homologous recombination may permit gene correction, it will likely not be practical for most diseases and addition of a wild-type copy of the mutated gene will still be the method of choice55, 56. This will almost certainly be the case for genetic disorders caused by loss-of-function mutations scattered throughout the entire length of large genes - which would

t p ri

alternatively require the development of a large array of customized targeting vectors – as well as for disorders resulting from large genomic changes, such as deletion of an entire gene, like alpha-thalassemia. In other cases, addition of a gain-of-function mutant, rather than a wild-type

c us

copy of a gene may have superior therapeutic potential, as it may have protective effects. Globin genes with anti-sickling properties present such an example57.

an m

Apart from integration of therapeutic transgenes, GSHs could support addition of exogenous

d e t

genes that confer new properties to cells. These include drug resistance genes to select gene modified cells58 and tumor antigen receptor genes, such as chimeric antigen receptor (CAR)

p e c

59

genes, to mediate anti-tumor effects . Another use would be to introduce reporter genes for lineage tracing or functional genetics studies in basic research. The latter would also be of use

c A

in cell-based therapies, particularly stem cell-based therapies, to facilitate tracking and imaging of cells, especially in an autologous setting where the cells of the graft are genetically indistinguishable from the host. These can be combined with suicide genes to safeguard against cells that have undergone de-differentiation and transformation, against aberrantly proliferating partially differentiated progenitor cells, as well as for the less likely occurrence of teratoma formation by residual pluripotent cells contaminating the cell graft60-64.

Pre-determination or prospective selection of GSHs. Any given therapeutic application may utilize targeting pre-determined GSHs or prospectively selected ones (Figure 2). The first approach relies on a few or even only one “universal” GSH and requires efficient site-specific gene targeting tools to be in place. Gene transfer can theoretically be performed either in vitro or in vivo. With the recent emergence of pioneering technologies allowing targeted gene

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

delivery through homologous recombination with higher efficiency than could be attained before, the development of universal approaches for the genetic modification of human cells in GSHs is becoming a very attractive prospect65. AAV-mediated gene targeting, as well as homologous recombination enhanced by the introduction of DNA double strand breaks using site-specific

endonucleases66

(zinc-finger

nucleases67,

meganucleases68,

transcription

activator-like effector (TALE) nucleases69 and more recently the CRISPR-Cas9 system70,71) are all tools that can mediate targeted insertion of foreign DNA at predetermined genomic sites with efficiency that can be therapeutically meaningful. However, as discussed above, one or a few GSH that can support most or all gene transfer applications will first need to be identified and carefully validated, as this approach requires a very high level of confidence in the safety

t p ri

of a locus, as nearly 100% of the targeted cells will contain a transgene insertion in this locus. Finally, since, despite the great advances in recent years, targeting technologies can still afford

c us

efficiencies in the order of a few percent of cells at best, these approaches may be more appropriate - at least in the beginning - for diseases, in which therapeutic transgene expression

an m

provides a selection advantage to the corrected cells.

In the second approach, the prospective selection of GSHs, any integrating gene delivery

d e t

method, e.g. a retroviral vector, can be used. The limitation is that it can only be applied to target cells that have considerable self-renewal and proliferation potential and that are

p e c

amenable to ex vivo genetic modification. These properties are required to enable subcloning, selection of clones with the desired modification and their subsequent expansion. Perhaps the

c A

only cells that offer this possibility are pluripotent stem cells, although applications in some tissue-specific stem cells could be envisioned, if ex vivo culture conditions can support clonal growth and extensive expansion. The steps of an approach for prospective GSH selection utilizing patient-derived autologous iPSCs would be as described in Papapetrou et al16. In this study, iPSC lines derived from patients with beta-thalassemia were transduced at low multiplicity of infection (MOI) with a lentiviral vector expressing a therapeutic beta-globin transgene, together with a “floxed” PGK-Neo selection cassette, so that clones with single vector integrations could be isolated after single-cell subcloning and G418 selection. Clones that were found to harbor a single vector copy and could be thoroughly confirmed to be clonal were selected, the vector integration was mapped and integration sites that met the five core criteria were chosen. Differentiation along the erythroid lineage showed that ß-globin expression from GSHs can reach therapeutically relevant levels without perturbation of

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

endogenous genes in the neighborhood of the integration site or globally. This is theoretically a potentially clinically translatable approach to autologous cell and gene therapy using gene addition into GSHs, although many important roadblocks related to the use of pluripotent stem cells for cell therapies need to be overcome before this becomes a realistic prospect. A consideration pertaining to both approaches mentioned above is that, whenever the cell type in which the gene modification is performed is different than the cell type in which the transgene needs to be expressed (which is always the case in stem-cell based approaches), the GSH may be transcriptionally inactive and/or inaccessible in the stem cell state and this may have a negative impact on targeting efficiency and clone screening and selection.

t p ri

Concluding remarks

Gene addition will always be a method of choice for a number of gene therapy applications

c us

and these will greatly benefit from the identification and validation of only a handful of appropriate genomic sites for gene insertion. Such GSH sites should be able to “host”

an m

transgenes ensuring their predictable expression, while maintaining “neutrality” in the context of the linear genome, as well as the three-dimensional nuclear environment. GSHs can be defined broadly or on an application-based manner. Our previous criteria provide general

d e t

recommendations that can be amended to incorporate new information on the function of the human genome and regulatory DNA as the latter become available. In silico approaches and

p e c

retroviral screens can aid the discovery of GSHs. Functional validation will be necessary and can involve in vitro and in vivo hPSC and transgenic mouse models to detect adverse events

c A

at a multi-tissue and whole-animal level. hPSCs, offering the possibility to model development and differentiation “in-a-dish”, can help characterize expression afforded by specific candidate GSHs at different lineages, cell types and stages of differentiation.

Acknowledgments EPP is supported by NIH grants R00 DK087923 and R01 HL121570, by the Lawrence Ellison Foundation and by the Damon Runyon Cancer Research Foundation. AS is supported by the Federal Ministry of Education and Research (BMBF, network projects ReGene, PidNet and IFB-Tx), the Deutsche Forschungsgemeinschaft (SFB738 and Cluster of Excellence REBIRTH), and the European Union (FP7 integrated projects CELL-PID and PERSIST).

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

References

1. 2.

3.

4. 5. 6. 7. 8.

9. 10. 11. 12.

13.

14.

15. 16.

17. 18. 19.

Cartier, N. et al. Hematopoietic stem cell gene therapy with a lentiviral vector in X-linked adrenoleukodystrophy. Science 326, 818-823 (2009). Gaspar, H.B. et al. Hematopoietic stem cell gene therapy for adenosine deaminasedeficient severe combined immunodeficiency leads to long-term immunological recovery and metabolic correction. Science translational medicine 3, 97ra80 (2011). Hacein-Bey-Abina, S. et al. A modified gamma-retrovirus vector for X-linked severe combined immunodeficiency. The New England journal of medicine 371, 1407-1417 (2014). Biffi, A. et al. Lentiviral hematopoietic stem cell gene therapy benefits metachromatic leukodystrophy. Science 341, 1233158 (2013). Aiuti, A. et al. Lentiviral hematopoietic stem cell gene therapy in patients with WiskottAldrich syndrome. Science 341, 1233151 (2013). Martin, D.I. & Whitelaw, E. The vagaries of variegating transgenes. Bioessays 18, 919923 (1996). Kioussis, D. & Festenstein, R. Locus control regions: overcoming heterochromatininduced gene inactivation in mammals. Curr Opin Genet Dev 7, 614-619 (1997). Rivella, S. & Sadelain, M. Genetic treatment of severe hemoglobinopathies: the combat against transgene variegation and transgene silencing. Semin Hematol 35, 112-125 (1998). Bestor, T.H. Gene silencing as a threat to the success of gene therapy. J Clin Invest 105, 409-411 (2000). Ellis, J. Silencing and variegation of gammaretrovirus and lentivirus vectors. Hum Gene Ther 16, 1241-1246 (2005). Hacein-Bey-Abina, S. et al. LMO2-associated clonal T cell proliferation in two patients after gene therapy for SCID-X1. Science 302, 415-419 (2003). Ott, M.G. et al. Correction of X-linked chronic granulomatous disease by gene therapy, augmented by insertional activation of MDS1-EVI1, PRDM16 or SETBP1. Nat Med 12, 401-409 (2006). Stein, S. et al. Genomic instability and myelodysplasia with monosomy 7 consequent to EVI1 activation after gene therapy for chronic granulomatous disease. Nat Med 16, 198204 (2010). Howe, S.J. et al. Insertional mutagenesis combined with acquired somatic mutations causes leukemogenesis following gene therapy of SCID-X1 patients. J Clin Invest 118, 3143-3150 (2008). Cavazzana-Calvo, M. et al. Transfusion independence and HMGA2 activation after gene therapy of human beta-thalassaemia. Nature 467, 318-322 (2010). Papapetrou, E.P. et al. Genomic safe harbors permit high beta-globin transgene expression in thalassemia induced pluripotent stem cells. Nat Biotechnol 29, 73-78 (2011). Sadelain, M., Papapetrou, E.P. & Bushman, F.D. Safe harbours for the integration of new DNA in the human genome. Nat Rev Cancer 12, 51-58 (2012). Li, Q., Peterson, K.R., Fang, X. & Stamatoyannopoulos, G. Locus control regions. Blood 100, 3077-3086 (2002). Bejerano, G. et al. Ultraconserved elements in the human genome. Science 304, 13211325 (2004).

t p ri

c us

d e t

an m

p e c

c A

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

20. 21. 22.

23. 24. 25. 26. 27. 28.

29.

30.

31. 32. 33.

34. 35. 36. 37. 38. 39. 40. 41.

Fraser, P. Transcriptional control thrown for a loop. Curr Opin Genet Dev 16, 490-495 (2006). Irion, S. et al. Identification and targeting of the ROSA26 locus in human embryonic stem cells. Nat Biotechnol 25, 1477-1482 (2007). Kotin, R.M., Linden, R.M. & Berns, K.I. Characterization of a preferred site on human chromosome 19q for integration of adeno-associated virus DNA by non-homologous recombination. EMBO J 11, 5071-5078 (1992). Liu, R. et al. Homozygous defect in HIV-1 coreceptor accounts for resistance of some multiply-exposed individuals to HIV-1 infection. Cell 86, 367-377 (1996). Perez, E.E. et al. Establishment of HIV-1 resistance in CD4+ T cells by genome editing using zinc-finger nucleases. Nat Biotechnol 26, 808-816 (2008). Lombardo, A. et al. Site-specific integration and tailoring of cassette design for sustainable gene transfer. Nature methods 8, 861-869 (2011). Smith, J.R. et al. Robust, persistent transgene expression in human embryonic stem cells is achieved with AAVS1-targeted integration. Stem Cells 26, 496-504 (2008). Yang, L. et al. Human cardiovascular progenitor cells develop from a KDR+ embryonicstem-cell-derived population. Nature 453, 524-528 (2008). Zou, J. et al. Oxidase-deficient neutrophils from X-linked chronic granulomatous disease iPS cells: functional correction by zinc finger nuclease-mediated safe harbor targeting. Blood 117, 5561-5572 (2011). Ramachandra, C.J. et al. Efficient recombinase-mediated cassette exchange at the AAVS1 locus in human embryonic stem cells using baculoviral vectors. Nucleic Acids Res (2011). Ordovas, L. et al. Efficient Recombinase-Mediated Cassette Exchange in hPSCs to Study the Hepatocyte Lineage Reveals AAVS1 Locus-Mediated Transgene Inhibition. Stem cell reports 5, 918-931 (2015). Glass, W.G. et al. CCR5 deficiency increases risk of symptomatic West Nile virus infection. The Journal of experimental medicine 203, 35-40 (2006). Lim, J.K. & Murphy, P.M. Chemokine control of West Nile virus infection. Experimental cell research 317, 569-574 (2011). Larena, M., Regner, M. & Lobigs, M. The chemokine receptor CCR5, a therapeutic target for HIV/AIDS antagonists, is critical for recovery in a mouse model of Japanese encephalitis. PloS one 7, e44834 (2012). Adey, A. et al. The haplotype-resolved genome and epigenome of the aneuploid HeLa cancer cell line. Nature 500, 207-211 (2013). Blomen, V.A. et al. Gene essentiality and synthetic lethality in haploid human cells. Science 350, 1092-1096 (2015). Wang, T. et al. Identification and characterization of essential genes in the human genome. Science 350, 1096-1101 (2015). MacArthur, D.G. et al. A systematic survey of loss-of-function variants in human proteincoding genes. Science 335, 823-828 (2012). Sulem, P. et al. Identification of a large set of rare complete human knockouts. Nature genetics 47, 448-452 (2015). Huang, B. & Zhang, R. Regulatory non-coding RNAs: revolutionizing the RNA world. Molecular biology reports 41, 3915-3923 (2014). Fatica, A. & Bozzoni, I. Long non-coding RNAs: new players in cell differentiation and development. Nature reviews. Genetics 15, 7-21 (2014). Bhan, A. & Mandal, S.S. LncRNA HOTAIR: A master regulator of chromatin dynamics and cancer. Biochimica et biophysica acta 1856, 151-164 (2015).

t p ri

c us

d e t

an m

p e c

c A

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

42. 43. 44. 45. 46. 47. 48. 49. 50.

51.

52. 53.

54. 55. 56. 57.

58. 59. 60.

61. 62.

63.

Sexton, T. & Cavalli, G. The role of chromosome domains in shaping the functional genome. Cell 160, 1049-1059 (2015). Dixon, J.R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376-380 (2012). Nora, E.P. et al. Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485, 381-385 (2012). Cavalli, G. & Misteli, T. Functional implications of genome topology. Nature structural & molecular biology 20, 290-299 (2013). Bird, A. Perceptions of epigenetics. Nature 447, 396-398 (2007). Boland, M.J., Nazor, K.L. & Loring, J.F. Epigenetic regulation of pluripotency and differentiation. Circulation research 115, 311-324 (2014). Shlyueva, D., Stampfel, G. & Stark, A. Transcriptional enhancers: from properties to genome-wide predictions. Nature reviews. Genetics 15, 272-286 (2014). Barzel, A. et al. Promoterless gene targeting without nucleases ameliorates haemophilia B in mice. Nature 517, 360-364 (2015). Lisowski, L. et al. Ribosomal DNA integrating rAAV-rDNA vectors allow for stable transgene expression. Molecular therapy : the journal of the American Society of Gene Therapy 20, 1912-1923 (2012). Zambrowicz, B.P. et al. Disruption of overlapping transcripts in the ROSA beta geo 26 gene trap strain leads to widespread expression of beta-galactosidase in mouse embryos and hematopoietic cells. Proc Natl Acad Sci U S A 94, 3789-3794 (1997). Schroder, A.R. et al. HIV-1 integration in the human genome favors active genes and local hotspots. Cell 110, 521-529 (2002). Kuehle, J. et al. Modified lentiviral LTRs allow Flp recombinase-mediated cassette exchange and in vivo tracing of "factor-free" induced pluripotent stem cells. Molecular therapy : the journal of the American Society of Gene Therapy 22, 919-928 (2014). Turan, S. et al. Expanding Flp-RMCE options: the potential of Recombinase Mediated Twin-Site Targeting (RMTT). Gene 546, 135-144 (2014). Ellis, J. et al. Benefits of utilizing gene-modified iPSCs for clinical applications. Cell Stem Cell 7, 429-430 (2010). Notarangelo, L.D. Correcting CGD safely, iPSo facto. Blood 117, 5554-5556 (2011). Urbinati, F. et al. Potentially therapeutic levels of anti-sickling globin gene expression following lentivirus-mediated gene transfer in sickle cell disease bone marrow CD34+ cells. Experimental hematology 43, 346-351 (2015). Beard, B.C. et al. Efficient and stable MGMT-mediated selection of long-term repopulating stem cells in nonhuman primates. J Clin Invest 120, 2345-2354 (2010). Rosenberg, S.A. & Restifo, N.P. Adoptive cell transfer as personalized immunotherapy for human cancer. Science 348, 62-68 (2015). Wakitani, S. et al. Embryonic stem cells injected into the mouse knee joint form teratomas and subsequently destroy the joint. Rheumatology (Oxford) 42, 162-165 (2003). Nussbaum, J. et al. Transplantation of undifferentiated murine embryonic stem cells in the heart: teratoma formation and immune response. FASEB J 21, 1345-1357 (2007). Bjorklund, L.M. et al. Embryonic stem cells develop into functional dopaminergic neurons after transplantation in a Parkinson rat model. Proc Natl Acad Sci U S A 99, 2344-2349 (2002). Fu, W. et al. Residual undifferentiated cells during differentiation of induced pluripotent stem cells in vitro and in vivo. Stem Cells Dev 21, 521-529 (2012).

t p ri

c us

d e t

an m

p e c

c A

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

64.

65. 66. 67. 68. 69. 70. 71.

Zhong, B. et al. Safeguarding nonhuman primate iPS cells with suicide genes. Molecular therapy : the journal of the American Society of Gene Therapy 19, 1667-1675 (2011). Cox, D.B., Platt, R.J. & Zhang, F. Therapeutic genome editing: prospects and challenges. Nat Med 21, 121-131 (2015). Jasin, M. Genetic manipulation of genomes with rare-cutting endonucleases. Trends Genet 12, 224-228 (1996). Porteus, M.H. & Carroll, D. Gene targeting using zinc finger nucleases. Nat Biotechnol 23, 967-973 (2005). Paques, F. & Duchateau, P. Meganucleases and DNA double-strand break-induced recombination: perspectives for gene therapy. Curr Gene Ther 7, 49-66 (2007). Boch, J. TALEs of genome targeting. Nat Biotechnol 29, 135-136 (2011). Mali, P. et al. RNA-guided human genome engineering via Cas9. Science 339, 823-826 (2013). Doudna, J.A. & Charpentier, E. Genome editing. The new frontier of genome engineering with CRISPR-Cas9. Science 346, 1258096 (2014).

t p ri

c us

d e t

an m

p e c

c A

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

Figure legends

Figure 1. The making of a GSH. Establishing one or few GSHs will require moving from a wish-list of desirable attributes, to well-defined criteria, an in silico or wet lab-based discovery phase and extensive validation in multiple cell types and stages of development, ideally including tests in both a human context and at the whole-organism level.

Figure 2. A priori or prospective GSH determination. In the therapeutic example in the left, GSHs are prospectively selected after screening a number of clones with unique integration

t p ri

sites generated by random insertion. This approach is only feasible because hPSCs are endowed with extensive potential for in vitro self-renewal, subcloning and expansion. Alernatively (right), a pre-determined GSH can be targeted using a number of gene targeting

c us

tools (CRISPR/Cas9, TALENs, ZFNs, AAV and others). If the cell type allows single cell subcloning, screening for verification of correctly targeted clones can be included as an

an m

optional step. hPSCs: human pluripotent stem cells, CRISPR: Clustered Regularly Interspaced Short Palindromic Repeats; Cas9: CRISPR-associated nuclease 9; TALENs: Transcription

d e t

Activator-Like Effector Nucleases; ZFNs: Zinc Finger Nucleases; AAV: Adeno-Associated Virus.

p e c

c A

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

Figure 1

© 2016 The American Society of Gene & Cell Therapy. All rights reserved

ACCEPTED ARTICLE PREVIEW

Figure 2

© 2016 The American Society of Gene & Cell Therapy. All rights reserved