1
Btochtmwa et Btophystca Acta, 1032 (1990) 1-17 Elsevter BBACAN 87218
The structure of mutation in mammalian cells Mark Meuth Impertal Cancer Research Fund, Clare Hall Laboratorws South Mtmms (U K ) (Recewed 29 August 1989)
Contents I
Introducuon
II
Target loci A Selectable 1oo B Shuttle vectors
III
Stmple base substttuuons A Spontaneous mutaUons B Mutations m geneucally unstable stratus C Effects of DNA-damagmg agents
IV
Deletions A Germ hne deletxons B Spontaneous deletions at selectable loc~ C Induction of deleuons m cultured cells
6 6 7 10
V
Inseruons
11
VI
Translocatlons/mverslons
12
VII
Gene amphflcauon A Novel jomts B Structure of the arrays C Some proposals for mechamsms
12 13 13 14
VIII Future directions
14
References
15
I. Introduction The importance of mutaUon in the process of carcinogenesis ts becomxng increasingly clear, however, our understandmg of the factors and influences governmg the formation of these changes m gene structure is considerably less advanced Recent developments m the
Abbreviations LDL, low-density llpoprotean, HLA, human hastocompaubdlty complex, BPDE, benzo[a]pyrene dflaydrodtolepoxtde Correspondence M Meuth, Imperial Cancer Research Fund, Clare Hall Laboratories, South Mmuns, Hertfordshire EN6 3LD, U K
analysis of gene mutation at 'selectable locf m cultured m a m m a h a n cells (in particular the polymerase chain reaction [107]) are beginning to gtve some insight mto mechanisms responsible for these mutations The purpose of tlus review wdl be to summarlse the approaches being taken to analyse the structure of mutation m model cell culture systems, to present some of the results of these analyses, and to discuss some of the ideas advanced concermng the factors responsible for gene structure alterations Furthermore the mutations a n s m g m model cell culture systems will be compared throughout with those occurring m h u m a n germ hnes (responsible for various m h e n t e d disorders) and those responsible for oncogene formation
0304-419X/90/$03 50 © 1990 Elsevier Soence Pubhshers B V (Biomedical Dlvaslon)
II. Target loci
to compare them with mutations occurnng in cultured somatic cells
Progress in the analysis of mutations stems from recent advances in clomng the targets of mutational alterations and in the ability to analyse mutant genes at the molecular level Tins has been done by analysing mutations occurrmg at chromosomal gene targets and at transiently introduced targets (shuttle vectors) Since the nature of mutation IS, to some extent, a function of the target, I will first review the different loci which have been used A great variety of mutations occurnng in the germ lines or somatic cells of orgamsms have been described Because of the obvious importance of inherited disorders to human populations, cloned probes for locl such as a- and fl-globln [21], the low-density hpoproteln (LDL) receptor [68], factor XIII [37], and, more recently, Duchenne Muscular Dystrophy [61] have been isolated and used to study mutations giving rise to the disorders In somatic cells, dominant oncogenes are generated by point mutations of normal cellular homologues (e g, the ras oncogene) or gene rearrangements (e g , amplification of the c-myc gene, for reviews, see Refs 8 and 132) Chromosomal rearrangements wtuch are specific and diagnostic of the tumour in which they occur [140] also generate oncogenes Two well characterized examples of such turnout-specific rearrangements are the Pluladelplua chromosome, found in chromc myelogenous leukemia [104], and the translocation between chromosomal bands 11q13 and 14q32 in some chromc lymphocytic leukermas to produce the bcl-1 oncogene [22] Deletion of a defined band of chromosome 13 pred:sposes individuals to retinoblastoma [117] Chromosome wallong techmques have now been used to clone the human gene (Rb-1) whach, when deleted or altered, leads to retinoblastoma and osteosarcoma [29,35] The nature of many of the mutations at these loo have been dealt with in other reviews (e g, Ref 44) and wall not be discussed here other than
H-A Selectable loct 'Selectable' genetic markers (such as drug resistance [113]) have been widely used to analyse mutations in cultured mammahan cells The overriding advantage of these systems, is that mutations are selected in cells growing in defined conditions, making it possible to examine the stresses (genetic, chemical or physical) which promote rearrangements, the causes of germ hne mutations, in contrast, are much more difficult to characterlse Most analyses of such targets have centred around four loci (summarized in Table I) These genes code for non-essential punne or pynmldlne salvage enzymes allowing simple single step drug selections for deficient mutants or back selections for revertants Despite the similarity of these enzymes, the size of structural genes encoding them differ significantly Aprt is the smallest with a 2 5 kb D N A fragment carrying the entire structural gene [77] whale hprt is the largest, with the structural gene spread over some 44 kb of D N A in human cells [94] Clearly, a small target faclhtates the detection, mapping and analysis of mutations at the molecular level and, as a result, the greatest body of mutation data comes from analyses of the aprt locus On the other hand, there are hmltations to working with tlus locus - the small size of the gene and hemazygous cell strata used could affect the types and distribution of mutations recovered (as discussed further below) Furthermore, mutations in any smed gene can be analysed through the chenucal amphfication of reverse copies of mutant mRNAs [135] although not all mutant genes are expressed and sphce junction mutations can also cause problems (a slopped exon indicates the site of a mutation, but not its sequence) One important advantage of working with hprt is its location on the X chromosome, meamng that the locus
TABLE I
Selectable loct Locus
Enzyme encoded
Map site a
aprt
adenine phosphonbosyl transferase
16q24 Z4,Z7 (CHO)
dhfr
dthydrofolate reductase
Z2 (CHO)
1 15
6
25
18
hprt
hypoxanthmeguamne phosphortbosyl transferase
Xq26
16
9
44
94, 123
tk
thyrmdme kmase
17p21
1 2-1 5
7
11-14
11, 71, 139
a Human map site unless otherwise indicated
Message (kb)
Exons 5
Genonuc region (kb) 25
Refs 77
Is henuzygous m cells from males and functionally henuzygous m cells from females Thus the selection of hprt deficient mutants is feasible from virtually any cell hne without gross chromosomal abnormahues Addmonally, a pool of germ hne-lnduced hprt - mutations is avadable in the form of mchvaduals suffering from Lesch Nyhan syndrome [123] The difficulty of the autosomal location of the other loci has been overcome by isolating hemtzygous [10,87,129] or heterozygous stratus [137], but only a hrmted number of such strains are available Another target winch has great potential for the analysis of mutations in humans as well as In cultured cells is the human major instocompatlblhty complex (HLA) The HLA complex is autosomal and it is polymorptuc so that most individuals are heterozygous An effective selection system has been developed for the detection of mutations at one allele m winch cells expressing an antigen target are selectively killed using specific antibodies together with complement, leavang only rare 'variants' or mutants [59,93,98] An extenswe number of molecular probes are avaalable for the study of target and flankang sequences m mutant stratus H - B Shuttle vectors
Shuttle vectors were developed to circumvent the labonous task of examuung mutations reduced in chromosomal genes The design was to incorporate a mutational target which can be rapidly analysed at the sequence level in bacteria into a vector which would be mutated dunng rephcatton m cultured mammahan somatic cells Basically three types of shuttle vector have been developed one type is based on SV40 and replicates autonomously In the nucleus of transfected cells [65] The second utlhses the Epsteln-Barr xarus as vector winch rephcates as a free plasnud in mammalian nuclei but under cellular control [26] The third type of vector is integrated into the cellular genome and rephcates with chromosomal genes, but can be specifically rescued from surrounding cellular sequences [4] With the development of polymerase cham reacUon techtuques, shuttle vectors no longer have such a great speed advantage, but they have numerous other useful features Freely rephcating vectors can be used in a broad range of cell hnes (mostly human though), m parUcular, m hnes wluch grow or plate poorly (the recipient cell line needs only rephcate the vector) These vectors also fac~htate studies aimed at correlating the regions prone to D N A damage w~th those producing mutations and are useful for addressing the problem of reducible repair pathways as the effect of pretreatmg recipient cells with a mutagen can be examined Vectors whach integrate throughout the genome can be used to study posmon effects Unfortunately, these systems also have several drawbacks Firstly SV40-based vectors can have
unexpectedly high spontaneous mutation frequencies, as Ingh as 10 -2 (thought to be a result of damage incurred dunng transfectlon [16,101]) Tins can be partly overcome by using a particular human cell line as recipient [65], but then tins reduces the advantage of using shuttle vectors More recently the use of vectors based on the Epsteln-Barr virus has provided much more satisfactory backgrounds [26] and integrated vectors also have significantly lower background frequencies although stall not to the low level of true chromosomal genes [4,126] These vectors are also unsuitable for the study of genome rearrangements involving D N A fragments much greater than 1 kb or relating the influence of higher order chromosome structure on mutation
III. Simple base substitutions III-A Spontaneous mutations
Mutations occurring during normal cellular growth m culture have been examined using several of the model systems described above Not surpnsmgly, virtually every type of base subsUtutlon has been observed Large collecUons of aprt deficient mutants obtained from hemlzygous CHO stratus have been reported by three laboratories [1,47,87], two of the groups having characterized significant numbers of these mutauons at the sequence level Southern blot analysis has shown that the great m a j o n t y ( > 80%) of mutations of the aprt locus are 'small' lesions (1 e , smgle base pair substitutions or deletlons/duphcaaons of less than 40 bp) which escape the resolution of the techruque Sequence analysis of the mutant genes has confirmed this although the dlstnbutlon and types of mutations in two independent collections differ [23,91,95] A collection of approx 90 mutant genes was analysed m tins laboratory either by cloning mapped mutant genes (1 e , those affecting restriction endonuclease sites) m ?~ vectors [88,91] or by amphfymg the exons of mutant genes by the polymerase chain reaction and sequencing the double-stranded products [95] These studies revealed a variety of spontaneous mutational events with all types of transitions and transversmns represented (see Table II) G C ~ A T transitions were the largest class of mutations but by no means predormnant On the other hand, over 75% of the simple base substitutions (46% of the total collection) were at G C base pairs Base paar deletlons/InserUons (including framesbafts) accounted for approx 34% of mutants Mutations were evenly distributed over the aprt structural gene with few recurnng sites One region of the structural gene was notable in that it was the site of several independent deletion mutations including a cluster of four identical three base-pmr deletions (discussed further below) The bagh frequency of events at G C base pairs would implicate modfflcaUons of these bases
4 TABLE II Some mutattonal spectra for the aprt locus Spontaneous
Thy mutator reduced
UV-hght induced
Benzo[a]pyrene induced
3,-Radiation mduced
Refs
[87-89,91,951
[96,97]
[27]
[79]
[46 82]
Mutant strains mcollectton
120
45
34
59
85
8
0
0
0
20
Mutants w t t h ' v~s~ble' rearrangements a Mutant genes sequenced b
78 (100)
40 (100)
34 (100)
Transitions A T~G
22 (28) 9 (13)
1 (2) 16 (40)
16 (47) 1 (3)
11 (15) 9 (12) 2 (3) 2 (3)
0 1 (2) 20 (50) 0
0 4 (13) 1 (3) 2 (6)
G C~A C
Transverslons G C~T A G C~C G A T~C G A T~T A
T
21 (100) 1 0
43 (100)
(5)
5 (12) 3 (7)
13 (62) 3 (14) 0 2 (10)
6 (14) 5 (12) 4 (9) 3 (7)
Multiple subsututlons
1
(1)
0
7 (21)
0
3
Framestufts
6
(8)
0
1
2 (10)
6 (14)
Duphcatlons ¢
5
(6)
0
0
0
0
11 (15)
2
0
0
8 (19)
Deletlons(<40bp)
(5)
(3)
(7)
a Mutations detectable by Southern blotting 0 e, deletions and insertions > 40 bp or inversions) b Figures do not include mutations with visible alterations (large deletions, insertions) Figures m parentheses represent the percentage of the sequenced genes represented by each class of mutation Duplication of an adjacent sequence
as a major source of spontaneous errors Such modifications could be the result of deamlnatlon [73], alkylatlon caused by lntracellular metabohtes [105], or oxidation by endogenously generated active oxygen species It seems unlikely that 5-methylcytoslne is involved in tins process at aprt, as most of these mutations occur outside the C p G dlnucleotlde methylation sites, which account for most of this methylat~on (for review, see Ref 100) A host of enzymes recogmse and exose such modified bases [72,75], the a p u r l m c / a p y n r m d m l c sites produced by excision could be the sites of rmsmcorporation of nucleotldes (predominantly of dATP [106], but also to a lesser extent the other dNTPs [63,124]) These mutations could also be the rare rephcatlonal errors winch escape proofreading exonucleases and nusmatch repair mechamsms De Jong et al [23] found a predominance of G C ---, A T transittons in 30 mutant aprt genes ~solated by a rapid clomng technique utlhzlng 2, vectors N o A T ---, G C transitions were observed, although A T base pairs were the site of two of the five transverslons The dlstnbuuon of the mutants was also striking in that 23% of the mutations (all G C---) A T transitions) were found at one s~te There was no obvaous explanation for this 'hotspot', it is not an apparent site for cytosine methylat~on or secondary structure This 'hotspot' has
not been observed in two other large mutant collections, m fact, among the 90 mutant genes we sequenced, no substltuUons at all were found at the stte [95] Tins collection was also unusual m the absence of large deletion mutattons winch constitute about 10% of mutants m two other collections It is possible that subtle differences m cell growth condmons may influence the spontaneous mutations collected, although it is crucial that strict protocols be followed for mutant isolation and cloning to guarantee the mdependence of sequenced mutant genes Most of the sequenced mutations (regardless of the collection) result in drastic armno acid substitutions Nonsense mutations and altered splice sites (at canomcal donor acceptor s~gnals m the stringent selections employed here) each represented about 10% of mutations while translational start and stop signals were altered m approx 5% of the mutants Upstream promoter, downstream noncoding, or mtron sequences are not primary targets m any of the mutant strmns The analysis of spontaneous mutation at other chromosomal locl ~s not nearly as advanced Southern blot analysis reveals that deletions are much more common (particularly at autosomal loci (e g , Ref 137) discussed in detail below) while sequence determinations indicate that the patterns of single base substitutions may also
be different from that found at aprt Thus the spectrum of spontaneous mutation may be, to some degree, a function of the target locus At the C H O hprt locus base substitutions account for about half of the mutauons winch can be analysed at the nucleotide level (from a collection of 25 spontaneous mutants, Cahgo and Meuth, unpubhshed data) Unlike the aprtmutants isolated from the same strain, wrtually all these mutations are transverslons at G C base pairs Transversion mutations also occur frequently among germ hne mutations at the human hprt locus of Lesch-Nyhan patients, although there is no significant bias towards G C base pairs [41] Three spontaneous dhfr-deflclent mutants of C H O cells have also been characterized at the sequence level [84] All three alter concensus sphcmg sequences leading to loss of exon 5 of dhfr and revolve G C base pairs On the other hand, mutations of chromosomally Integrated shuttle vectors have been analysed in considerable detail by several groups Spontaneous mutations of the gpt gene Introduced mto an hprt-deficient mouse strain by a retroxaral shuttle vector were dominated by a recurnng three base-pair deletion (occurnng in 16 of 43 sequenced mutants) at a site where the trlnucleotlde is repeated [4] The rest of the mutations were deletions rangmg from 1 to 1250 base pairs or single base-pair substitutions of vartually every type Another group using the same gpt target but constructed in another vector and introduced into C H O cells found the identical mutation in 19/62 mutants sequenced with deletions of varying sizes accounting for almost 80% of mutations [127] In contrast a thtrd group studying spontaneous mutations m the gpt gene in CHO, d~d not find any recurring mutations at the site [102]
III-B Mutations tn genettcally unstable strams Mutational 'spectra' can be profoundly altered by cellular mutations affecting the accuracy of D N A rephcation or by treatment of cells with D N A - d a m a g l n g agents For example, increasing the intracellular content of the precursors of D N A synthesis, the deoxynbonucleoside triphosphates (particularly dCTP and dTTP), in C H O cells increases the rate of mutation at several genetic loci [80] In one case, a C H O strain with a 10-fold increase in the pool of dCTP (caused by the altered regulation of one of the enzymes involved in the synthesis of tins precursor) has a corresponding increase in the rate of mutation at aprt Not surprisingly such mutations are the result of the mlslncorporation of the nucleotlde in excess, glvang in the case of excess dCTP, T ~ C transitions and, more unexpectedly, A--* C transverslons in virtually equal proportions (Refs 96 and 97, and see Table II) The latter mutations are virtually absent among spontaneous collections and appear to be the result of an energetically unfavourable
C T nuspalr The data suggest that the substantial increase ( - 200-500 × ) in the frequency of tins transversion is due to inefficient proofreading of the mlspair during replication It is also possible that any mismatch correction mechanism responsible for the control of these errors does not function efficiently in the presence of precursor imbalances Such analyses demonstrate the usefulness of tins approach to investigate the mechamsms maintaining the accuracy of D N A replication In mammalian cells The lumtatxon is that very few other mammalian mutator strains have yet been identified
III-C Effects of DNA-damagmg agents A number of groups have examined the effect of a variety of D N A - d a m a g l n g agents on the mutations occurring at target loci As tins review is not meant to concentrate on mutagenesls, per se, I will only provade an overview of these data I have also presented a summary of 'mutational spectra' induced by several of these agents at one target (aprt, Table II) to illustrate the distlnctwe patterns winch emerge Basically, these data indicate that mutagenesis in mammalian cells is largely targeted, i e , the bases modified at certain sites (particularly those involved in base pairing) by damaging agents are the ones causing mutation As yet there is little convincing evidence for an error-prone inducible rapair pathway For example DNA-alkylatlng agents such as EMS induce simple G C ~ A T transitions in shuttle vectors (free rephcatlng [66] or integrated [3]) corresponding to the level of O6-alkylguamne residues in D N A produced by tins agent Ethyl mtrosourea, on the other hand, also induces unusual A T - ~ T A and A T ~ C G transverslons in 30% of the mutations winch corresponds with a low but significant level of O2-ethythymlne residues in the D N A of treated cells [301 The most widely studied mutagen is ultraviolet light In most of the model systems the largest class of mutations induced by tins agent is simple G C ~ A T transitions, mostly at dlpyrimldlne sites, indicating that lesions at such sites are premutagenic (Refs 9, 27, 51, 99, 102 and 135, and see Table II for data regarding aprt) In contrast, the pattern of mutations induced at the hamster hprt locus are a tmxture of transitions and transvers~ons at the dlpyrirmdme sites mvolvang either G C or A T base pairs [135] Of the two potential lesions produced at such sites, experiments with shuttle vectors indicate that cyclobutane dlmers may be the predominant premutagenic lesion as the mutation frequency can be substantially reduced by photoreactivation of the damaged D N A [99] Differences between mutations at chromosomal targets and those found in freely rephcatlng shuttle vectors are also evadent The major difference is the sigmfxcant number of multiple mutations found in shuttle vector targets, these
mutations being largely absent among chromosomal mutations [27,102] perhaps because of the lower doses of UV used to generate these alterations Although the sites at winch UV-mduced mutations occur are sxrmlar, there are differences m the relatwe proporuons of mutations Tins could reflect differences m the cell hnes I e , the nucleoude m_tsmserted opposite a premutagemc lesion rmght vary from cell hne to cell lme (perhaps even dependang upon such factors as d N T P pool content) although they are largely exphcable by rmsmserUon of A opposxte the photoproducts [51] Hot spots for UVreduced mutauons were reported for many of the systems, but ~t was not apparent why these clusters of mutauons occur In one study using an S V 4 0 - based shuttle vector system the hot spots dad not necessarily correspond to the sites of excessive damage [25] The frequency and pattern of mutaUons reduced on a shuttle vector were altered when the prelrradaated vector was introduced into UV-sensmve excision repair deficient human cells (of Xeroderma ptgmentosum complementatlon groups A [12] or D [112]) The frequency of mutations was increased up to 6-fold m the XP strmns and the mutations became predormnantly G C ---, A T transluons Furthermore, multiple mutaUons winch are prevalent among UV-lnduced mutations in these vectors rephcated m wald-type cells virtually disappear m the sensmve strmns Since the Xeroderma stratus are deficient m excision repmr, ~t was suggested that the multiple mutations rmght be hnked to a normally functlomng excision repair system m human cell hnes S~malarly, the pattern of mutations at the hprt locus was dramatically altered m an excision repair defective hamster strain to become predormnantly G C--, A T transmons [135] Since dlmers were poorly repmred m tins strmn it was suggested that these mutations were the result of A mlsmsertlon opposite the darners Addmonally there was a strlkang asymmetry m the dastrlbutton of mutations on the two DNA strands Most of the mutaUotas m the repair-deficient strata fell on the transcribed strand as opposed to the shght excess of mutations on the nontranscnbed strand m the wdd-type strams Thus other factors (such as the differential ablhty of lagging and leading strand D N A synthes~s to masmcorporate A opposite the damers) may dactate the pattern of mutations m the absence of repair The effect of an agent winch adds bulky adducts to the DNA of cells has been exarmned using the aprt locus of CHO cells (Ref 79 and Table II) and the supF locus of an SV40 based vector [25] G C base parrs were the predornlnant target of mutaUons reduced by the potent mutagen benzo[a]pyrene dlhydrodaoleporade (BPDE) at both targets, consistent with the projected premutagemc role of the N2-subsUtuted guamne adduct On the other hand, the types of mutations produced were noticeably different, being predormnantly G C
T A transverslons for aprt and a max of single base-parr deletions, transverslons, or transmons at G C base pairs m the shuttle vector Mutations were spread throughout aprt, predonunantly (81%) m runs of two or more Gs Furthermore, it was noted that half of these had flanlong A residues much luke the primary target for BPDE reduced mutations at the c-Ha-rasl protooncogene [79] The dafferences m the two spectra could well reflect the differences m the mutagenes~s protocol where the shuttle vector is mutagemsed as naked DNA before transfect~on into the recipient cell strain Another agent winch reduces single base pair substltUtlOnS at aprt ~s ~,-radaaUon Southern blot analysis of D N A obtained from radaatlon-mduced aprtmutants showed that the majority had no detectable alterations of aprt restrlcUon fragments [13,47] Subsequent sequence analys~s revealed that these were predominantly single base substitutions or small deleuons ( < 40 bp [46,82]) Of these mutations transverslons were the majority, with small deletions and transmons also observed Iomslng radaaUon Is known to cause base damage as well as single- and double-strand breaks, so it seems hkely that these errors are the result of mlspa~rs with modafled bases or by rephcatlon past nomnformatlonal abaslc sites formed by spontaneous or enzyme reduced base ehmlnatlons, although it is very difficult to determine the specific lesion(s) responsible Radaatlon reduced mutations at other selectable loc~ are predominantly deletions, but several radaatlon-mduced mutations at the hprt locus of hamster [13] or human [115] cells also have no detectable changes m gene structure Whether slrmlar base subsutut~ons occur at tins, or other, loci can only be determined by sequence analysis of the mutant genes IV. Deletions
IV-A Germ hne delettons Comprehensive studaes of germ hne deletions resulting in human inherited dasorders have been made by numerous laboratories and many recurnng features have been reported (1) Deletions are generally spread throughout the target genes and ehmlnate varying amounts of D N A Those m the p-globm locus can be as small as 600 base pmrs or as large as 100 kb (for review see Ref 21), in the factor VIII gene they range from 2 5 to 80 kb [37] A few 'breakpoint clusters' have been found Eight deletions of the a-globm locus have breakpomts Wltinn a 6 to 8 kb region [92] and several smularly sized deletions of B-globm have 5' ends winch fall relatwely close [131] It has been proposed that such clustering may result from the juxtaposluon of distant sequences by their anchorage to the nuclear scaffoldang, and that the
staggered posmons of the 5' or 3' endpomts may indicate a 'shdmg frame' as sequences move through the anchorage points [131] Unfortunately, these endpomts do not correspond to the fried attachment sites found m the fl-globm locus and thus may represent transient assocmUons between the chromosome and matrix at the point of rephcaUon [55] (2) Deletions appear to be the result of 'illegmmate' or nonhomologous recombmaUons between sequences having only hmlted smnlanty The sinnlarity may extend over 200 to 300 base pairs m deletions revolving alu elements at each breakpoint as with those occurnng m the locus coding for the low-density hpoprotem (LDL) receptor [67] Alu repeats are not revolved m all germ line mutaUons In many cases only one breakpoint falls wittun an alu repeat [92] and in others there are only a few or no smular nucleoudes Other structural features have also been noted e g, short direct and inverted repeats [50]
(3) The structure of deletion juncUons ts also variable Many are formed between short Identical sequences at each breakpoint leaving one copy at the juncUon [50,131] Others are clean breaks [131], while several contain novel nucleoudes or filler DNA [78] Duphcatlons of sequences have also been noted [120] IV-B Spontaneous deletwns at selectable loct Deletions ehmlnatmg part or all of the structural genes encoding various salvage enzymes have also been reported Such mutations occur spontaneously or they can be mduced by a few DNA damagdng agents Deletions at aprt constitute about 20% of spontaneous mutaUons [1,87,95] and they range m size from three base pairs to 170 kdobase paars However, the dlstmctlve feature of deletions at tlus target xs their dxrectxonahty (Fig 1) Many ehnunate sequences only within the structural gene whale others have one breakpoint with
XA99 XA76 $88 X I
~
~1
?
~
~7
8
9
10
XA8 XA15 XA23 (-17o~) 1 9
$225 (~170 ~)
$209 $213 i $78 I i
9
$201 $70
Sl18 i
IS41
(~8 kb)
$67I $207 1 I
(-40kb)
S36 $26 ' ~ ' SlO
Ftg 1 DeleUons and mserUons at the hamster aprt locus Top hne shows chstancc coordmates (m kzlobase parrs) vnth the BamHI site just 5' of the structural gene (exons as fdled boxes) chosen as zero Alu-cqmvalent repeats are presented as arrows, CpG ('HTF')-nch islands, as wavy hnes InserUons are indicated above, deleuons are the lower sohd hnes (one block per deleuon) S denotes spontaneous and XA 7-radlauon-mduced mutants Deleuons vath breakpomts far upstream of aprt have an arrow, small deleuons ( < 40 bp) are not presented
aprt and the second up to 170 kb upstream of aprt [81], (Sargent, Phear and Meuth, unpubhshed data) However, no spontaneous mutants have yet been reported (from three independent laboratories) with a breakpoint downstream of aprt [1,47,87] Tills pattern suggests that there is an essential gene or structure in this region which would be lethal if deleted in the hemlzygous strains used in all the studies Closer analysis reveals the presence of a DNA regaon rich in CpG residues (indicative of the presence of an expressed gene) approx 4 kb downstream from the most 3' deletion breakpoint, distinct from the CpG cluster in the 5' end of aprt (Fig 1) It has not yet been determined whether this presumptive gene is expressed let alone essential Certainly other constraints on the pattern of deletions could also be considered (i) Alterations of nuclear scaffold attachment sites rmght prove fatal if the resulting changes of packaging impaired the expression or organization of essential genes (n) Regions associated with the nuclear scaffold rmght simply be preferentially retained followmg double-strand breaks in surrounding sequences (nl) Deletion of an origin of DNA rephcatlon could cause cell death dunng S phase if specific termination signals left large stretches of D N A unrephcated Consistent wtth this interpretation is the recent report indicating an origin just 3' of the aprt structural gene [48] although further critical tests of such functional sites are required Nevertheless, the directlonahty of deletions at aprt has facilitated the study of these mutations at the molecular level by providing a tag for the cloning and sequence analysis of the novel junction fragments formed by these rearrangements These fragments in turn allowed the identification and the characterization of the breakpolnts which generated the deletions At the nucleotlde level the mutations share a number of propertles, despite the great size variation (0 Most occur between short direct repeat sequences of 2 to 7 bp, leaving one copy of the sequence m the mutant gene (Fig 2 [88]) Even deletions as large as 170 kb have a dinucleotide overlap at the breakpolnts, although others (e g, $67) may have no such repeats (Sargent, Phear, and Meuth, unpublished data) (n) Certain tn- and tetranucleotides recur at some of these short direct repeats in the smaller deletions, but generally the breakpoints have no common features and no slgmficant matches to entries in the data bases Virtually all the upstream breakpolnts fall in umque sequence DNA, only one in the entire collection falls in a hamster alu-equivalent element Some reasonable matches to the concensus cleavage site for DNA topoisomerase II [118] were found at three breakpolnts with the projected cleavage s~te falhng at the deletion breakpoint in one case But given the degeneracy of this concensus sequence (one such site occurs about every hundred base pairs in aprt) and the absence of rules
S12 16 22 93
2280
I
CAGTACCA'T'~ G~GAGAGAGGAC CAGTACCATT~ C.~G/~AGC:~C
$41 2230
2530
I ~ 299 bp ~ I AGCCTGGTGG[ AGCITGACCTCACT. . . . GTTATGACAClA,GClAGATCAATAA
,aGOOTGGT~i~--~GATCAATAA
XA23
~4kb 3 0 9 0 A C A A G A T A ~ A A A A C C C C . . . . . T~TTATTTAJ~-"'~ATCTTACGT
ACAAGA T ; T " ~ - - ~ A2TCTTACGT $209 . ~ ] ~ [~[~ 2850 ACTTAAACAO[~CTCCOTGAG . . . . . . . GCOTCTAGG/~TrTTCTGG(~C
ACTTAAACA'C'I~]"~I-I'CTGGGC Fig 2 DNA sequences of breakpomts and juncUons formed by deletion mutations at the aprt locus Short direct repeats are boxed Nucleotlde positions of breakpomts in aprt are indicated, as are the
mutant strains (S, spontaneous mutant, XA, y-radlauon-mduced) The sizes of the deletions are also presented
which accurately predict active topolsomerase cleavage sites, more direct evidence is necessary to establish a clear relationship of either toposlomerase I or II with deletion breakpolnts (hi) The deletions are truly the result of nonhomologous (or 'lllegmmate') recombination as there is little sequence slnulanty between the nucleotldes at the breakpoints, and no further alterations were detected near the deletion junctions (iv) Deletion breakpomts are not randomly distributed over the aprt locus, in fact, many cluster within a 100 bp region of exon 5 (Fig 3A) This region of aprt IS notable in that it is rich in short direct and inverted repeats (having the potential to form stable stem loop structures) Such structures can also be found at several of the breakpolnts not associated with the cluster The deletions do not appear to simply ehmanate the inverted repeats, although an exception to this is a small deletion in the human aprt locus that ellrmnates a perfect 11 bp inverted repeat (Fig 3B, Harwood and Meuth, unpublished data) (v) Two deletions are more complex $10 has a small insert of novel D N A (not closely linked to aprt) at the deletion junction while $70 involves a substantial dele-
A G=-20 6
A
G CA A T T
A Cc_GT -2300 T-A C-G "T 1" C-G
$12 S16 $22 $93
A G--18 0
T-A T-A A-'l" C-G C-G
TGGG C C G-C T-A
IC-G
A-T T
/G-C
T G G-C A-T C-G
./A-T <~T-A • T $27
I~ C s G ~'1:< ~I II1I~. /
ssl
s41
CAT~ C T G -CTG-CTGAGGTGGT~GTGTGT rG~
sa6
TGGGTGG ~clrGACCTCAqTTAAG~ C,~ --'
,,
.^
i-~
~
.c
A G-CCTC-CCCTCAG
'--'I $78
B
exon 3
A ~
-8 8 kcal
C G-C G-C T-A
TCT'VFG~3CCCCT ~ ~ T T ~ - - ~ C
T~,3G
Fig 3 D¢leuon mutaUons at aprt (A) A cluster of deleUons m the aprt locals of CHO cells Short direct repeats mvolved m the deletton formauon are boxed and mutant stratus are m&cated Inverted repeats are m the form of their predicted stem-loop structures (together with predicted ~G values) Direct repeats are underhned An mserhon m tlus regmn ($51) is also indicated (B) A small deletion m the human aprt wluch premsely ehn~nates an reverted repeat (Harwood and Meuth, unpublished data) The stem loop structure mthcated is not parucularly stable and would seem unhkely to form
h o n of u p s t r e a m sequences c o m b i n e d with the f o r m a tion of a n r e v e r t e d r e p e a t o f sequences d o w n s t r e a m f r o m the b r e a k p o i n t Thts c o m p l e x j u n c U o n was also a m p h f i e d four times [89]. G~ven the h e t e r o g e n e i t y of these m u t a t i o n s , the perp l e x m g p r o b l e m ~s the m e c h a m s m b n n g m g two d~stant
D N A f r a g m e n t s i n t o p r o x m u t y to allow f o r m a t i o n of the d d e u o n j u n c t i o n T h e n e a r l y : d e n t l c a l size of the two largest d e l e t i o n s a d d s strength to the a r g u m e n t that htgher o r d e r c h r o m a t m structure is r e v o l v e d ( F i g 3) as p r o p o s e d for the f l - g l o b m m u t a n t s [131] Tlus m e c h a m s m c o u l d also a c c o u n t for the dtrecuonal~ty of the
10
I~l
|~?,t
/
double strand break
\
single strand generation, c-strand synthesis
...&
P.>.t
~t
)
unwinding,
_
ahgnment of short direct J . . . ~ repeats
~=1. . . . . . .
"template sw~tch"
resumed synthesas
tnmmmg ends, gap filhng, hgatmn
further rephcat=onor loop cleavage and repair
~
~
(~
..............
F~g 4 Projectedpathways of deletmn formatmn The left half interprets deletmns as a response to double-strand breaks wlule the right presents the copy choice or strand shppage model Short direct repeats are indicated by the filled blocks and arorws, newly synthesmed DNA are represented by the dashed hne Models adapted from Refs 15 and 103
aprt deletions the retention of the downstream regmn in deleUons may be assured by its attachment to the nuclear matrix Such structures have not yet been mapped in aprt but such expenments are crucml in view of the lack of correlatmn between these sites and deleuon breakpolnts at a- and fl-globxn locl [55] Equally obscure are the events which initiate deletion formaUon (Fxg 4), the most simple being the generation of double-strand breaks [103] and more comphcated models (consldenng the size of some of the deletions) envasagmg template strand shppage with the rephcatlon complex copying past extruded nucleotldes (copy choice, [151) Deleuons are a slgmficant cause of mutation at m a n y other selectable locl, particularly at autosomal genes where the constncUons on deletion formation at hemlzygous loci may not apply [74] Due to the large size of many such deleUons (extending as much as several megabases at heterozygous locl [24]), the molecular events leading to the generaUon of these mutatlons have not been deternuned In such detad Deletmns were first found among spontaneous and UV-mduced hprt mutants of C H O cells [38] More recently they have been demonstrated m peripheral T cells isolated from people and then stimulated to grow in culture [2] at frequencies rangmg from 15% of mutauons m cells obtamed from adults to 85% m cells from newborns [2a] Breakpomts have been carefully mapped m hprtmutants o n g m a t m g from human T lymphocytes [2a], a human lymphoblastold strata [40], and a human tumour cell hne [86] and appear to be distributed throughout the locus In contrast, the hprt deletmns in the T
lymphocytes from newborns are clustered to exons 2 and 3 [2a] In a h u m a n lymphoblasto~d stram heterozygous for the tk structural gene, the majority of spontaneous mutatxons are due to massive deletions much greater than the size of the structural gene (14 kb) and m half the cases ehrmnatmg a closely hnked marker (1 e , within the same band on chromosome 17 [74,137]) Smaller rearrangements (those leawng at least part of the structural gene) accounted for only 10% of the mutations Adchtmnally, certam cell hnes appear to be more efficient m deletion formation Recently, a lugher frequency of hprt deletions was found m a mutator h u m a n cell strain derived from pauents with Werner syndrome (a rare autosomal disorder characterized by premature aging [36]) Furthermore, the deleUons occurnng m the Werner's cells were on the whole much larger (with several ehrmnatmg the entire structural gene > 50 kb) than those m control stratus Tumour derived stratus of a hamster cell hne (CHEF) have generally lugher frequencies of deletion formatton than the parental strata, although overall mutation rates are not s~gruficantly altered [56] Hopefully, such cell strams will be useful m the ldentlficauon of cellular functmns revolved the generaUon of deletions
IV-C Induction of delettons m cultured cells Very few D N A damaging agents induce deletions y-Irradiation is the most effectwe, reducing deleuons at virtually every locus where such mutaUons are recoverable (although at very different frequencies [13,47,125,
11 130,134]). Presumably these mutations are the result of the double-stranded D N A breaks produced by the agent Deletions have also been reported in collecuons of mutalaons reduced by chemical agents A surprising observation is that analogues of the D N A precursor deoxyadenosme reduce deletions at the hprt locus of C H O cells [52]. The results are stnklng (all the mutations reduced by the agents were deletions), but are difficult to explain The carcmogemc aromatic amine N-acetoxy-2-acetylarmnofluorene induces deletions in about 28% of mutants at the hamster dhfr locus [19] In one case 0 . e , the tk locus m a human lymphoblastold line heterozygous for the gene) even the 'simple' alkylatmg agent ethylmethanesulfonate has been shown to reduce a slgraficant proportion of deletions and other rearrangements, however, such rearrangements are not Induced at aprt [87] or an integrated shuttle vector bearing the gpt gene [3] Only a few radiation-reduced deletions of the aprt locus have been examaned at the molecular level (Males and Meuth, unpubhshed data) These are simple deletions formed, m some cases between short direct repeats at the two breakpomts much as the spontaneous deletions, but they are not particularly large ( - 3 kb at most) There are no recurring sequences at the breakpoints, however, there is some clustenng of breakpolnts [82] ~,-Radtatlon also produces complex rearrangements whach are not found among spontaneous or any other collections of induced mutants [13,47,130] These rearrangements could be massive insertions, reversions, or translocalaons based on the patterns of mutant gene products on Southern blots None of these complex rearrangements has been completely resolved as yet, as the work revolved is formidable V. Insertions Mutalaons involving the introduction of a well defined genetic element mto a target site are a major source of genetac flux in many orgamsms [5], but occur infrequently among mammalian germ hne or somatic cell mutations Nevertheless several cases of such mseruons have been described m detaal The cellular oncogene c-mos was actavated m a mouse plasmacytoma by the insertion of a 4 7 kb element wltl~n the coding region [17] This element contained 335 bp terminal direct repeats and was homologous to the endogenous mtracasternal A p a r a d e (IAP) element The IAPs are a group of proretrovirus-hke elements present in about 1000 copies per mouse haploid genome [62] They have a gross structural orgamzatlon resembhng retrovtruses but no known extraceUular phase Tlus insertion was accompamed by a su¢ base-pair target sequence duphcatlon Two suntlar cases of insertions of IAP sequences into K hght chain target genes have also been reported in mouse hybndoma cell hnes [49]
Elevated levels of c-myc expression in a dog venereal tumour was apparently due to the insertion of a 1 8 kb fragment highly homologous with the primate Kpn I (LINE) family [58] The insert was flanked by 10 bp direct repeats of the cellular target site and contained a dA tad suggesting an m R N A origin Strmlar inserts have been found in reactive pseudogenes [69] More recently, insertions of L1 sequences have been detected in exon 14 of the factor VIII gene in two of 240 patients suffering from haemophaha A [60] The 3' portions of the L1 sequence introduced include the poly(A) track and are flanked by target site duphcatlons It was proposed that these rearrangements were mediated by an R N A intermediate and preferentially revolved exon 14 because the A-rich regions of the exon could base pair with the poly(T) tall of the L1 cDNA Inserted sequences in germ hne mutations are not exclusively fragments of L1 repetltwe elements A novel 2 kb D N A fragment inserted into the human hpoprotem hpase in several patients with a defioency of the enzyme did not resemble any known human L I N E family [64] The sequence of the novel fragment must be detenmned to further define these mutations Insertions may also be associated with large deletions as m an a-thalassemia where a 123 bp fragment is introduced between the breakpomts of a large deletion [92] Insertions are also rare at the hamster aprt locus Six such mutations have been reported - three spontaneous [1,87] and three radiation induced (Ref 13, and Miles and Meuth, unpublished data) - constituting 2 to 4% of mutations The five insertions characterized at the D N A sequence level (two spontaneous and three "/-radiationinduced, Table III) bear no resemblance to those mediated by transposable elements in bacteria or lower eukaryotes (Ref 90, and Mdes and Meuth, unpubhshed data) The inserts are small (only approx 50 to 500 bp), they are accompamed by a deletion of 8-13 bp at the target sites, rather than duplication of the surrounding sequences, and they do not contam terminal repeats Target sites are within a one kb region of aprt but there are no consistent sequence simalanUes There is no obxaous homology between the target and inserted sequence in the spontaneous mutant, whale the inserted fragment in the "t-radiation-reduced mutant shows some complementarity with sequences surrounding the target and could form a stable stem-loop structure [90] The sequences inserted have no stmalanty (Table III) In "~-radiatlon-mduced mutants the inserts are members of different repetitive D N A famlhes, those in the spontaneous mutants are umque (Ref 90, and C Miles and M Meuth, unpubhshed data) One of the repeat sequences is not related to any of the known dispersed repeats or to famlhes having structures resembhng transposable elements (being part of a novel hamster SINE family [83]), the second appears related to the hamster B2 (alu equivalent) repeat family, and the thard
12 T A B L E III Propertzes o f insertion rnutattons at the aprt locus a
Mutant
Target b
Insert c
S10
downstream - 50 kbp
398 bp
umque
Features
$88
exon 2 12 bp
285 bp
umque
insert is duphcate of donor sequence
XA5
exon 3 - 13 bp
58 bp
lughly &spersed repetmve
forms stem loop with flankmg aprt novel SINE farmly - 50000 copies found m spc D N A
XA 76
exon 1 - 10 bp
102 bp
rmddle repentwe
distant relatwe of hamster B2 type alu eqtuvalent 9
XA99
exon 4 - 8 bp
457 bp
rmddle repet]txve
related to hamster LINE farmly
-
a Data are from Sargent, Phear, and Meuth, subrmtted, Mdes and Meuth, unpubhshed data, and Refs 83 and 90 b Target of insertion m aprt and the size of the deletton at the target c Size and copy number of inserted fragment
may be related to a hamster L I N E fanuly The umque nature of the fragments inserted into the aprt gene of the spontaneous mutants allowed the clomng of the donor regions and the further analysis of these rearrangements The donor regions are largely umque, although m one case the mobdlsed fragment as flanked on one sade by a long alternating purlne pynnudme stretch dominated by a 46 bp d G - d T run [90] The donor region an flus mutant stram is not altered with respect to sequence or copy number This suggests that the fragment was duphcated before or during lnsertaon mto aprt, perhaps as a result of a process initiated by the sample sequence dG-dT repeat We imagine that the dmucleotlde repeat rmght have produced some distortion at the juncnon with the moblhsed sequence (dG-dT runs are prone to form Z-DNA an vitro [114]), and provaded a focus for a nonrecaprocal exchange with the target sate
VI. Translocations / inversions Complex DNA rearrangements actwate oncogenes in B- and T-cell leukemlas and lymphomas and in the process produce distractive and &agnostic marker chromosomes [22] Several translocatlons appear to be the result of aberrant lmmunoglobulln sphclng an which the recombmase apparently recogmses a degenerate sagnal sequence to produce a novel gene product altering cellular growth propertxes For example, a translocataon be-
tween the human ammunoglobulln heavy cham gene on chromosome 14 band q32 and l l q 1 3 forms the bd-1 oncogene in some chromc lymphocytic leukenuas [128] In chromc myelogenous leukemia, the c-abl gene ts translocated from 9q34 to 22q34 (into a 5 8 kb region called the breakpoint cluster region, bcr) to produce the Pluladelphla chromosome [45], although the mediatmg mechamsm in flus case is not known Mutataons mediated by aberrant lmmunoglobuhn rearrangements do not exclusively involve bcl-1 or other oncogene sequences A drug resistant mouse myeloma strain was recently reported [57] an which the functional ormtlune decarboxylase gene was jointed to the switch region of the ~'1 ammunoglobuhn heavy-chain gene via an lntrachromosomal rearrangement (both genes have been mapped to chromosome 12) Thus lmmunoglobuhn gene clusters may be more prone to random rearrangements which are only recovered under strmgent selection condxtions In view of the specxficlty of the rearrangements and the strong selective pressures for the given rearrangement (an the form of a growth advantage), at is not surpnsmg that cultured somatic cell models for these mutations have not been developed In fact, rearrangements, such as translocataons and reversions, wtuch require multiple breakages and rehgataons are not frequent among spontaneous mutants (frequency < 1%) In contrast, radiation induced mutations (at a slgmficant frequency, - 5%) gwe altered gene patterns on Southern blots which could be interpreted as translocataons or lnversaons (see subsection IV-C) In the case of two radlataon induced mutations at the dhfr locus cytogenetxc analysxs confirmed this interpretation [130] The nature of the rearrangements at the base sequence level has not been determined
VlI Gene amplification Probably the most remarkable mutant gene structures are amphfied gene arrays Single-step amphficanons may revolve units as large as 10000 kb of DNA [43] with subsequent rearrangements dunng progresswe selections producing as many as 3000 copaes of the target gene (although of a substantially smaller umt size [138]) These arrays have been largely studied in serially selected drug resastant straans (from cultures grown over numerous generations an increasing concentrations of the selectxve agent), however, they are spontaneously occurnng structural alterations and arise an the absence of selecUve pressures (even among collectaons of aprt deficient mutants [89]) The literature descnbmg gene amphfication as extensive and has been recently reviewed an detail [110,111,121,122] I will restnct the discussion here to several recent analyses of novel joints formed by such rearrangements as the structure of these joints was somewhat unanUcipated and has led to a reappraasal of the process of gene amphficataon
13 VII-A Novel jomts b
The first novel joint was found m repeated umts of the polyoma T anttgen joined to a weak enhancer [33] Thas construct was transfected into a cell stram (Rat-l) normally exhabitmg strong growth control and then selected for the loss of growth control A resulting strain had a 20-40-fold amphficatlon of thas construct, but a more interesting feature of the amplified umts was a very large mverted duphcatlon As tills unusual structure could have been a product of transfectlon or mtegration mto chromosomal DNA, smular structures were sought m well characterized chromosomal gene arrays obtained by more conventtonal protocols Usmg a technique to speofically recover 'snap back structures' (i e , large inverted repeats) from genorntc DNA, mverted duplications were found in cell hnes (at the CAD locus) or tumours (at c-myc) bearxng amphfied arrays [34] Further examples of inverted repeats were subsequently reported for the aprt [89], AMPD [53], CAD [109] and DHFR [76] loo, indicating that these unusual structures were widespread in amphfied gene arrays Simple duplications (1 e , direct repeats) were also recovered, indicating that more than one mechanism may be involved m the multistep production of multiple gene copies [54,76] The fine structure of the inverted repeats isolated from the various amphfied arrays ts similar as well The repeats are not sxmple, perfect pahndromes, rather, there is a small (600 to 900 bp) unduphcated regmn between the two arms at each of the joints (see Fig 5A for a representative structure) Two of these novel joints have been characterized at the sequence level At the aprt locus a large deletion of sequences upstream from the novel .lomt (resultmg in the loss of the aprt structural gene) is accompanxed by a substantial duphcatmn ( > 10 kb) begmmng 672 bp downstream from the joint [89] The duplicated fragment is inverted and rejoined at the deletion breakpomt The fidehty of the rearranged sequences xs preserved as no other alterations were detected The novel joint from the AMPD locus also has an unduphcated intervening region (of 861 bp) separatmg Inverted repeats of > 120 kb [53] The region contaming the 'breakpolnts' for the mverted duphcauon of AMPD has some other stnlong properues [54] (i) The sequence is rich m dyad symmetries capable of fornung significant stem loop structures Both breakpomts of the mverted duphcatton mapped Wlthan such structures (u) It contains a 'mosaic orgamzatmn' of alu equivalent elements (one of whach mcludes a breakpoint) (m) Numerous other novel joints fall in this region dunng mdependent amplification events These mclude dtrect repeats (' head-to-tail' jomts) as well as inverted repeats ('heat-to-head' or 'tail-to-tail' joints) In contrast, the 'breakpolnts' for the reverted duphcatIon at aprt did
a
a'4 bc' B
headto head joint
d
d' c' b' c d
d'
dcbc'd'
b
a k.(-~c d a' ~ c ' d ' b' b c' c d'
b
Q
C a
tall to tall iolnt
d e' b c d
d'
Fig 5 Some pro )osals for the generation of amphfied gene arrays (A) Generation of reverted repeats via a replicatlonal strand switch As indicated here the unduphcated regions charactensnc of the reverted duplications found m amphfled gene arrays may be the result of a loop out of the template in advance of the replication fork [89] (B) Such structures m a y lead to amphficatlon when a second strand switch occurs in a fork moxang in the opposite direction (to form the tall to tad joint) The two forks need not be in the same rephcon and may even be megabases apart (C) Excision of the loop may produce a structure that rmrmcs a bona fide rephcatlon fork and can prime repair replication The outcome of tins sequence of events at both head to head and tad to tad joints would be that two rephcanon forks follow each other yielding a m u l t l m e n c array of inverted repeats [52]
not have such structural properties nor dtd they lie near the previously described cluster of deletion termini
VII-B Structure of the arrays The presence of an amplified reverted repeat structure m an unselected single-step mutant at aprt [89] and at 85 to 90% of the novel joints in a haghly amphfled array derived from a multtstep selection for the dhfr [76] suggest that such structures may be formed during tmttal steps of the process whale other types of IoInts are produced m subsequent rearrangements Imtial events involve long tracts of DNA, estimated to be as large as 10000 kb for a single step CAD amphflcatlon Needless to say there would be considerable selection pressure to
14
reduce tins umt s~ze an subsequent selectaons to produce large copy numbers Tins may be accomphshed through the deletmn of some amphfied sequences combined w~th further amphficataon of the target and the onganal novel joint For example, an a recent analysis of a multastep amphficatmn at the CAD locus, the target gene (C), a sequence relatwely near tins target (N) and a far sequence (F) were followed The initial complex was -CNFCNFCNFCNF- Subsequent steps yielded an (FC-)9 array (deleting the N sequence) winle further steps gave even greater heterogeneity [108] Such complex rearrangements could be promoted by the format:on of a relatwely small extrachromosomal element from the mmal array winch could be amphfied further as an epasome and then remserted into the original chromosome sate or a new one [20] Thus chromosomal locahzatmns of these arrays are crucaal
VII-C Some proposals for mechamsms Given our amproved understanding of the structure of amplified gene arrays, a number of models have been proposed to explain their formatmn One of the first mechamsms revolved muluple masfinngs at an ongm of DNA rephcataon to create an 'omon slan' structure m the target DNA region (see Ref 122, for review) Tins structure is then resolved by recombination to create a hnear array of duphcated genes or, in other cases, extrachromosomal copras While there is evadence supporting such a mechamsm m the developmentally controlled amphficat:on of chonon genes m Drosophda folhcular cells [119], at cannot easily account for the structure of the amphfied arrays in mammahan cells, m partacular, the abundance of the reverted repeat novel joints, and the ammense saze of the umts m sxngle step mutants (far greater than esumates for the s~ze of single rephcons) Furthermore, numerous double-strand breaks and rejoamng would be necessary to resolve the structures An alternatxve mechamsm for the generaUon of reverted repeats envxsmns a replacatmnal error m winch the DNA rephcataon complex rephcates around the fork rather than progressing an the normal manner (Fig 5A [89]) The deletmn/mverted duphcatmn (as observed for the aprt locus) could be formed as a result of the resoluuon of thas structure by a double-strand break at the parental DNA molecule The unduphcated region could be formed by a loop out of sequences an advance of the strand swatch leawng one poruon of the fork unrephcated In the event at the AMPD locus, tins loop out rmght be stainhzed by the complementarity of the sequences extruded [53], though slrmlar complementaraty of the nucleotldes 'extruded' at aprt could only be found some 100 bp from the base of the projected loop Whale tins model could explain the formatmn of the reverted repeat ('head to head' joint), at does not ex-
plain the ampllficatmn of the structure To account for tins, Hyrlen et al [53] proposed that the amphficat:on could be produced by a second sxnular event at a fork moving m the opposite darectlon to produce a tail to tail joint (Fig 5B) If a new round of DNA synthesis were amtmted at the looped out structure (perhaps through repmr rephcaUon) and the forks progressed along the prevaously rephcated molecules (Fig 5C), then two replacaUon forks would pursue each other around tins bubble structure, producing rephcas of the reverted repeat joint as proposed by Futcher [39] Evldenee has been obtained for such a mechamsm generating muluple copies of the 2/~m carcle m yeast [133], but there as httle exadence for anything except the original reverted duphcatlon m mammahan cells Ideally, both head-tohead and tail-to-tail structures should be analyzed m an array unadulterated by subsequent rearrangements Other structures (e g, double rmnutes) could be produced m subsequent steps perhaps by deleuons of these complex structures Tins 'chromosome sptral' model, though at present laclong experimental support, ~s intriguing and could explain some of the features of amphfled arrays The problem stdl to be resolved is the cellular products winch produce amphfied arrays In eather of the above models the rephcatlonal apparatus must paruclpate, although the lmtmtmg events for eather mechamsm need to be explained In parUcular, the strand swatch model may be comphcated by current evidence that the eukaryoUc genome as rephcated by two polymerases, one (a) specafic for the lagging strand and the other (8) the leading strand [32,116] On the other hand, any factor winch aninbats one of the polymerases, but not the other, may promote a strand swatch Recently, cell hnes have been adentffled winch should prowde some insight into these problems Gaulotto et al [42] found that cell strains selected for two single step amphficatmns have a sagmficantly higher frequency of subsequent amphficauons These alterauons appear dominant m somatic cell hybrids, suggesting that new funcuons are gained or prevaously existing ones are present at a Ingher level (Stark, G , personal commumcauon) Strains isolated by serial selecuon protocols appear to be unstable m other ways, since mutatmnal rates at independent loci are also increased [28] Tins increase m the rate of mutation ~s largely the result of single base subsUtuUons at a target gene mdacatmg an alteraUon of rephcataonal fidehty (Cahgo and Meuth, unpubhshed data) Investlgatmns are at an early stage but such 'amphficator' strains offer a promising approach to the analysas of functmns parttc~patlng m gene amplaficatmn VIII. Future directions
It should be evadent that mutatmns m cultured mammahan cells have a variety of structures, ranging from
15 simple sequence substitutions to complex amphfied gene arrays These structures are, to some extent, a function of the cell hne and the locus m winch they occur and are profoundly influenced by numerous chemtcal and physical agents Yet, desptte the variety of forms, mutaUons are very rare events Thus it seems hkely that future work wall shift from the pure description of mutations to the deflmt~on of the cellular functions that mamtaln the lntegnty of the mammahan genome Very httle ~s known of the functions winch govern the accuracy of D N A rephcatton m mammalian cells The nature of rephcaUon complexes is only now being detenmned and it wtll be another quantum leap before nusmatch correcuon mechamsms are tdent~fied As an xmtml step, assays for correctaon of some forms of mismatches have been developed [14] and mutator stratus of cultured mammahan cells are being tdentlfied [96] Another aspect being actwely mvesttgated is the proteins recogmsmg D N A secondary structures wluch may be involved m generating sequence rearrangements Such proteins appear to be present m mammahan cells [6,31], as they are in bacteria [70,85] and yeast [136] More intriguingly, these protems may even be an integral part of chromosome structure [7] Thus further defimtlon of chromosome order and structure Is certainly essentml before we can understand more complex gene rearrangements References 1 Adatr, G M (1987) m Banbury Report 28 Mammahan Mutagenesls (Moore, M M , Demanm, D M , De Serres, F J , Tmdall, K R , eds), pp 3-13, Cold Spring Harbor Laboratory, Cold Spnng Harbor, New York 2 Albertmt, R J , O'Nedl, J P, Ntcklas, J A , Hemtz, N H and Kelleher, P C (1985) Nature 316, 369-371 2a Albertmt, R J, O'Nedl, J P, Nlcklas, J A , McGmmss, M J , ReClO, L and Skopek, T R (1989) Enxaron Mol Mutagen 14 (Suppl 16), 7 3 Ashrnan, C R and Davtdson, R L (1987)Som Cell Mol Genet 13, 563-568 4 Ashman, C R and Davadson, R L (1987) Proc Nail Acad Sel USA 84, 3354-3358 5 Berg, D E and Howe, M M (1989) Mobde DNA, American Society for Macrobtology, Washington, DC 6 Blancht, M E (1988) EMBO J 7, 843-850 7 Btanclu, M E, Beltrame, M and Paonessa, G (1989) Science 243, 1056-1059 8 Bishop, J M (1987) Setence 235, 305-311 9 Bourre, F and Sarasm, A (1983) Nature 305, 68-70 10 Bradley, W E C and Letovanec, D (1982)Somat Cell Genet 8, 51-66 11 Bradshaw, H D J (1983)Proc Natl Acad Set USA 80, 55885591 12 Bredberg, A , Kraemer, K H and Seidman, M M (1986) Pro¢ Natl Acad Set USA 83, 8273-8277 13 Breimer, L H , Nalbantoglu, J and Meuth, M (1986) J Mol Btol 192, 669-674 14 Brown, T C and Juacny, J (1988) Cell 54, 705-711 15 Brumer, D , Mtchel, B and Ehrhch, S D (1988) Cell 52, 883-892 16 Calos, M P, Lebkowsla, J S and Botchan, M R (1983) Proc Natl Acad Sei USA 80, 3015-3019
17 Cananm, E, Dreazen, O, Klar, A , RechavL G , Ryan, D , Cohen, J B and Gtvol, D (1983) Proc Natl Acad Set USA 80, 7118-7122 18 Carothers, A M , Urlanb, G, Elhs, N and Charon, L A (1983) Nucleic Actds Res 11, 1997-2012 19 Carothers, A M , Urlaub, G , Stelgerwalt, R W, Chasm, L A and Grunberger, D (1986)Proc Natl Acad Set USA 83, 6519-6523 20 Carroll, S M, De Rose, M L, Gaudry, P, Moore, C M , Needham-Vandevanter, D R , Von Hoff, D D and Wahl, G M (1988) Mol Cell Blol 8, 1525-1533 21 Collins, F S and Welssman, S M (1984) Prog Nucl Aetds Res Mol Blol 31, 315-436 22 Croce, C M (1987) Cell 49, 155-156 23 De Jong, P J , Grosovsky, A J arid Ghckman, B W (1988) Proc Natl Acad Se~ USA 85, 3499-3503 24 Dewyse, P and Bradley, W E C (1989)Somat Cell Mol Genet 15, 19-28 25 Dixon, K , Hanser, J, Tuteja, N , Protlc-Sabljlc, M, Rolhdes, E, Munson, P J and Levme, A S (1987) m Banbury Report 28 Mammahan Mutagenests (Moore, M M , Demartm, D M , De Serres, F J , Tmdall, K R , eds ), pp 315-323, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York 26 Dnnkwater, N R and Klmedmst, D K (1986) Proe Natl Acad Set USA 83, 3402-3406 27 Drobetsky, E A , Grosovsky, A J and Ghekman, B W (1987) Proc Natl Aead Se~ USA 84, 9103-9107 28 Drobetsky, E and Meuth, M (1983) Mol Cell Blol 3, 18821885 29 Dryja, T P, Rapport, J M , Joyce, J M and Petersen, R A (1986) Proc Natl Acad Set USA 83, 7391-7394 30 Eckert, K A , Ingle, C A , Klmechnst, D K and Dnnkwater, N R (1988) Mol Carcinogen 1, 50-56 31 Elborough, K and West, S C (1988) Nucleic Aetds Res 16, 3603-3614 32 Focher, F , Ferran, E, Spadan, S and Hubscher, U (1988) FEBS Lett 229, 6-10 33 Ford, M , Dawes, B, Gnffiths, M, Wdson, J and Fried, M (1985) Pro¢ Nail Acad Set USA 82, 3370-3374 34 Ford, M and Fried, M (1986) Cell 45, 425-430 35 Friend, S H , Bemards, R , Rogeaj, S, Wemberg, R A , Rapaport, J M , Albert, D M and Dryja, T P (1986) Nature 323 36 Fukuctu, K , Martm, G M and Monnat, R J (1989) Pro¢ Natl Acad Set USA 86, 5893-5897 37 Fune, B and Fune, B C (1988) Cell 53, 505-518 38 Fuscoe, J C , Fenwlck, R G , Ledbetter, D H and Caskey, C T (1983) Mol Cell Blol 3, 1086-1096 39 Futcher, A B (1986)J Theor 131o1 119, 197-204 40 Gennet, I N and Thdly, W G (1988) MutaUon Res 201, 149160 41 Gabbs, R A , Nguyen, P, McBride, L J , Koepf, S M and Caskey, C T (1989)Proc Natl Aead Sea USA 86, 1919-1923 42 Cnulotto, E, K.mghts, C and Stark, G R (1987) Cell 48, 837-845 43 Gmiotto, E, Satto, I and Stark, G R (1986) EMBO J 5, 21152121 44 Greasser, H , Tkachuk, D , gels, M D and Mak, T W (1989) Blood 73, 1402-1415 45 Groffen, J , Stephenson, J R , Heasterkamp, N , de Klein, A, Bartram, C R and Grosveld, G (1984) Cell 36, 93-99 46 Grosovsky, A J , De Boer, J G , De Jong, P J , Drobetsky, E A and Ghckman, B W (1988)Proe Nail Aead Sel USA 85, 185-188 47 Grosovsky, A J , Drobetsky, E A , De Jong, P J and Ghekman, B W (1986) Genetacs 113, 405-415 48 Handeh, S, Klar, A , Meuth, M and Cedar, H (1989) Cell 57, 909-920 49 Hawley, R G , Shuiman, M J and Hozurm, N (1984) Mol Cell Blol 4, 2565-2572
16 50 Henthorn, P S, Mager, D L, Hmsman, T H J and Srmttnes, O (1986) Proc Natl Acad Sci USA 83, 5194-5198 51 Hsla, H C, Lebkowslo, J S, Leong, P, Calos, M P and Miller, J H (1989)J Mol B~ol 205, 103-113 52 Huang, P, Sfliliano, M J and Plunkett, W (1989) MutaUon Res 210, 291-301 53 Hynen, O , Debatlsse, M , Buttin, G and De Samt Vincent, B R (1988) EMBO J 7, 407-417 54 Hynen, O . Debatlse, M , Buttln, G and De Saint Vincent, B R (1987) EMBO J 6, 2401-2408 55 Jarman, A P and Hlggs, D R (1988) EMBO J 7, 3337-3344 56 Kaden, D A , Bardwell, L, Newmark, P, Amsowlcz, A , Skopek, T R and Sager, R (1989)Proc Natl Acad Scl USA 86, 2306-2310 57 Katz, A and Kahana, C (1989) EMBO J 8, 1163-1167 58 Katzar, N , Rechavl, G , Cohen. J B, Unger, T , Slmom, F , Segal, S, Cohen, D and Glvol, D (1985)Proc Natl Acad Sci USA 82, 1054-1058 59 Kavathas, P, Bach, F H and De Mars, R (1980) Proc Natl Acad Sci USA 77, 4251-4255 60 Kazazaan, H H , Wong, C Youssouflan, H , Scott, A F , Plulhps, D G and Antonaralas, S E (1988) Nature 332, 164-166 61 Koemg, M , Hoffman, E P, Bertelson, A P, Monaco, A P, Feener, C and Kunkel, L M (1987) Cell 50, 509-517 62 Kuff, E L , Smath, L A and Lueders, K K (1981) Mol Cell Blol 1,216-227 63 Kunkel, T A , Schaaper, R M and Loeb, L A (1983) Biochemistry 22, 2378-2384 64 Langlols, S, Deeb, S, Brunzell, J D , Kastelem, J J and Hayden, M R (1989)Proc Natl Acad Scl USA 86, 948-952 65 Lebkowslo, J S, Clancy, S, Mdler, J H and Calos, M P (1985) Proc Natl Acad Scl USA 82, 8606-8610 66 Lebkowslo, J S, Miller, J H and Calos. M P (1986) Mol Cell Blol 6, 1838-1842 67 Lehrman. M A , Russell, D W, Goldstein, J L and Brown, M S (1987) J Blol Chem 272, 3354-3361 68 Lehrman. M A , Schneider, W L , Sudhof, T C, Brown, M S, Goldstem, J L and Russell, D W (1985) Science 227. 140-146 69 Lermshchka, I and Sharp, P A (1982) Nature 300, 330-335 70 Lflley, D M and Kemper, B (1984) Cell 36, 413-422 71 Lm, P, Zhao, S and Ruddle, F H (1983) Proc Natl Acad Scl USA 80, 6528-6532 72 Lmdahl, T (1982)Annu Rev Blochem 51, 61-87 73 Lmdahl, T and Nyberg, B (1974) Blochermstry 13, 3405-3410 74 Little, J B, Yandell, D W and Laber, H L (1987) m Banbury Report 28 Mammahan Mutagenesls (Moore, M M , Demanm, D M, De Serres, F J, Tmdall, K R , eds), pp 225-236, Cold Spring Harbor Laboratory, Cold Spnng Harbor, New York 75 Loeb. L A and Preston. B D (1986)Annu Rev Genet 20, 201-230 76 Looney, J E and Hamhn, J L (1987) Mol Cell Biod 7. 569-577 77 Lowy, I , Pelhcer, A , Jackson, J F , Slm. G , Sdverstem, S and Axel, R (1980) Cell 22, 817-823 78 Mager, D L, Henthom, P S and Smathles, O (1985) Nucleic Acids Res 13, 6559-6575 79 Mazur, M and Ghckman, B W (1988)Somat Cell Mol Genet 14, 393-400 80 Meuth. M (1981)Mot Cell Biol 1,652-660 81 Meuth. M , Nalbantoglu, J , Phear, G and Miles, C (1987) m Banbury Report 28 Mammahan Mutagenesls (Moore, M M , Demanm, D M , De Serres, F J , Tmdall, K R , eds ), pp 183-191, Cold Spnng Harbor Laboratory, Cold Sprang Harbor, New York 82 Miles, C and Meuth, M (1989) Mutat Res 227, 97-102 83 Miles. C and Meuth, M (1989) Nucleic Acids Res 17, 7221-7228 84 Mitchell, P J, Urlaub, G and Chasm, L (1986) Mol Cell Blol 6. 1926-1935
85 Mmuuctu, K , Kemper, B, Hays. J and Welsberg, R A (1982) Cell 29, 357-363 86 Monnat, R J J (1989) Cancer Res 49, 81-87 87 Nalbantoglu, J . Goncalves, O and Meuth, M (1983) J Mol Blol 167, 575-594 88 Nalbantoglu, J Hartley, D . Phear, G , Tear, G and Meuth, M (1986) EMBO J 5, 1199-1204 89 Nalbantoglu, J and Meuth, M (1986) Nucleic Acids Res 14, 8361-8371 90 Nalbantoglu, J , Miles, C and Meuth, M (1988) J Mol Blol 200, 449-459 91 Nalbantoglu, J , Phear, G and Meuth, M (1987) Mol Cell Blol 7, 1445-1449 92 Nlcholls, R D , Flshchel-Ghodslan, N and Hlggs, D R (1987) Cell 49, 369-378 93 Nlcklas, J A . O'Nedl, J P, AUegretta. M and Albertlm R J (1987) m Banbury Report 28 Mammahan Mutagenesls (Moore, M M . Demanm, D M , De Serres, F J , Tmdall, K R , eds ), pp 15-24, Cold Spnng Harbor Laboratory, Cold Spnng Harbor, New York 94 Patel, P I , Framson, P E , Caskey, C T and Chmault, A C (1986) Mol Cell Blol 6, 393-403 95 Phear, G , Armstrong, W and Meuth, M (1989) J Mol Blol 209, 577-582 96 Phear, G and Meuth, M (1989) Mol Cell Blol 9, 1810-1812 97 Phear, G , Nalbantoglu, J and Meuth, M (1987) Proc Natl Acad SCl USA 84, 4450-4454 98 Pious, D , Krangel, M S, Dtxon, L L, Parham, P and Strommger, J L (1982)Proc Natl Acad Scl USA 79, 7832-7836 99 Protlc-Sabljlc, M , Tuteja, N , Munson, P J , Hauser, J , Kraemer, K H and Dixon, K (1986)Mol Cell Blol 6, 3349-3356 100 Ramn, A and Raggs, A D (1980) Science 210, 604-610 101 Razzaque, A , Mlzusawa, H and Seldman, M M (1983) Proc Natl Acad Scl USA 80, 3010-3014 102 Romak, S, Leong, P, Sockett, H and Hutctunson, F (1989) J Mol Blol 209, 195-204 103 Roth, D B and Wdson, J H (1986) Mol Cell Blol 6. 4295-4304 104 Rowley, J D (1973) Nature 243, 290-293 105 Rydberg, B and Llndahl, T (1982) EMBO J 1, 211-216 106 Sag,her, D and Strauss, B (1983) Blochermstry 22, 4518-4526 107 Smlo, R K , Gelfand, D H , Stoffel, S. Scharf, S J , Hlguclu, R , Horn, G T , Mulhs, K B and Ehrhch, H A (1988) Science 239, 487-491 108 Smto, I , Groves, R , Gmlotto, E, Rolfe, M and Stark, G R (1989) Mol Cell Biol 9, 2445-2452 109 Salto, I and Stark, G R (1986) Proc Natl Acad Scl USA 83, 8664-8668 110 Schamke, R T (1984) Cell 37, 705-713 111 Sclumke, R T (1988)J Blol Chem 263, 5989-5992 112 Seetharam, S, Protic-Sabljlc, M , Seldman, M M and Kraemer, K H (1987)J Chn Invest 80, 1613-1617 113 Slmanovltch, L (1976) Cell 7, 1-11 114 Singleton, C K , Klyslk, S, Stlrdwant, S M and Wells, R D (1982) Nature 299, 312-316 115 Skuhmowslo, A W , Turner, D R , Morely, A A , Sanderson, B J S and Hahandros, M (1986) Mutat Res 162, 105-112 116 So, A G and Downey. K M (1988) Biochemastry 27, 4591-4595 117 Sparkes. R S, Sparkes, M C , Wilson. M G , Towner, J W , Benedict. W , Murphree, A L and Yums, J J (1980) Science 208, 1042 - 1044 118 Spltzner, J R and Muller, M T (1988) Nucleic Acids Res 16 119 Spradmg, A C and Mahowald, A P (1980) Proc Natl Acad Scl USA 77, 1096-1100 120 Spntz, R A and Orlon. S H (1982) Nucleic Acids Res 10, 8025-8028
17 121 Stark, G R, Debat~sse, M E G and Wahl, G M (1989) Cell 57, 901-908 122 Stark, G R and Wahl, G M (1984) Annu Rev Blochem 53, 447-491 123 Stout, J T and Caskey, C T (1985)Annu Rev Genet 19, 127-148 124 Takeshlta, M, Chang, C, Johnson, F, Will, S and GroUman, A P (1987)J Blol Chem 262, 10171-10179 125 Thacker, J (1986)Mutat Res 160, 267-275 126 Tmdall, K R, Stankowska, J, L F, Machanoff, R and Hsle, A W (1984)Mol Cell Blol 4, 1411-1415 127 Tmdall, K R and Stankowsla, L F (1989) Mutat Res 220, 241-253 128 Tsujlmoto, Y, Jaffe, E, Cossman, J, Gorham, J, Nowell, P C and Croce, C M (1985) Nature 315, 340-345 129 Urlaub, G and Chasm, L (1980)Proc Natl Acad Scl USA 77, 4216 130 Urlaub, G, Mitchell, P J, Kas, E, Chasm, L A, Funanage, V L, Myoda, T T and Hamhn, J (1986)Somat Cell Mol Genet 12, 555-566
131 Vamn, E F, Henthorn, P S, Kaoussls, D, Grosveld, F and Srmthies, O (1983) Cell 35, 701-709 132 Varmus, H E (1984) Armu Rev Genet 18, 553-612 133 Volkert, F C and Broach, J R (1986) Cell 46, 541-550 134 Vnehng, H, Slmons, J W I M, Arwert, F, Matarajan, A T and Van Zeeland, A A (1985) Mutat Res 144, 281-286 135 Vnehng, H, Van Rooljen, M L, Grown, N A, Zdzaemcka, M Z, Slmons, J W I M, Lohman, P H M and Van Zeeland, A A (1989) Mol Cell Blol 9, 1277-1283 136 West, S C and Korner, A (1985)Proc Natl Acad Scl USA 82 137 Yandell, D W, Dryja, T P and Little, J B (1986) Somat Cell Genet 12, 255-263 138 Yeung, C, Frayne, E G, A-Ubaldl, M R, Hook, A G, Ingoha, D E, Wright, D A and Kellems, R E (1983)J Blol Chem 258, 15179-15185 139 Yun-Fal, L and Wal Kan, Y (1984)Proc Natl Acad So 81, 414-418 140 Yums J J (1983) Science 221,227-236