Cell, Vol. 71, 363-366,
October
30, 1992, Copyright
0 1992 by Cell Press
A Question of Time: Replication Origins of Eukaryotic Chromosomes Walton L. Fangman and Bonita Department of Genetics SK-50 University of Washington Seattle, Washington 98195
J. Brewer
Each chromosome of a eukaryotic organism contains many replication origins (Figure 1) from which DNA is efficiently duplicated during S phase of the cell cycle. Recent work raises new questions about the nature of these origins and their regulation. What Are Origins? Like initiation of transcription at a promoter, initiation of replication at an origin can be thought of as requiring three discrete steps: the recognition of one or more cis-acting elements by specific initiation proteins, the localized unwinding of the DNA helix, and the selection of a site for the initiation of polymerization. Continuing the analogy to a promoter, it is clear that these three steps do not necessarily have to occur at the same site and may differ greatly in their degree of sequence specificity. Assays to detect origin function may detect one or more of these steps of initiation and thus may give different results if the events occur at different sites. Replication origins have been best defined in the yeast Saccharomyces cerevisiae. The autonomous replication sequence (ARS) assay, which detects the cis-acting sequences required for origin function in plasmids, has allowed mutational analysis with increasing resolution (Marahrens and Stillman, 1992). ARS elements consist of only 100-200 bp and include a conserved 11 bp core consensus sequence and several other less conserved elements that are required for or enhance the maintenance of a plasmid. Mapping of the site of initiation by two-dimensional gels (Brewer and Fangman, 1987; Huberman et al., 1987) reveals that replication initiation on plasmids occurs in the vicinity of the ARS element (within a few hundred base pairs of the ARS consensus). The exact size of the initiation zone and the locations of priming sites for DNA chain elongation are yet to be determined. Two-dimensional gel analysis of replicating chromosomal DNA fragments has shown that many but not all ARS elements are active as origins in the chromosome, and that initiation is dependent on a functional ARS element (Dubey et al., 1991; Rivier and Rine, 1992; Deshpande and Newlon, 1992). Quantitative assessments show that specific origins can have eff iciencies of activation that range from <.20 to >.90 per cell cycle (Dubey et al., 1991; Ferguson et al., 1991). The basis for differences in efficiency is not known. Do All Eukaryotes Have Specific Origins? A simple, reliable ARS assay has not been achieved for any eukaryote other than yeast. However, by physical means, examples of specific initiation have been observed in both Physarum and Tetrahymena (Benard and Pierron, 1992; Cech and Brehm, 1981). In these cases, the specificity of initiation can be interpreted as evidence for specific
Minireview
cis-acting origin sequences. At the other extreme, Xenopus eggs and egg extracts replicate plasmids containing only prokaryotic vector sequences as efficiently as those with Xenopus inserts. While two-dimensional gel analyses suggest that a single initiation event occurs on each plasmid molecule, there is no evidence for sequence specificity in initiation (Mahbubani et al., 1992; Hyrien and Mechali, 1992). The selective amplification of Drosophila chorion genes appears to combine both a small cis-acting element and dispersive initiation (see Orr-Weaver, 1991). The 450 bp ACE3 element directs initiation at sites distributed over several kilobase pairs with a preference for one locus within the region. However, there is no evidence that the ACE element is involved in replication in normal, dividing, diptoid tissue. It may be that neither the amplification that occurs during terminal differentiation of the insect follicle cell nor the amphibian egg prepared for rapid cleavage divisions upon fertilization is relevant to the replication of the mammalian somatic chromosome. The best-studied mammalian origin of replication resides near the dihydrofolate reductase (DHFR) locus of Chinese hamster ovary cells (see Hamlin, 1992; Held and Heintz, 1992, and references therein). Replication studies have been facilitated by using cell lines in which the DHFR gene, along with several hundred kilobase pairs of surrounding sequences, has been amplified lOOO-fold. In vivo labeling experiments that detect the earliest replicated restriction fragments have shown that replication preferentially begins in two regions within the 30 kb intergenic region 3’to the DHFR gene. Taking advantage of the features of semidiscontinuous synthesis, Okazaki fragments were labeled in permeabilized nuclei, and the strand of DNA to which they hybridized was determined (Burhans et al., 1990). On opposite sides of an origin of bidirectional replication, Okazaki fragments should hybridize to opposite strands. One of the early-replicated regions was found to contain a site at which a switch in strand preference occurred. However, conservative estimates of the magnitude of the bias on either side of this proposed origin show only a e-fold difference, and as the distance from the origin increased, the bias increased to 5- to 6-fold. These results
Figure
1. A Replicating
Yeast
Chromosomal
DNA Molecule
A molecule (- 300 kb) was traced from an electron micrograph. Arrows indicate the eight replication forks. Assuming that the four replicating regions on the molecule each arose from a single origin of bidirectional replication and that forks move at a constant speed, then the second origin from the left was activated earlier in the S phase than were the other three.
Cell 364
suggest that the site identified may serve as an origin in roughly one-third of the repeats, with the other two-thirds of the initiation events occurring in sequences that flank the origin. Two-dimensional gel analysis of the replication intermediates generated in vivo shows that each restriction fragment within the 30 kb intergenic region initiates replication a small fraction of the time, as evidenced by a small proportion of intermediates with internal bubbles (Dijkwel and Hamlin, 1992). The predominant form of replication intermediate for each fragment is a simple forked (Y) structure indicative of initiation elsewhere. Because breakage of the DNA at a fork can convert bubbles to Y structures, it may not be meaningful to quantitate the ratio of bubbles to Ys from these two-dimensional gels. Nevertheless, the data qualitatively are not inconsistent with a slight preference for initiation in the vicinity of the origin of bidirectional replication and dispersive initiation in flanking intergenic regions. The broad zone of replication initiation in the DHf-R locus does not rule out the possibility that there is a cis-acting site essential for initiation. There may be a single element (e.g., the origin of bidirectional replication) that directs initiation to take place anywhere within the 30 kb zone, or there may be many elements throughout the zone, each directing initiation over a narrower region. Alternatively, some broad feature of chromosome structure, such as higher order folding, may result in this zone being active in initiation. There is a great need for a functional analysis of cis-acting elements to distinguish among these possibilities. Such tests are feasible, as it has become increasingly easy to delete sequences and insert DNA fragments at specific locations in the mammalian chromosome. In addition, more mammalian origin regions need to be examined in detail. identification of cis-acting origin elements and mapping of the sites of initiation have been performed by transfecting human cells in culture with plasmids that contain fragments from human chromosomes. Random human genomic inserts but not bacterial inserts can promote replication if the insert is long enough (Heinzel et al., 1991); replication efficiency increases with size up to at least 20 kb. Initiation in these plasmids occurs at random sites but preferentially within the human inserts. These results are consistent with the idea that the human genome contains simple cis-acting origin elements that are present at high density. A given element may have a low probability of experiencing initiation, but the chance of a productive initiation event may increase as a plasmid acquires more elements. For example, in yeast, sequences very weakly related to the ARS consensus do not provide detectable ARS activity. But upon iteration of this element in a plasmid, ARS activity is created (Zweifel and Fangman, 1990). While it is possible that the mammalian plasmid system does not faithfully reflect origin function in mammalian chromosomes, the results are consistent with the dispersive pattern observed at the DHFR locus by twodimensional gel analysis. They fit with the idea that some feature of chromosome structure other than its sequence may contribute to the specificity for preferential initiation.
Figure 2. Early and Late Replication
Bands on Human Chromosome
14
The chromosome 14 prophase chromatids contain about 100,000 kb of DNA. In the chromosome on the left, the dark regions replicated during the first half of Sphase; the unstained regionswere late replicating. In the chromosome on the right, the dark regions replicated in the second half of S phase, and the light regions in early S phase. The analysis was made with primary lymphocyte cultures. Photograph provided by C.-L. Richer and modified from Drouin et al. (1990).
Chromosome replication from yeast to mammalian somatic cells may be directed by discrete cis-acting elements whose accessibility can be limited by the surrounding chromosomal context. Why So Many Replication Origins? The answers seem obvious. First, multiple origins increase the probability that achromosome will be replicated during an S phase and thereby reduce the chance of chromosome loss at mitosis. Second, most eukaryotic cells have more DNA per chromosome than can be duplicated from a single origin in an S phase. For a human cell, the approximately 3,000,OOO kb genome is replicated in about 8 h. Since the rate at which replication forks plow through chromatin is no greater than - 2 kb/min, bidirectional origins would need to be spaced at roughly 2000 kb intervals to finish replication in the allotted time. But this analysis, which views replication as a process that must accommodate the cell cycle rather than the other way around, underestimates the number of origins by 20-fold: the observed spacing between active origins is about 100 kb. Similar calculations can be made for S. cerevisiae (Rivin and Fangman, 1980), which has a much smaller genome (14,000 kb) and a shorter S phase (- 30 min). The density of origins is similar to that of mammals and, again, there are more origins than seem necessary. The apparent excess of active origins leads to a puzzle: if all origins were to start replication at the beginning of S phase, then S phase should be much shorter than it actually is. One solution to this conundrum is that not all origins are activated at the beginning of S phase. Cytological analysis of mammalian cells supports the deduction that different origins initiate replication at different times in S phase. The human chromosome set is composed of over 1000 discrete temporal domains that appear as replication bands on prophase chromosomes (Figure 2). Replication in these bands begins at different times during S phase (e.g., Drouin et al., 1990). Since an average human chromosome with 120,000 kb of DNA contains about 50 replication bands, a replication band would contain 25 origins
Minireview 365
that share a common time of activation. DNA fiber autoradiography experiments confirm that most adjacent origins begin synthesis at about thesame time(Hand, 1976). High resolution mapping of replication foci within the mammalian S phase nucleus similarly reveals a discrete temporal and spatial organization of chromosome duplication (e.g., O’Keefe et al., 1992). These demonstrations of temporal and spatial control reinforce the notion that replication initiation is not a random process, at least in somatic cells. Information about the acquisition of temporal regulation of origin activation is provided by work with Drosophila (McKnight and Miller, 1977). In early-cleavage embryos, active replication origins are closely spaced and activated nearly simultaneously. However, at the time of cellular blastoderm formation, fewer origins are used, and activation events appear to become asynchronous. Cellularization is also the time when zygotic transcription is activated and when heterochromatin is formed. Changes in higher order chromosome structure may lead to the late activation of origins in some chromosomal domains. What Makes an Origin Late? The late replication of one of the X chromosomes in a female mammal would seem to suggest that the time of activation of an origin cannot be an inherent property of the origin. Perhaps some aspect of the chromatin structure and/or methylation of the inactive X creates a context for lateactivation(e.g., Riggsand Pfeifer, 1992). Theobservation that the inactive X is late replicating is more than 25 years old, yet nothing is yet known about the mechanism that causes the switch to late replication. Also unknown is the mechanism that distinguishes early and late origins in the replication bands of mammalian chromosomes. The small chromosomes of yeast (average size of 900 kb) show interspersion of early- and late-replicating domains: centromere regions replicate during the first half of S phase and telomeres at the end of S phase (McCarroll and Fangman, 1966). The deduction that there must be origins in yeast that are activated not just at the beginning of S phase but possibly throughout has led to a search for a temporally regulated origin. The chromosome V origin, ARS501, is activated in the second half of S phase and is responsible for the late replication of a 66 kb domain that includes the right telomere (Ferguson et al., 1991). The time of replication of this origin is regulated by position, in that late activation requires the presence of a telomere nearby (Ferguson and Fangman, 1992). These observations show that the time at which an origin is activated can depend upon the chromosomal context of the origin. The generality of this phenomenon is unknown; other regions of the genome need to be examined. Context can also regulate origin activation in mammalian cells. One unique arrangement of sequences in an amplified Chinese hamster DHFR region results in the apparent activation of an origin (Leu and Hamlin, 1992). And, adeletion of the locus control region in the human 6-globin gene cluster results in both failure to activate the P-globin genes and a change from early to late replication in erythroid cell lines (Forrester et al., 1990). Why Temporal Regulation? Why are replication origins activated at different times?
Does the apparent excess of origins serve some useful purpose? While there are no answers to these questions, the fact that the temporal/spatial organization of origins is a highly conserved feature of the replication of eukaryotic chromosomes suggests that it is important for their maintenance. One possible function that late replication may serve is to ensure that sister chromatids remain attached to one another until the duplicated centromeres have established fully functional kinetochores. If this model were correct, then the early replication of centromeres would be a requirement. In yeast, this prediction seems to be true. Mammalian centromeres had been thought to replicate toward the end of S phase, but high resolution experiments show that they actually replicate at mid-S (Cl’Keefe et al., 1992). As long as centromeres are not the last region on thechromosometo be replicated, then newlyformingchromatids will remain in register, held together by unreplicated domains, until the end of S phase when the kinetochore can take over the maintenance of sister chromatid adhesion that is necessary for proper chromosome segregation at anaphase. An intriguing and probably fundamental feature of replication timing is its connection with gene expression. It may be this connection that accounts for the existence of multiple origins as a conserved feature of all eukaryotic organisms. A striking case is that of the mammalian X chromosome in females where the transcriptionally inactive X replicates later than its active homolog. More generally, many tissue-specific genes change from late to early S phase replication when they become competent for expression (e.g., Hatton et al., 1988). It is unknown whether there is a causal relationship between activation of tissuespecific genes and early S phase replication, and what the mechanism is for the switch in time of replication. A previously inactive origin(s) near the expressed gene may become active, or an origin may have its time of activation changed. Transcription factors are required for or enhance the function of some origins both in mammalian viruses (e.g., Guo and DePamphilis, 1992) and in yeast (e.g., Marahrens and Stillman, 1992). Transcription factor-binding sites are found near a preferential initiation site in the DHFR origin zone (Held and Heintz, 1992). Early origin activation may be the secondary consequence of the availability of specific transcription factors, either through the binding of thesefactorsorthrough the act of transcription itself. Alternatively, activation of certain origins may be a primary event that leads to expression of nearby genes by allowing preferential capture of limiting transcription factors by new daughter duplexes created early in S phase (see Riggs and Pfeifer, 1992). A combination is also possible, in which transcription leads to early origin activation, which in turn reinforces the capture of transcription factors. The origins associated with tissue-specific genes that exhibit a switch in time need to be identified. An association between time of replication and gene expression has also been found in yeast. The unexpressed HML and HMR loci, which contain repressed cell typespecific information, are located in subtelomeric regions that replicate in the last half of S phase. When copied into
Cdl 366
the MAT locus in an early-replicating region of the same chromosome, the promoters are activated. The positional silencing at HMR requires an ARS element, and mutations in the ARS that abolish origin function also relieve transcriptional silencing of the locus (Fiivier and Rine, 1992). In addition, genes inserted next to late-replicating telomeres are down-regulated compared with their normal locations in the genome (Aparicio et al., 1991). These late-replicating domains appear to provide an environment much like heterochromatin, which generally decreases gene expression. What Proteins Activate Origins? Understanding how origins are regulated will require identification of the proteins that activate them. These primary effecters have been identified and analyzed extensively in prokaryotic systems (especially the dnaA protein of Escherichia coli) and in eukaryotic viruses (especially the SV40 T antigen). They destabilize the DNA double helix in the origin region and promote assembly of the replication apparatus. Because of its well-characterized ARS elements and because of the ease of mutational analysis, S. cerevisiae has been used intensively in the search for initiation proteins. Looking for proteins that bind to the conserved 11 bp consensus sequence has been the natural focus of such efforts. The most promising results were reported recently (Bell and Stillman, 1992). A complex of six proteins, called the origin recognition complex (ORC), appears to bind specifically to ARS elements, since point mutations in the ARS consensus that eliminate function also prevent binding of the complex. Whether this complex is indeed the primary effector of origin activation will be answered when the genes that encode its proteins are identified and mutated. However, the prospect that the ORC plays this role is greatly strengthened by finding a footprint over an ARS element in cell lysates that is essentially identical to the footprint mapped with the purified ORC (Diffley and Cocker, 1992). Assuming that the ORC is a transactivator of origins, several questions become obvious. Are the footprints for early- and late-activated origins different? Do some S phase-specific alterations take place in both earlyand late-activated origins at the beginning of S phase? Can the yeast proteins be used to find metazoan homologs? Finally, the analysis of the ORC may shed light on the mechanism that ensures that an origin is used no more than once during each cell cycle. Prospects for the Future The analysis of eukaryotic origins has just begun. We do not know whether there is one kind of origin or many. Our current understanding of origin structure and function may be similar to the early days of the molecular analysis of eukaryotic gene expression, when a promoter was a promoter was a promoter. Only slowly did the tremendous diversity in promoter structure and regulation become evident. Given the large number of origins and the possible functional connection of some of them to gene expression, a multiplicity of origin types seems possible. Many interesting questions about origin function, origin regulation, the relationship of origin activation to gene expression, and
the evolutionary conservation of high origin density and temporal control wait to be answered. Many of the tools for answering these questions are available, and the study of origins from a broad spectrum of organisms will contribute to piecing together the puzzle.
Aparicio, 0. M.. Billington, 1279-l 287.
B. L., and Gottschling,
D. E. (1991).
Cell 66,
Bell, S. P., and Stillman,
B. (1992).
Nature
Benard,
M., and Pierron,
G. (1992).
Nucl. Acids Res. 20, 3309-3315.
Brewer,
B. J.. and Fangman,
Burhans, W. C., Vassilev, DePamphilis, M. L. (1990). Cech,
T. R.. and Brehm,
Deshpande, Diffley,
W. L. (1987).
S. L. (1981).
A. M., and Newlon,
P. A., and Hamlin,
Drouin, R., Lemieux, 273-280.
Cell 57, 463-471
L. T., Caddle, M. S., Heintz, Cell 62, 955-965.
J. F. X., and Cocker,
Dijkwel,
357, 128-134.
Nucl. Acids
C. S. (1992).
J. H. (1992).
J. L. (1992).
N. H., and
Res. 9, 3531-3543.
Mol. Cell. Biol., in press. Nature
357, 169-172.
Mol. Cell. Biol. 72.3715-3722.
N., and Richer,
C.-L. (1990).
Chromosoma
99,
Dubey, D. D., Davis, L. R., Greenfeder. S. A., Ong, L. Y., Zhu, J., Broach, J. R.. Newlon, C. S., and Huberman, J. A. (1991). Mol. Cell. Biol. 77, 5348-5355. Ferguson,
B. M., and Fangman,
Ferguson, B. M., Brewer, (1991). Ceil 65, 507-515.
W. L. (1992).
8. J., Reynolds,
Cell 68, 333-339.
A. E., and Fangman,
Forrester, W. C., Epner, E., Driscoll, M. C., Enver, yannopoulou, T., and Groudine, M. (1990). Genes Guo, Z.-S.. and DePamphilis, 2524. Hamlin, Hand,
J. L. (1992). R. (1976).
M. L. (1992).
Bioessays,
W. L.
T., Brice, M., PapaDev. 4,1637-1649.
Mol. Cell. Biol. 72, 2514-
in press.
J. Cell Biol. 64, 89-97.
Hatton, K.
S., Dhar, V., Brown, E. H., Iqbal. M. A., Stuart, S., Didamo, V. T., and Schildkraut, C. L. (1988). Mol. Cell. Biol. 8, 2149-2168. Heinzel, S. S., Krysan, P. J., Tran, C. T.. and Calos. Cell. Biol. 77, 2263-2272.
Held, P. G., and Heintz, 235-246.
N. H. (1992).
Biochim.
Huberman. J. A., Spotila, L. D., Nawotka, Davis, L. R. (1987). Cell 57, 473-481. Hyrien,
O., and MCchali,
Leu, T.-H.,
and Hamlin,
M. (1992).
Biophys.
Acta
K. A., El-Assouli,
Nucl. Acids
J. L. (1992).
M. P. (1991).
7730,
S. M., and
Res. 20, 1463-1469.
Mol. Cell. Biol. 72, 2804-2812.
Mahbubani, H. M., Paull, T., Elder, J. K., and Blow, J. J. (1992). Acids Res. 20, 1457-1462. Marahrens,
Y.. and Stillman,
McCarroll.
B. (1992).
R. M., and Fangman,
McKnight, S. L., and Miller, 0. O’Keefe, R. T., Henderson, Biol. 776, 1095-1110. Orr-Weaver,
T. L. (1991).
L.. Jr. (1977).
Bioessays
Rivier,
D. H., and Rine, J. (1992).
255, 817-823.
Cell 72, 795-804.
S. G., and Fangman,
D. L. (1992).
J. Cell
73. 97-105.
G. P. (1992).
Rivin, C. J., and Fangman,
Nucl.
Cell 54, 505-513.
S. C.. and Spector,
A. D., and Pfeifer,
Zweifel,
Science
W. L. (1988).
Riggs,
Mol.
Trends
Science
W. L. (1980). W. L. (1990).
Genet.
8, 169-174.
256, 659-663. J. Cell Biol. 85, 108-115. Yeast
6, 179-186.