Evolution of CpG Islands within the myc Gene Family

Evolution of CpG Islands within the myc Gene Family

Molecular Phylogenetics and Evolution Vol. 16, No. 3, September, pp. 475– 481, 2000 doi:10.1006/mpev.2000.0783, available online at http://www.idealib...

100KB Sizes 0 Downloads 65 Views

Molecular Phylogenetics and Evolution Vol. 16, No. 3, September, pp. 475– 481, 2000 doi:10.1006/mpev.2000.0783, available online at http://www.idealibrary.com on

SHORT COMMUNICATION Evolution of CpG Islands within the myc Gene Family Michael M. Miyamoto and Nicole P. Freire Department of Zoology, P.O. Box 118525, University of Florida, Gainesville, Florida 32611-8525 Received January 25, 2000

CpG islands are discrete regions of DNA with significantly greater frequencies of CpG doublets than bulk genomic DNA. They are most frequently associated with the 5ⴕ-ends of housekeeping genes and are involved in the regulation of their expression. In this study, the structure and evolution of CpG islands within genes of the myc family were evaluated with the protein-coding sequences of animals and their transducing viruses. These evaluations relied on a gene tree for the entire myc family to test the origins of CpG islands within their two protein-coding exons. Overall, CG-very rich and CG-rich islands are associated with exon 2 of the different myc genes of warmblooded vertebrates and with exon 3 of the N-myc and s-myc sequences of mammals, but not birds. These overall distributions of well-developed islands can be related to the major transitions of the CG-rich genomes of warm-blooded vertebrates from the CG-poor ones of other animals. In turn, the greater variability of well-developed islands within exon 3 of the N-myc gene and among the different retrogenes of the myc family can be attributed to their reduced functional constraints, as evidenced by their limited and very restricted patterns of expression, respectively. © 2000 Academic Press

In mammals and other groups, CpG islands, or HpaII tiny fragment islands, are discrete regions of unmethylated DNA with CpG dinucleotide frequencies that approach their expected levels derived from base composition (Larsen et al., 1992; Cross and Bird, 1995). These regions differ markedly from the rest of the genome, which reaches only about 20% of its expected CpG levels. On average, CpG islands are 1.0 –2.0 kb in length and occur in the genome about every 100 kb. These islands most frequently overlap with the promoters and 5⬘-ends of housekeeping genes. They are less commonly associated with tissue-specific genes and, when present, are not restricted to their 5⬘-ends. The recognized function of CpG islands is to maintain

a methylation-free environment for the expression of their associated genes. The myc gene family consists of five duplicate protooncogenes that encode protein transcription factors for the regulation of cell proliferation and differentiation (Henriksson and Lu¨scher, 1996; Sugiyama et al., 1999). The c-myc protooncogene is the best-known member of the family and is widely expressed among different tissues in vertebrates. It was first discovered in transducing viruses that carry their own v-myc copies of the gene. In turn, the L-myc and N-myc genes show much more limited expression. These three genes all share the same three-exon/two-intron structure, with exon 1 important in their regulation and exons 2 and 3 encoding multiple transcripts for different primary and secondary protein products. The remaining two members of the family are the B-myc and s-myc genes. The former is widely expressed, but consists of a single protein-coding exon that corresponds to exon 2 of the other myc genes. The latter is most restricted in its tissue expression and represents an intronless retrogene. Several previous studies of CpG islands have considered representatives of the myc family in their general surveys of island frequency and composition among vertebrates (Aı¨ssani and Bernardi, 1991a,b; Larsen et al., 1992). These investigations have revealed the existence of CpG islands within exons 1, 2, and/or 3 of the myc genes of mammals and birds. However, these studies, with their genome-wide perspectives, have relied mostly on the c-myc gene and few vertebrate taxa and have incorporated rarely an explicit phylogenetic framework in the interpretation of their results. This study builds on these earlier surveys by increasing the sample of both animal taxa and myc genes and by reconstructing the origins and evolution of their CpG islands on a gene tree for the entire family (Atchley and Fitch, 1995). The evolution of well-developed islands within this family is then related to patterns of gene expression and to the major transitions of the CG-rich genomes of warm-blooded vertebrates from

475

1055-7903/00 $35.00 Copyright © 2000 by Academic Press All rights of reproduction in any form reserved.

476

SHORT COMMUNICATION

the CG-poor ones of other animals (Bernardi et al., 1997; Hughes et al., 1999). A complete set of DNA and RNA records for the protein-coding regions of myc genes was compiled for animals and their transducing viruses from GenBank (release 111.0). Similar sequences, representing minor allelic variants, were eliminated from this initial set in favor of the most recent and complete records. The corresponding amino acid sequences of the retained records were first aligned to each other with CLUSTAL W (Thompson et al., 1994). This initial alignment was then refined according to published sequence alignments, the known exon/intron boundaries of myc genes, and the conserved functional domains of their proteins (Atchley and Fitch, 1995; Henriksson and Lu¨scher, 1996). Our sequence comparisons focused on those DNA and RNA segments corresponding to the protein-coding regions of exons 2 and 3 for the primary product of mammalian c-myc (p64) (Henriksson and Lu¨scher, 1996). These two segments, hereafter referred to as “exon 2” and “exon 3,” were separately compared according to their observed to expected CpG ratios (CpG O/E), total C⫹G frequencies (CG%), and C⫹G frequencies at third codon positions (CG 3%) (Aı¨ssani and Bernardi, 1991a,b; Larsen et al., 1992). Exons 2 and 3 of the different myc genes were then scored for their CpG islands according to the following system: (0) exons with no CpG island (i.e., ⱕ0.6 CpG O/E), (1) exons with a CG-poor island (⬎0.6 CpG O/E, ⱕ55% CG%, and ⱕ70% CG 3%), (2) exons with a CG-rich island (⬎0.6 CpG O/E and ⬎55% to ⱕ60% CG% and/or ⬎70% to ⱕ75% CG 3%), and (3) exons with a CG-very rich island (⬎0.6 CpG O/E, ⬎60% CG%, and ⬎75% CG 3%) (Aı¨ssani and Bernardi, 1991a,b). Although sufficient for the circumscription of exons, the final full alignment of the Myc proteins was less appropriate for phylogenetic analysis, because of several unstable regions of poorly aligned sequence. These poorly aligned regions were first excluded from the multiple alignment, thereby leaving only the most conserved functional domains for phylogenetic estimation (Atchley and Fitch, 1995). Phylogenetic analysis of these conserved domains was conducted with the PROTPARS method, as implemented in PAUP* 4.0 (Swofford, 1998). This PROTPARS analysis relied on a starting phylogeny with many topological constraints that focused the search for the most-parsimonious (MP) solutions on the unresolved relationships near the base of the tree. These topological constraints were derived from our recent phylogenetic comparisons of mammalian c-myc sequences (Miyamoto et al., 2000), the recognized relationships of vertebrates and nonvertebrates (Brusca and Brusca, 1990; Benton, 1997), and the published gene tree of the myc family (Atchley and Fitch, 1995). Given these topological constraints, the MP solution was identified by the branch-and-bound

method. It was then rooted with the Myc proteins of Crassostrea and Drosophila. This MP solution served as the reference phylogeny in the character state optimizations and ancestral reconstructions of the CpG islands. In these analyses, the final scores for the CpG islands of exons 2 and 3 were treated as two, separate, ordered, multistate characters that were optimized by standard parsimony to the MP tree (Swofford, 1998). By identifying all equally parsimonious ancestral reconstructions, ambiguous character state changes (i.e., those with multiple MP assignments) were distinguished from unambiguous ones. These ambiguous transitions were assigned to the MP phylogeny by ACCTRAN optimization, a procedure that tends to concentrate change at the base of the tree. As a check on our phylogenetic results, bootstrapping with 1000 replications was first conducted to assess the strength of support for the unconstrained groups of the MP solution (Swofford, 1998). This assessment was followed by further character state optimizations with all possible dichotomous trees for the major gene clusters of the myc family (i.e., the c-myc/ B-myc, L-myc, N-myc/s-myc, echinoderm, and Crassostrea/Drosophila lineages). By repeating the optimizations with these alternative trees, our original best estimates of character state change were evaluated for their dependency on the MP phylogeny and for their corresponding stability. A total of 47 myc genes was retained for phylogenetic and evolutionary analyses (Table 1). Eight classes from four animal phyla were represented by these sequences, with the c-myc gene of vertebrates most extensively covered with 22 orthologues (Fig. 1). This final set of 47 myc sequences also included one B-myc and two s-myc sequences of murid rodents, five v-myc genes of avian and feline viruses, and two intronless retrogenes of L-myc and N-myc (mycL2 and N-myc2 of Homo and Marmota, respectively). Within exon 2, CG-very rich islands (character state 3) were associated with the B-myc, c-myc, L-myc, and N-myc genes of mammals and birds and the v-myc copies of their transducing viruses (Fig. 1A). The two s-myc sequences of murids were variable in that exon 2 of Mus and Rattus was found with no island (0) and a CG-rich island (2), respectively. In contrast, only CGpoor islands (1) or no islands were associated with exon 2 of the other animals. Within exon 3, no myc sequences were found with CG-very rich islands, including those of mammals and birds (Fig. 1B). Instead, the most highly developed islands were the CG-rich ones of the N-myc1 gene of Marmota and the N-myc and s-myc sequences of other mammals. Unlike exon 2, exon 3 of the s-myc gene of both Mus and Rattus was similar in that the two were associated with a CG-rich island. Instead, the N-myc1 orthologues and N-myc2 retrogene of mammals were

477

SHORT COMMUNICATION

TABLE 1 Summary Statistics of Base Composition for the Two Protein-Coding Exons of the 47 myc Genes

myc gene B c

L

N

s NV

v

a

Species (common name or viral strain; GenBank Accession No.) Rattus norvegicus (Norway rat; X17455) Callithrix jacchus (short-tusked marmoset; M88116) Canis familiaris (domestic dog; X95367) Felis silvestris (wild cat; M22727, M22728) Gorilla gorilla (gorilla; Mohammad-Ali et al., 1995 b) Homo sapiens (human; X00364) Hylobates lar (white-handed gibbon; M88115) Marmota monax (woodchuck; X13232) Mus musculus (house mouse; L00038, L00039) Ovis aries (domestic sheep; Z68501) Pan troglodytes (chimpanzee; M38057) Pongo pygmaeus (orangutan; Mohammad-Ali et al., 1995 b) Rattus norvegicus (Norway rat; Y00396) Sus scrofa (pig; X97040) Gallus gallus (chicken; J00889) Xenopus laevis cI (African clawed frog; X53717) Xenopus laevis cII (African clawed frog; X56870) Carassius auratus (goldfish; D31729) Cyprinus carpio c1 (carp; D37887) Cyprinus carpio c2 (carp; D37888) Danio rerio (zebrafish; L11710) Oncorhynchus mykiss, pituitary (rainbow trout; S79770) Oncorhynchus mykiss, testes (rainbow trout; M13048) Homo sapiens L (human; M19720) Homo sapiens L2 (human; AC004081) Mus musculus (house mouse; X13945) Xenopus laevis L1 (African clawed frog; L11362) Xenopus laevis L2 (African clawed frog; L11363) Homo sapiens (human; Y00664) Marmota monax N1 (woodchuck; X53673, X53674) Marmota monax N2 (woodchuck; X53671) Mus musculus (house mouse; M12731) Rattus norvegicus (Norway rat; X63281) Gallus gallus (chicken; D90071) Serinus canaria (canary; M64251, M64598) Xenopus laevis (African clawed frog; X58670, X58671) Mus musculus (house mouse; AB016289) Rattus norvegicus (Norway rat; M29069) Asterias vulgaris (sea star; M80364) Strongylocentrotus purpuratus (purple urchin; L37056) Crassostrea virginica (eastern oyster; S77334) Drosophila melanogaster (fruit fly; U77370) Avian carcinoma virus (MH2; K02082) Avian carcinoma virus (MH2 28-z; M16529) Avian carcinoma virus (OK10; M11352) Avian myelocytomatosis virus (MC29; V01174) Feline leukemia virus (FTT; M25762)

Exon 2

Exon 3

bp

CG%

CG 3%

CpG O/E

bp

CG%

CG 3%

CpG O/E

537

65.3

84.9

0.920









757 757 757 757 757 757 757 757 757 757

63.5 67.2 67.3 63.5 63.9 64.2 64.1 61.1 65.7 64.0

86.9 93.2 94.0 87.3 88.4 88.8 85.8 81.8 89.3 88.8

0.849 0.929 0.918 0.825 0.829 0.824 0.896 0.815 0.915 0.827

560 563 563 563 563 563 563 563 563 563

52.6 54.4 54.4 51.6 52.0 51.3 51.8 54.0 51.8 52.0

62.6 66.4 68.0 59.6 60.2 58.5 60.6 68.7 61.2 60.2

0.544 0.625 0.673 0.561 0.553 0.514 0.451 0.563 0.663 0.553

757 757 757 688 683 679 619 619 619 619

63.8 61.2 64.5 71.3 55.4 54.6 53.8 52.9 52.6 54.3

88.1 81.4 88.5 94.8 68.3 68.6 67.5 64.5 65.0 69.9

0.819 0.808 0.897 0.931 0.462 0.438 0.753 0.702 0.618 0.751

563 563 563 563 577 584 581 566 587 599

51.8 51.9 52.6 52.8 52.4 52.3 50.3 51.4 51.9 52.7

60.2 61.2 61.2 65.9 65.8 66.1 56.7 60.4 61.8 63.0

0.583 0.478 0.489 0.740 0.911 1.129 0.672 0.648 0.693 0.759

640

47.6

50.2

0.698

557

52.2

59.7

0.556

664 496 481 496 472 472 766 754 751 760 760 727 679

50.6 70.1 63.5 66.2 49.5 46.7 73.8 71.2 67.9 67.5 66.7 71.4 69.8

55.2 84.2 82.5 76.3 46.5 42.1 87.5 80.9 77.6 75.9 75.1 87.6 85.0

0.742 0.907 0.871 0.867 0.280 0.237 0.886 0.861 0.833 0.743 0.699 0.908 0.933

581 599 596 611 563 563 605 605 590 605 605 599 605

53.0 54.0 52.8 55.3 45.2 46.8 56.1 57.3 56.2 57.7 57.9 44.8 42.6

66.0 61.5 60.8 63.7 36.2 41.0 70.3 74.3 74.1 73.2 73.8 45.0 37.2

0.518 0.300 0.531 0.429 0.421 0.359 0.674 0.628 0.580 0.796 0.753 0.734 0.584

700 685 682 577

55.3 61.5 62.1 50.4

54.0 73.7 73.6 49.4

0.150 0.526 0.640 0.608

614 587 584 626

49.3 56.2 56.6 49.8

58.1 72.0 72.3 52.6

0.967 0.691 0.728 0.931

634 619 997 676 676 688 688 757

53.1 50.0 51.3 71.7 71.4 71.4 71.0 67.0

57.4 51.0 64.7 94.2 94.2 94.8 94.8 93.6

0.832 0.366 1.145 0.949 0.960 0.939 0.915 0.905

668 554 1,157 563 563 563 563 557

49.9 47.2 52.2 54.1 54.1 52.6 53.2 54.6

59.6 57.3 67.6 67.5 68.1 65.9 67.0 68.8

0.843 0.227 0.867 0.754 0.730 0.720 0.755 0.676

Note. These statistics include (bp) base pairs; (CG%) percentage C⫹G frequency; (CG 3%) CG% at third codon positions; and (CpG O/E) observed to expected ratio of CpG pairs. a The B, c, L, N, s, and v designations of the myc genes correspond to their original descriptions, except for those of nonvertebrates. Their designations are left undefined as “NV,” since their relationships remain unresolved (see text). b The original citation for the c-myc genes of Gorilla and Pongo, rather than accession numbers, is given, since these sequences have never been deposited in GenBank.

478

SHORT COMMUNICATION

FIG. 1. Evolution of CpG islands within exons 2 (A) and 3 (B) of the 47 myc genes of vertebrates, nonvertebrates, and their transducing viruses. Exons are scored according to the following definitions of their CpG islands: (0) exons with no islands, (1) exons with CG-poor islands, (2) exons with CG-rich islands, and (3) exons with CG-very rich islands (see text and Table 1). The reference phylogeny represents the single MP solution from the PROTPARS analysis of the aligned protein sequences of the 47 myc genes. All branch points within this gene tree were fixed a priori, except for the four internal nodes with bootstrap scores. The final character states of the 47 sequences are given in parentheses, as are those of their ancestral reconstructions. Solid arrows highlight character state changes that are based on unambiguous ancestral reconstructions. In contrast, broken arrows refer to ambiguous changes that are supported by ACCTRAN optimization. Asterisks designate inferred gene duplications.

variable in that the latter lacked a CpG island. All other sequences of exon 3 were associated with CGpoor islands or no islands.

In all, 181 aligned positions of the three Myc boxes, exon 2/exon 3 junction, major casein II kinase phosphorylation site, and basic helix-loop-helix motif were

SHORT COMMUNICATION

479

FIG. 1—Continued

used in the PROTPARS analysis (Atchley and Fitch, 1995; Henriksson and Lu¨scher, 1996). This analysis resulted in the identification of one MP solution of 789 steps (Fig. 1). This MP tree supported the union of the N-myc and s-myc genes and their subsequent grouping with the two echinoderm orthologues. In turn, this larger group was joined to the c-myc/B-myc cluster, thereby leaving the L-myc lineage as part of the earliest duplication.

All character optimizations with exon 2 and this reference phylogeny were unambiguous in their support of three separate shifts to the CG-very rich islands of the c-myc/B-myc, L-myc, and N-myc clusters of mammals and birds from the nonexistent and CG-poor ones of other animals (Fig. 1A). In contrast, some character optimizations were ambiguous for exon 3 (Fig. 1B). Nevertheless, these uncertainties did not preclude the unambiguous recognition within exon 3 of two separate

480

SHORT COMMUNICATION

shifts to the CG-rich islands of the N-myc1 orthologues and s-myc retrogene of mammals from the nonexistent and CG-poor ones of birds and other animals. The grouping of N-myc and s-myc genes was the only unconstrained one to receive a reasonable to strong bootstrap score of ⱖ85% (88%; Fig. 1). In light of this result, the follow-up tests of stability with the alternative trees took on special importance as checks on the character optimizations with the MP solution (Fig. 1). These 14 other arrangements were only 2 to 10 steps longer than the MP solution, thereby highlighting the overall lack of resolution for relationships among major gene lineages. Nevertheless, the same unambiguous transitions within exons 2 and 3, as summarized above, were supported by these 14 alternatives. Thus, these major shifts were not uniquely tied to the MP solution, but were rather dependent on the more certain arrangements of sequences within each gene cluster. With the exception of the s-myc retrogene, CG-very rich islands are consistently found with exon 2 of the different myc genes of mammals and birds (Fig. 1A). Our character optimizations with exon 2 attribute the origins of these CG-very rich islands of the c-myc/Bmyc, L-myc, and N-myc clusters to three separate events within the common ancestor of mammals and birds. However, this interpretation is based on an incomplete sample of amniotes [mammals, birds, crocodilians, lepidosaurimorphs (squamates and tuatara), and testudinates] and ignores the existing evidence that well-developed islands are rare or nonexistent in the genomes of cold-blooded vertebrates (Aı¨ssani and Bernardi, 1991a,b; Caccio` et al., 1997). Taking these limitations into account, a more likely explanation for the evolution of the CG-very rich islands of the c-myc/ B-myc, L-myc, and N-myc clusters attributes their origins to two separate sets of three parallel events within the common ancestors of both mammals and birds. Given this more likely explanation, these CGvery rich islands of exon 2 are seen as parts of the same major transitions that are responsible for the final increases of the CG-richest isochores in the genomes of warm-blooded vertebrates (Bernardi et al., 1997; Hughes et al., 1999). Critical tests of this hypothesis now await new myc sequences from other amniotes. Our character optimizations with exon 3 indicate that this association may also apply to the CG-rich islands of the N-myc1 orthologues and s-myc retrogene of mammals (Fig. 1B). These CG-rich islands of exon 3 are attributed to two separate events within the common ancestor of mammals. Thus, as before, the origins of these CG-rich islands can be related to the final major rise in the CG-richest isochores of mammals. However, the lack of well-developed islands within exon 3 of the N-myc gene of birds and N-myc2 retrogene of mammals argues against the occurrences of related events within these two lineages.

The most complex patterns of variation and evolution of well-developed islands are exhibited by the smyc, mycL2, and N-myc2 retrogenes of murids, Homo, and Marmota, respectively (Fig. 1). The mycL2 retrogene is the only one of these four to share the same pattern of well-developed islands with the other mammalian sequences of its gene cluster. CG-rich islands are associated with both exons of the s-myc retrogene of Rattus, but only with exon 3 of Mus. As noted above, the N-myc2 retrogene varies from the N-myc1 orthologues of mammals by its lack of a well-developed island within exon 3. Interestingly, this variation within exon 3 exists even though the N-myc2 sequence and N-myc1 orthologues overlap broadly in their CG% and CG 3% values (Table 1). These close similarities reveal that the lack of a well-developed island in the N-myc2 retrogene is not related to a diminished CG composition. The presence of a well-developed island within exon 2 of the s-myc retrogene of Rattus, but not Mus, illustrates that these structures can be gained or lost within the time scale of a mammalian subfamily. Our character optimizations with exon 2 indicate that the CG-rich island of the s-myc gene is due to a unique gain within Rattus (Fig. 1A). However, given the current sample of only two s-myc sequences, this explanation cannot be unequivocally accepted over the alternative one for an earlier origin of this CG-rich island within the common ancestor of mammals, followed by its subsequent loss within Mus. Nevertheless, whether due to a gain or loss, this difference between Mus and Rattus suggests an evolutionary event that occurred within the last 10 to 40 million years after their separation (Kumar and Hedges, 1998). As in the N-myc2 retrogene, this variation between the two s-myc sequences exists even though they are nearly identical in their CG compositions (Table 1). The v-myc sequences of the avian and feline retroviruses offer a different perspective on the time scale in which well-developed islands are gained or lost. The amino acid sequences of these v-myc copies closely resemble those of the c-myc genes of their warmblooded hosts (Atchley and Fitch, 1995). These close similarities are paralleled by the near identities of their CpG islands and base compositions (Table 1). They are indicative of recent transductions that have left relatively little time for major evolutionary change. Why isn’t a CG-rich island associated with exon 3 of the N-myc sequences of birds and why are the various retrosequences of mammals so variable in their distributions of well-developed islands? The answers to these questions are most readily attributed to the greater variability in the frequency and location of CpG islands within genes with tissue-specific or limited expression (Larsen et al., 1992). Like the s-myc sequences, the mycL2 and N-myc2 retrogenes show a very restricted pattern of tissue expression, whereas

SHORT COMMUNICATION

the N-myc gene exhibits a limited one (Henriksson and Lu¨scher, 1996; Sugiyama et al., 1999). Thus, the more variable distributions of well-developed islands within the N-myc gene and different retrogenes of warmblooded vertebrates may be related to their limited and highly restricted patterns of expression, respectively. This conclusion indicates that the presence and location of well-developed islands are less critical for the regulation of “luxury” genes than for housekeeping genes. ACKNOWLEDGMENTS We thank F.-G. R. Liu and C.-T. Phung for help with the statistical analyses; W. R. Atchley for providing us with a copy of his multiple alignment; B. Bolker, B. F. Koop, M. R. Tennant, and M. L. Wayne for valuable suggestions about the manuscript and research; and the Department of Zoology, University of Florida for financial assistance.

REFERENCES Aı¨ssani, B., and Bernardi, G. (1991a). CpG islands: Features and distribution in the genomes of vertebrates. Gene 106: 173–183. Aı¨ssani, B., and Bernardi, G. (1991b). CpG islands, genes and isochores in the genomes of vertebrates. Gene 106: 185–195. Atchley, W. R., and Fitch, W. M. (1995). Myc and max: Molecular evolution of a family of proto-oncogene products and their dimerization partner. Proc. Natl. Acad. Sci. USA 92: 10217–10221. Benton, M. J. (1997). “Vertebrate Paleontology,” 2nd ed., Chapman & Hall, London. Bernardi, G., Hughes, S., and Mouchiroud, D. (1997). The major compositional transitions in the vertebrate genome. J. Mol. Evol. 44: S44 –S51.

481

Brusca, R. C., and Brusca, G. J. (1990). “Invertebrates,” Sinauer, Sunderland, MA. Caccio`, S., Jabbari, K., Matassi, G., Guermonprez, F., Desgre`s, J., and Bernardi, G. (1997). Methylation patterns in the isochores of vertebrate genomes. Gene 205: 119 –124. Cross, S. H., and Bird, A. P. (1995). CpG islands and genes. Curr. Opin. Genet. Dev. 5: 309 –314. Henriksson, M., and Lu¨scher, B. (1996). Proteins of the Myc network: Essential regulators of cell growth and differentiation. Adv. Cancer Res. 68: 109 –182. Hughes, S., Zelus, D., and Mouchiroud, D. (1999). Warm-blooded isochore structure in Nile crocodile and turtle. Mol. Biol. Evol. 6: 1521–1527. Kumar, S., and Hedges, S. B. (1998). A molecular timescale for vertebrate evolution. Nature 392: 917–920. Larsen, F., Gundersen, G., Lopez, R., and Prydz, H. (1992). CpG islands as gene markers in the human genome. Genomics 13: 1095–1107. Miyamoto, M. M., Porter, C. A., and Goodman, M. (2000). c-myc gene sequences and the phylogeny of bats and other eutherian mammals. Syst. Biol. 49, in press. Mohammad-Ali, K., Eladari, M. E., and Galibert, F. (1995). Gorilla and orangutan c-myc nucleotide sequences: Inference on hominoid phylogeny. J. Mol. Evol. 41: 262–276. Sugiyama, A., Noguchi, K., Kitanaka, C., Katou, N., Tashiro, F., Ono, T., Yoshida, M. C., and Kuchino, Y. (1999). Molecular cloning and chromosomal mapping of mouse intronless myc gene acting as a potent apoptosis inducer. Gene 226: 273–283. Swofford, D. L. (1998). “PAUP*. Phylogenetic analysis using parsimony (*and other methods),” version 4, Sinauer, Sunderland, MA. Thompson, J. D., Higgins, D. G., and Gibson, T. J. (1994). CLUSTAL W: Improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position specific gap penalties and weight matrix choice. Nucleic Acids Res. 22: 4673– 4680.