Methylation of CpG-island-containing genes in human sperm, fetal and adult tissues

Methylation of CpG-island-containing genes in human sperm, fetal and adult tissues

Gene. 114 ( !992) 203-210 © 1992 Elsevier Science Publishers B.V. All rights reserved. 0378-1119/92/$05.00 203 G ENE O6443 Methylation of CpG-isla...

793KB Sizes 0 Downloads 50 Views

Gene. 114 ( !992) 203-210

© 1992 Elsevier Science Publishers B.V. All rights reserved. 0378-1119/92/$05.00

203

G ENE O6443

Methylation of CpG-island-containing genes in human sperm, fetal and adult tissues (Human DNA; 5-methylcytosine; methyltransferase; c-Ha-ras and c-m.vc oncogenes; HPRT; promoter; insulin)

Hamid Ghazi, Felicidad A. Gonzales and Peter A. Jones Kemleth Norris Jr. Comprehensive Cancer Center. University of Southern California. Los Angeles, CA 90033 (U.S.A.)

Received by A.D. Riggs: September 1991 Revised/Accepted: 7 January/10 January 1992 Received at publishers: 17 February 1992

SUMMARY The methylation of three human genes containing CpG islands and a CpG-depleted gene were measured in sperm, fetal and adult tissues. The c-Ha-ras was methylated extensively in the 3' region in sperm with a methylation-free region extending from the promoter to the third exon. The extent of methylation in the 3' region decreased in fetal cells, however, de novo methylation of sites closer to the island and within exon 1 were apparent. These sites were more completely methylated in adult lymphocytes and kidney. Essentially similar results were obtained with the CpG-island-containing genes, c-myc and HP.RT (encoding hypoxanthine phosphoribosyl transferase), which showed that unmethylated sites near the CpG islands in sperm became methylated in fetal and adult cells. The variations in methylation seen in the non-island regions of the c-Ha-ras gene were mirrored in the insulin-encoding gene which does not contain a CpG island. The results show similar variations in methylation of non-island regions of DNA which occur independent of expression, and show that regions of extensive methylation in sperm may move closer to CpG islands in fetal and adult somatic cells.

INTRODUCTION The methylation of cytosine residues in vertebrate DNA has apparently had a profound evolutionary effect on the distribution of the methyl acceptor site CpG. This dinucleotide occurs at 20% of the statistically expected frequency and is asymmetrically distributed in organisms which contain 5-methylcytosine in their DNA. Most vertebrate DNA is CpG poor, but about 1% contains the dinucleotide at the expected frequency in regions called 'CpG islands' which are often associated with genes (Bird et al., 1986; GardinerCorrespondence to: Dr. P.A. Jones, Kenneth Norris Jr. Comprehensive Cancer Center, University of Southern California, 1441 Eastlake Ave., Los Angeles,CA 90033 IUSA)Tel. (213) 224-6503; Fax (213) 224-6593.

Abbreviations: bp, base pair(s); c-Ha-ras, oncogene encoding c-Harvey Ras; DHFR, gene encoding dihydrofolate reductase; HPRT, gene encoding hypoxanthine phosphoribosyl transferase; kb, kilobase(s)or 1000bp; MTase, methyltransferase; nt, nucleotide(s); SDS, sodium dodeeyl sulfate; SSC, 0.15 M NaCI/0.015 M Na3"citrate pH 7.6; VNTR, variable number of tandem repeats.

Garden and Frommer, 1987; Tykocinski and Max, 1984). It has been proposed that CpG has become depleted, because of the tendency of 5-methylcytosine to deaminate to thymine (Coulondre et al., 1978) and that CpG islands exist because they are somehow protected from methylation in the germline. The mechanisms protecting CpG islands from germline methylation are not known, but certain DNA sequences seem to have the ability to resist de novo methylation (Kolsto et al., 1986; Szyf et al., 1990). The functions of CpG islands are not understood but their frequent association with the promoters or the coding regions of genes suggests that they may serve to somehow distinguish genes from bulk DNA (Bird, 1986). The only known examples of methylation of CpG islands in normal cells are those on L 1 dispersed repetitive elements (Crowther etal., 1991), human telomeres (DeLange et ai., 1990) and on genes located on the inactive X chromosome in female mammalian cells where the methylation may serve to stabilize permanently transcriptional incompetency (Yen etal., 1984; Pfeifer et al., 1990). In marked contrast, CpG islands on

204

A. c-Ha-ras 1000

2000

i

G pC~ ~ ~ , , ~ m m ~ l ~ l ~ CpG

M~

pl

Taql

I n'

_.l U l i ~ , i m U I l l l l l It llil IIDIII lillillil illllllllll

3000

I

4000

I

5000

l

6000

l

bp

I

. . , - - , , . . I,,.,=..--....--=,..,.,,.,,..,,.,,.,-..,= = :-"llMI U l l l llill i m l n l n l=, l l,i= i l H. ,I I=l U- i.N= NBUlilUllU lUlill lmigiinwil l aglmuii,tlWIINIIiiIIllnuln Im B e l i a l mid R i l l I IIIHI i H a l i m l p u m u l l l l U l l l l l l m m l l l l l Ilinl lU I I I I I n l I H I I III I l l . . . . . lllllillN['..-'" I n l l i l l l l i b nn II I I l l l i l l lift NIIII I Illili I l U | N I h , n n m . i l U l f l l l i l l l i l l l l l l l l l l l f l IIIIlll|flll III l l l l i l l l l l l l l l i l l l l l l k t . . . l l f l l l l l IIII IIII I II I il I I I II I Itllill I I II e I I I I i I I HI U ! T II illll I IllllH I IIIIIIII I I V I I It

-I

!

IF

' E]

Q

'

I

ti:!:i:i:i:i!i!i:!:i:i:i:i:!:ii::i:]

r

B. c . m v c 1000 t

2000

3000

!

4000

I

5000

!

6000

!

7000

|

!

8 0 0 0 bp !

GpC CpG

Mspl I

'lllfl "'

"IIII" ' "

III

I "II

I' 'I " II

'

I I

II 'I

I

S

T I

I I

I PROBE

Fig. ;. Maps of c-Ha-ras and c-myc. (A) Map of human c-Ha-ras (Capon et al., 1983). {B) c-myc (Colby et al., 1983). The positions of GpC, CpG, Mspi and Jaql sites are shown by the vertical lines and the numbers at the top refer to bp. The exons are shown as open boxes and the VNTR region ofcHa-ras as a stippled boy The symbol ! denotes a CpG island-containing Taql fragment and the symbol V denotes the VNTR-containing Taql fragment. The probes within the c-Ha-ras CpG island which arc rcfQrred to in Figs. 2 and 3 are indicated by P, Q and R. The Mspl sites examined with the marked probe in c-myc are shown as S and T respectively.

autosomal genes are often extensively methylated in immortalized cell lines or tumor cells (Antequera et al., 1990; Jones et al., 1990). Because of the demonstrated ability of CpG island methylation to silence gene activity (Borreilo et al., 1987; Rachal et ai., 1989), this de nero methylation may be functionally important in ensuring the heritable quiescence of genes capable of inducing ¢~'Uulardifferentiation. The possibility that the methylation of CpG" in promoter sequences can suppress gene activity has leA! tO a large number of reports on the observed inverse relationship between methylation and expression (reviewea by Riggs and Jones, 1983 and Jones and Buckley, 1990). Precise mapping of methylation patterns in the promoter regions within the CpG islands of some X-linked gene~ ,.uch as phosphoglycerate kinase has shown that there is an excellent correlation between the presence of 5-methylcytosine in these regions and transcriptional inactivity (Pfeifer et ai., 1990; Singer-Sam et al., 1990). However, considerably less is known of the modification patterns present on autosomal genes containing CpG islands. Exceptions are the hamster

phosphoribosyl transferase gene which is highly methylated in the 3' region in sperm and somatic tissues with few expression=related methylation changes (Stein et al., 1983) and the human DHFR CpG island which is unmethylated at all developmental stages (Migeon et al., 1991). It is unlikely that methylation plays a role in regulating gene activity of such genes yet it is a major cause of human gerreline and somatic mutations (Cooper and Youssoufian, 1988). We have examined the changes in methylation which occur on four human genes, which do or do not contain CpG islands, at different stages of human development. The results show a remarkably consistent variation in methylation in the non-CpG island regions of DNA which occur independently of expression. The data also show extensive methylation of dispersed CpG regions in sperm with regions of methylation maintained at some distance from the islands. The degree of methylation in these regions decreased in fetal cells with de novo methylation of sites closer to the CpG islands. The consistency of these variations independent of expression, suggests a mechanism

205 A.

AO

Mspl

2527 I

I

Bgll

2273

1224 I

Mspl

Rsal

I

1639 169$ I I

t~0

2638

bp 365

bp 815

254

415

741

326

270 B.

bp

1 2

3

4

5 6 7 8 M

Fig. 2. Methylation of c-Ha-ras area R. (A) A map o1" the Bgll fragment from nt position 2273-2638 bp with the CCGG site at 2527 is shown. The expected fragments generated by single and double digests are indicated below. (Panel B) Autoradiograph of Southern blot of 2.7% agarose gel containing 15 #g of DNA from human sperm (lanes I-2), fetal muscle (lanes 3-4), adult kidney (lanes 5-6) and adult lymphocytes (lanes 7-8). All samples with the exception of the adult lymphocytes were obtained from male subjects. Odd numbered lanes are single digests with Bgll, even numbered lanes are double digests of Bgll+ Hpall. Lane M is DNA from adult lymphocytes cut with Bgii+Mspl. Methods. Genomic DNA was incubated at 37°C overnight with 4 units of restriction enzyme/pg of DNA. The following day an additional ! unit/pg of enzyme was added and the reaction incubated for 4 h. For double digests, the DNA was eLhanol precipitated after the first digestion. Digested DNA samples were size fractionated using agarose gel electrophoresis in 1 × TAE buffer (40 mM Tris.acetate/l mM EDTA, pH 8.0). Molecular weights were calculated using Hindlll-digested bacteriophage ). DNA and Haelll-digested ¢XI74 replicative form DNA as Mr standards. DNA was transferred to nylon membranes (Zetabind, Cuno) according to the procedure of Southern (1975). DNA was cross-linked to the membrane by ultraviolet illumination (Church and Gilbert, 1984). Gel-purified probe DNA fragments were labeled with [x-3-'P]dCTP according to the procedure by Feinberg and Vogelstein (1984). The membrane was prehybridized overnight at 42°C in 5 × SSC/10 × Denhardt's solution/0.05 M Na.phosphate/i% SDS/5% dextran sulfate/50% formamide/500/~g per ml of denatured sonicated salmon testis DNA, pH 6.7. The probe was hybridized overnight at 42 °C in 5 × S SC/2 × Denhardt's solution/0.02 M Na.phosphate/ I% SDS/10% dextran sulfate/50% formamide/100 pg per ml of denatured sonicated salmon testis DNA, pH 6.7. The membrane was washed at 42°C in 0.2 × SSC/0.2% SDS for 20 min, and then exposed to X-ray film at -80°C.

for the maintenance of CpG islands and changes in methylation patterns which are dependent on CpG distribution rather than expression. RESULTS AND DISCUSSION

(a) Methylation at specific sites near the c-Ha-ras promoter A map of the human c-Ha-ras (Capon et al., 1983) showing GpC and CpG distribution, location of appropriate

B.

bp

1 2 3 4 $ 6 7 8 9101112

815-..

~*~"

270 --

'.~"

**:

~*

~*. . . .

**~.

Fig. 3. Methylation analysis of area Q (Figs. I, 4) in the CpG island of

¢-I-la.vas: (A) A map of the Rsai fragment from positions !150-1965 bp with three CCGG sites at bp positions 1224, 1639 and 1695. The horizontal lines represent fragments generated in the single and double digests with Rsal and/or Mspl. (Panel B) DNA (15 ~g) from human sperm (lanes I-3), fetal muscle tissue (lanes 4-6), adult kidney (lanes 7-9) or adult lymphocytes (lanes 10-12) were all digested with Rsal. Following ethanol precipitation, 5/Jg of restricted DNA was digested with a second enzyme, Hpall (lanes 2, 5, 8 and 11) or Mspl (lanes 3, 6, 9 and 12). Lanes !, 4, 7 and 10 were digested with Rsal only. Restricted DNA samples were electrophoresed on a 2% agarose gel and the Southern transfer was hybridized with probe Q as described in the legend to Fig. 2.

restriction enzyme sites, probes, exons and the variable number of tandem repeats (VNTR) region is shown in Fig. IA. The gene is G+C rich (greater than 50%) and contains a CpG island in the 5' region which, by the definition of Gardiner-Garden and Frommer (1987), extends into the fourth exon. Human DNA obtained from sperm, 20-week-old fetal tissues and adult kidney and lymphocytes was analyzed for methylation of a subset of CpG sites using Hpall and Mspl which both cut CCGG sites but HpaIIis inhibited by methylation of the internal cytosine residue. Earlier studies using Taqldigests of DNA obtained from sperm, fetal tissue and adult lymphocytes and kidney showed that all of the Mspl sites contained in fragment V (see Fig. 1A) were methylated in sperm and there was substantial demethylation of these sites in fetal tissues (Ghazi et al., 1990). Most of these sites, with the exception of a specific cytosine at position 3600 bp, were methylated in lymphocytes implying a demethylation followed by remethylation during development. In contrast, fragment I was largely destroyed by Hpall digestion of DNA from all tissues. The methylation of specific sites within fragment R (Fig. IA) was assessed in the experiment shown in Fig. 2. Single digests with Bgll of DNA from sperm, fetal muscle,

206 adult kidney or lymphocytes gave the expected bands at 365 bp when probed with probe R. Double digestion of lymphocyte DNA with Bgil+Mspl gave a 254-bp band showing complete cutting of the CCGG site at nt position 2527 (lane M). Fig. 2 also shows that the majority of the band at 365 bp was cut by Hpall in sperm DNA (lane 2) but not in the DNA from any other tissue (lanes 4, 6 and 8) showing that this site was substantially unmethylated in sperm unlike the other tissues (see Fig. 4). The methylation of three sites contained within fragment Q was examined next (Fig. 3). An 815-bp band was visible in Rsal digests of sperm, fetal muscle, adult kidney and lymphocytes (lanes 1, 4, 7 and 10, respectively). This band was reduced to two bands of 415 and 270 bp in sperm DNA by digestion with Hpall or Mspl showing that sites 1224, 1639 and 1695 were unmethylated. Site 1224 was unmethylated in all tissues but sites 1639 and 1695 were progressively more modified in fetal muscle, adult kidney and lymphocytes (see bands at 741 bp of increasing intensities). Experiments using fragment P as probe and double digests of genomic DNA with Bglll+Hpall showed that sites 333 and 528 were unmethylated in all samples tested (results not shown). The region between sites 1224 and 1639 might therefore represent a boundary of methylation in somatic tissues since no methylation was detected 5' of site 1224. Determination of the exact limits of a putative boundary will require detailed genomic sequencing of this area. The results from Figs. 2 and 3 and our earlier experiments (Ghazi et al., 1990; Ghazi, 1990) are summarized in

1000

2000

the diagram shown in Fig. 4. A band of methylation extends to at least nt position 3200 in c-Ha-ras in sperm. All MspI sites within this region are completely methylated but site 2527 is almost completely unmethylated and no methylation is present in the five sites examined 5' to this position. The limit of extensive methylation seen in sperm is therefore located between nt 2527 and 3200. The completeness of methylation in the 3' region of the gene decreases in fetal cells but partial methylation of sites nearer the 5' CpG island as far as position 1639 are apparent. The degree of methylation of these sites increases in adult cells with one area of partial undermethylation remaining in the vicinity of nt position 3500 in lymphocytes. Since DNA methylation patterns are known to be tissue specific and we were not able to compare identical fetal and adult tissues, the developmental significance of these observations is not yet clear. The restriction enzymes used allowed us to examine only a subset of potential modification sites, however recent data using direct genomic sequencing in a number of experimental systems has shown that Hpall/Mspl analysis gives a remarkably accurate indication of the methylation status of a given area (Jones et al., 1990; Toth et al., 1990). Therefore the restriction enzyme results probably give a good representation of the overall modification levels in the regions examined. The fact that all Mspl sites located in the 3' region of c-Ha-ras are completely methylated in sperm (Ghazi, 1990; Ghazi et al., 1990) suggests that these relatively dispersed CpG sites are accessible to the MTase during spermato-

4000

3000

5000

6000 bp

Mspl 0

Sperm

oo

o

Fetal

O0

0

Adult blood

OO

O

Adult kidney

OO

O

co

O0

[~

~'.::':i:i:!:!:i:!:~:~i:.~i!!:i:ii~!!

o •

N

Q •

Fig. 4. Smmnaryof the methylationof c-Ha-rasin different human tissues. A map of c-Ha-raswith its four exons (open boxes), VNTR region (stippled box) and Mspl sites (vertical lines) is shown. The solid bars and circles represent complete methylation, the stippled bar and circles represent partial methylation and the open circles represent absence of methylation. The asterisks mark the CCGG sites that were analysed in this work, the data for the 3000-6000 bp bar regions are taken from Ghazi et al. (1990).

207 Ae

A.

MspI

Mspl

$ I

I

i

II

II

II

II

pPB1.7

EcoRI kb

kb 0.9

3.1 2.3 1.1 Bs

kb

1

2

3

4

5

0.7 B.

1 2 3 4 5 6

6 9.4-

3.1-4.4-

2.3-. ~

!

t 1 . 1 -,. r

Fig. 5. Changes in the methylation ofc-myc in human tissues. (A) A map of the 3' end of the c-myc gene is shown. Mspl sites S and T are marked as in Fig. IB and the EcoRI site is indicated as a vertical line. The horizontal lines represent the fragments generated in double digests with EcoRI and Hpall or Mspl. (Panel B) DNA (10 Fg) from sperm (lanes ! and 2), fetal muscle (lanes 3 and 4) or adult lymphocytes (lanes 5 and 6) was digested with EcoRl. Following ethanol precipitation, 5 #g of restricted DNA was digested with a second enzyme, ~t.vpl (odd lanes) or Hpall (even lanes). Restricted samples were electrophoresed on a 0.8% agarose gel and the Southern transfer was hybridized with the probe marked in Fig. lB. The 1.1-, 2.3- and 3.I-kb fragments (left margin) are shown in A.

genesis. However it is unclear what prevents the methylation of sites 5' to nt position 2529 since some of these sites are clearly susceptible to modification in fetal and adult tissues. A possible explanation is that proteins bound to these regions during gametogenesis prevent access by the MTase. Recent results obtained with an in vitro methylated adenosine phosphoribosyl transferase CpG island in transgenie mice has shown that the island became rapidly and extensively demethylated in early embryonic cells (Frank et al., 1991). There may also therefore be active mechanisms to remove island methylation in early development.

(b) Methylationof c-myc and HPRT The c-myc and HPRT genes are autosomal and X-linked genes respectively which both contain CpG islands

0.9o.7.

~

,

Fig. 6. Methylation ofHPRT. (A) A map ofthe Mspl sites (vertical lines) in the promoter and the first exon of HPRT is shown (Wolf et al., 1984). The horizontal lines represent the 0.9-kb and 0.7-kb fragments expected to hybridize to the probe after Mspl digestion. (Panel B) DNA (10 #g) from sperm (lanes ! and 2). muscle obtained from a male fetus (lanes 3 and 4) and adult lymphocytes (female) (lane 5 and 6) were digested with Mspl (odd lanes) or Hpall (even lanes). Restricted DNA was electrophorcsed on a 0.8% agarose gel and the Southern transfer hybridized with probe pPBI.7 (see A).

(Gardiner-Garden and Frommer, 1987; see Colby et al. (1983) and Fig. 1B for map ofc-myc). The methylation of sites in close proximity to the islands were measured to see whether the strong exclusion of methylation from the CpG island region of the c-Ha-ras gcnc seen in sperm was a general property of CpG islands. Fig. 5 shows results of the digestion of human DNA with EcoRl and Hpall or Mspl followed by Southern analysis probed with the c-myc probe shown in Fig. lB. Bands of 2.3 and 1.1 kb corresponding to cutting of sites S and T by Mspl were seen in DNA obtained from sperm, fetal muscle and adult lymphocytes (lanes 1, 3 and 5). Site T was partially methylated in sperm as shown by the appearance of an additional 3.1 k~ band in Hpall digests (lane 2). The extent of methylation at tllis site increased in fetal cells and was complete in human lymphocytes since the 3. l-kb band was present predominantly with little evidence of the 2.3 and 1. l-kb bands after digestion with Hpall. Other analysis (Ghazi, 1990) showed

208 that the next Mspl site which is located 3.6 kb downstream from site T is methylated in sperm, fetal tissue and adult lymphocytes. The results shown in Fig. 5 are therefore similar to the analysis of c-Ha-ras and showed de novo methylation of sites towards the c-myc island. The methylation of sites close to the HPR T CpG island was measured next (Fig. 6). Digestion of human DNA with Mspl showed bands of 0.9 and 0.7 kb which hybridized to the probe after Southern analysis. All three sites within the probe were partially unmethylated in sperm since both bands were present on Hpall digestion. The absence of the 0.9-kb band in Hpall digests of fetal or adult tissues showed that the site close to the CpG island had become methylated in these tissues. It was also evident that methylation had moved into the CpG island in the adult lymphocyte sample which had been obtained from a female, since distinct higher molecular weight bands were visible on the blot. (c) Methylation of the insulin gone The insulin gone is CpG depleted and is expressed in a highly tissue specific manner. Fig. 7 shows a Southern blot of human DNA restricted with Hpall or Mspl and hybridized with a probe corresponding to the coding region of the

Ae 1ooo Mspl' i

i" !

'!

i"

2000 I[ I

[

3000

I

,

4000

| .w . .sl . . I .

0 r'-I

bp

I

[]

PROBE

BD

kb 9.4 .

1234

$6

4.4 *

2.0 * 1.1 *

0.6. 0.3. Fig. 7. Changes in the overall methylation of the insulin-encodinggone during human development,(A) A map of the Mspl sites within the insulin gone (vertical lines), with its three exons (open boxes) and VNTR region(stippledbox)(Belletal., 198I). (Panel B) DNA (5 #g) from sperm (lanes I and 2), fetal muscle tissue (lanes 3 and 4) or adult lymphocytes (lanes 5 and 6) weredigestedwith Mspl(odd lanes)or Hpall(even lanes). Restricted DNA was electrophoresed on a 0.8°o agarose gel and the Southern transfer was hybridized with the probe marked in A.

gene. Three low-molecular-weight bands between 0.3-0.6 kb were present in all tissues and sites flanking these were extensively methylated in sperm, since only fragments > 20 kb were present in Hpall digests of the same DNA. Methylation of these sites was considerably less in DNA extracted from muscle tissues of a 20-week-old fetus (lane 4) and was intermediate in adult lymphocyte DNA (lane 6). We have completed more detailed mapping of the methylation status of specific sites within the gene and the data show essentially the same pattern with extensive methylation of dispersed CpG sites in sperm, decreased methylation in fetal non-expressing tissues and intermediate modification in adult tissues (Ghazi etal., 1990). (d) Conclusions (1) The changes in methylation patterns seen in three CpG island-containing autosomai genes were quite similar to each other. Regions 3' of the islands were methylated extensively in sperm and became less methylated in fetal cells with de novo methylation of sites closer to the islands. (2) The above observations are similar to those recently reported for the human tissue specific gone apolipoprotein AI (Shemer etal., 1990) the mouse MyoDl (Jones et al., 1990) and several X-linked genes (Migeon et al., 1991) which are all CpG islands. These variations in methylation therefore seem to be linked to the island nature of the DNA rather than to extent of expression. (3) Examination at the CpG-depleted insulin-encoding gone showed essentially the same behavior as CpGdepleted regions of island-containing genes. The insulinencoding gone was extensively methylated in sperm, was less methylated in fetal tissues independently of expression and increased methylation was present in adult tissues. This data is in conflict with the idea that extensive methylation of genes in germ cells represents a ground state of methylation and that specific demethylation events are responsible for activating genes during development (Riggs and Jones, 1983). (4) The biological significance of this methylation is unknown, but, as mentioned above it is unlikely to be linked to expression since c-Ha-ras is an example of a 'housekeeping gone' and is expressed in a large number of different cell types including kidney and some lymphocytes (Furth et al., 1987). (5) The exclusion of methylation from the CpG islands of all three genes in human sperm may be responsible for their maintenance, since it has been suggested that methylation of CpG-rich regions in the germline would inevitably lead to CpG depletion due to the tendency of 5-methylcytosine to deaminate to thymine (Bird, 1986). The mechanisms for this exclusion of methylation in sperm are not understood, but may be due to the presence of cisacting sequences which are able to protect CpG islands

209 such as Thy-I from de novo methylation (Kolsto et al., 1986; Szyf et al., 1990). (6) The level of 5-methylcytosine in the coding regions of genes is extensive (see for example Fig. 4) where it has been hypothesized to be responsible for 30-35% of all human germline mutations (Cooper and Youssoufian, 1988). The methylation of coding sequences in sperm may therefore be a major contributor to genetic disease. The presence of 5-methylcytosine in expressed genes in somatic cells may also be of significance if the genes play a role in cellular growth control. Indeed, we have recently found, by direct genomic sequencing, that several hot spots for mutations in the p53 tumor suppressor gene contain 5methylcytosine (Rideout et al., 1990). The presence of 5methylcytosine in the coding sequences of human DNA is therefore likely to be important in the generation of genetic disease and cancer.

ACKNOWLEDGEMENTS

This work was supported by grant R35 CA49758 from the National Cancer Institute.

REFERENCES Antequera, F., Boyes, J. and Bird, A.P.: High levels of de novo methylation and altered chromatin structure at CpG islands in cell lines. Cell 62 (1990) 503-514. Bell, G.I., Karam, J.H. and Rutter W.J.: Polymorphic DNA region adjacent to the 5' end of the human insulin genc. Proc. Natl. Acad. Sci. USA 78 (1981) 5759-5763. Bird, A.P.: CpG-rich islands and the function of DNA methylation. Nature 321 (1986) 209-213. Borrdlo, M.G., Pierotti, M.A., Bongarzova, !., Donghi, R., Mondellini, P. and Porta, G.D.: DNA methylation affecting the transforming activity of the haman Ha-ras oncogene. Cancer Res. 47 (1987) 7579. Capon, D.J., Chen, E.Y., Levinson, A.D., Seeburg, P.H. and Goeddel, D.V.: Complete nucleotide sequences of the T24 human bladder carcinoma oncogcne and its normal homologue. Nature 302 (1983) 3337. Church, G.M. and Gilbert, W.: Genomic sequencing. Proc. Natl. Acad. Sei. USA 81 (1984) 1991-1995. Colby, W.W., Chen, E.Y., Smith, D.H. and Levinson, A.D.: Identification and nucleo-tide sequence of a human locus homologous to the v-n[rc oncogene of avian myelocytomatosis virus MC29. Nature 301 (1983) 722-725. Cooper, D.N. and Youssoufian, Y.: The CpG dinucleotide and human genetic disease. Hum. Genet. 78 (1988) 151-155. Coulondre, C., Miller, J.H., Farabaugh, P.J. and Gilbert, W.: Molecular basis of base substitution hotspots in Escherichia coll. Nature 274 (1978) 775-780. Crowther, P.J., Doherty, J.P., Linsenmeyer, M.E., Williamson, M.R. and Woodcock, D.M.: Revised genomic consensus for the hypermethylated CpG island region of the human L I transposon and integration sites of full length L1 elements from recombinant clones made using

methylation-tolerant host strains. Nucleic Acids Res. i 9 ( 1991) 23952401. DeLange, T., Shine, L., Myers, R.M., Cox, D.R., Naylor, S., Killery, A.M. and Varmus, H.E.: Structure and viability of human chromosome ends. Mol. Cell. Biol. 10 (1990) 518-527. Feinberg, A.P. and Vogelstein, B.: A technique for radiolabeling DNA restriction endonu,:lease fragments to high specific activity. Anal. Bidchem. 137 (1984) 266-267. Frank, D., Keshet, I., Shani, M., Levine, A., Razin, A. and Cedar, H.: Demethylation of CpG islands in embryonic cells. Nature 351 (1991) 239-241. Furth, M.E., Aldrich, T.M. and Cordon-Cardo, C.: Expression of ms proto-oncogene proteins in normal human tissues. Oncogene 1 (1987) 47-58. Gardiner-Garden, M. and Frommer, M.: CpG islands in vertebrate gehomes. J. Mol. Biol. 196 (1987) 261-282. Ghazi, H.: Changes of DNA Methylation Patterns during Human Development and Transformation. Ph.D. Thesis, University of Southern California, Los Angeles, CA, 1990. Ghazi, H., Magewu, A.N., Gonzales, F. and Jones, P.A.: Changes in the allelic methylation patterns of c-H-ras.I, insulin and retinoblastoma genes in human development. Development Suppl. (1990) ! 15124. Jones, P.A. and Buckley, J.D.: The role of DNA methylation in cancer. Adv. Cancer Res. 54 (1990) 1-23. Jones, P.A., Wolkowicz, M.J., Rideout, W.M., Gonzales, F.A., Marziasz, C.M., Coetzee, G.A. and Tapscott, S.J.: De novo methylation of the MyoD! CpG island during the establishment of immortal cell lines. Proc. Natl. Acad. Sci. USA 87 (1990) 6117-6121. Kolsto, A.B., Kollias, G., Giguere, V., isobe, K.-I., Prydz, H. and Grosreid, F.: The maintenance of methylation-free islands in transgenic mice. Nucleic Acids Res. 14 (1986) 9667-9678. Migeon, B.R., Holland, M.M., Driscoll, D.J. and Robinson J.C.: Programmed demethylation in CpG islands during human fetal development. Somat. Cell. Mol. Genet. 17 (1991) 159-168. Pfeifer, G.P., Steigerwald, S.D., Hansen, R.S., Gartler, S.M. and Riggs, A.D.: Polymerase chain reaction-aided genomic sequencing of an X chromosome-linked CpG island: methylation patterns suggest clonal inheritance, CpG site autonomy, and an explanation of activity state stability. Proc. Natl. Acad. Sci. USA 87 (1990) 8252-8256. Rachal, M.J., Yoo, H., Becket, F.F. and Lapeyre, J.-N.: In vitro DNA cytosine methylation of cis-regulatory elements modulates c-Ha-ras promoter activity in vivo. Nucleic Acids Res. 17 (1989) 5135-5147. Rideout, W.M., Coetzee, G.A., Olumi, A.F. and Jones, P.A.: 5-Methylcytosine as an endogenous mutagen in the human LDL receptor and p53 genes. Science 249 (1990) 1288-1290. Riggs, A.D. and Jones, P.A.: 5.Methylcytosine, gene regulation and cancer. Adv. Cancer Res. 40 (1983) 1-30. Shemer, R., Walsh, A., Eisenberg, S., Breslow, J.L. and Razin, A.: Tissue-specific methylation patterns and de-expression of the human apolipoprotein AI site gene. J. Biol. Chem. 265 (1990) 1010-1015. Singer-Sam, J., Grant, M., LeBon, J.M., Okuyama, K., Chapman, V., Monk, V. and Riggs, A.D.: Use of a Hpall polymerase chain reaction assay to study DNA methylation in the Pgk-I CpG island of mouse embryos at the time of X-inactivation. Mol. Cell. Biol. 10 (1990) 4987-4989. Southern, E.M.: Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98 (1975) 503517. Stein, R., Scialy-Gallini, N., Razin, A. and Cedar, H.: Pattern of methylation of two genes coding for housekeeping functions. Proc. Natl. Acad. Sci. USA 80 (1983)2422-2426. Szyf, M., Tanigawa, G. and McCarthy, P.L.: A DNA signal from the

210 Thy-I gene defines de novo methylation patterns in embryonic stem cells, Mol. Cell Biol. 10 (1990) 4396-4400. Toth, M., Muller, U. and Doerfler, W.: Establishment of de novo DNA methylation patterns. J. Mol. Biol. 214 (1990) 673-683. Tykocinski, M.U and Max, E.E.: CG clusters in MHC genes. Nucleic Acids Res. 12 (1984) 4385-4396. Wolf, S.F., Jolly, D.J., Lunnen, K.D., Friedmann, T. and Migeon, B.R.:

Methylation of the hypoxanthine phosphoribosyltransferase locus on the human X chromosome. Implication for X chromosome inactivation. Proc. Natl. Acad. Sci. USA 81 (1984) 2806-2810. Yen, P.H., Patel, P., Chinault, A.C., Mohandas, T. and Shapiro, L.J.: Differential methylation of hypoxanthine phosphoribosyltransferase genes on active and inactive X chromosomes. Proc. Natl. Acad. Sci. USA 81 (1984) 1759-1763.