TIBTECH - DECEMBER
Two-dimensional DNA typing Andr6 G. Uitterlinden and Jan Vijg DNA typing based on gel electrophoretic separation of DNA fragments, followed by hybridization analysis, has become an important analytical tool in areas ranging from forensic science to population biology. This approach can be extended by combining size separation with sequence-specific separation in denaturing gradient gels; this creates a high resolution two-dimensional pattern. The high information content of this system means that very closely related individuals (even monozygotic twins) can be distinguished and that the genetic events associated with development or cancer, for instance, can be followed. Ultimately, 2-D DNA typing could lead to computerized matching of a single individual's genome to a database of genetic markers. Using Southern hybridization analysis 1, specific DNA sequences can be analysed; total chromosomal DNA is digested with a restriction enzyme; the restriction fragments are separated by length in an agarose gel by electrophoresis and, then, a labelled, homologous DNA or RNA probe is used to detect the sequence of interest. The length of a particular restriction fragment (and hence its electrophoretic migration), can vary among different individuals, depending on whether a restriction enzyme recognition site is present or absent, or on whether the fragment contains insertions or deletions. Such variations are referred to as restriction fragment length polymorphisms (RFLPs). A particular type of RFLP is due to variation at minisatellite sequences, regions in the genome containing multiple copies of a short sequence motif (the repeat unit) in a tandem array (Fig. la). A minisatellite region can be extremely polymorphic because the copy number of the repeat
A. G. Uitterlinden and I. Vijg are at the Department of Molecular Biology, TNO Institute for Experimental Gerontology, PO Box 5815, 2280 H V Rijswijk, The Netherlands.
unit varies widely (from ten to several hundred): the alternative name for this type of sequence, Variable Number of Tandem Repeats (VNTRs)2 7, perhaps reflects their nature better.
DNA fingerprinting Jeffreys et al. 6 showed that part of the repeat unit sequence was a 'core', homologous to corresponding parts of repeat units at other minisatellite loci (Fig. lb). By using a core sequence as a probe in Southern hybridization analysis, many hypervariable minisatellites could be detected simultaneously 6. In this way, individual-specific 'DNA fingerprints' could be obtained. Several different core sequences have been identified 6'7, each of which detects a distinct set of minisatellite sequences with up to several hundred members. DNA fingerprinting has been very successful as an analytical tool for distinguishing individuals, in forensic science (to identify blood and semen traces), in paternity testing and in population biology (for a review, see Ref. 8). In addition, DNA fingerprinting has been applied to analyse simultaneously alleles of multiple polymorphic loci, spread through the genome, in mapping of genetic diseases by linkage analysis 9.
@ 1989, Elsevier Science Publishers Ltd (UK) 0167 9430/89/$2.00 -
1989 [Vol. 7]
However, DNA fingerprinting, as described, has some limitations. The number of minisatellite loci detected by one core probe is far too large to allow all minisatellite-containing restriction fragments to be resolved in a single electrophoretic lane. The limited resolution of agarose gel electrophoresis means that only the largest fragments can be resolved. In addition, it is almost always impossible to identify particular bands in a DNA fingerprint as alleles from the same locus; only when both the fragments containing the alleles happen to be large can they be resolved in the high molecular weight area 9. Furthermore, not knowing the chromosomal origin of a given minisatellite-containing fragment from a DNA fingerprint may make establishing family relationships difficult 1°. In short, then, DNA fingerprinting suffers from low resolution and the consequent inability of the technique to distinguish different alleles from the same locus. These factors have prevented its more widespread application in linkage analysis of heritable disorders. To identify alleles it is Possible to use locus-specific VNTR probes 2-7. These include both the core and noncore sequences of the repeat unit, and flanking sequences of the minisateb lite region. Such a probe detects only a single locus in the genome which will usually have multiple alleles in the human population. Provided the restriction enzyme does not cut within individual repeat units of the minisatellite locus, the probe will hybridize to only one or two restriction fragments per individual (the alleles from one locus). Although the use of locus-specific probes helps identify particular bands, only one locus at a time can be examined. As a consequence, each gel provides much more limited information than would be the case with a locus-nonspecific probe. An alternative approach, which does not sacrifice information content for resolution, is to perform the separation of DNA fragments using two independent properties of DNA: size (agarose or polyacrylamide gel electrophoresis) and melting charac= teristics (denaturing gradient gel
TIBTECH - DECEMBER 1989 [Vol. 7] ~Fig.
1
|
= Core
:i~i~i~i:i~ti:i~=i~i:i~Non-core i~i:
b ~ I | l H I H l l l l l i l l l l l H I l l
I II
I II
I II
I II
I II
I m ..........
~ ....
I I
I II
I
Minisatellite sequences and polymorphisms. (a) The structure of minisatellite sequences and their organization in the genome. (b) Restriction fragment length polymorphism of VNTR sequences. Homologous chromosomes may differ in the number of repeat units at the VNTR locus. Cleavage at recognition sites of a restriction enzyme (~ ) can be used to generate DNA fingerprints. [113, repeat unit;B, core;[- 7, non-core.
electrophoresis; see Box 1). Core probes can be used to identify fragments separated in this way. This is the basis of two-dimensional DNA typing.
Two-dimensional DNA typing Separation of DNA by denaturing gradient gel electrophoresis (DGGE) is independent of fragment size (see Box 1 and Ref. 11), but highly dependent on sequence composition: identically sized fragments differing only in the identity of one base pair can be separated. In combination with size-based electrophoretic separation, DGGE separation can generate two-dimensional separation patterns
(Fig. 2). The analytical power of this 2-D system was first demonstrated by comparing an EcoRI digest of the entire E. coli genome (4 x 106 bp) with that of a lysogenic E. cell strain, carrying phage lambda integrated in its chromosomal DNA. Among 350 spots, visualized by fluorescence under UV of the DNAintercalating dye ethidium bromide, the four additional spots due to bacteriophage lambda could easily be identified 11. However, the application of 2-D DNA electrophoresis to larger genomes (e.g. the human genome at 3 × 1 0 9 basepairs), poses technical problems. Stained, electrophoreti-
cally separated restriction enzyme digests of h u m a n genomic DNA are still essentially unresolvable. It occurred to us that hybridization with repetitive sequences as probes (a form of selective staining, in essence), could render the 2-D technique suitable for analysing the mammalian genome 21. This form of two-dimensional DNA typing could allow multiple genomic sites (e.g. those bearing the same minisatellite core sequence), to be analysed simultaneously for sequence variation. The amount of information that can be obtained from the genome by 2-D DNA typing is highly dependent on (among other characteristics) the sequence characteristics, copy number and distribution over the chomosomes, of the repeat sequences used as probes.
Two-dimensional DNA typing of human individuals Since minisatellites are thought to be dispersed throughout the human genome (albeit with a clustering near the telomeres), and a single core probe detects not more than a few hundred loci, these sequences were prime candidates for two-dimensional electrophoretic analysis. Indeed the suitability of minisatellite core sequences 33.15 and 33.6 (core sequence nomenclature based on 33 bp repeat, see Ref. 6) as probes in 2-D DNA typing of the human genome was demonstrated by comparing members of a human pedigree 22 (Fig. 3a). Using the minisatellite core sequence 33.6, about 600 spots per individual can be discerned by eye (we also employ image analysis; e.g. Fig. 4). Thus, roughly 300 minisatellite loci (corresponding to 600 alleles), homologous to core sequence 33.6 can be analysed. About 150 spots were polymorphic between the mother and the father, and 105 of these were transmitted to the son (examples shown in Fig. 3b). Thus, many separate genetic loci can be studied in a single gel analysis. Locus-specific migration patterns of minisate]]ite alleles in DGGE gels in order to test whether the system can identify alleles from specific loci, locus-specific VNTR sequences were used as probes (unpublished data).
TIBTECH - D E C E M B E R 1989 [Vol. 7]
--Box I
The second dimension: size-independent separation of DNA fragments using denaturing gradient gel electrophoresis DGGE is based largely on three characteristics of doublestranded DNA molecules:
In denaturing gradient gel electrophoresis (DGGE; reviewed in Ref. 12), DNA fragments are separated at an elevated temperature (5O°Cto 60°C) in a polyacrylamide gel which contains a gradient of urea and formamide 1~. These two chemicals facilitate separation of the two complementary strands of DNA (melting).
• DNA molecules contain 'melting domains' within which the melting of contiguous bases is closely coupled; • the temperature at which the strands of a melting domain part is highly dependent on the sequence composition of the domain;
a 80,
• partially melted DNA molecules (these will contain both double-stranded and single-stranded regions) are much less electrophoretically mobile in polyacrylamide gels than unmelted molecules. Only low-melting domains can be screened by DGGE analysis for sequence variations.
o~ 75. i.. E 7 0 -
.
649 8,
For each basepair, the melting temperature at which 50% of the molecules will be in the single-stranded or melted state (Tm5O) can be calculated by computer algorithm ~3"14. A plot of the Tin50 versus the basepair position along the sequence is termed a melting map and is useful in determining which part(s) of a sequence are amenable to DGGE analysis. DGGE has been used to screen low-melting domains for sequence alterations as small as single basepair changes both in cloned DNA sequences 13 ' 15-17 and directly in genomic DNA ~8-2°. In this respect, it is a useful tool in the diagnosis of genetic diseases, such as [3-thalassaemia, which are associated with point mutations.
65. 50
100
150
200
f
5'-TGTATTGATTCACTTGAAGTACGAA-3 ~
PGP649
..................
ML6 ML7
...............
ML8
......
A...... A--A ......
A..................
b ML 6
DNA melting and denaturing gradient gel electrophoresis. (a) Melting maps of four sequence variants of a DNA restriction fragment of 192 bp derived from bacteriophage Mu 75 The fragments differ only by point mutations at the 3' end of the molecule. Therefore, only the Tin50 of the lowest melting domain (positions 160 to 192 at the 3' side of the molecule) differs. (b) The four fragments migrate to identical positions during electrophoresis in a neutral polyacrylamide gel (left), but different ones in a denaturing gradient gel (right). The migration on DGGE corresponds to the order predicted by the algorithm. [Plasmids containing the sequence variants of the Mu DNA restriction fragment were kindly provided by Dr M. A. M. Groenen (Department of Biochemistry, University of Leiden, Leiden, The Netherlands).]
ML 7
ML 8
--Fig. 2 I-D
2-D E
large
D
A
II I
II
A
CB
D t
E I r
small
4-
4-
Two-dimensional electrophoresis of DNA fragments (according to Fischer and Lerman; see Ref. 11). The first separation is on the basis of fragment size by neutral agarose or polyacrylamide gel electrophoresis; the second on the basis of sequence by denaturing gradient polyacrylamide gel electrophoresis.
PGP 649
m
ML 6
ML 7
ML 8
PCP 649
Since locus-specific VNTR probes contain, in addition to the core sequence, the non-core sequence of the repeat unit and sequences adjacent to the minisatellite region, only two spots are expected when an individual's entire genome is analysed (Fig. 5). There is an order of magnitude difference in length between the two alleles shown in Fig. 5 and yet both migrate to very similar positions in the denaturing gradient. This is because any two minisatellite alleles will share the locus-specific 5' sequences adjacent to the minisatellite. In this case, because the specific 5' sequence is A + T rich relative to the repeat sequence, the specific sequence will act as a low melting-
TIBTECH - D E C E M B E R 1989 [Vol. 7]
--Fig. 3 a
>10
0.54
kb 10%
e3
o ~3
Pedigree analysis by two-dimensional DNA typing. Genomic DNA from a father, mother and their son [shown, (a)] were digested with Haelll, separated in two dimensions, transferred to nylon membranes and hybridized with the minisatellite core probe 33.6. (b) Details from the areas shown in (a) indicating the transmission of polymorphic spots. [], paternal and o, maternal polymorphic spots; --*, non-transmitted polymorphic spots. Up to four samples can be analysed on a single 2-D gel.
75%
o ? II
9
A
etically, it follows that alleles of a certain locus drawn from different individuals, even from different families will be present at a particular isotherm. This will facilitate identification of spots as alleles from specific loci.
Two-dimensional DNA typing using other repetitive sequences Repeat sequence motifs other than minisatellite core sequences 6'7 can be used in 2-D genomic analysis. G + C rich 'simple sequence' core motifs 24 are especially important. They con= sist of tandem arrays of 2-4 basepairs repeat units and display a similar degree of polymorphism (variation in the number of repeat units) as the minisatellites. Since minisatellites and simple sequences have different distribution patterns in chromosomes, the use of both these types of probes would allow large parts of the genome to be analysed. Several other types of interspersed h u m a n repetitive sequences, such as the AluI and Kpn[ repeat families, do not readily display resolvable spot patterns when used as hybridization probes. The reason for this difference is not clear at the moment.
Applications Currently, two-dimensional DNA typing requires specialized equipment and skilled personnel in order to obtain reproducible results from independent gels. A similar situation occurs with other complicated techniques such as 2-D protein electrophoresis or genomic sequencing 25. In our lab, up to four samples can be run per 2-D gel and about six gels can be produced per week per person. However, many steps in the experimental protocol are suitable for automation and this should make the system less labour-intensive and more reproducible. We anticipate that 2-D DNA typing could be applied in a number of fields.
B
C
point domain and hence determine largely the position of the minisatellite-containing restriction fragment on the denaturing gradient gel (see Ref. 23 and Box 1). This provides a means of establishing relationships among the many
Identity testing spots which are present in the 2-D patterns when a non-specific core probe is used: both alleles from a given locus will be present within the same zone of concentration of denaturants (an isotherm) in the 2-D spot pattern of an individual. Theor-
In principle, identifying individuals by 2-D DNA typing is much more accurate than by one-dimensional fingerprinting since, due to improved resolution, well over a hundred polymorphisms can be studied simultaneously (compared
TIBTECH- DECEMBER 1989 [Vol. 7] ~ F i g . 4.
Image analysis of 2-D patterns. The GEMINI system (Joyce-Loebl Ltd, UK) digitizes complete 2-D patterns. Automated pattern matching is interactive and generates datafiles defining both matching and non-matching spots. The digitized "grey'image (left) and binary image (middle) from the gel shown in Fig. 3 are used to generate a match image (right). In the match image, two patterns (red and green) are compared: filled areas indicate non-matching spots and open areas, matching spots.
with not more than 20), using a single core probe. This could be particularly important when one needs to distinguish between closely related individuals (a situation which is not u n c o m m o n in cattle-breeding, for instance). Indeed, 2-D DNA typing can even be used to distinguish monozygotic twins (on the basis of mutations during embryogenesis).
Detection of genomic alterations Being able to analyse-numerous loci simultaneously by 2-D DNA typing means that the efficiency of detecting genomic changes can be increased. Such changes would, for example, include those occurring in the germline during early development, those which are the result of treatment with mutagens or those which arise during tumour development, ageing, etc. Some such changes have been detected by using onedimensional DNA fingerprinting 27'28 indicating that the minisatellite sequences are involved in the somatic plasticity of the genome. However, using 2-D DNA typing, a much larger part of the genome can be
scanned for slippage replication events in both minisatellite sequences 26 and simple sequences (mistakes during replication, due to the repetitive nature of the substrate, can cause slippage leading to the deletion/insertion of repeat units). This will make genomic screening for rearrangements relevant to plasticity much more efficient. The same applies to ageing, a process that has also been associated with genome changes21,29,30.
Mapping gen eric traits Two-dimensional DNA typing could find its major application in the study of genetic traits such as heritable disorders. Mapping human genes currently involves t h e screening of pedigrees with hundreds of individual probes which detect polymorphic loci evenly spaced in the genome and with well-known chromosomal positions 31. Even with many such probes available for all the human chromosomes, this still represents a major task for any given monogenetic trait. Moreover, knowledge of the genetics of several
economically-important domesticated animals and plants is rather poorer and less well-developed than that of humans. As the accumulation of genetic information gathers pace, techniques which allow characteristics of total genomes to be rapidly assessed will be of great advantage. Using minisatellite core probes or simple sequence motifs 24 in 2-D DNA typing, a very large number of polymorphisms covering a substantial part of the genome could be screened in a single experiment. In principle, this could be done without previous information on the identity of the spots. For example, linkage analysis can be performed on pedigrees with genetic defects, and linked spots could be identified by punching them out of the gel followed by cloning and sequencing. The latter procedure has been described both for one-dimensional DNA fingerprinting 23 and for fragments separated by DGGE 32. In the longer term, another possibility would be to use 2-D DNA typing for scanning large numbers of defined DNA polymorphisms, there-
T I B T E C H - DECEMBER 1989 [Vol. 7] --Fig.
5 1 st
7.5 i
, PAGE
0.8 i
2ndl DGE
Two-dimensional gel analysis of a single VNTR locus (plambdaG3; s e e Ref. 31) in a human individual. Both alleles (7.5 kb and 0.8 kb, respectively) of the locus have migrated to nearly identical positions in the denaturing gradient, despite a ten-fold difference in size. For clarity, the component one-dimensional separation patterns of DNA from the same individual by PAGE and DGGE are shown. The probe p lambdaG3 was kindly provided by Dr A. J. Jeffreys (Department of Genetics, University of Leicester, Leicester, UK).
by, u l t i m a t e l y , p r o v i d i n g a w a y of rapidly evaluating (rather t h a n s i m p l y identifying) an i n d i v i d u a l ' s DNA. A n u m b e r of t e c h n i c a l develo p m e n t s will c o n t r i b u t e to this goal. Each spot in a 2-D p a t t e r n c a n be identified b y c o m p a r i n g core p r o b e s w i t h locus-specific ones. Our finding that t w o alleles of a locus migrate to similar p o s i t i o n s in the s e c o n d d i m e n s i o n also facilitates s u c h a venture. The n u m b e r of h u m a n p o l y m o r p h i s m s , i n c l u d i n g a large n u m b e r of VNTRs a n d s i m p l e sequences, for w h i c h the c h r o m o s o m a l location has b e e n established, is r a p i d l y increasing 33. The r e s o l u t i o n of the 2-D m e t h o d will also be i n c r e a s e d a n d the m e t h o d will be a u t o m a t e d . G i v e n time, the g e n o m e c o u l d be s a t u r a t e d w i t h p o l y m o r p h i c spots f r o m 2-D separations. T w o - d i m e n s i o n a l s e p a r a t i o n patterns can be c a p t u r e d b y image analysis (Fig. 4) a n d digitized to p r o v i d e an e n t r y on a database of polymorphisms. An individual's DNA c o u l d p r o v i d e m a n y entries to the database, each b a s e d on a different m i n i s a t e l l i t e or s i m p l e s e q u e n c e core sequence. T h e s e c o u l d be c o m b i n e d w i t h entries from locusspecific V N T R a n d s i m p l e s e q u e n c e RFLPs. Each spot in a given 2-D p a t t e r n c o u l d t h e n be identified as a p a r t i c u l a r allele f r o m a c h r o m o s o m a l
locus. The 2-D patterns w o u l d t h e n truly represent comprehensive g e n o m i c m a p s of i n d i v i d u a l s . This a p p r o a c h will f o r m a natural c o m p l e m e n t to the databases of h u m a n cellular p r o t e i n s as a n a l y s e d b y twodimensional protein electrop h o r e s i s 34. Ultimately, 2-D DNA typing could provide computerized m a t c h i n g of a single i n d i v i d u a l ' s g e n o m e to a database of genetic markers.
Acknowledgements We t h a n k Drs P. Eline S l a g b o o m a n d E. M u l l a a r t for helpful d i s c u s s i o n s during the c o u r s e of this work, part of w h i c h w a s s u p p o r t e d b y S e n e t e k plc. Locus-specific VNTR p r o b e s w e r e k i n d l y p r o v i d e d b y Prof. A. J. Jeffreys (University of Leicester, Leicester, UK). We t h a n k Prof. L. S. L e r m a n (Massachussets Institute of Technology, Cambridge, USA) for the use of his m e l t i n g program. The p r o b e s 33.6 a n d 33.15 are the subject of p a t e n t p r o p e r t y and c o m m e r c i a l enquiries regarding these p r o b e s s h o u l d be directed to ICI Diagnostics, G a d b r o o k Park, R u d h e a t h , Northwich, C h e s h i r e CW9 7RA, UK. The two-dimensional DNA typing m e t h o d is subject to a p a t e n t application.
References 1 Southern, E. M. (1975) J. Mol. Biol. 98, 503-517 2 Wyman, A. R. and White, R. (1980) Proc. Natl Acad. Sci. USA 77, 6754-6758 3 Higgs, D. R., Goodbourn, S. E. Y., Wainscoat, J. S., Clegg, J. B. and Weatherall, D. J. (1981) Nucleic Acids Res. 9, 4213-4214 4 Bell, G. I., Selby, M. J. and Rutter, W. J. (1982) Nature 295, 31-35 5 Wong, Z., Wilson, V., Patel, I., Povey, S. and Jeffreys, A. J. (1987) Ann. Hum. Genet. 51, 269-288 6 Jeffreys, A. J., Wilson, v. and Thein, S. L. (1985) Nature 314, 67-73 7 Nakamura, Y., Leppert, M., O'Connell, P. et al. (1987) Science 235, 1616-1622 8 Jeffreys, A. J. (1986) Biochem. Soc. Transact. 15, 309 317 9 Jeffreys, A. J., Wilson, V., Thein, S. L., Weatherall, D. J. and Ponder, B. A. J. (1986) A m . J. Hum. Genet. 39, 11-24 10 Lewin, R. (1989) Science 243, 1549-1551
11 Fischer, S. G. and Lerman, L. S. (1979) Ce]i 16, 191-200 12 Lerman, L. S., Fischer, S. G., Hurley, I., Silverstein, K. and Lumelsky, N. (1984) A n n u . Rev. Biophys. Bioeng. 13,399-423 13 Fischer, S. G. and Lerman, L. S. (1983) Proc. Nat] Acad. Sci. USA 80, 1579 1583 14 Lerman, L. S. and Silverstein, K. (1987) Meth. Enzymol. 155,482-501 15 Groenen, M. A. M. and van de Putte, P. (1986) ]. Mol. Biol. 189, 597-602 16 Myers, R. M., Fischer, S. G., Maniatis, T. and Lerman, L. S. (1985) Nucleic Acids Res. 13, 3111-3129 17 Smith, F. I., Parvin, J. D. and Palese, P. (19861 Virology 150, 55-64 18 Noll, W. W. and Collins, M. (1987) Proc. Natl Acad. Sci. USA 84, 3339-3343 19 Curiello, N. F., Scott, J. K., Kat, A. G., Thilly, W. G. and Keohavong, P. (1988) Am. J. Hum. Genet. 42,726-734 20 Borresen, A. L., Hovig, E. and Brogger, A. (1988) Mutat. Res. 202, 77-83 21 Vijg, J. and Uitterlinden, A. G. (1987) Mech. Aging Dev-. 41, 47-63 22 Uitterlinden, A. G., Slagboom, P. E., Knook, D. L. and Vijg, J. (1989) Proc. Natl Acad. Sci. USA 86, 2742-2746 23 Wong, Z., Wilson, V., Jeffreys, A. J. and Thein, S. L. (1986) Nucleic Acids Res. 14, 4605-4616 24 Tautz, D. and Renz, M. (1984) Nucleic Acids Res. 12, 4127-4138 25 Saluz, H. P. and Jost, J. P. (1989) Anal. Biochem. 176, 201-208 26 Jeffreys, A. J., Royle, N. J., Wilson, V. and Wong, Z. (1988) Nature 332, 278-281 27 Thein, S. L., Jeffreys, A. J., Gooi, H. C. et al. (1987) Brit. J. Cancer 55, 353-356 28 de Jong, D., Voetdijk, B. M. H., KleinNelemans, J. C., van Ommen, G. J. B. and Kluyn, Ph. M. (1988) Brit. J. Cancer 58, 773-775 29 Vijg, J., Uitterlinden, A. G., Mullaart, E., Lohman, P. H. M. and Knook, D. L. (1985) in Molecular Biology of Aging: Gene Stability and Gene Expression (Sohal, R. S. et al., eds), pp. 155-171,
Raven Press 30 Schimke, R. T., Sherwood, S. W., Hill, A. B. and Johnston, R. N. (1986) Proc. Nat] Acud. Sci. USA 83, 2157-2161 31 Botstein, D., White, R. L., Skotnick, M. and Davis, R. W. (1980) Am. J. t t u m . Genet. 32, 314-331 32 Myers, R. M., Lerman, L. S. and Maniatis, T. (1985) Science 229, 242-247 33 Donis-Keller, H., Green, P., Helms, C. et al. (1987) Cell 51, 319-337 34 Celis, J. E. et u]. (1989) FEBS Lett. 244, 247-254