trends in plant science Update
T
Combining genome sequences and new technologies for dissecting the genetics of complex phenotypes One of the challenges in the analysis of genome sequences is determining the contribution of sequence variants (both within and between species) to hereditary phenotypes and diseases. Most of our knowledge about how sequence variants affect phenotype is based on the study of traits and diseases that are conditioned by single genes. However, most of the diversity in a population results from multigene variants. One of the major challenges for years to come will be identifying the individual genetic factors that condition multifactorial phenotypes. Given the number of sequence variants and individuals that need to be genotyped, high-throughput methods are essential to study these traits. It has proved to be difficult to dissect the genetics of multifactorial traits because the phenotypes are often continuous and their inheritance follows non-mendelian or complex patterns. Each of the multiple underlying alleles is thought to contribute different amounts to the phenotype and a similar phenotype could result from a different combination of any number of underlying alleles. In humans, the dissection of complex traits is difficult but in model organisms, in which genetic crosses between selected strains can be performed at will, complex traits can be dissected more easily1. In plants, an estimated 2.5 million natural accessions (exotics) have been collected in seed banks2. These repositories harbor tremendous genetic potential that can be exploited to find genes important for quantitative characteristics such as plant height, seed size and crop yield3. Plant geneticists can map valuable alleles for quantitative traits, also known as quantitative trait loci (QTLs), by mating a wild, exotic accession with a laboratory strain3,4. The cross places a few wild alleles into the genetic background of the laboratory strain. The chromosomal position of interesting alleles can then be revealed with the help of phenotypic5 or, more recently, molecular6 markers
that serve as landmarks in the genome. Only a few hundred linked markers are needed to identify the approximate position of QTLs with major effects in genomes the size of Arabidopsis, rice and tomato. However, to map minor-effect QTLs, higher density marker maps are needed; for example, for the human genome, up to 500 000 markers have been proposed7,8. The present lack of large numbers of markers and the technical challenge of processing them in parallel has hindered the finestructure mapping and cloning of QTLs. Historically, visible changes in phenotype were the first set of genetic markers that were used for linkage analysis and the construction of the first generation of genetic maps5. However, the availability of phenotypic markers is limited in most organisms and it is difficult to analyze many such markers in a single cross. The realization that simple DNA sequence variants can be used as molecular markers led to a major advance in mapping genetic traits6: their abundance made linkage analysis feasible in practically any organism. The first molecular marker maps consisted of sets of single-copy DNA probes designed to recognize DNA polymorphisms by hybridization to restriction enzyme digested DNA [restriction-fragment-length polymorphism (RFLP) analysis]6. The advent of the polymerase chain reaction9 (PCR) led to the development of new marker systems including amplified fragment-length polymorphisms (AFLP)10, cleaved amplified polymorphic sequences (CAPS)11 and microsatellites12. Microsatellites in particular, with an average interval spacing of 1.6 cM in humans13, enabled the accelerated mapping and successful cloning of several mendelian trait loci14. Now, with the increasing availability of genome sequences and the advent of genome-scale technologies, the construction of even denser genomewide marker maps based on single nucleotide polymorphisms (SNPs) is feasible15Ð17. In this article, we discuss two new strategies for using genome sequences to identify and map markers. The discussion will be divided according to biological utility. First, we review the use of denaturing high-performance liquid chromatography (DHPLC) and arrays to scan genomes for DNA sequence variation. Second, we cover the use of DHPLC and arrays to genotype individuals at sites of DNA variation. Third, we speculate about future directions and applications. High-throughput marker identification
DNA sequence information can be used to amplify or to probe specific regions of the genome for high-throughput sequence comparisons between strains. The increasing availability of full genome sequences for many model organisms provides a framework to identify sequence variants in non-reference
1360 - 1385/00/$ Ð see front matter © 2000 Elsevier Science Ltd. All rights reserved.
strains (e.g. wild accessions) and then to place them directly on a physical map. For organisms for which only partial sequence is available, such as from ESTs or bacterial artificial chromosome (BAC) ends, the sequence information can still be used to find markers, but other methods might be needed to order and position markers on a map18. Among the tools that use reference sequence information to discover markers, resequencing of select genomic regions by means of dideoxydye-terminator chemistry is the best established. However, as currently used, it is labor intensive and expensive. Single-strand sequencing reduces the cost but is inaccurate. In addition, reference sequences contain errors and therefore any apparent differences detected in non-reference strains need to be confirmed with additional sequencing reactions, including resequencing of the reference strain. As a result, it is unlikely that dideoxy sequencing will be widely used to sequence more than one strain of most species in an effort to find nucleotide variants on a genome-wide level. Methods that incorporate a comparative analysis of reference and non-reference strain sequences provide an alternative and, perhaps, more efficient tool. Such methods include DHPLC and highdensity oligonucleotide arrays. Denaturing high-performance liquid chromatography
DHPLC enables the detection of single-base substitutions and other simple sequence variants in DNA fragments as large as 1.0 kb in a highly automated fashion19. This method is often used to compare two or more chromosomes over a short distance as a mixture of denatured and annealed PCR products. The presence of a single or multiple sequence differences results in the formation of heteroduplices. These are revealed under partial thermal denaturation by the appearance of one or more early eluting peaks in a chromatographic profile (Fig. 1). The number of peaks observed depends on the number of mismatches and their location and chemical nature. Column temperature and acetonitrile gradient conditions also modulate the degree of denaturation and peak separation. Using the available sequence information, accurate temperature and acetonitrile gradient recommendations can be made that optimize the sensitivity of detecting a sequence variant (http://insertion.stanford.edu/melt.html) 20. Variation analysis of the Arabidopsis ecotypes Columbia (Col) and Landsberg erecta (Ler), showed that DHPLC detected 98.4% of single base variants (n 5 61) and 100% of multiple base variants, insertions or deletions (n 5 105) (J. Spiegelman et al., unpublished). Finally, all steps can be automated in multiwell microtiter plates, and column loading, separation and detection require only a few minutes. September 2000, Vol. 5, No. 9
397
A
No heteroduplices A Non-polymorphic
T
T A T A T A T
G C
Columbia
G C G
A T
Two peaks A
T
Polymorphic T A
Landsberg
One peak
Trends in Plant Science
trends in plant science Update
C T A
Heteroduplices Heteroduplex at SNP peak
Fig. 1. Denaturing high-performance liquid chromatography (DHPLC) for the identification of markers between Arabidopsis ecotypes Columbia (Col) and Landsberg erecta (Ler). A single locus of up to 1.0 kb is amplified from each ecotype, mixed in equal proportion, denatured and reannealed. If there is a sequence difference between the two ecotypes, both homoand heteroduplex DNAs are formed. Separation of the resulting duplices under partial heat denaturation in a column that binds double-stranded DNA results in the homo- and heteroduplex DNA species being retained differently. The heteroduplex is generally retained less and is evidence for the presence of a polymorphism. A similar analysis can be applied to identify regions of heterozygosity in non-inbred lines or to genotype heterozygous markers, in which case PCR amplification is performed on only a single sample, without mixing. Column loading and sample separation typically take less than 6 min.
DHPLC reveals the presence of a polymorphism in the form of a heteroduplex but it does not provide information about its chemical nature and location. Therefore, sequencing is required when DHPLC-discovered markers need to be processed with other genotyping tools or where individual variants within a single amplicon need to be reliably distinguished. Nevertheless, in some applications, the identification of the actual sequence difference is unnecessary (e.g. fine-structure mapping in model organisms, in which one does not need to know the sequence of a marker as long as it can be followed in genetic crosses). This situation allows the combined use of DHPLC for marker discovery and genotyping. In addition, when the frequency of polymorphism is low (e.g. coding regions of DNA and detecting DNA variation in Arabidopsis), DHPLC-based marker discovery can save time and money because fragments in which DHPLC does not find a polymorphism do not need to be sequenced. EST sequences, BAC end sequences and sequences obtained from a shotgun library of Arabidopsis Ler were used in designing PCR primers to amplify more than 1800 fragments from both ecotypes, representing nearly 0.5% of the Arabidopsis genome. Polymorphic DHPLC peak profiles were obtained in a third of the fragments and confirmed by sequencing17. High-density oligonucleotide arrays
High-density DNA arrays, better known for their application to genome-wide gene expres398
September 2000, Vol. 5, No. 9
sion analysis, are also powerful new tools for DNA variation detection and subsequent parallel genotyping of large numbers of markers21. High-density arrays for variation detection typically consist of short oligonucleotide probes 20Ð25 bases long that are covalently bound to a solid surface such as glass. Currently, the highest density arrays are made by synthesizing oligonucleotides directly on the surface of glass plates using photolithography and photosensitive oligonucleotide synthesis chemistry22. Current standards allow for 300 000 different probes in a 1.28 3 1.28 cm2 area23. The ability to analyze large numbers of probes in parallel lies in the specificity of DNA hybridization, which allows the detection of individual target sequences within complex mixtures. The ability to use hybridization to distinguish simple nucleotide differences is a further result of this specificity. Taken together, this means that one can search large stretches of sequence for DNA variation on a single highdensity tiling array. Probes on the array are designed to complement a reference sequence with a set of four probes, each differing at only the central base position, tiling across the reference sequence one base at a time24 (Fig. 2). In theory, an array of this design can detect all possible single base substitutions in 75 kb of sequence (at current array size and density). Because the target sample is interrogated at one base pair resolution, this strategy of finding polymorphisms resembles resequencing. Efficient resequencing on tiling arrays involves a comparison between the hybridization
patterns of a known reference and a polymorphic sample (the reference sample usually complements the probe sequences on the array). Focusing on the key differences in hybridization pattern limits the analysis to areas in which there is probably sequence variation. This approach overcomes difficulties experienced when sequence is directly inferred from hybridization rates. In practice, up to 30 kb have been analyzed by comparative hybridization on a single array15. One of the first practical demonstrations of this approach was the resequencing of human mitochondrial DNA in ten individuals24. More recent and larger human genome surveys have scanned 2 Mb (Ref. 15), 196 kb (Ref. 25) and 190 kb (Ref. 26) distributed over multiple arrays. Several hundred candidate polymorphisms were found. At present, the major limitations are cost and the difficulty of achieving both highly sensitive (proportion of polymorphisms detected) and highly specific (proportion of true positives) marker identification. The ability of current efforts to satisfy the demand for one but not the other resides in the difficulty of achieving the same hybridization behavior for probes with different sequences. Under a single hybridization condition, some probes have nonoptimal hybridization kinetics and thus cannot be used for sequence analysis. As a consequence, markers located near such sequences cannot be detected. In addition, it is difficult to identify markers that are present as heterozygotes or markers located close to other polymorphisms. As a result, most studies achieve sensitivities of 85Ð95%, with specificity in some cases as low as 55%. Although the computational analysis of hybridization signals allows specificities approaching 100% (Refs 25,26), this compromises sensitivity. Hybridization-based marker discovery is thus an extremely powerful tool if a complete identification of all markers is not necessary or where modest false-positive rates are acceptable. For example, identifying all sequence variants is unnecessary for the construction of a genetic map in an organism with abundant polymorphisms. Arrays might therefore prove useful for the construction of linkage maps. Automated and parallel marker genotyping
Often, the same technology that was used for marker detection can be adapted to marker genotyping. However, the challenge of genotyping DNA variants is rather different to that of identifying DNA variation. Hetero- and homozygotes need to be distinguished and it is important that variants found in genotypes of individuals are accurate. As a result, the success of marker genotyping is measured by the accuracy of homozygous and heterozygous genotype assignments.
trends in plant science Update
(a)
Columbia - allele G
Landsberg - allele A ...AGGACTAGTCTATACCTTGAACTATGTGAACCAAATTAAAG...
...AGGACTAGTCTATACCTTGAGCTATGTGAACCAAATTAAAG...
tcagatatg aactTgatac
tcagatatg aactCgatac cagatatgg
cagatatgg act gataca
actCgataca
agatatgga
agatatgga
ctCgatacac
gatatggaa atatggaac
tCgatacact
Probe sequences
Cgatacactt
tatggaactAgatacacttg tatggaactCgatacacttg tatggaact gatacacttg tatggaactGgatacacttg atggaactC atacacttgg tatggaactTgatacacttg
atatggaac Tgatacactt tatggaact gatacacttg atggaactT atacacttgg tggaactTg tacacttggt
tggaactCg tacacttggt
ggaactTga acacttggtt
ggaactCga acacttggtt
gaactTgat cacttggttt
gaactCgat cacttggttt aactCgata acttggttta A C G T
(b)
Allele G
aactTgata acttggttta
Allele G Allele A
ctTgatacac
gatatggaa tTgatacact
A C G T
Allele A Allele G
Allele A
Allele G
Allele A
A C G T
A C G T
A C G T
GÐG homozygote
AÐG heterozygote
AÐA homozygote Trends in Plant Science
Fig. 2. Resequencing and genotyping SNP markers on a tiling array. (a) By knowing the reference sequence for a polymorphic DNA sample, one can design an array that contains four oligonucleotide probes for each base position of the reference sequence. Resequencing on the array is carried out by interrogating each base position of a target with four probes that have an A, C, G or T at their center position. A DNA sample (target) that is exposed to the collection of probes, spatially resolved on a solid surface, will hybridize most strongly to the probe that complements its sequence most closely. Therefore on the tiling array, the probe with the correct base at each center position will produce the strongest hybridization signal. The next set of four probes interrogates the next base and hence is offset by a single base position. With probes tiling through an entire reference sequence, single base substitutions can be identified at any position. This is illustrated here in the design and hybridization pattern of a tiling array designed to resequence a short region of the genomes of Arabidopsis ecotypes Columbia (Col) and Landsberg erecta (Ler) that is marked by a single G to A substitution polymorphism. (b) For genotyping this polymorphism, tiling array blocks that complement both alleles are synthesized on a single array and are hybridized to the same sample. GÐG homozygotes, AÐG heterozygotes and AÐA homozygotes can be distinguished clearly by their hybridization patterns.
Denaturing high-performance liquid chromatography
In the case of partially denaturing HPLC, the major drawback of using heteroduplex detection for marker genotyping is that it is not co-dominant. DHPLC analysis can distinguish between homo- and heterozygotes, but it does not distinguish the homozygous alleles from one another unless the sample is mixed with a known reference. Under completely denaturing conditions, however, DHPLC enables the direct genotyping of short amplicons without the addition of a reference chromosome27 and the analysis of primer extension products28 using ion-pair reversed-phase liquid chromatography on commonly used alkylated nonporous poly(styrenedivinylbenzene) chromatographic supports. Single-stranded molecules of identical size that differ only in a single nucleotide in sequence have been resolved to type all possible transitions and transversions other than C to G in amplicons less than 100 nucleotides long27. Knowing the exact genotype is important, especially for mapping schemes in diploid
organisms in which homozygotes of either allele need to be identified. However, for the fine mapping of recessive trait loci, partial denaturing HPLC can play a major role owing to its high automation and low cost, because only discrimination between hetero- and homozygous states is usually required for narrowing an interval. Once the appropriate temperature and acetonitrile conditions have been determined for a particular amplicon, hundreds of individuals can be genotyped at that locus with complete automation and high accuracy29 (100%). If the individual genotype does not need to be determined, the throughput of DHPLC can be increased by pooling up to five diploid DNA samples. This still allows the detection of a single base change in one out of ten chromosomes30. Oligonucleotide arrays
Tiling arrays do allow co-dominant marker scoring and, unlike DHPLC, allow massive parallel genotyping. Because tiling arrays of diploid plant genomes need to score heterozygotes, arrays meant for parallel genotyping of
large numbers of markers contain a second tiling block to represent the alternate marker allele31 (Fig. 2). By choosing only the sequence variants with favorable hybridization properties, accurate genotype assignments can be made repeatedly. For example, of 558 candidate human SNPs identified by tiling array hybridization, 378 (68%) were scored correctly in a test on samples with already known genotypes. When the 378 SNPs were then genotyped on new samples, 1611 of 1613 array-determined genotype assignments were accurate15. This high level of accuracy has enabled the use of tiling arrays for the genome-wide mapping of genetic factors in plants. In a recent study of 412 simple sequence polymorphisms identified between Arabidopsis ecotypes Col and Ler, 235 (57%) could be amplified successfully by multiplex PCR and allowed codominant marker discrimination. However, 75 sequence variants had to be excluded a priori because they met criteria known to result in poor hybridization. These include sequences with runs of more than four of any base or September 2000, Vol. 5, No. 9
399
F2 strains (n = 26)
trends in plant science Update
0
20
40
60
80
100
120
Chromosome 1 position (cM) 7 cM Trends in Plant Science
Fig. 3. Plot of marker genotypes along chromosome 1 for 26 Arabidopsis second generation progeny (F2) selected to be susceptible to the fungal pathogen Erysiphe orontii. Segregation data suggested that the susceptibility phenotype is the result of a single recessive mendelian locus. For mapping purposes, mutant Columbia (Col) plants were crossed with Landsberg erecta (Ler). First generation progeny were intercrossed and 26 F2 progeny that displayed the susceptibility phenotype were collected. For each susceptible F2 progeny, 235 markers were genotyped by hybridization to a tiling array. One of three possible genotypes was recorded for each marker: homozygous Col (green 1), homozygous Ler (red 3), or heterozygous (blue *). Taking the position of each marker into account, an inheritance map of each chromosome could be constructed and is illustrated here for chromosome 1. The x axis represents the position of each marker and the y axis the genotype of each of the 26 F2 strains. The boxed region contains a region that is highly biased towards inheritance from the mutant Col strain (green 1) and represents the 7 cM interval to which the susceptibility factor was localized.
more than 40% cytosines or adenines in a stretch of 20 bp. Nevertheless, the 235 robust markers sufficed to map a gene involved in the Arabidopsis defense response to the fungal pathogen Erysiphe orontii to a 7 cM interval on chromosome 1 (Ref. 17; Fig. 3). Although one array is needed for each individual, more markers can be analyzed in parallel than is possible with most other methods. An alternative strategy for marker genotyping couples an enzymatic minisequencing reaction with hybridization. In a single-base primer extension reaction, an oligonucleotide primer just one base short of the polymorphic target position is annealed to target DNA and extended with a labeled nucleoside triphosphate32. This reaction has been carried out with oligonucleotides that have been immobilized to solid surfaces by their 59 ends33 and radioactive nucleoside triphosphates have been extended on primers arrayed onto glass34,35. In theory, it 400
September 2000, Vol. 5, No. 9
appears to be possible to use four different fluorescently labeled dideoxynucleoside triphosphates to genotype many markers in a single extension reaction. The advantages of this approach are that marker discrimination is carried out with a highly specific polymerase and one primer per SNP is sufficient. However, the largescale application of this technology is needed before its full utility can be assessed; including the proportion of markers that fail because the flanking sequence environment precludes the design of appropriate extension primers. Future directions and applications
DHPLC and array hybridization are powerful tools for the automated, parallel analysis of many markers. The application of these technologies to the genome-wide mapping of complex traits has promise. Nevertheless, several technical issues remain to be solved. One of the main problems with sequence-based marker
identification and processing is the current requirement for PCR amplification. For DHPLC, PCR amplification seems to be a necessity, although hybridization-based approaches have been feasible in the yeast genome without PCR. Haploid yeast has been scanned for DNA variation on arrays by direct total genomic DNA hybridization. With an oligonucleotide array designed for gene expression analysis, containing mostly non-overlapping probes complementary to yeast open-reading-frames, 3714 markers were identified that generated a map with an average marker spacing of 3.5 kb (~1 cM) (Ref. 16). Using the same approach, all markers can be scored simultaneously for high-resolution mapping of genetic factors. With empirical improvements in hybridization conditions, it might be possible to extend the same ability to more complex genomes such as Arabidopsis. Alternatively, it might be possible to reduce the complexity of total genomic DNA to a level where direct genomic DNA hybridization becomes feasible. Similar strategies might be needed to overcome difficulties presented by the polyploidy of some plants. To date, there has been no demonstration of the feasibility of DHPLC and array hybridization for genomes that contain more than two alleles at a locus. Marker discovery by hybridization is limited by the fact that the sequence composition of some markers makes them difficult to score on the arrays. However, for marker genotyping, confidence scores can be determined for each marker and markers with high false-positive rates can be eliminated from second-generation tiling array designs. In theory, a genotyping chip can be designed with high confidence scores for every one of the ~3750 markers that fit on a single array of current size and density. The chip-based platform is highly scalable. In theory, a significant increase in throughput can be achieved by decreasing the number of probes per marker and/or by increasing the size or density of the array. Currently, to ensure robust and accurate marker scoring, at least 20 probes are used per allele, per strand. As demonstrated in haploid yeast, a single probe can be sufficient to identify a marker. Therefore, in theory, it should be possible to use as few as two probes per SNP for diploid genomes36,37, in which case an array of 300 000 probes in a 1.28 3 1.28 cm2 area could be used to process 150 000 markers in parallel. Although the main advantage of a chipbased approach is that it is easy to run markers in parallel, the main advantage of DHPLC is its automation and high accuracy. The use of DHPLC is an important step towards the complete automation of sequence variation analysis. A practical demonstration of the parallel analysis with DHPLC is still needed but, in theory, limited parallel analysis can be accomplished with amplicons of different size and/or
trends in plant science Update labeling with fluorescent dyes19. Analogous to capillary sequencing, the recent introduction of capillary columns for DHPLC might enable the parallel use of multiple separation channels in a single instrument38. Ninety-six capillaries coupled to a four-color-fluorescence detector could enable the analysis of a few thousand samples per hour. Another advantage of capillary DHPLC is the use of low sample volumes (a few hundred nanoliters), which reduces the cost of PCR-based sample preparation. By using DHPLC and array hybridization to scan large numbers of sequences for DNA variation and to process markers in many individuals, one can begin systematically to map the genetic factors underlying many traits. However, the applications of these marker discovery and genotyping tools might also reach further. In addition to genetic mapping, the comparison of marker distributions within and between species39,40 allows the construction of phylogenetic maps. For humans, as for plants, this allows evolutionary origins to be traced, subsequent spread and population growth to be monitored, and the impact of factors such as disease and climate on the observed allele frequency to be assessed. Acknowledgements
L.M.S. is a Howard Hughes Medical Institute predoctoral fellow. We are grateful to Richard Hyman for critical reading of the manuscript and to Ronald Sapolsky and Dan Richards for their help in preparing the figures. Lars M. Steinmetz Dept of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA (tel 11 650 723 6287; fax 11 650 725 6044; e-mail
[email protected]) Michael Mindrinos Stanford Genome Technology Center, Palo Alto, CA 94304, USA (tel 11 650 812 2004; fax 11 650 812 1975; e-mail
[email protected]) Peter J. Oefner* Stanford Genome Technology Center, Palo Alto, CA 94304, USA *Author for correspondence (tel 11 650 812 1926; fax 11 650 812 1975; e-mail
[email protected]) References 1 Lander, E.S. and Schork, N.J. (1994) Genetic dissection of complex traits. Science 265, 2037Ð2048 2 Plucknett, D.L. (1987) Gene Banks and the WorldÕs Food, Princeton University Press, Princeton, NJ, USA 3 Tanksley, S.D. and McCouch, S.R. (1997) Seed banks and molecular maps: unlocking genetic potential from the wild. Science 277, 1063Ð1066
4 Alonso-Blanco, C. and Koornneef, M. (2000) Naturally occurring variation in Arabidopsis: an underexploited resource for plant genetics. Trends Plant Sci. 5, 22Ð29 5 Lindsley, D.L. and Zimm, G.G. (1992) The Genome of Drosophila melanogaster, Academic Press 6 Botstein, D. et al. (1980) Construction of a genetic linkage map in man using restriction fragment length polymorphisms. Am. J. Hum. Genet. 32, 314Ð331 7 Risch, N. and Merikangas, K. (1996) The future of genetic studies of complex human diseases. Science 273, 1516Ð1517 8 Kruglyak, L. (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat. Genet. 22, 139Ð144 9 Mullis, K. et al. (1986) Specific enzymatic amplification of DNA in vitro: the polymerase chain reaction. Cold Spring Harbor Symp. Quant. Biol. 51, 263Ð273 10 Lin, J.J. et al. (1996) A PCR-based DNA fingerprinting technique: AFLP for molecular typing of bacteria. Nucleic Acids Res. 24, 3649Ð3650 11 Konieczny, A. and Ausubel, F.M. (1993) A procedure for mapping Arabidopsis mutations using co-dominant ecotype-specific PCR-based markers. Plant J. 4, 403Ð410 12 Weber, J.L. and May, P.E. (1989) Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J. Hum. Genet. 44, 388Ð396 13 Dib, C. et al. (1996) A comprehensive genetic map of the human genome based on 5,264 microsatellites. Nature 380, 152Ð154 14 Collins, F.S. (1995) Positional cloning moves from perditional to traditional. Nat. Genet. 9, 347Ð350 15 Wang, D.G. et al. (1998) Large-scale identification, mapping and genotyping of singlenucleotide polymorphisms in the human genome. Science 280, 1077Ð1082 16 Winzeler, E.A. et al. (1998) Direct allelic variation scanning of the yeast genome. Science 281, 1194Ð1197 17 Cho, R.J. et al. (1999) Genome-wide mapping with biallelic markers in Arabidopsis thaliana. Nat. Genet. 23, 203Ð207 18 Lawrence, S. et al. (1991) Radiation hybrid mapping. Proc. Natl. Acad. Sci. U. S. A. 88, 7477Ð7480 19 Oefner, P.J. and Underhill, P.A. (1998) DNA mutation detection using denaturing highperformance liquid chromatography (DHPLC). In Current Protocols in Human Genetics (Dracopoli, N.C. et al., eds), pp. 7.10.1Ð7.10.12, John Wiley & Sons 20 Jones, A.C. et al. (1999) Optimal temperature selection for mutation detection by denaturing HPLC and comparison to single-stranded conformation polymorphism and heteroduplex analysis. Clin. Chem. 45, 1133Ð1140 21 Steinmetz, L.M. and Davis, R.W. High-density arrays and insights into genome function. Biotechnol. Genet. Eng. Rev. (in press) 22 Fodor, S.P. et al. (1991) Light-directed, spatially addressable parallel chemical synthesis. Science 251, 767Ð773
23 Lipshutz, R.J. et al. (1999) High density synthetic oligonucleotide arrays. Nat. Genet. 21, 20Ð24 24 Chee, M. et al. (1996) Accessing genetic information with high-density DNA arrays. Science 274, 610Ð614 25 Cargill, M. et al. (1999) Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22, 231Ð238 26 Halushka, M.K. et al. (1999) Patterns of singlenucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat. Genet. 22, 239Ð247 27 Oefner, P.J. (2000) Allelic discrimination by denaturing high-performance liquid chromatography. J. Chromatogr. B 739, 345Ð355 28 Hoogendoorn, B. et al. (1999) Genotyping single nucleotide polymorphisms by primer extension and high performance liquid chromatography. Hum. Genet. 104, 89Ð93 29 Schriml, L.M. et al. (2000) Use of denaturing HPLC to map human and murine genes and to validate single-nucleotide polymorphisms. Biotechniques 28, 740Ð745 30 McCallum, C.M. et al. (2000) Targeted screening for induced mutations. Nat. Biotechnol. 18, 455Ð457 31 Cronin, M.T. et al. (1996) Cystic fibrosis mutation detection by hybridization to lightgenerated DNA probe arrays. Hum. Mutat. 7, 244Ð255 32 Sokolov, B.P. (1990) Primer extension technique for the detection of single nucleotide in genomic DNA. Nucleic Acids Res. 18, 3671 33 Nikiforov, T.T. et al. (1994) Genetic bit analysis: a solid phase method for typing single nucleotide polymorphisms. Nucleic Acids Res. 22, 4167Ð4175 34 Shumaker, J.M. et al. (1996) Mutation detection by solid phase primer extension. Hum. Mutat. 7, 346Ð354 35 Pastinen, T. et al. (1997) Minisequencing: a specific tool for DNA analysis and diagnostics on oligonucleotide arrays. Genome Res. 7, 606Ð614 36 Wallace, R.B. et al. (1979) Hybridization of synthetic oligodeoxyribonucleotides to phi chi 174 DNA: the effect of single base pair mismatch. Nucleic Acids Res. 6, 3543Ð3557 37 Wallace, R.B. et al. (1981) The use of synthetic oligonucleotides as hybridization probes, II. Hybridization of oligonucleotides of mixed sequence to rabbit beta-globin DNA. Nucleic Acids Res. 9, 879Ð894 38 Huber, C.G. and Krajete, A. (1999) Analysis of nucleic acids by capillary ion-pair reversed-phase HPLC coupled to negative-ion electrospray ionization mass spectrometry. Anal. Chem. 71, 3730Ð3739 39 Hacia, J.G. et al. (1999) Determination of ancestral alleles for human single-nucleotide polymorphisms using high-density oligonucleotide arrays. Nat. Genet. 22, 164Ð167 40 Shen, P. et al. (2000) Population genetic implications from sequence variation in four Y chromosome genes. Proc. Natl. Acad. Sci. U. S. A. 97, 7354Ð7359
September 2000, Vol. 5, No. 9
401