The genetics of complex diseases

The genetics of complex diseases

Millennium issue The genetics of complex diseases Glenys Thomson and Michael S. Esposito Genetic factors influence virtually every human disorder, de...

450KB Sizes 0 Downloads 63 Views

Millennium issue

The genetics of complex diseases Glenys Thomson and Michael S. Esposito Genetic factors influence virtually every human disorder, determining disease susceptibility or resistance and interactions with environmental factors. Our recent successes in the genetic mapping and identification of the molecular basis of mendelian traits have been remarkable. Now, attention is rapidly shifting to more-complex, and more-prevalent, genetic disorders and traits that involve multiple genes and environmental effects, such as cardiovascular disease, diabetes, rheumatoid arthritis and schizophrenia. Rather than being due to specific and relatively rare mutations, complex diseases and traits result principally from genetic variation that is relatively common in the general population. Unfortunately, despite extensive efforts by many groups, only a few genetic regions and genes involved in complex diseases have been identified. Completion of the human genome sequence will be a seminal accomplishment, but it will not provide an immediate solution to the genetics of complex traits. f the Human Genome Project (HGP) is completed by 2003, this achievement will fall on the 50th anniversary of the discovery of the double-helical structure of DNA by Watson and Crick1. The genetic and physical maps, and technologies for gene identification that have already emerged from the HGP, have had a tremendous impact on the research community’s ability to discover the genes that underlie human genetic variation2,3. The goal of the HGP, a complete DNA sequence of the 3000 million base pairs of the human genome, will reveal the underlying architecture of our genetic endowment, including the estimated 80 000–100 000 genes therein. To understand the genetic and environmental factors that contribute to complex diseases, we must address the daunting prospect of documenting the genetic variation of human genomes at the population level within and across ethnic groups. Traits or diseases caused by defects in a single major gene or biochemical pathway are called mendelian or single-gene traits. Complicating factors, such as incomplete penetrance (not all genetically predisposed individuals have the disease) and variable age of onset, are often present with single-gene traits, but they show basic mendelian segregation patterns3. By contrast, complex or multifactorial diseases result from the interaction of environmental factors and multiple genes, some of which might have a major disease effect but many of which have a relatively minor effect. Differential rates of disease in twins and other family members, compared with the rates in the general population, have shown that genetic and environmental factors both contribute to complex diseases3. The distinction in terminology between mendelian and complex traits is not meant to imply that the underlying genes in complex diseases violate the rules of mendelian inheritance, merely that the pattern of inheritance is not simple. Although the boundary between mendelian and complex traits is not precisely defined, a large number of diseases clearly fall into each category. It is also important to remember that, although we often use the shorthand notation of disease gene, or disease-predisposing gene, we are discussing variation in genes that are involved in normal human health and development, specific forms of which can lead to a disease state.

I

Mendelian traits There are three main approaches to mapping the genetic variants involved in a disease: functional cloning, the candidate gene strategy and positional cloning3. In functional cloning, identification of the underlying protein defect leads to localization of the responsible gene (disease–function–gene–map). An example of functional cloning was the finding that individuals with sicklecell anaemia carried an amino acid substitution in the b chain of

© 1999 Elsevier Science Ltd. All rights reserved. For article fee, see p. IV.

haemoglobin. Isolation of the mutant molecule led to the cloning of the gene encoding b globin. However, this approach is only useful in a subset of mendelian traits where the biological basis is known, and in very few complex traits. In the candidate-gene approach, genes with a known or proposed function with the potential to influence the disease phenotype are investigated for a direct role in disease3. In a small number of type 2 diabetes cases, candidate-gene studies have identified mutations in, for example, the genes encoding insulin and the insulin receptor4. Positional cloning is used when the biochemical nature of a disease is unknown. Marker genes not related to disease physiology (Box 1) and genome-wide screens are the starting point for mapping the genetic components of the disease. The aim is first to identify the genetic region within which a disease-predisposing gene lies and, once this is found, to localize the gene and determine its functional and biological role in the disease (disease–map– gene–function). Landmarks in positional cloning have included genetic linkage of Huntington disease in 1983 to a restriction fragment length polymorphism (RFLP) marker (Box 1) on chromosome 4 (the gene was cloned in 1993)6 and the cloning of the cystic fibrosis gene on chromosome 7 in 1989. By 1990, when the HGP began, a handful of additional successes had accrued. In 1994, the earlyonset familial breast/ovarian cancer gene (BRCA1) was cloned and, by 1997, close to 100 mendelian disease loci had been identified by positional cloning2,3. Considerable work still lies ahead, even with mendelian traits: documentation of the worldwide variation in mutations, identification of the effects of modifier genes with respect to age of onset and severity, measurement of the effects of different mutations on disease risk, and development of appropriate therapies. However, it is encouraging to note at this point that, even if a disease has a strong genetic component, its onset can occasionally be prevented through environmental change. A classic example of this is phenylketonuria, where a special diet that is low in the amino acid phenylalanine completely prevents mental retardation3. Similarly, for complex diseases such as heart disease and type 2 diabetes, weight loss and exercise are important preventive measures.

Mapping mendelian and complex traits An array of approaches is available to uncover the different genetic facets of mendelian and complex traits and diseases7–10. Two complementary analytical methods, linkage analyses and association (linkage disequilibrium) mapping, are used to detect the specific genetic regions and genes that are involved in the disease process.

TCB, Vol. 9, N¼ 12 (0962-8924)

TIBS, Vol. 24, N¼ 12 (0968-0004)

TIG, Vol. 15, N¼ 12 (0168-9525)

PII: S0962-8924(99)01689-X

PII: S0968-0004(99)01499-1

PII: S0168-9525(99)01909-5

Glenys Thomson [email protected]. berkeley.edu Michael S. Esposito espogen@hotmail. com Dept of Integrative Biology, 3060 Valley Life Sciences Building, MC #3140, University of California, Berkeley, CA 94720-3140, USA.

M17

Millennium issue

Glenys Thomson and Michael S. Esposito ¥ The genetics of complex diseases

BOX 1. DNA polymorphisms Markers such as protein and blood group loci were initially used in the analysis of genetic traits; however, they were of limited use owing to low variation. Early in the 1980s, these markers began to be replaced with DNA polymorphisms, following the discovery of the first restriction fragment length polymorphism (RFLP) detected by the ability of a segment of DNA to be cut, or not, by a specific restriction enzyme that recognizes between 4Ñ6 specific DNA base pairs. Since then, two important advances in molecular biology have made possible the rapid development of highly informative markers for genetic mapping: the discovery of the polymerase chain reaction (PCR) methodology and the use of PCR to amplify microsatellite markers across the genome. Microsatellites consist of approximately 10Ñ50 copies of particular DNA-sequence motifs ranging from 1Ñ6 nucleotide base pairs that occur in tandem repetition. These repeat sequences occur frequently and randomly across the human genome. Microsatellite loci usually have multiple alleles and high levels of variation. Microsatellites have rapidly replaced RFLP markers in disease studies owing to these features and to their ease of typing, including the small amount of template DNA required. Microsatellites are named by analogy with larger minisatellite arrays (also referred to as variable number of tandem repeats Ñ VNTRs). VNTRs are also highly polymorphic and powerful markers in forensic and paternity studies (DNA fingerprinting). However, they are less common than microsatellites and are not distributed evenly throughout the genome. Their larger sequence motifs make them less amenable to PCR technology and use in genomic screening analyses. Single-nucleotide polymorphisms (SNPs) are the most common type of human DNA genetic variation, occurring on average 1 per 1000 base pairs2,5. SNPs are mostly biallelic and less informative than microsatellites; however, they are more frequent and mutationally more stable than microsatellites, and more amenable to automation and DNA-chip technology.

These approaches can be applied, without prior knowledge of the biological basis of the disease, using genome-wide studies, together with the candidate-gene approach and comparative analyses using animal models of disease. Linkage analyses test for cosegregation of a marker and disease phenotype within a pedigree to determine whether a genetic marker and a disease-predisposing gene are physically linked, that is, in close physical proximity to each other7,10. When large, multigeneration pedigrees are available, linkage analysis is a powerful technique for localizing disease genes. The usual practice in an initial screen is the genetic typing of about 300 highly polymorphic markers that are distributed approximately evenly over the genome. This approach, called a genome-wide screen, is particularly powerful in mapping mendelian traits such as Huntington disease6 and also for subsets of complex traits that show simple mendelian inheritance, for example early-onset Alzheimer disease11. For complex diseases, the involvement of many genes and the strong influence of environmental factors mean that large multigeneration pedigrees are rarely, or never, seen. Therefore, linkage analysis of nuclear families with both parents and two children affected with the disease, although less powerful, is more commonly used to map complex traits (Fig. 1). Association studies compare marker frequencies in unrelated cases and controls and test for the co-occurrence of a marker and disease at the population level; a significant association with disease might implicate a candidate gene in the aetiology of a disease7–10,12. Alternatively, an association can be the result of nonrandom association at the population level, termed linkage disequilibrium, of variants at the marker and disease genes. This usually implies close physical linkage of the marker and disease gene, reflecting the historical origin of a mutation on a specific chromosome with a characteristic set of variation. In association studies, it is essential that the patient and control groups be ethnically matched to avoid a spurious association of an unlinked marker with a disease12. Case/control, simplex families (with at least one affected child), and affected sib-pair families can all be used in association studies.

M18

With nuclear family data, the parental marker alleles not transmitted to an affected child, or never transmitted to an affected sib pair (Fig. 1), form a control population referred to as affected family-based controls (AFBACs)12. Testing for a 50% transmission ratio from parents who are heterozygous for a marker allele [the transmission disequilibrium test (TDT)] detects significant differences with marker alleles that are both in linkage disequilibrium and linked to the disease predisposing gene14. Animal models of human disease can be informative for mapping and for physiological studies of genetic and environmental factors and can be used to test novel therapeutics7,15. An advantage with many animal and plant models is our ability to analyse large numbers of offspring from a single set of parents. Genome-wide quantitative trait locus (QTL) linkage analyses have identified genetic regions and genes involved in several multifactorial traits and diseases, leading to the identification of human homologues, for example, in hypertension7. Genes that modify disease severity can also be investigated in animal models by studying animal strains that are severely or mildly affected by the same gene defect. Given our increasing understanding of the unifying biological aspects of all life forms, including gene expression, protein synthesis and protein function, animal models can be used to identify key genes that influence human multifactorial traits. In the search for these genes, the mouse is the most commonly used mammalian experimental model. The recent ability to generate mice in which specific genes have been precisely inactivated was a major innovation, allowing disease-predisposing genes identified in humans to be studied in the mouse. For example, the mouse model of type 1 diabetes shows homology with some genes and genetic regions involved in the human disease16,17. Recently, a gene involved in the sleep disorder narcolepsy in dogs has been cloned18, and studies on the human homologue and the mouse knockout are eagerly awaited.

Complex diseases Association studies have been applied most successfully in mapping complex diseases to the human leukocyte antigen (HLA) region on chromosome 6 (Ref. 19). The HLA region contains ~200 genes, many of which are involved in the immune response. Association (Table 1) and affected sib-pair linkage studies have implicated genes of the HLA region in the aetiology of more than 100 diseases8,19. The associations are often very strong, for example, over 90% of patients with ankylosing spondylitis carry a specific HLA variant, B27, compared with 9% of control subjects. Other HLA-associated diseases include: complex autoimmune diseases, such as type 1 diabetes16,17,20, rheumatoid arthritis8 and multiple sclerosis; cancers, for example Hodgkin disease; infectious diseases, such as malaria, tuberculosis and AIDS21; and other diseases including narcolepsy and haemochromatosis. In some cases, the HLA immune-response genes have been implicated directly in disease, for example ankylosing spondylitis, type 1 diabetes and narcolepsy, but, with haemochromatosis, the association was the result of linkage disequilibrium. Given the relative ease with which linkage was demonstrated for many HLA-associated diseases8,13, it seemed a logical progression to use genome-wide linkage-analysis screens on affected sib-pair families to investigate all complex diseases. Such studies of many complex disorders are in progress: to map the non-HLA genes in type 1 diabetes16, multiple sclerosis, rheumatoid arthritis and Crohn’s disease, and for many other complex diseases, such as type 2 diabetes, hypertension, coronary artery disease, alcoholism and schizophrenia. These studies have been successful to a limited extent, but progress has been exceedingly slow. Only a few genes and some genetic regions involved in complex diseases have been identified. For type 1 diabetes, six non-HLA genetic regions have been identified

Millennium issue

Glenys Thomson and Michael S. Esposito ¥ The genetics of complex diseases

definitely and another ten implicated provisionally16,17,20. It has often been difficult both to detect and replicate linkages, and considerable heterogeneity has been evident among data sets4,17,22. This occurs because of the large number of loci that can be involved in a complex disease, each of which might have a small effect overall caused by alleles in high frequency in the general population23. Study of many hundreds, and usually thousands, of affected sib-pairs is required to establish linkage. Despite these large sample sizes, the results are sometimes not consistent and might never attain the significance levels that we often find with mendelian traits.

AB

CD

AC AC Share 2 25%

AB

CD

AC AD Share 1 25%

AB

CD

AC BC Share 1 25%

AB

CD

AC BD Share 0 25% TCB•TIBS•TIG

The future Both functional and positional cloning of mendelian traits and diseases are now relatively straightforward, and, with the completion of the HGP, they will become practically routine3. An immediate benefit of the completion of the HGP will be an increase in the use of positional candidate-gene analyses. The ability to identify candidate genes in a region that has already been identified by positional cloning can greatly reduce the time required to target the actual gene involved in the disease. An additional outcome of the HGP will be the increased characterization of human gene-expression patterns, which will inform us greatly about the functional properties of their products24. A multistrategy approach to the mapping of complex diseases and traits is still appropriate: no single method is sufficient or optimal. Linkage analyses have proven useful, yet limited. The increasing availability of more markers across the genome, combined with multipoint analyses using closely linked markers22,25, will increase the power of linkage and association studies, including the follow-up of regions potentially involved in disease. The development of more-powerful statistical analysis tools, such as multilocus methods of linkage analysis22 and meta-analysis techniques26,27, will also provide more information. Association mapping for complex diseases might, in many cases, be more efficient than linkage analyses28,29, although the overall power of association studies to detect disease genes depends on several parameters and cannot be determined accurately30–32. Nevertheless, researchers are beginning to use genome-wide association screens in complex diseases. These have been made possible using DNA pooling of patient and control samples33, thus making feasible study of the 12 000, and preferably many more, markers needed. The development of single-nucleotide polymorphisms (SNPs; Box 1), combined with DNA-chip technology, will soon allow the rapid screening of vast amounts of genetic data. Technological advances in biocomputing will meet the needs for the storage, manipulation and analysis of these data. The full development of SNPs will eventually permit routine typing for variation in every human gene and its regulatory region – the ultimate association study5,30,34,35. This type of brute-force effort is essential but not sufficient; attention must also be given to genetic epidemiology and population genetics in study design36. The use of different disease phenotype definitions in linkage and association studies is a subject of debate. The issue is the trade-off between minimizing false results versus finding all the genes that contribute to a disease phenotype37. Success stories based on the study of a range of disease definitions, specific subsets of the disease phenotype, sex effects38 and age-of-onset effects11 support these approaches. We will need very large sample sizes of case/control, simplex and affected sib-pair family-based data, as well as multigeneration pedigrees when available, for our continuing studies of complex diseases. The success of the international consortium in mapping the Huntingtin gene is a model for all studies of complex diseases6. The development of permanent cell lines of type-1-diabetes-affected sib-pair families in the USA and the UK has greatly aided the mapping of non-HLA type 1 diabetes genes22.

FIGURE 1. Affected sib-pair families. Nuclear family pedigrees are shown with the father (blue square) and mother (yellow circle) in the first row, and the two affected children of either sex (black diamonds) in the second row. Assume, for simplicity, that we can distinguish all four parental alleles, denoted A, B, C and D, in the genetic region under study, with the parental alleles ordered such that A and C are transmitted from the father and mother, respectively, to the first affected child9. There are four possible configurations among the two offspring with respect to the alleles inherited from the parents: they can share both parental alleles (AC); they can share an allele from the father (A) but differ in the alleles received from the mother (C and D); they can share an allele from the mother (C) but differ in the alleles received from the father (A and B); or they can share no parental alleles in common. These four configurations are equally likely if there is no influence of the genetic region under consideration on the disease. Deviation from these mendelian expectations of 25%, 50% and 25%, that the affected sibs will share on average 2, 1 and 0 parental alleles in common, implicates a disease-predisposing gene in the region8,9. In a study of 711 sibpairs affected with type 1 diabetes13, the observed sharing of 2, 1 and 0 parental alleles was 52%, 40% and 8%: a mean sharing of 72%, compared with the 50% expected if this region were not involved in disease. The parental alleles that are never transmitted to the affected sib pair in each family type (indicated in blue, while the transmitted alleles are given in red) are used as a control population in association studies using nuclear family data, the so-called AFBAC (affected family-based control) sample12.

Concomitant with further linkage and association studies, and investigation of animal models of disease, we require a population-level survey of human variation to understand the patterns of polymorphism and linkage disequilibrium across the genome, and across populations and ethnic groups30,36,39. By correlating this extensive genetic variation with disease and environmental factors, we will uncover the complete genetics of complex diseases and traits, determine the role of environmental factors on their expression and understand the mechanisms by which disease-predisposing genes can become relatively common in a population.

TABLE 1. Some examples of HLA-associated diseasesa HLA

Patients

Controls

90%

9%

52% 74% 93%

23% 24% 43%

86%

33%

81%

24%

.95%

33%

Ankylosing spondylitis B27

Type 1 diabetes DR3 DR4 DR3 or DR4

Multiple sclerosis DR2

Rheumatoid arthritis DR4

Narcolepsy DR2 a

For each disease, the frequency of the associated human leukocyte antigen (HLA) allele is given in patients and controls. The letter designation denotes the HLA gene, while the number is assigned to a specific allele at the gene. For ease of reading, the data shown are older serological-level HLA typing, rather than more recent molecular typing. The data are modified from Ref. 19.

M19

Millennium issue References

Acknowledgements We thank M. Nelson for preparing Fig. 1 and L. Barcellos, M. Grote, J. Hollenbach, L. Louie, K. Mather, S. McWeeney, D. Meyer, E. Mignot, M. Nelson, H. Payami, D.C. Rao, J-X. She and A.M. Valdes for helpful input on the manuscript. The work was supported by NIH grant GM56688.

1 Goodman, L. (1998) The human genome project aims for 2003. Genome Res. 8, 997Ñ999 2 Collins, F.S. et al. (1997) Variations on a theme: cataloguing the human DNA sequence variation. Science 278, 1580Ñ1581 3 Gelehrter, T.D. et al. (1998) Principles of Medical Genetics, Williams and Wilkins 4 Thomson, G. (1997) Strategies involved in mapping diabetes genes: an overview. Diabetes Rev. 5, 106Ñ115 5 Collins, F.S. et al. (1998) A DNA polymorphism discovery resource for research on human genetic variation. Genome Res. 8, 1229Ñ1231 6 HuntingtonÕs Disease Collaborative Research Group (1993) A novel gene containing a trinucleotide repeat that is expanded and unstable on HuntingtonÕs disease chromosomes. Cell 72, 971Ñ983 7 Lander, E.S. and Schork, N.J. (1994) Genetic dissection of complex traits. Science 265, 2037Ñ2048 8 Thomson, G. (1995) HLA disease associations: models for the study of complex human genetic disorders. Crit. Rev. Clin. Lab. Sci. 32, 183Ñ219 9 Thomson, G. (1995) Analysis of complex human genetic traits: an ordered notation method and new tests for mode of inheritance. Am. J. Hum. Genet. 57, 474Ñ486 10 Elston, R.C. (1998) Linkage and association. Genet. Epidemiol. 15, 565Ñ576 11 Levy-Lahard, E. et al. (1998) Recent advances in the genetics of AlzheimerÕs disease. J. Geriatr. Neur. 11, 42Ñ54 12 Thomson, G. (1995) Mapping disease genes: family based association studies. Am. J. Hum. Genet. 57, 487Ñ498 13 Payami, H. et al. (1998) The affected sib method. IV. Sib trios. Ann. Hum. Genet. 49, 303Ñ314 14 Martin, E.R. et al. (1997) Tests for linkage and association in nuclear families. Am. J. Hum. Genet. 61, 439Ñ448

Glenys Thomson and Michael S. Esposito ¥ The genetics of complex diseases

15 Wynshaw-Boris, A. (1996) Model mice and human disease. Nat. Genet. 13, 259Ñ260 16 Pugliese, A. (1999) Unraveling the genetics of insulin-dependent type 1A diabetes: the search must go on. Diabetes Rev. 7, 39Ñ54 17 Mein, C.A. et al. (1998) A search for type 1 diabetes susceptibility genes in families from the United Kingdom. Nat. Genet. 19, 297Ñ300 18 Lin, L. et al. (1999) The sleep disorder canine narcolepsy is caused by a mutation in the hypocretin (orexin) receptor 2 gene. Cell 98, 365Ñ376 19 Thorsby, E. (1997) Invited anniversary review: HLA associated diseases. Hum. Immunol. 53, 1Ñ11 20 Concannon, P. et al. (1998) A secondgeneration screen of the human genome for susceptibility to insulin-dependent diabetes mellitus. Nat. Genet. 19, 292Ñ296 21 Carrington, M. et al. (1999) HLA and HIV-1: heterozygote advantage and B*35-Cw*04 disadvantage. Science 283, 1748Ñ1752 22 Lernmark, A. and Ott, J. (1998) Sometimes itÕs hot, sometimes itÕs not. Nat. Genet. 19, 213Ñ214 23 Suarez, B.K. et al. (1994) in Genetic Approaches to Mental Disorders (Gershon, E.S. and Cloninger, C.R., eds), pp. 23Ñ46, American Psychiatric Press 24 Brown, P.O. and Hartwell, L. (1998) Genomics and human disease Ñ variations on variation. Nat. Genet. 18, 91Ñ93 25 Barcellos, L.F. et al. (1997) Chromosome 19 single-locus and multilocus haplotype associations with multiple sclerosis: evidence of a new susceptibility locus in Caucasian and Chinese patients. J. Am. Med. Assoc. 278, 1256Ñ1261 26 Gu, C. et al. (1998) Meta-analysis methodology for combining non-parametric sibpair linkage results: genetic homogeneity and identical markers. Genet. Epidemiol. 15, 609Ñ626 27 Wise, L.H. et al. (1999) Meta-analysis of genome searches. Ann. Hum. Genet. 63, 263Ñ272

28 Risch, N. and Merikangas, K. (1996) The future of genetic studies of complex human diseases. Science 273, 1516Ñ1517 29 Morton, N.E. and Collins, A. (1998) Tests and estimates of allelic association in complex inheritance. Proc. Natl. Acad. Sci. U. S. A. 95, 11389Ñ11393 30 Nickerson, D. et al. (1998) DNA sequence diversity in a 9.7 kb region of the human lipoprotein lipase gene. Nat. Genet. 19, 233Ñ239 31 Escamilla, M.A. et al. (1999) Assessing the feasibility of linkage disequilibrium methods for mapping complex traits: an initial screen for bipolar disorder loci on chromosome 18. Am. J. Hum. Genet. 64, 1670Ñ1678 32 Kruglyak, L. (1999) Prospects for whole-genome linkage disequilibrium mapping of common disease genes. Nat. Genet. 22, 139Ñ144 33 Barcellos, L.F. et al. (1997) Association mapping of disease loci, by use of a pooled DNA genomic screen. Am. J. Hum. Genet. 61, 734Ñ747 34 Cargill, M. et al. (1999) Characterization of single-nucleotide polymorphisms in coding regions of human genes. Nat. Genet. 22, 231Ñ238 35 Halushka, M.K. et al. (1999) Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis. Nat. Genet. 22, 239Ñ247 36 Schork, N.A. et al. (1998) The future of genetic epidemiology. Trends Genet. 14, 266Ñ272 37 Todorov, A.A. and Rao, D.C. (1997) Trade-off between false positives and false negatives in the linkage analysis of complex traits. Genet. Epidemiol. 14, 453Ñ464 38 Paterson, A.D. and Petronis, A. (1999) Sex of affected sibpairs and genetic linkage to type 1 diabetes. Am. J. Med. Genet. 84, 15Ñ19 39 Huttley, G.A. et al. (1999) A scan for linkage disequilibrium across the human genome. Genetics 152, 1711Ñ1722

Challenges at the frontiers of structural biology Andrej ùSali [email protected] John Kuriyan* kuriyan@rockvax. rockefeller.edu

Andrej ùSali and John Kuriyan Knowledge of the three-dimensional structures of proteins is the key to unlocking the full potential of genomic information. There are two distinct directions along which cutting-edge research in structural biology is currently moving towards this goal. On the one hand, tightly focused long-term research in individual laboratories is leading to the determination of the structures of macromolecular assemblies of ever-increasing size and complexity. On the other hand, large consortia of structural biologists, inspired by the pace of genome sequencing, are developing strategies to determine new protein structures rapidly, so that it will soon be possible to predict reasonably accurate structures for most protein domains. We anticipate that a small number of complex systems, studied in depth, will provide insights across the field of biology with the aid of genome-based comparative structural analysis.

Laboratories of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, *Howard Hughes Medical Institute, The Rockefeller University, 1230 York Ave, New York, NY 10021, USA.

o understand fully the workings of the cell, we face the challenge of describing the three-dimensional structures of all the cellular components at an atomic level of detail, and relating these structures to molecular mechanisms. As work towards this aim progresses, the frontiers of structural biology

M20

© 1999 Elsevier Science Ltd. All rights reserved. For article fee, see p. IV.

T

will expand – but in two, almost orthogonal, directions. The newer theme, referred to as ‘structural genomics’, is motivated by the growing impact of genome-sequencing efforts and is aimed at accelerating the rate at which protein structures containing new folds are solved. The other thrust of research builds on

TCB, Vol. 9, N¼ 12 (0962-8924)

TIBS, Vol. 24, N¼ 12 (0968-0004)

TIG, Vol. 15, N¼ 12 (0168-9525)

PII: S0962-8924(99)01685-2

PII: S0968-0004(99)01494-2

PII: S0168-9525(99)01908-3