C O M M E N T
Why map the rat? MICHAELR. JAMES*AND KLAUSLINDPAINTNER:F
[email protected] [email protected] *WELLCOMETRUfffCLVI'REFORHus~ GEh'E]'ICS,WINDMILLROAD,OXFORD,UK OK3 7BN; :~DEPARTMEN'IOF " MEDICINE,BRIGH~I &~,~DWOMEN'SHOSPITAL,HARVARDMEDICALSCHOOL,BOSTON,MA 02115, USA.
As the Chinese lunar year of the rat comes to a close, has it been a good year for the raL~ Certainly it seems to be coincident with an increased interest and awareness of the mt as a model organism for genetic research, and with the commitment of substamial funding for the establishment of genomic resources for this organism. Major grants have been awarded by the National Institutes of Health and the European Community to consortia in the United States and in Europe, respectively. Consequently, there has already been a quickening of the pace at which new resources and reagents are being entered into the public domain, with large numbers of additional simple sequence length polymorphism(SSLP) markers forthcoming(H. Jacob, pets. commun.), with the availability of the first largeinsert genomic library for the ,at 1and with the preparation under way of a radiation hybrid panel (L. McCarthy and P. Goodfellow, pers. commun.), large-insert genomic bacterial clone libraries (P.Y. Woon and P. deJong, pers. commun.) and arrayed cDNA libraries z. Table 1 lists resources available on the Intemet. A bigger mouse? The rat might appear to be not much different, except for its larger size, from the mouse, a species for which extensive and excellent genetic and genomic resources are available. This might lead one to question the wisdom of investing substantial effort
into the development of genomic resources for a phylogenetically closely related organism. Yet, the role of these two species in biomedical research is so different that a separate genome project for the mt has been deemed justified. While scientific work in the mouse has traditionally maintained a strong focus on genetics, inherited-disease models in the rat were used primarily in physiological research, and the genetic aspects received little attention. This historical difference in the orientation of scientific discovery in the mouse and rat had profound effects on the tools and resources that have been developed for the two species over the past century. Because meaningful genetic experimentation was more or less restricted to monogenic traits for most of this time, an extensive repository of single-gene mutants has been developed in the mouse. While interest in models of complex disease in mice has increased lately (primarily diabetes, obesity and cancer; Ref. 3), it was the availability of monogenic disease models through which mouse genetics made its most significant contributions to our understanding of mammalian genetics and development. By contrast, the focus on physiological research with the rat has generated a wealth of experience and methodological sophistication for the accurate determination of quantitative phenotype measurements, taking advantage of the physical size of the rat compared with the
mouse. Much of our current understanding of integrative physiology is based on these studies in rots. Mapping complex traits Although there was no concerted effort to develop monogenic mt mutants, a substantial number of genetic disease swains have been created over the years, by selective breeding of animals expressing a desired phenotype. Inbreeding provided fixation of the trait of interest, but was almost never followed by backcrossing to establish whether a trait was monogenic or polygenic in character. Thus, a substantial number of genetic models for a broad spectrum of complex pathological traits, such as hypertension, arthritis, diabetes, renal disease, reactive airway disease, cancer, seizure, autoimmune, ophthalmic, dental and psycho-behavioural disorders, have been developed'l. These models have been used extensively and, until recently, almost exclusively for comparative studies, contrasting the disease strain with a strain that does not show the phenotype of interest, or that had been selection-bred for the opposite extreme of a quantitative phenotype. While most of this work is difficult to interpret given the unknown confounding chance fixation of trait-unrelated differences among any rwo strains, Jack of genetic sophistication among physiologists unwittingly preserved a rich source of models for polygenic and common multifactorial traits. With the advent
TAeLE 1. Rat g e n o m i c resources o n the Internet w w w address
Host site
Resource
http://ratmap.gen.gu.se
Grteborg Univ., Sweden
World consortiummap, nomenclature, mouse--rathomologies Oxford consortiumgenetic map, physical resources MITand \Vi,~onsingenetic map, physical resources Commercialsupplies of genetic markers, YACclones
http://www.well.ox.ac.uk
Wellcome TrustCentre for Human Genetics, Oxford, UK hnp://www.genome.wi.mit.edu WhiteheadInstituteCenter for Genome Research, Boston,MA, USA http://www.resgen.com Research Genetics Inc., Huntsville,AL, USA
TIG MAY 1997 VOL 13 NO. 5 O,pyright ~ 1997Elst'*.it'r,"k'ientx'Lkl.Allrightsre~-'rrt'd.01(¢,'~12q9L $I" (~ PIh ~) 108-9~2¢~97)1111~0-X
171
C O M M E N T
of powerful molecular genetic tools, these strains have now become a major asset in complex disease research. Thus, although the mouse has served to establish or to test many of the principles of classic mendelian genetics, the rat is quickly becoming the s3'stem of choice for exploring the paradigms of complex inheritance. A cross between a hypertensive and a normotenaive rat strain was the fast mammalian model in which quantitative trait loci (QTL) were mapped by genome screening, a critical proof that this approach towards new gene discovery, previously demonstrated only in plants, was applicable to vertebratesS,6. Subsequently, a substantial number of similar studies have been published; while the majority maintained a focus on hypertension, others dealt with disorders such as renal disease7, diabetes s-lo, arthritislLlz, stroke 13and a behavioural disorder 14. The findings encountered in these studies have established a paradigm for complex, multifactorial traits. Thus, there have been examples of epistatic interaction of two loci15,1s,I, as well as of ecogenetic (gene and environment) effects 13A5. Likewise, 'transgressive alleles' have been amply documented; that is, an allele can be protective or disease-causing depending on the host strain and the specific genetic context. "lhe finding that, in crosses of different hypertensive and normotensive strains, very different loci are found to be significantly linked to a phenotype that appears physiologically uniform indicates the genetic heterogeneity of complex disease and illustrates how harnessing the power of a 'founder effect' can help to restrict heterogeneity and to yield greater statistical power, at the cost of representing only one segment of the genetic spectrum of a disease or trait. While the attainable mapping resolution in quantitative traits depends on the relative contribution of the locus to overall genetic variance of the phenotype, on the number of meioses studied, on the number/spacing of informative markers and on the fidelity of the phenotyping methodology, a key lesson learned from these studies has been that, for any given locus, only a certain, finite mapping resolution is attainable. To progress beyond localization of a gene to a fairly large chromosomal segment, the need for the production
of programmed, nested congenic strains became apparent, as did the difficulty of phenotyping these ~qrains whose phenotypic difference from the wild-type background can be relatively small. With complex diseases, a positional-candidate approach to disease gene identification 17 suffers from the problem that many genes can be interpreted as potential candidates. Clearly, the congenic strategy provides an alternative to laboriously screening the perhaps hundreds of genes within a given mapped interval. As yet, no complex disease gene has been cloned using these approaches, but first successes in using congenic strains for more precise localization of a QTL and evaluation of its phenotype effect have been described 13,18.
Rat genome toolbox These strategies implicitly assume the existence of adequate genomic resources. There is little doubt that many of these resources, which constitute the 'mammalian genome toolbox '19, will be available to the rat genetics community in the near future. Being a late starter in the genomics area compared with its smaller rodent relative, the rat can gain advantage from lessons learned and experience gained, and possibly avoid some effort altogether by exploiting the close evolutionary relationship of the two species. Thus, comparative mapping should be particularly important in gene identification in the rat, saving time and resources, because of the significant programme to develop a transcript map for the mouse. Likewise, as long as the rat-mouse comparative map is adequate, the transcript map to emerge from the human genome project 20 will also provide a critical helping hand to the rat. Of course, this is a two-way process and the rat genetic map could point to human genes and regions to be examined more closely in human diseases. It must be said, however, that there is a fair distance to travel before this nirvana is reached. For example, the latest rat genetic map contains just under 800 SSLPs(M.T. Bihoreau et al., unpublished; see Ref. 21), and although this represents almost double the density of the map of just one year earlier 22, it is only equivalent to the first microsatellite-based human genetic map. While there is full expectation that these numbers will reach thousands in the near future, as has TIG MAY 1997 VOL. 13 No. 5 172
been repeatedly observed in genome projects, technology might come to the rescue. First, radiation hybrid (RH) maps are likely to displace the much more laborious genetic maps as the primary mapping tool. Because RH maps do not require polymorphic markers, they might also be the main means of creating a high-resolution comparative map, perhaps even using the same gene and expressed sequence tag (EST) primers created for the mouse transcript map project. It is to be seen whether single nucleotide polymorphisms, probably developed from these same mapped genes, and the technology to screen them, will obviate the need to develop as many SSLPs as has been the case for the mouse and human. New approaches in transgenic technology 23might overcome the failure to develop targeted gene replacemere in the rat. This is a non-trivial advantage of the mouse because many of the changes that will be found associated with QTLs are likely to be subtle alterations. The ultimate proof that such alterations are biologically significam might have to come from such precise single-gene or singlenucleotide-replacement experiments to assess their effect in various genetic backgrounds. In z,ome cases, this type of experiment might be informative in the mouse. In many cases, however, the complexity of interactions with other genes and specific allelic variants of other genes might work against reproducing a specific model in the mouse. Finally, in this context of whole-animal technology, it is difficult to resist mentioning a recent study in which the mouse has unarguably been of help to the rat, namely in acting as a host for rat spermatogenesis24. The recem demonstration of long-term frozen storage of viable spermatogonial stem cells25 might feasibly represent an alternative to standard transgenic methodology if the spermatogonial stem cells can be grown in culture. Yet another promising fruitful approach for the future might be to close the 'phenotype gap' by saturation mutagenesis and screening for interesting mutants 26. In this way, one might identify genes involved in metabolic or regulatory pathways of specific interest. Large-scalescreening systems developed by the pharmaceutical industry make this more realistic than it would have been in
COMMENT
the past. Combining tiffs with chemical mutagenesis z6, or even transposontagging mutagenesis ~ (which allows the mutated gene to be readily identried), might make this a more direct route to identify genes as potential candidates for drug therapy.
~ospects The epidemiological importance of common, complex diseases highlights the potential impact that a deeper understanding of their nature and the uncovering of the contribt,ting genes will have on the future of medicine, and on the human condition in general. The next few decades are likely to see an explosion of knowledge in this area, and the availability of a model organism that offers reductionist, yet typical, models of complex multifactorial traits will undoubtedly play an important role in these advances. Thus, the current efforts at establishing a powerful set of genetic/genomic tools to make experiments in the laboratory rat accessible to efficient gene discovery seem timely and well poised to contribute significantly to the forefront of biomedical sciences. Acknowledgements M.R.J. thanks the Wellcome Trust for support. K.L is supported by a
Research Career Development Award (K04-HL03138-01) from the National Heart, Lung and Blood Institute. Laboratory work was partly supported by EC Biotechnology grant BIO4-CT96-0372.
11 Re.,nmers,E.F. el al. (1996) Nat. Genet. 14, 82--85 12 Kermarrec, N. et al. (1996) Get,nmics 31, 111-114 13 Kmutz, R. etal. (1995) Proc. Natl. Acad. Sci. U. S. A. 92,
References
14 Moisan, M-P. et al. (1996) Nat. Genet. 14, 471-473
1 Cat, L. etal. (1997) Genomics39, 385-392 2 Bonaldo, M.F., Lennon, G. and Soares, M.B. (1996) Genome Res. 6, 791-806 3 Frankel, W.N. (1995) Trends Genet. 11,471-477 4 Greenhouse, D.D., Festling. M.F.W., Hasan, S. and Cohen, A.L. (1990) in Genetic Monitoring of Inbred Strains nf Rats. A Manual on Colony Management, Basic Monitoring Techniques, and Genetic Variants nf tbe LaboratnryRat(Hedrich, H. and
Adams, M., eds), pp. 411-480, Gustav Fischer Verlag 5 Hilben, P. etal. (1991) Nature353, 521-529 6 Jacob, H.J. etaL (1991) Cell67, 213-224 7 Brown, D.M. etal. (1996) Nat. Genet. 12, 44-51 8 Galli, J. etal. (1996) Nat. Genet. 12. 31-37 9 Gauguier. D. etal. (1996) Nat. Genet. 12, 38-43 10 Jacob, H.J. et aL (1992) Nat. Genet. 2. 56-60
M E E T I N G
8778-8782 15 Kreutz, R., Stock, P., Struk, B. and Lindpaintner, K. (1996) Hypertension 28, 895-897 16 Deng, A.Y. and Rapp, J.P. (1992) Nat. Genet. 1,267-272 17 Collins, F.S. (1995) Nat. Genet. 9, 347-350 18 St Lezin, E.M. etal. (1996) J. Cltn. Invest. 97, 522-577 19 Frankel, W.N. (1995) Nat. Genet. 9, 3-4 20 Schuler, G.D. etaL (1996) Science 274, 540-546 21 Bihoreau, M.T. et aL Hum. Mol. Genet. (in press) 22 Jacob, H.J. et al. (1995) Nat. Genet. 9, 63-69 23 Campbell, K.H.S., McWhir,J., Patchie, W.A. and Wilmut, I. (1996) Nature380, 64-66
24 Clouthier, D.E. etaL (1996) Natare 381, 418-421 25 Avarbock, M.R., Brinster, C.J. and Brinster, R.L. (!9>6) Nat. Med. 2, 693-696 26 Brown, S.D.M. and Peters, J. (1996) Trends Genet. 12, 433--434 27 Moran.J.V. etaL (1996) Cell87, 917-927
REPORTS
Biologists and mathematicians:bridgingthe chasm RECOMB97: FI0~TAA'I~ALCOMXZtrrAnONALMom2la.~ BIoi.o6¥COI~,ENCE,SA,~,'rAFE, NM, USA,20-23 J.~\~Aln"1997. Computational molecular biology, or bioinformatics as it is sometimes known, is a very rapidly growing discipline that is still in search of an identity. The discipline has emerged from a series of collaborations where biologists have wandered over to the mathematics, statistics, or computer science departments of their universities in search of someone who could help them with a problem. The field remains primarily collaborative ~ather than interdisciplinary. Each speaker was careful to introduce themselves either as a mathematlcian (including statisticians and computer scientists
under this category) or as a biologist, no one claimed to be a 'computational biologist'. Numerous jokes were made about the conflict of cultures: biologists tend to present talks with elegant slides of gels, while mathematicians use hand-drawn overhead transparencies filled with equations. Another symptom of insufficient communication was the lack of understanding by those developing new computer algorithms about what computer tools most biologists are currently using it Oe laboratory. This is obviously a contributing factor to the frequent difficulties that biologists T1G MAY 1997 VOL. 13 No. 5
Ca~pyright c 1997 H~wk~ ridetxx' Lid All right, r t ~ ' ¢ d , ol;,8a~25 9" $1- t ~ PII: SO161~952~97~113~9
173
have with the user interfaces and file format incompatibilities of sequence analysis and other bioinformatics software. Clearly, breaking down these communication barriers is what this conference was all about. Most of the talks fell into a few distinct categories: (1) biologists with new kinds of data in need of analysis [David Botstein (Stanford Univ., USA), Rob Lipshutz (Affyme~x, Santa Clara, CA, USA), Richard Roberts (New England Biolabs, Beverly, MA, USA)]; (2) extensions and improvement of current algorithms (multiple alignment, STS mapping, assembly of short