Cell, Vol. 85, 311–318, May 3, 1996, Copyright 1996 by Cell Press
Genetic Analysis of Autoimmune Disease
Timothy J. Vyse* and John A. Todd† *Division of Basic Sciences National Jewish Center for Respiratory Medicine and Immunology 1400 Jackson Street Denver, Colorado 80206 † The Wellcome Trust Centre for Human Genetics Nuffield Department of Surgery University of Oxford Headington Oxford OX3 7BN United Kingdom Introduction The destruction or disruption of the body’s own tissues by the immune system results from a complex interaction of genetic and environmental factors. Inbred populations such as laboratory strains of mice and rats and outbred populations such as humans harbor a pool of DNA mutations or allelic variants that affect the expression/function of genes involved in control of the immune response. Individually, most mutations in the genome have mild, if not undetectable, effects, but in combination with “normal” alleles of other loci produce a measurable and occasionally lethal autoimmune phenotype. Many of these mutations may be normal variants, which in other circumstances act to maintain the health of the individual. They may have subtle quantitative effects, bestowing a relatively small risk of disease. We refer to primary, predisposing allelic variants as “etiological mutations” to distinguish them from other polymorphisms that play no direct role in disease. It is emphasized that the term mutation is used here generically and is not meant to infer function. The incomplete risk conferred by any single allele makes both the mapping and final identification of these etiological mutations difficult. This single characteristic is the main reason why most common diseases such as autoimmune conditions are not inherited in a simple Mendelian way, but instead have a complex or unknown mode of inheritance. Identification of these etiological mutations and an understanding of their functional consequences will impact the diagnosis, treatment, and prevention of autoimmune diseases, which affect about 4% of the population (see Table 1). Genetic Analysis on a Whole-Genome Scale The availability of easy to use maps of markers that cover the entire mouse (and rat) and human genomes (Dietrich et al., 1994; Gyapay et al., 1994) has enabled the chromosome localization of almost 40 genes that predispose to autoimmunity in animal models and in human disease (Figure 1). Human studies are in their infancy, but genome-wide linkage studies are underway for most of the diseases listed in Table 1. The markers are simple tandem repeats or microsatellites, which occur in mammalian genomes at fairly even intervals of about 105 bp. Microsatellite markers possess high levels
Review
of allelic variation (Love et al., 1990; Weber and May, 1989) so that parental chromosomes often bear different alleles allowing fully informative tracking of chromosomes from parents to affected progeny. Microsatellites are flanked by unique sequences to which polymerase chain reaction primers can be designed, and many aspects of large-scale microsatellite-assisted linkage mapping have been semi-automated (Reed et al., 1994). A whole-genome scan is the analysis of about 100 evenly spaced markers in mouse pedigrees, or about 300 markers in human families, in which there are affected and nonaffected progeny. The aim is to identify chromosome regions that are inherited by affected progeny more often than expected from Mendelian random segregation. Statistical methods have been reviewed recently (Ghosh and Schork, 1996; Lander and Kruglyak, 1995; Lander and Schork, 1994; Thomson, 1995; Weeks and Lathrop, 1995), as has the topic of genetic predisposition to autoimmunity (Theofilopoulos, 1995). We will, therefore, review recent results briefly, but focus on the implications and interpretation of current data. Animal Models of Autoimmunity Some of the best-studied animal models of spontaneous and induced autoimmune disease have now been subjected to genome scans: the nonobese diabetic mouse (NOD; insulin-dependent diabetes [Idd] loci) (Wicker et al., 1995); the F1 hybrid of New Zealand black (NZB) and New Zealand white (NZW) mice (Theofilopoulos, 1995) and the MRL-lpr mouse (Watson et al., 1992), which develop a disease similar to human systemic lupus erythematosus (SLE; Sle or Lbw [lupus–NZB 3 NZW] loci) (Kono et al., 1994; Kotzin, 1996 [this issue of Cell]); experimental allergic encephalomyelitis (EAE; eae loci), a model of multiple sclerosis (Steinman, 1996 [this issue of Cell]); experimental allergic orchitis (EAO; Orch loci; anti-testis autoimmunity) (Meeker et al., 1995); and autoimmune ovarian dysgenesis (AOD; Aod loci) (Wardell et al., 1995), in which neonatal mice are thymectomized at 3 days of age (Figure 1). Mice are the model of choice for mammalian genetic analysis (Frankel, 1995). Moreover, the environment is controlled and is not a variable in the cumulative frequency of disease in the cross. Therefore, the reduction of disease frequency from the susceptible parental strain to that observed in the cross-progeny is solely a reflection of the genes segregating in the cross, their penetrances (the probability that an etiological mutation can cause the disease), and how the products of these genes relate. Chromosome Mapping Genes in Mice The first step is to ascertain the cumulative disease frequency in the parental strains, the F1 hybrids, the first backcross generation, which is the progeny of an F1 3 parent mating, and the (F1 3 F1)F2 generation. Marker allele frequencies are determined in either backcross progeny or F2 progeny, or preferably both (McAleer et al., 1995), in as many affected and unaffected progeny as possible; they will vary in disease progeny
Cell 312
Figure 1. Chromosome Locations of Mouse Genes Associated with Autoimmune Disease We have positioned 38 murine disease loci (horizontal bars) according to the marker locus most linked to the trait from genome scanning experiments. Where appropriate, 6 5 cM confidence intervals are indicated with vertical bars. Loci colored red are linked to type 1 diabetes, those colored purple to SLE or SLE-associated autoantibody production (or both), those colored blue to EAE, those colored pink to EAO, and those colored green to AOD. There are currently 13 separate type 1 diabetes non-MHC loci mapped (Idd2 through Idd15; Chesnut et al., 1993; de Gouyon et al., 1993; Ghosh et al., 1993; McAleer et al., 1995; Morahan et al., 1994; Serreze et al., 1994; Wicker et al., 1995) (Idd8 and Idd11 are probably the same locus). Nine separate loci (excluding lpr/Fas and gld/Fas ligand; Nagata and Suda, 1995) have been linked to SLE or to SLE-associated autoantibody production (or both) (Sle1–Sle3 [Morel et al., 1994], Lbw2, Lbw3, and Lbw5–Lbw7 [Kono et al., 1994], and nba1 [Drake et al., 1994]; it is probable that Sle1 is equivalent to Lbw7 and Sle3 is equivalent to Lbw5), five in EAE (eae) (Baker et al., 1995; Sundvall et al., 1995), three in EAO (Orch3, Orch4, and Orch5; Meeker et al., 1995), and two in AOD (Aod1 and Aod2; Teuscher et al., 1996; Wardell et al., 1995). Loci such as Lbw4 and Lbw8 (Kono et al., 1994), which had statistical support of p > 0.003, were not included. The loci eae4 (chromosome 7), eae5 (chromosome 11), and eae6 (chromosome 18) are named here for convenience; although linkage data were reported by Baker et al. (1995) with support at p < 0.003, the possibility of segregation distortion was not eliminated. Map distances are based on the version 3.1 of the Mouse Genome Database, which contains additional information and references for all loci. Additional genes shown are as follows: Il1r1 and Il1r2 (interleukin-1 receptor 1 and 2), Ctla4, Cd28, Bcl2, Nramp1 (natural resistance–associated macrophage protein), and Lshs (unofficial nomenclature) on chromosome 11, which controls T cell responsiveness to interleukin-12 and susceptibility to Leishmania infection (Guler et al., 1996). Periinsulitis, sialitis, hyperimmunoglobulin production, and resistance of T cells to apoptosis in NOD mice have been mapped to the Bcl2 locus, named here as Nod1 (Garchon et al., 1994). Nod1 overlaps with the SLE-associated Sbw1 locus (splenomegaly) (Kono et al., 1994), which affects lymphoid hyperplasia and hypereactivity. Sbw2 on chromosome 4 (Kono et al., 1994) is probably the same as Sle2 (Morel et al., 1994). Interleukin-1 receptor antagonist (Il1rn) on chromosome 2 has been associated with, but not linked to, a number of human autoimmune diseases (Blakemore et al., 1995). The Idd13 locus maps to the interleukin-1 gene on chromosome 2 (Serreze et al., 1994). Il2 (interleukin-2) maps to the 4 cM congenic interval of Idd3 (Lord et al., 1995), overlapping with Aod2 (Teuscher et al., 1996). Aia1 is a locus controlling autoimmune hemolytic anemia on chromosome 4 that overlaps with the Idd9 and nba1 loci. The D6Nds1/Idd locus was detected in a NOD 3 Mus spretus backcross and maps close to Igk (immunoglobulin light chain locus) (de Gouyon et al., 1993) and Tcrb (T cell receptor b-chain). The D6Mit52/Idd linkage was observed in a cross of NOD with NON (McAleer et al., 1995) and might be distinct from Idd6. Nod3 signifies a locus linked to dexamethazone resistance/apoptosis of NOD T cells and colocalizes with Idd6 (Penha-Goncalves et al., 1995). Idd7 and Sle, a locus linked in the MRL-lpr mouse, colocalize at the centromere of chromosome 7 with the candidate gene Tgfb1 (Watson et al., 1992). Nod2 is a subphenotype of NOD involving a T cell proliferative defect mapping to the marker D11Mit38, which is in a cluster of immunomodulatory genes, the b-chemokines (Gill et al., 1995). Igh is the immunoglobulin heavy chain locus on chromosome 12. H2 is the MHC on chromosome 17. Mbp is the myelin basic protein gene on chromosome 18 and does not overlap with eae6 (Baker et al., 1995). Note that the Sle1 10 cM interval overlaps with Fasl (Morel et al., 1994). Bphs is the Bordetella pertussis–induced histamine sensitization locus, involved in both EAO and EAE susceptibility (Sudweeks et al., 1993). For the published human IDDM2–IDDM5, IDDM7, and IDDM8 loci on chromosomes 11p15 (Bennett et al., 1995), 15q26 (Field et al., 1994), 11q13 (Davies et al., 1994; Hashimoto et al., 1994), 6q25 (Davies et al., 1994), 2q31 (Copeman et al., 1995; Owerbach and Gabbay, 1995), and 6q27 (Luo et al., 1995),respectively, the nearest genes that have been mapped in both mouse and humans were selected and placed in the figure according to the position of the mouse gene, e.g., the human thrombospondin gene is in the vicinity of IDDM8 on chromosome 6q27,
Review: Genetics of Autoimmune Disease 313
Table 1. Population Frequencies and Familial Clustering of Human Autoimmune and Immune-Related Diseases
Disease a
Psoriasis Rheumatoid arthritisb Graves’ disease Type 1 diabetesc Ankylosing spondilitisd Multiple sclerosisc Ulcerative colitise Lupus (SLE) f Crohn’s diseasee Narcolepsyg Celiac disease c Primary biliary cirrhosish
Population Frequency (%)
Sibling Risk (%) ls
MHC ls
2.8 1.0 0.5 0.4 0.13 0.1 0.1 0.1 0.06 0.06 0.05 0.008
17 8 7.5 6 7 2 1.2 2 1.2 1 3 0.8
ND 1.6 ND 2.4 4.2 2.4 8.3 ND 1.3 ND 5.2 ND
6 8 15 15 54 20 12 20 20 12 60 100
Population prevalences and sibling risk values are averaged values from a wide literature studying different ethnic groups. The values have wide confidence intervals, as have the ls values. ls 5 sibling risk/population sibling risk prevalence. ND, not determined. a Elder et al., 1994 b Wordsworth, 1995 c Risch, 1987 d Brown and Wordsworth, 1996 e Satsangi et al., 1996 f Hochberg, 1987 g Guilleminault et al., 1989 h Gregory and Bassendine, 1994
from the expected Mendelian ratios (1:1 for a backcross and 1:2:1 for an intercross) if a marker locus is near a disease locus. Cross-progeny without disease are essential controls, and it is also important to establish that the marker allele frequencies in all progeny (or a randomly selected subset) do not vary from the Mendelian ratios. Sample sizes are absolutely critical for the reliability of genome scan results. Small sample sizes, say less than 50 mice, will have a high false-positive rate, where false-positive results are simply random fluctuations. False-positive results can be reduced or made highly unlikely by replicating the results in more than one cross and by taking a low probability (p) value as a statistical threshold for significance. In murine autoimmune disease models, p < 0.001 has been taken as sufficient evidence to name a locus in a genome scan and undertake more experiments to confirm it. For example, of 13 separate non–major histocompatibility complex (MHC) Idd loci named, there is additional, independent support for 12 of them (Idd15 has not yet been tested; Figure 1). Nevertheless, even more stringent statistical criteria have been suggested (Lander and Kruglyak, 1995). Map Positions In themselves, map positions are important because they suggest candidate genes already mapped to the disease gene interval. Figure 1 summarizes currently
and the mouse Thbs2 maps to mouse chromosome 17. These assignments are very approximate, and when the IDDM loci are identified and their homologs are mapped they may be different than proposed.
published data on non-MHC loci accrued from mouse genome scan studies of NOD type 1 diabetes, EAE, EAO, AOD, and the New Zealand and MRL models of SLE. In addition, all of the models require a gene or genes in the MHC on chromosome 17. The composite map shows that the intervals for genes for different diseases overlap. If each disease locus is considered to be positioned accurately within a 10 cM interval, then three of nine SLE loci colocalize with an Idd locus interval. This is unlikely to have occurred by chance (binomial p 5 0.02). Moreover, two of three Orch (EAO) loci colocalize with Idd loci (p 5 0.02). It seems likely, therefore, that the observed clustering of disease genes is not random. These data suggest that mutations exist that cause general defects in immune regulation (such as the polarization of T cells into Th1 and Th2 phenotypes; Charlton and Lafferty, 1995; Tisch and McDevitt, 1996 [this issue of Cell]) or apoptosis (Garchon et al., 1994; Nagata and Suda, 1995; Penha-Goncalves et al., 1995), while other genes that are unique to a disease determine the susceptibility of the target organ to autoimmune response (Sudweeks et al., 1993), such as the expression of an organ-specific antigen and its specific interaction with the class II and class I molecules of the MHC (Wicker et al., 1995). Note that in Figure 1 several loci are shown that are linked to autoimmune disease–associated phenotypes such as apoptosis (Nod3 on chromosome 6) (Penha-Goncalves et al., 1995), T cell and B cell activity (Nod1, Sbw1 and Sbw2, and Nod2 on chromosomes 1, 4, and 11; Garchon et al., 1994; Gill et al., 1995; Kono et al., 1994), and susceptibility to infectious diseases (Nramp1 [Vidal et al., 1995] and Lshs [Guler et al., 1996] on chromosomes 1 and 11), which may influence autoimmunity via dysregulation of the immune system, e.g., by perturbing the Th1/Th2 balance.
Modes of Inheritance Key information can be obtained about the relation between the action of disease gene products and the number and effect of loci linked to the disease observed in the genome scan by analysis of the frequency of disease in the parental strains and the F1, BC1, and F2 progeny. Only 13% of progeny of a (NOD 3 C57BL/10.H2g7) 3 NOD backcross developed diabetes, which is much less than the 45% frequency expected if a single, fully recessive-acting gene was segregating and 7-fold less than the 90% frequency observed in parental female NOD mice. If such a locus existed, it would have a genotypic risk ratio (GRR) (the total number of affected progeny/ two times the number of homozygous affected progeny) of 0.5, i.e., all affected progeny would have the NOD homozygous genotype at the disease locus (Risch et al., 1993). Hence, there must be more than one gene segregating. Genome scanning revealed that Idd2 through Idd10 were segregating, but these genes still did not account for the 7-fold decrease in disease frequency from NOD parental mice to backcross progeny. The GRRs for these nine Idd loci were, respectively, 0.84, 0.51 (Idd3 and Idd10 are combined because they are linked on chromosome 3), 0.72, 0.85, 0.77, 1.47, 1.67, and 0.72. Note that none of the loci is fully recessive
Cell 314
with a GRR of 0.5 and that the GRRs of Idd7 and Idd8 exceed 1.0, indicative of genes in which the susceptibility is contributed from the resistant strain. Two main genetic models were fitted to the backcross data: the multiplicative model, which is a special model of epistasis with simple mathematical properties (relative penetrances are multiplied) and is approximately equivalent to the classic polygenic threshold liability model (in the liability model, the data from the measured phenotypes are transformed into a logarithmic or logistic scale so that they can be added together); and, second, the heterogeneity model, in which genes act independently of each other and can substitute for each other (Risch et al., 1993). Epistasis or gene interaction implies that the disease gene products interact, in contrast with heterogeneity, in which the defective proteins are causing etiologically distinct forms of the disease. At the outset, the heterogeneity model could be rejected because the frequency of disease in the backcross progeny was less than 45%. Statistically, the multiplicative model fit the data better than the heterogeneity model. Nevertheless, when the GRRs for all nine of the detected loci were multiplied together, the overall product was 0.36, which is greater than the experimental value of 0.14 (calculated by dividing the proportion of mice with diabetes in the backcross progeny [0.13] by the proportion of parental NOD mice that develop diabetes [0.9]). Therefore, there must be a large number of additional genes with small, undetectable effects segregating in the cross (Risch et al., 1993). Similar conclusions have been reached in plant genetics, in which relatively few genes account for the bulk of variation in many populations, with ever-larger numbers of genes contributing ever-smaller portions of the remaining variance (Paterson, 1995). In the genetic analyses of murine autoimmunity reported to date, there is no significant evidence for heterogeneity, and instead epistasis is inferred. It appears that the most likely mode of inheritance for autoimmune traits in mice, including EAE (Baker et al., 1995; Sundvall et al., 1995), type 1 diabetes (McAleer et al., 1995), and SLE (Morel et al., 1994), is an epistatic one, with the following characteristics: no single allele is sufficient or, in general, necessary to cause the disease; the more susceptibility alleles at unlinked loci an individual has, the greater the risk of developing the disease; the risks attributable to the main loci are roughly equal; combinations of genes cause the disease, but more than one combination may exist; and a significant component will probably remain unmapped owing to the existence of many genes with weak effects.
Mouse Congenic Strains Not one of the mapped non-MHC susceptibility genes in Figure 1 has yet been identified. Owing to the reduced penetrance of susceptibility alleles, fine mapping by generating standard backcrosses or intercrosses is not productive unless an extreme penetrant trait is available. Definitive fine mapping can be achieved by the construction of congenic strains in which specific chromosome segments from the donor strain are introduced into the genome of the recipient strain by repetitive backcross
of the donor strain to the recipient (Frankel, 1995). Thus, it is possible to compare two strains of mice differing only for a specific chromosome segment. The construction of congenic strains has proved that chromosome 3 has two separate but linked Idd loci (Wicker et al., 1994), Idd3 and Idd10, that interact in synergistic way (Wicker et al., 1994) and has narrowed the Idd3-containing region down to 4 cM (Lord et al., 1995). Congenic strains for the MHC have proved invaluable in sorting out the relative contributions of MHC-linked genes and non-MHC-linked genes (Ikegami et al., 1995; Wicker et al., 1995). Once it is established that the specific region contains a disease gene, the gene can be mapped to 0.1 cM by developing further congenic strains.
Human Autoimmune Disease Clustering of disease in families is caused by environmental factors, genetic factors, or both. The degree of clustering of disease in families can be estimated from the ratio of the risk for siblings of patients with a disease and the population prevalence of that disease (Risch, 1987) (Table 1). If this ratio, l s, is close to 1.0, then there is no evidence for familial clustering. Note, however, for the 12 diseases listed in Table 1, the ls value greatly exceeds 1.0, ranging from 6–100. For example, if a family has a child with type 1 diabetes, then the other siblings are at a 15 times greater risk of developing diabetes than a member of the general population. It cannot be assumed that genes are fully responsible, because environmental factors can also cause familial clustering. However, it has recently been shown for multiple sclerosis, by comparing the frequency of disease in biological and nonbiological relatives, that familial clustering has a genetic basis (Ebers et al., 1995). The alternative strategy is to embark on a genome scan, if large enough numbers of families with multiple affected cases are available. In type 1 diabetes, two genome scans (over 250 markers) have been completed on 61 (Hashimoto et al., 1994) and 96 (Davies et al., 1994) families with two or more siblings both affected by disease. If available, other affected relatives can be used, as can pedigrees with larger numbers of affected individuals in more than one generation, but affected sibling pairs are often the most common and the most easily collected. The main conclusion was that the MHC-linked locus IDDM1 is probably the major locus and that a number of other genes with lesser effects probably contribute to familial clustering. This conclusion is remarkably, but not unexpectedly, similar to that reached from genome scanning of the NOD mouse. Moreover, in line with the epidemiological results from multiple sclerosis, it appears likely that familial clustering could well be explained by the sharing of susceptibility alleles at multiple loci (Todd, 1995). For affected sibling pairs (or other relative pairs), the contribution of a single locus to familial clustering can be estimated from the linkage data (Risch, 1987), which in the case of type 1 diabetes was the distribution of marker alleles in the affected sibling pairs. In families from the United Kingdom, the ls value for MHC is about 2.4, much less than the overall ls value of 15, implying that more genes may be involved. Similar conclusions
Review: Genetics of Autoimmune Disease 315
were reached for multiple sclerosis and celiac disease (Risch, 1987), and it is evident from Table 1 that the MHC cannot fully account for familial clustering of rheumatoid arthritis, ankylosing spondylitis, ulcerative colitis, and Crohn’s disease. If the ls of the disease is large (greater than 6, for example) and the linkage of the MHC region in families cannot account for it, then a genome scan is warranted. In human type 1 diabetes, we expect epistasis because the risk to first, second, and third degree relatives decreases more rapidly than would be expected for a heterogeneity model, in which risk would decrease in a linear fashion by 50% from first to second and 50% from second to third degree relatives (Rich, 1990; Risch, 1990). This is precisely analogous to the drastic reduction in NOD diabetes frequency from that observed in the parental strain (90%) to that in the backcross progeny (13%). Preliminary studies suggest that the MHC IDDM1 locus on chromosome 6p21 and the insulin gene minisatellite locus IDDM2 on chromosome 11p15 interact epistatically (Cordell et al., 1995). In genome scans, linkage results that have occurred through random fluctuation are expected, referred to as false positives (Lander and Kruglyak, 1995; Risch, 1991). They can be eliminated by testing additional, independent data sets. The proportion of false-positive results among true positives can be reduced by using large data sets, a map with denser markers, and multipoint programs (Kruglyak and Lander, 1995; Weeks and Lathrop, 1995), which make the map even more informative, and by correcting for the testing of multiple genetic and clinical models (Risch, 1991). Taking all the current mouse and human data together, environmental factors are probably not a major cause of familial clustering (Ebers et al., 1995; Todd, 1995). This does not mean that environmental factors are not important. They must be important in type 1 diabetes to explain seasonal and geographical variations in incidence, variations in disease frequency associated with emigration, and the dramatic increase in disease incidence observed over the last 30 years. Hence, environmental factors exist but are probably common and ubiquitous, affecting the general population. We predict that individuals that carry susceptibility alleles at most disease loci will probably be at high risk of developing disease, in much the same way that the selected strains of NOD and (NZB 3 NZW)F1 are uniquely susceptible to disease because they carry susceptible alleles at many loci. Fine Mapping in Humans Ironically, after genome scanning has been carried out, fine mapping and etiological mutation identification in humans may proceed faster than in mice because humans are outbred. Consider a disease mutation that arose on a human chromosome 2000 years ago. It will be flanked by microsatellite markers with certain alleles, and hence if those microsatellites could be typed, their alleles would give a specific and recognizable signature to the chromosome region containing the disease mutation. If a marker is very close, within 1 cM (Jorde, 1995), to a disease mutation, then despite 2000 years of opportunities for recombination during meiosis between the
marker and the disease mutation, there will still be a strong chance that the same marker alleles are associated with the susceptibility allele of the disease mutation in present-day chromosomes. This is called linkage disequilibrium; alleles at marker loci more than 1 cM away from the disease mutation will not tend to be associated with alleles of the disease mutation, because recombination with other chromosomes bearing different alleles at the microsatellite markers will have occurred frequently enough to scramble the association of alleles such that they reach equilibrium and become randomly associated. For example, if alleles at two linked markers each have a population frequency of 0.25, then the chance of them occurring on the same chromosome is 0.25 3 0.25 (0.06). However, if they are very close to each other on the chromosome, they will tend to occur together on the same chromosome at a frequency greater than 0.06. In regions of the genome that show some evidence of linkage to disease detected in the genome scan, therefore, testing markers every centimorgan should yield a marker that shows linkage disequilibrium with disease. By characterizing new markers in the 1 cM region flanking the first disease-associated marker, the disease mutation can be located to regions smaller than 1 cM, or approximately 106 bp of DNA, by finding markers with alleles that show stronger and stronger association with the disease. For the IDDM2 mutation this process, called linkage disequilibrium mapping, reduced the maximal chromosome region to only 4 3 103 bp of DNA (Lucassen et al., 1993). This approach is very powerful and much more sensitive than linkage analysis in families. For example, the statistical support in all the combined available type 1 diabetic families for linkage of IDDM2 is marginal (p 5 0.005 in 568 affected sibling pair families; ls 5 1.3), but the evidence for linkage disequilibrium at IDDM2 is overwhelming (p 5 6.7 3 10214) (Bennett et al., 1996; Spielman et al., 1993). Currently, it is impractical and statistically problematical to scan the whole human genome for linkage disequilibrium with 3000 markers, but in regions of linkage spanning 20 cM it is a feasible strategy. Already, there is evidence for the association of a microsatellite allele with type 1 diabetes in a region of chromosome 2q that showed weak evidence of linkage in the genome scan (Copeman et al., 1995). The history of the populations studied and the age of the disease mutation relative to the age of marker alleles are important factors (Jorde, 1995). Linkage disequilibrium studies are best carried out in families, but these families do not have to have multiple affected children, but instead one affected child is sufficient (Copeman et al., 1995; Spielman et al., 1993; Thomson, 1995). Families with one affected child are much more common than families with multiple cases and are therefore much easier to collect. Until recently the candidate gene approach dominated the genetic analysis of autoimmunity. Candidate genes are, however, numerous, and aside from the MHC in several diseases and the insulin gene in type 1 diabetes, associations of many candidate genes have been reported but not confirmed. Nevertheless, candidate genes in regions of linkage, or better still in regions of linkage disequilibrium, do warrant some priority.
Cell 316
Known Etiological Mutations Except for IDDM2, none of the etiological mutations for non-MHC loci in humans has yet been identified (IDDM3–IDDM5, IDDM7, and IDDM8) (Copeman et al., 1995; Davies et al., 1994; Hashimoto et al., 1994; Luo et al., 1995; Owerbach and Gabbay, 1995). IDDM2 is due to allelic variation of a minisatellite tandem repeat sequence embedded in the 59 regulatory region of the insulin gene on chromosome 11p15 (Bennett et al., 1995). The minisatellite affects insulin expression, but the mechanistic relevance of this to disease is unknown (Bennett et al., 1996). The MHC-encoded effects are complex (Cucca and Todd, 1996; Thomson, 1995; Thorsby, 1995). The MHC etiological mutations for celiac disease (Thorsby, 1995), rheumatoid arthritis (Wordsworth, 1995), and type 1 diabetes (Cucca and Todd, 1996) are likely to include variation in exon 2 of the MHC class II HLA-DQB1 and HLA-DRB1 genes. Exon 2 encodes the amino acid polymorphism of these antigen presentation molecules that determines peptide binding and T cell recognition (Tisch and McDevitt, 1996). MHC class II molecules probably determine the target organ specificity of the autoimmune response (Tisch and McDevitt, 1996; Wicker et al., 1995): the interaction of peptide autoantigen, T cell receptor, and MHC is a key checkpoint in autoimmunity. However, the peptides involved are unknown (Tisch and McDevitt, 1996), and the events distal to T cell recognition, such as MHC class II allele DQB1*0602-mediated protection from type 1 diabetes (Tisch and McDevitt, 1996) or the role of HLA class I B27 allele in ankylosing spondylitis (Brown and Wordsworth, 1996), remain to be established. The MHC class III region also plays a primary role in autoimmunity, with mutations in the complement genes C4A and C2 contributing to SLE. Comparative Mapping in Mouse and Humans Localization of disease genes in mouse leads directly to the chromosome location of their human homologs owing to the significant homology between the two genomes. Owing to the limited number of genome-wide mapping studies thus far conducted, particularly in humans, it is premature to draw conclusions. IDDM is the only autoimmune disease that has been investigated on a genome-wide basis in humans; none of the NOD Idd loci overlaps with IDDM loci (Figure 1), with the possible exception of Idd5 and IDDM7 (Copeman et al., 1995). The IDDM3 gene on human chromosome 15q (Field et al., 1994) could conceivably be the homolog of Sle3/ Lbw5 (Figure 1). Given that the population frequency of the disease allele is an important determinant of the power to detect linkage (or disequilibrium) in human families (Risch et al., 1993), and that the mouse strains used have to vary at the locus, there is no a priori reason to expect exactly the same genes to show up in mouse and human. However, since biochemical pathways are encoded by many gene products and the mouse disease phenotypes model closely their human diseases, analysis of both mouse and human is desirable. For human diseases in which it is not possible to collect families with multiple living cases or measure key subphenotypes, or where the disease is clinically heterogeneous, mouse studies are essential.
Acknowledgments We thank Martin Farrall, Chris Lord, Paul Lyons, Jared Lunceford, and Neil Risch for discussion. We also thank Paul Lyons for preparation of Figure 1. We apologize to authors of important references that we could not cite. Research in the laboratories of T. J. V. and J. A. T. was supported by the National Institutes of Health and by the Juvenile Diabetes Foundation, the British Diabetic Association, the United Kingdom Medical Research Council, and the Wellcome Trust. J. A. T. is a Wellcome Trust Principal Research Fellow. References Baker, D., Rosenwaser, O.A., O’Neill, J.K., and Turk, J.L. (1995). Genetic analysis of experimental allergic encephalomyelitis in mice. J. Immunol. 155, 4046–4051. Bennett, S.T., Lucassen, A.M., Gough, S.C.L., Powell, E.E., Undlien, D.E., Pritchard, L.E., Merriman, M.E., Kawaguchi, Y., Dronsfield, M., Pociot, F., Nerup, J., Bouzekri, N., Cambon-Thomsen, A., Rønningen, K.S., Barnett, A.H., Bain, S.C., and Todd, J.A. (1995). Susceptibility to human type 1 diabetes at IDDM2 is determined by tandem repeat variation at the insulin gene minisatellite locus. Nature Genet. 9, 284–292. Bennett, S.T., Wilson, A.J., Cucca, F., Nerup, J., Pociot, F., McKinney, P.A., Barnett, A.H., Bain, S.C., and Todd, J.A. (1996). IDDM2VNTR-encoded susceptibility to type 1 diabetes: dominant protection and parental transmission of alleles of the insulin gene-linked minisatellite locus. J. Autoimmun., in press. Blakemore, A.F., Watson, P.F., Weetman, A.P., and Duff, G.W. (1995). Association of Graves’ disease with an allele of the interleukin-1 receptor antagonist gene. J. Clin. Endocrinol. Metab. 80, 111–115. Brown, M.A., and Wordsworth, B.P. (1996). Inflammatory Arthropathies: Principles and Practice of Medical Genetics, A. Emery and D. Rimoin, eds. (Edinburgh: Churchill Livingstone), in press. Charlton, B., and Lafferty, K.J. (1995). The Th1/Th2 balance in autoimmunity. Curr. Opin. Immunol. 7, 793–798. Chesnut, K., She, J.-X., Cheng, I., Muralidharan, K., and Wakeland, E.K. (1993). Characterizations of candidate genes for IDD susceptibility from the diabetes-prone NOD mouse strain. Mamm. Genome 4, 549–554. Copeman, J.B., Cucca, F., Hearne, C.M., Cornall, R.J., Reed, P.W., Rønningen, K.S., Undlien, D.E., Nistico, L., Buzzetti, R., Tosi, R., Pociot, F., Nerup, J., Corne´lis, F., Barnett, A.H., Bain, S.C., and Todd, J.A. (1995). Linkage disequilibrium mapping of a type 1 diabetes susceptibility gene (IDDM7) to human chromosome 2q31-q33. Nature Genet. 9, 80–85. Cordell, H.J., Todd, J.A., Bennett, S.T., Kawaguchi, Y., and Farrall, M. (1995). Two-locus maximum lod score analysis of a polygenic trait: joint consideration of IDDM2 and IDDM4 with IDDM1 in type 1 diabetes. Am. J. Hum. Genet. 57, 920–934. Cucca, F., and Todd, J.A. (1996). HLA susceptibility to type 1 diabetes: methods and mechanisms. In HLA/MHC: Genes, Molecules and Function, M.J. Browning and A.J. McMichael, eds. (Oxford: bIOS Scientific Publishers), in press. Davies, J.L., Kawaguchi, Y., Bennett, S.T., Copeman, J.B., Cordell, H.J., Pritchard, L.E., Reed, P.W., Gough, S.C.L., Jenkins, S.C., Palmer, S.M., Balfour, K.M., Rowe, B., Farrall, M., Barnett, A.H., Bain, S.C., and Todd, J.A. (1994). A genome-wide search for human type 1 diabetes susceptibility genes. Nature 371, 130–136. de Gouyon, B., Melanitou, E., Richard, M., Requarth, M., Hahn, I., Guenet, J., Demenais, F., Julier, C., Lathrop, G., Boitard, C., and Avner, P. (1993). Genetic analysis of diabetes and insulitis in an interspecific cross of the nonobese diabetic mouse with Mus spretus. Proc. Natl. Acad. Sci. USA 90, 1877–1881. Dietrich, W., Miller, J., Steen, R., Merchant, M., Damron, D., Nahf, R., Gross, A., Joyce, D., Wessel, M., Dredge, R., Marquis, A., Stein, L., Goodman, N., Page, D., and Lander, E. (1994). A genetic map of the mouse with 4,006 simple sequence length polymorphisms. Nature Genet. 7, 220–245. Drake, C.G., Babcock, S.K., Palmer, E., and Kotzin, B.L. (1994).
Review: Genetics of Autoimmune Disease 317
Genetic analysis of the NZB contribution to lupus-like autoimmune disease in (NZB 3 NZW)F1 mice. Proc. Natl. Acad. Sci. USA 91, 4062–4066. Ebers, G.C., Sadovnick, A.D., Risch, N.J., and Group, C.C.S. (1995). A genetic basis for familial aggregation in multiple sclerosis. Nature 377, 150–151. Elder, J.T., Nair, R.P., Guo, S.-W., Henseler, T., Christophers, E., and Voorhees, J.J. (1994). The genetics of psoriasis. Arch. Dermatol. 130, 216–224. Field, L., Tobias, R., and Magnus, T. (1994). A locus on chromosome 15q26 (IDDM3) produces susceptibility to insulin-dependent diabetes mellitus. Nature Genet. 8, 189–194. Frankel, W.N. (1995). Taking stock of complex trait genetics in mice. Trends Genet. 11, 471–477. Garchon, H.-J., Luan, J.-J., Eloy, L., Bedossa, P., and Bach, J.-F. (1994). Genetic analysis of immune dysfunction in non-obese diabetic (NOD) mice: mapping of a susceptibility locus close to the Bcl-2 gene correlates with increased resistance of NOD T cells to apoptosis induction. Eur. J. Immunol. 24, 380–384. Ghosh, S., and Schork, N. (1996). Genetic analysis of NIDDM. Diabetes 45, 1–14. Ghosh, S., Palmer, S.M., Rodrigues, N.R., Cordell, H.J., Hearne, C.M., Cornall, R., Prins, J.-B., McShane, P., Lathrop, G.M., Peterson, L.B., Wicker, L.S., and Todd, J.A. (1993). Polygenic control of autoimmune diabetes in nonobese diabetic mice. Nature Genet. 4, 404–409. Gill, B.M., Jaramillo, A., Ma, L., Laupland, K.B., and Delovitch, T.L. (1995). Genetic linkage of thymic T-cell proliferative unresponsiveness to mouse chromosome 11 in NOD mice: a possible role for chemokine genes. Diabetes 44, 614–619. Gregory, W.L., and Bassendine, M.F. (1994). Genetic factors in primary biliary cirrhosis. J. Hepatol. 20, 689–692. Guilleminault, C., Mignot, E., and Grumet, F.G. (1989). Familial patterns of narcolepsy. Lancet 2, 1376–1379. Guler, M.L., Gorham, J.D., Hsieh, C.-S., Mackey, A.J., Steen, R.G., Dietrich, W.F., and Murphy, K.M. (1996). Genetic susceptibility to Leishmania: IL-12 responsiveness in TH1 cell development. Science 271, 984–987.
Lander, E.S., and Schork, N.J. (1994). Genetic dissection of complex traits. Science 265, 2037–2048. Lord, C.J., Bohlander, S.K., Hopes, E.A., Montague, C.T., Hill, N.J., Prins, J.-B., Peterson, L.B., Wicker, L.S., Todd, J.A., and Denny, P. (1995). Mapping of the diabetes polygene Idd3 on mouse chromosome 3 using novel congenic strains. Mamm. Genome 6, 563–570. Love, J.M., Knight, A.M., McAleer, M., and Todd, J.A. (1990). Towards construction of a high resolution map of the mouse genome using PCR-analysed microsatellites. Nucl. Acids Res. 18, 4123– 4130. Lucassen, A., Julier, C., Beressi, J.-P., Boitard, C., Froguel, P., Lathrop, G., and Bell, J. (1993). Susceptibility to insulin dependent diabetes mellitus maps to a 4.1 kb segment of DNA spanning the insulin gene and associated VNTR. Nature Genet. 4, 305–310. Luo, D.-F., Bui, M.M., Muir, A., Maclaren, N.K., Thomson, G., and She, J.-X. (1995). Affected sibpair mapping of a novel susceptibility gene to insulin-dependent diabetes mellitus (IDDM8) on chromosome 6q25-q27. Am. J. Hum. Genet. 57, 911–919. McAleer, M.A., Reifsnyder, P., Palmer, S.M., Prochazka, M., Love, J.M., Copeman, J.B., Powell, E.E., Rodrigues, N.R., Prins, J.-P., Serreze, D.V., DeLarato, N.H., Wicker, L.S., Peterson, L.B., Schork, N.J., Todd, J.A., and Leiter, E.H. (1995). Crosses of NOD mice with the related NON strain: a polygenic model for type 1 diabetes. Diabetes 44, 1186–1195. Meeker, N.D., Hickey, W., Korngold, R., Hansen, K.W., Sudweeks, J.D., Wardell, B.B., Griffith, J.S., and Cory, T. (1995). Multiple loci govern the bone marrow-derived immunoregulatory mechanism controlling dominant resistance to autoimmune orchitis. Proc. Natl. Acad. Sci. USA 92, 5684–5688. Morahan, G., McClive, P.J., Huang, D., Little, P., and Baxter, A. (1994). Genetic and physiological association of diabetes susceptibility with raised Na1/H1 exchange activity. Proc. Natl. Acad. Sci. USA 91, 5898–5902. Morel, L., Rudofsky, U.H., Longmate, J.A., Schiffenbauer, J., and Wakeland, E.K. (1994). Polygenic control of susceptibility to murine systemic lupus erythematosus. Immunity 1, 219–229. Nagata, S., and Suda, T. (1995). Fas and Fas ligand: lpr and gld mutations. Immunol. Today 16, 39–43.
Gyapay, G., Morissette, J., Vignal, A., Dib, C., Fizames, C., Millasseau, P., Marc, S., Bernardi, G., Lathrop, G.M., and Weissenbach, J. (1994). 1993–1994 Ge´ ne´thon human genetic linkage map. Nature Genet. 7, 246–339.
Owerbach, D., and Gabbay, K.H. (1995). The HOXD8 locus (2q31) is linked to type 1 diabetes. Diabetes 44, 132–136.
Hashimoto, L., Habita, C., Beressi, J., Delepine, M., Besse, C., Cambon-Thomsen, A., Deschamps, I., Rotter, J., Djoulah, S., James, M., Froguel, P., Weissenbach, J., Lathrop, G., and Julier, C. (1994). Genetic mapping of a susceptibility locus for insulin-dependent diabetes mellitus on chromosome 11q. Nature 371, 161–164.
Penha-Goncalves, C., Leijon, K., Persson, L., and Holmberg, D. (1995). Type 1 diabetes and the control of dexamethazone-induced apoptosis in mice maps to the same region on chromosome 6. Genomics 28, 398–404.
Hochberg, M.C. (1987). The application of genetic epidemiology to systemic lupus erythematosus. J. Rheumatol. 14, 867–869. Ikegami, H., Makino, S., Yamato, E., Kawaguchi, Y., Ueda, H., Sakamoto, T., Takekawa, K., and Ogihara, T. (1995). Identification of a new susceptibility locus for insulin-dependent diabetes mellitus by ancestral haplotype congenic mapping. J. Clin. Invest. 96, 1936– 1942. Jorde, L.B. (1995). Linkage disequilibrium mapping as a gene-mapping tool. Am. J. Hum. Genet. 56, 11–14. Kono, D.H., Burlingame, R.W., Owens, D.B., Kuramochi, A., Balderas, R.S., Balomenos, D., and Theofilopoulos, A.N. (1994). Lupus susceptibility loci in New Zealand mice. Proc. Natl. Acad. Sci. USA 91, 10168–10172. Kotzin, B.L. (1996). Systemic lupus erythematosus. Cell 85, this issue. Kruglyak, L., and Lander, E.S. (1995). Complete multipoint sib-pair analysis of qualitative and quantitative traits. Am. J. Hum. Genet. 57, 439–454. Lander, E., and Kruglyak, L. (1995). Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results. Nature Genet. 11, 241–247.
Paterson, A.H. (1995). Molecular dissection of quantitative traits. Genome Res. 5, 321–333.
Reed, P.W., Davies, J.L., Copeman, J.B., Bennett, S.T., Palmer, S.M., Pritchard, L.E., Gough, S.C.L., Kawaguchi, Y., Cordell, H.J., Balfour, K.M., Jenkins, S.C., Powell, E.E., Vignal, A., and Todd, J.A. (1994). Chromosome-specific microsatellite sets for fluorescence-based, semi-automated genome mapping. Nature Genet. 7, 390–395. Rich, S.S. (1990). Mapping genes in diabetes. Diabetes 39, 1315– 1319. Risch, N. (1987). Assessing the role of HLA-linked and unlinked determinants of disease. Am. J. Hum. Genet. 40, 1–14. Risch, N. (1990). Linkage strategies for genetically complex traits. I. Multilocus models. Am. J. Hum. Genet. 46, 222–228. Risch, N. (1991). A note on multiple testing procedures in linkage analysis. Am. J. Hum. Genet. 48, 1058–1064. Risch, N., Ghosh, S., and Todd, J.A. (1993). Statistical evaluation of multiple locus linkage data in experimental species and relevance to human studies: application to murine and human IDDM. Am. J. Hum. Genet. 53, 702–714. Satsangi, J., Welsh, K.I., Bunce, M., Julier, C., Farrant, J.M., Bell, J.I., and Jewell, D.P. (1996). Relation of MHC gene to susceptibility and disease phenotype in inflammatory bowel disease. Lancet, in press. Serreze, D.V., Prochazka, M., Reifsnyder, P.C., Bridgett, M.M., and
Cell 318
Leiter, E.H. (1994). Use of recombinant inbred congenic and congenic strains of NOD mice to identify a new insulin-dependent diabetes resistance gene. J. Exp. Med. 180, 1553–1558. Spielman, R., McGinnis, R., and Ewens, W. (1993). Transmission test for linkage disequilibrium: the insulin gene region and insulindependent diabetes mellitus (IDDM). Am. J. Hum. Genet. 52, 506– 516. Steinman, L. (1996). Multiple sclerosis: a coordinated immunological attack against myelin in the central nervous system. Cell 85, this issue. Sudweeks, J., Todd, J., Blankenhorn, E., Wardell, B., Woodward, S., Meeker, N., Estes, S., and Teuscher, C. (1993). Locus controlling Bordetella pertussis-induced histamine sensitization (Bphs), an autoimmune disease-susceptibility gene, maps distal to T-cell receptor b-chain gene on mouse chromosome 6. Proc. Natl. Acad. Sci. USA 90, 3700–3704. Sundvall, M., Jirholt, J., Yang, H.-T., Jansson, L., Engstrom, A., Pettersson, U., and Holmdahl, R. (1995). Identification of murine loci associated with susceptibility to chronic experimental autoimmune encephalomyelitis. Nature Genet. 10, 313–317. Teuscher, C., Wardell, B.B., Lunceford, J.K., Michael, S.D., and Tung, K.S.K. (1996). Aod2, the locus controlling development of atrophy in neonatal thymectomy-induced autoimmune ovarian dysgenesis, colocalizes with Il2, Fgfb, and Idd3. J. Exp. Med. 183, 631–637. Theofilopoulos, A.N. (1995). The basis of autoimmunity. II. Genetic predisposition. Immunol. Today 16, 150–159. Thomson, G. (1995). HLA disease associations: models for the study of complex human genetic disorders. Crit. Rev. Clin. Lab. Sci. 32, 183–219. Thorsby, E. (1995). HLA-associated disease susceptibility: which genes are primarily involved? Immunologist 3, 51–58. Tisch, R., and McDevitt, H. (1996). Insulin-dependent diabetes mellitus. Cell 85, this issue. Todd, J.A. (1995). Genetic analysis of type 1 diabetes using whole genome approaches. Proc. Natl. Acad. Sci. USA 92, 8560–8565. Vidal, S., Tremblay, M.L., Govoni, G., Gauthier, S., Sebastiani, G., Malo, D., Skamene, E., Olivier, M., Jothy, S., and Gros, P. (1995). The Ity/Lsh/Bcg locus: natural resistance to infection with intracellular parasites is abrogated by disruption of the Nramp1 gene. J. Exp. Med. 182, 655–666. Wardell, B.B., Michael, S.D., Tung, K.S.K., Todd, J.A., Blankenhorn, E.P., McEntee, K., Sudweeks, J.D., Hansen, W.K., Meeker, N.D., Griffith, J.S., Livingstone, K.D., and Teuscher, C. (1995). Aod1, the immunoregulatory locus controlling abrogation of tolerance in neonatal thymectomy-induced autoimmune ovarian dysgenesis, maps to mouse chromosome 16. Proc. Natl. Acad. Sci. USA 92, 4758– 4762. Watson, M.L., Rao, J.K., Gilkeson, G.S., Ruiz, P., Eicher, E.M., Pisetsky, D.S., Matsuzawa, A., Rochelle, J.M., and Seldin, M.F. (1992). Genetic analysis of MRL-lpr mice: relationship of the Fas apoptosis gene to disease manifestations and renal disease-modifying loci. J. Exp. Med. 176, 1645–1656. Weber, J.L., and May, P.E. (1989). Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction. Am. J. Hum. Genet. 44, 388–396. Weeks, D.E., and Lathrop, G.M. (1995). Polygenic disease: methods for mapping complex traits. Trends Genet. 11, 513–519. Wicker, L.S., Todd, J.A., Prins, J.-B., Podolin, P.L., Renjilian, R.J., and Peterson, L.B. (1994). Resistance alleles at two non-major histocompatibility complex-linked insulin-dependent diabetes loci on chromosome 3, Idd3 and Idd10, protect nonobese diabetic mice from diabetes. J. Exp. Med. 180, 1705–1713. Wicker, L.S., Todd, J.A., and Peterson, L.B. (1995). Genetic control of autoimmune diabetes in the NOD mouse. Annu. Rev. Immunol. 13, 179–200. Wordsworth, P. (1995). Genes and arthritis. Br. Med. Bull. 51, 249– 266.