GENOMICS
(1991)
l&417-424
New Genes in the Class II Region of the Human Major Histocompatibility Complex ISABEL
M. HANSON,*
ANNEMARIE
PousTm,? AND JOHN TROWSDALE*
*Human Immunogenetics Laboratory, Imperial Cancer Research Fund, 44 Lincolns Inn Fields, London WCZA 3PX, United Kingdom; and tGerman Cancer Research Centre, Im Neuenheimer Fe/d 280, 069 Heidelberg, Federal Republic of Germany Received
October
19, 1990;
Academic
February
1, 1991
III region, which contains the genes for several proteins of diverse function including components C2, C4, and factor B of the complement cascade, steroid 21-hydroxylase, tumor necrosis factors (Yand @,and the heat shock protein HSP70 (Trowsdale and Campbell, 1988; Sargent et al., 1989a). More recently, it has been shown that the class III region also contains at least 17 genes of unknown function (Levi-Strauss et al., 1988; Spies et al., 1989a,b; Sargent et al., 1989b). The discovery of new genes in the MHC region is of particular importance because numerous diseasesare known to be associated with the MHC (Tiwari and Terasaki, 1985). Many of these diseaseshave apathology in which an autoimmune response is implicated and the HLA gene products themselves may be involved in the development of the diseased state, for example, insulin-dependent diabetes mellitus (Todd et al., 1988). However, for most of these disorders the mechanism of disease development remains obscure and it is possible that genesclosely linked to the HLA loci contribute to the disease phenotype. It is especially interesting in this respect that a locus mapping between the HLA-DP genes and the complement genes causes defects in the antigen-presenting function of HLA molecules when deleted (Cerundolo et al., 1990). The class II region may therefore encode other products that are essential for normal antigen presentation apart from the classical class II antigens themselves. To identify potential sites of new genes in the class II region, we have exploited the observation that the 5’ ends of genes are often associated with short stretches (l-2 kb) of unmethylated CpG-rich DNA (Bird, 1987). These “CpG islands” can be detected in genomic DNA as containing clusters of sites for certain “rare-cutter” restriction endonucleases that have target sequences containing one or more CpG dinucleotides and that cut only when the CpG is unmethylated (Brown and Bird, 1986). We have used pulsed-field gel electrophoresis (PFGE) to identify in the class II
A detailed map of the class II region of the human major histocompatibility complex has been constructed by pulsed-field gel electrophoresis. This map revealed clusters of sites for enzymes that cut preferentially in unmethylated CpG-rich DNA often found at the 5’ ends of genes. Three of these clusters have been cloned by cosmid walking and chromosome jumping. Analysis of the clones encompassing these regions through the use of zoo blots, Northern blots, and cDNA libraries resulted in the discovery of four novel genes. The DBSlllE and D6S112E genes are centromerit to the HLA-DPB2 gene, while D6S113E and D6Sl14E are between HLA-DNA and HLA-DOB. Preliminary characterization of the new genes indicates that they are unrelated to the class II genes themselves, although D6S114E expression, like class II expression, is inducible with interferon. In addition, the HLA-DNA gene has been accurately positioned and oriented for the first time. 0 1991
revised
Press. Inc.
INTRODUCTION
The major histocompatibility complex (MHC) is one of the more extensively characterized regions of the human genome. Situated at 6~21.3, it spans about 4 Mb of DNA and contains over 40 genes that have been physically mapped relative to one another (Trowsdale and Campbell, 1988; Spence et aZ., 1989). The first known products of this region were the classical MHC antigens: HLA-A, -B, and -C molecules encoded in the class I region, which spans approximately 2 Mb; and HLA-DR, -DQ, and -DP molecules encoded in the class II region, which spans 1 Mb (Strachan, 1987; Trowsdale, 1987). These class I and class II products play a central role in the regulation of the immune response by presenting peptide antigen to T lymphocytes. Additional class I- and class II-like genes, pseudogenes, and gene fragments have also been found in these regions. Between the class I and class II gene clusters is a l-Mb interval, the class 417
All
OS%-7543/91$3.00 Copyright 0 1991 hy Academic Press, Inc. rights of reproduction in any form reserved.
418
HANSON,
POUSTKA,
region the positions of clusters of sites for those rarecutter enzymes that are expected in theory, and shown in practice, to be highly diagnostic for CpG islands (Lindsay and Bird, 1987; Bird, 1990). We have cloned three such clusters by cosmid walking and chromosome jumping, and by analyzing the cloned regions for conserved and expressed sequences we have isolated cDNAs corresponding to four novel genes. MATERIALS
AND
METHODS
Preparation of DNA and RNA Blots PFGE was carried out using LKB apparatus as described in Hanson et al. (1989). PFGE blots were prepared using DNA from the MHC homozygous cell line PGF. Northern blots were prepared from RNA isolated from the cell lines Molt4 (T-LCL), U937 (macrophage), K562 (erythroleukemia), Mann and Raji (BLCL), HeLa, and SW1222, SW620, and CC20 (colon carcinoma) according to standard protocols (Ausubel et al., 1987). To test for y-interferon-inducible gene expression in the colon carcinoma cell lines, cells were incubated with 300U y-interferon/ml for 36-48 h before RNA extraction. Probes HLA-DPBl, -DPAl, -DOB, -DQAS, -DRB, and -DRA probes were those described in the report of the Tenth International Histocompatibility Workshop (Marcadet et al., 1989). The HLA-DNA probe was 8ba1, a 1.8-kb PstI genomic fragment isolated from the cosmid JG8b (Trowsdale and Kelly, 1985). The COLllA2 probe was a 4.5kb BamHI/EcoRI genomic fragment from the cosmid cosHco1.11 (Hanson et al., 1989). Probes for the analysis of the regions containing the rare-cutter sites were obtained by purifying cosmid fragments from agarose gel slices using GeneClean (Bio 101). Probe 33X1 was a 1.7-kb XhoI fragment from cosmid HPB.ALL 42 (Fig. 4A). Probe 31Kl was an 0.8-kb KpnI fragment from cosmid HPB.ALL 42 (Fig. 4A). Probe 71Bs3 was a 580-bp BssHII fragment from cosmid HPB.ALL 71 (Fig. 4B). Probe U15KN was a 6.0-kb KpnI fragment from cosmid U15 (Fig. 4C; Blanck and Strominger, 1990). The probe for the jumping library was jBK, a 3.8-kb BssHII/KpnI fragment from cosmid HPB.ALL 31 (position shown in Fig. 4A). The cDNA probes corresponding to the four novel genes were DGSlllE, the 1.5-kb XhoI insert from the cDNA clone CEM15; D6S112E, the 1.5-kb XhoI insert from the cDNA clone CEM21; D6S113E, the 3-kb XhoI insert from the cDNA clone CEM41; and D6S114E, the 2.6-kb XhoI insert from the cDNA clone 2.1.
AND
TROWSDALE
Probe DNA was labeled to high specific activity with [a-32P]dCTP by random hexamer priming (Feinberg and Vogelstein, 1983). Hybridization to Southern blot filters was carried out overnight in 6X SSC, 5~ Denhardt’s solution, 0.5% SDS, and 10% dextran sulfate containing 50 pg/ml heat-denatured sheared salmon sperm DNA and lo6 cpm/ml probe at 65°C. Cosmid library filters, cDNA library filters, and blots of cosmid DNA were hybridized under similar conditions except that the probe concentration was reduced to 5 X lo5 cpm/ml. Hybridization to Northern blot filters was carried out overnight at 42“C in 6X SSPE, 5~ Denhardt’s solution, 10% dextran sulfate, and 50% deionized formamide containing 50 pg/ ml salmon sperm DNA and lo6 cpm/ml probe. Filters were washed to a final stringency of 0.1X SSC, 0.1% SDS at 65°C except for zoo blots where the final washing stringency was 2~ SSC. Isolation of Cosmid Clones The cosmid library, constructed from the human T-cell line HPB.ALL, was plated out on nylon filters and screened according to standard protocols (Ausube1et al., 1987). Cosmids HPB.ALL 25,31,33, and 42, encompassing cluster 1, were isolated by screening the library with a 7.5-kb EcoRI fragment from the proximal end of the cosmid cosHco1.11 (Hanson et al., 1989). The position of cosHco1.11 within the class II region is shown in Fig. 2 and a detailed restriction site map is shown in Fig. 3. Cosmid HPB.ALL 71, encompassing cluster 2, was isolated by screening the library with the distal genomic fragment from the jumping clone Xj2 (see Results). Isolation of Jumping Clones The jumping library was constructed as described by Poustka and Lehrach (1988) from human genomic DNA cleaved with BssHII and recut with BamHI and HindIII. The X phage jumping recombinants were plated out astemperature-sensitive lysogens in Es&erichia coli MC1061/p3. Phage DNA was prepared from positive colonies according to standard protocols (Ausubel et al., 1987) after induction of lytic growth at 42°C. Isolation of cDNA Clones The cDNA library (a gift from J. Dunne, Lymphocyte Molecular Biology Laboratory, ICRF) was constructed from the T-cell line CEM in the plasmid vector CDM8. Recombinants (5 X 10’) were plated out on nylon filters and screened in the same way as the cosmid library.
NEW
GENES
IN
THE
HUMAN
MHC
CLASS
419
II REGION
. .
DNA
DPBl
Probe:
kb
DOB
FIG. 1. Southern hybridization analysis of PFGE blots with MHC class II probes. DNA from the MHC homozygous cell line PGF was cut with BssHII (B), MluI (M), and Not1 (N) .m single and double digest combinations and resolved by PFGE. The autoradiographs show the results of sequential hybridization of the same filter with probes for the MHC class II genes HLA-DPBl, HLA-DNA, and HLA-DOB. Size markers are bacteriophage X concatemers. Similar data were obtained using other class II gene probes as described in the text and were used to construct the map shown in Fig. 2.
RESULTS
Clusters of Sites for Rare-Cutter Enzymes Identified by PFGE PFGE blots prepared from human genomic DNA cleaved with the rare-cutter enzymes BssHII, EogI, NotI, and Mu1 in single and double digest combinations were sequentially hybridized with probes from the HLA-DPBl, -DPAl, -DNA, -DOB, -DQAB, -DRBl, and -DRA genes. Typical results are shown in Fig. 1. Fragment sizes obtained with each probe are listed in Table 1. These PFGE data were combined
TABLE Sizes
1
of Rare-Cutter Fragments Detected Class II Probes on PFGE Blots Fragment
Probe DPBl DPAl DNA DOB DQAQ” S W DRB DRA
BssHII
with previously published information concerning the relative positions of the genes within the individual class II subregions to construct the physical map shown in Fig. 2. This map reveals four clusters of rare-cutter sites: one centromeric to the HLA-DP genes, two between HLA-DNA and HLA-DOB, and one between HLA-DQB3 (formerly DVB) and HLADQBl. Each cluster contains two or more sites for EugI, BssHII, or NotI, all of which have been predicted from theoretical considerations and shown, in practice, to cut more often in CpG islands than in inter-island DNA (Lindsay and Bird, 1987; Bird, 1990). These clustered sites are therefore likely to mark the positions of genes and were selected as targets for further analysis.
EagI
by MHC
size (kb) MluI
Not1
250 250 250 210
250 250 250 210
130 450 450 450
370 370 370 >900
210 500 500 500
210 500 500 500
450 570 570 570
2900 >900 >900 >900
a Hybridization of PFGE blots with two bands in each track: one strong (S), ment carrying the DQAQ gene, and the sponding to the fragment carrying the gene.
the DQA2 probe revealed corresponding to the fragother weaker (W), correcross-hybridizing DQAl
Cosmid Walking and Chromosome Jumping To clone cluster 1 we extended a previously published cosmid walk that extends centromeric from the HLA-DP subregion to encompass the COLllA2 gene (Hanson et al., 1989). A probe was obtained from the proximal end of the cosmid cosHco1.11, the most centromeric clone in this walk, and used to obtain new overlapping clones from the cosmid library (Figs. 2 and 3). These cosmids, HPB.ALL 25, 31, 33, and 42, were mapped with the rare-cutter enzymes used to construct the PFGE map. The region covered by these clones was shown to contain one BssHII site, one Not1 site, one Mu1 site, and five EagI sites (Fig. 3). To determine the methylation status of the NotI, BssHII, and MuI sites in genomic DNA, PFGE blots were sequentially hybridized with genomic fragments mapping to the distal and proximal sides of these sites. The two probes hybridized to different NotI,
420
HANSON, CLUSTER1
CLUSTER
BSSHII EAGI MLUI
COLllA2 -
DPA2
DPBl
DPAl
CosHcol.11 HPB.ALWl HPB.ALL42
CLUSTER
AND
3
DNA
TROWSDALE
CLUSTER
BSSHII EAGI
BSSHII
OPB2
-
2
POUSTKA,
4
BSSHII EAGI
DOB
DQB2
DQA2
BSSHII
DOB3
DOB1
DQAl
DRBI
DRB2
DRB3
DRA
-u15 HPB.ALL71
1OOkb
FIG. 2. Physical map of the class II region. The map was constructed from PFGE data using HLA-DPBl, -DPAl, -DNA, -DOB, -DQA2, -DRBl, and -DRA probes, in combination with previously published results describing the relative positions of genes within subregions. The position of the COLllAP gene is from Hanson et al. (9); the positions of HLA-DPA2 and -DPB2 are from Trowsdale et al. (26); the positions of HLA-DQBl, -DQBZ, and -DQBB are from Blanck and Strominger (4); and the relative orientation of HLA-DRA to the -DRB genes is from Hardy et al. (10). The positions of the four clusters of rare-cutter sites are indicated at the top. Also shown are the positions of cosmids that enabled the cloning of these clusters as described in the text. CosHcol.11 is the most centromeric clone in a previously described walk (9), which was extended in this study to encompass cluster 1 by the isolation of clones HPB.ALL 25, HPB.ALL 31, HPB.ALL 33, and HPB.ALL 42. (Only the positions of clones 31 and 42 are shown on this map; see Fig. 3 for further details.) Cosmid HPB.ALL 71, encompassing cluster 2, was isolated by chromosome jumping from cluster 1. Cosmid U15, encompassing cluster 3, is the most centromeric clone in the cosmid walk of Blanck and Strominger (4).
BssHII, and MuI fragments, showing that these sites are cut, and are therefore unmethylated, in genomic DNA and that they must correspond to the sites in cluster 1. To determine the methylation status of the EugI sites, genomic (PGF) DNA and cosmid (HPB.ALL 42) DNA were digested with EagI and resolved by conventional agarose gel electrophoresis. Blots of these gels were hybridized with cosmid fragments mapping between the EagI sites and the sizes of the EagI fragments obtained in genomic or cosmid DNA were compared to determine which sites were cleaved in the genomic DNA. In this way it was shown that all but one of the EugI sites in cluster 1 were cut
in DNA from the cell line PGF, and are therefore unmethylated in the genome (Fig. 4A). The fact that the unmethylated rare-cutter sites in cluster 1 are spread out over 20 kb, whereas a single CpG island typically spans l-2 kb, suggested that there may be more than one gene in this region. To test this, we isolated cosmid fragments close to the unmethylated rare-cutter sites as shown in Fig. 4 and used these to probe Northern blots and zoo blots to obtain evidence for transcribed sequences, as described below. The cloning of the unmethylated rare-cutter sites in cluster 1 made feasible the use of rare-cutter jumping libraries to clone adjacent clusters (Poustka and
COLl 1 A2
DPB2
DPA2
m HPB.ALL 42 HPB.ALL 33 HPB.ALL 31 HPB.ALL
-
DPBl
DPAl
IDI -WC+
25 co~HcoI.11
wp
HPB.ALL
6
HPB.ALL
1 MANN 2.3
I
Mlul Notl BssHll Eagl
MANN 2.2
I
EcoRl KlXll
1111 III II
t
MANN 3.6 I I II I
I
Gal kbI
II I
I
Ill, I I I
I 0
10 I
20 I
30 1,
40
50 l
60 I
70 l
60 I
II
llllll I I 90 l
1111111 I
I
I I
II I
I
I
II
I I
I 100 I
110 1,
120
130 l
140 I
150 ,
160 l
170 l
160 l
190 l
FIG. 3. Restriction maps of overlapping cosmid clones extending centromeric of the COLllAP gene and encompassing the region predicted from the PFGE map to contain cluster 1. The positions of rare-cutter restriction enzyme sites (BssHII, EogI, MuI, and NotI) in this region are shown. The position of the 7.5-kb EcoRI fragment from cosHco1.11 used to screen the cosmid library is also indicated (WP). Genes are represented by black rectangles, with arrows showing the direction of transcription.
NEW A.
CLUSTER
GENES
IN
THE
HUMAN
1
0
x
0
E
E
E
0
I
I
I
m
Probe 33x1
H
00
NIE
i
I
I
EB
Probe 31 Kl
21rb
Probe PK
+
+ IO D6SlllE cDNA
B.
D6SllZE cDNA
CLUSTER
2 --
0
BE
B
-
E
Probe
B
71883 Ikb
I D6S113
C.
CLUSTER
cDNA
3
0
i/E B
NIE B
I
I Probe
UiSKN
-
lkb + 1
I D6S114E
cDNA
FIG. 4. Maps of the rare-cutter sites in clusters 1,2, and 3. The positions of rare-cutter sites in cosmid clones HPB.ALL 42 (cluster 1, A), HPB.ALL 71 (cluster 2, B), and U15 (cluster 3, C) are shown. B, BssHII; M, MluI; E, EagI; N, NotI. The methylation status of each site in genomic DNA is also indicated: o, site unmethylated; x, site methylated; -, not tested. The positions of the cosmid fragments 33X1,31Kl, 71Bs3, and U15KN used to probe the cDNA library filters are shown. The position of the cluster 1 cosmid fragment jBK used to probe the jumping library is also indicated. Boxed regions show the positions of the four genes, D6Slll-4E, as defined by mapping the cDNA clones back onto the cosmids. The maps are oriented with the centromere at left.
Lehrach, 1988). In this technique, the two ends of large genomic DNA fragments generated by cleavage with a rare-cutter enzyme are co-cloned, so that a probe next to a site for that rare-cutter enzyme in the genome can be used to obtain cloned DNA at an adjacent site for the same enzyme, which may be several hundred kilobases away. Cluster 2 was successfully cloned in this way by jumping from cluster 1. A genomic fragment (jBK) adjacent to, and on the distal side of, the BssHII site in cluster 1 (see Fig. 4A) was used to probe a jumping library constructed from human DNA cut with BssHII. A positive clone (Aj2) was
MHC
CLASS
421
II REGION
mapped with appropriate restriction enzymes and shown to contain two genomic inserts as expected. One of these inserts was shown by hybridization back to DNA from the cosmid HPB.ALL 31 to be derived from the starting probe in cluster 1. The other insert was shown by PFGE to map to cluster 2 (data not shown). The fragment mapping to cluster 2 was used to isolate a new clone, HPB.ALL 71, from the cosmid library (Fig. 2). Cosmid HPB.ALL 71 was mapped with rare-cutter enzymes and was shown to contain five BssHII sites and two EagI sites (Fig. 4B). Hybridization of probes from this region to Southern blots of genomic DNA cut with BssHII and EagI and resolved by conventional agarose gel electrophoresis revealed that at least four of these sites were unmethylated in the genome (Fig. 4B). Again, fragments from this cluster were used to search for potential coding sequences on zoo blots, RNA blots, and cDNA libraries. To clone cluster 3 we took advantage of a previously established cosmid walk that spans 280 kb and links the HLA-DOB gene to the DQ subregion (Blanck and Strominger, 1988). These cosmid clones extend 60 kb centromeric of HLA-DOB. Since our PFGE map suggested that cluster 3 was just centromeric of the HLA-DOB gene, we mapped the cosmids that extend proximally from HLA-DOB with the rare-cutter enzymes NotI, BssHII, and EagI. Cosmid U15, the most proximal clone in the cosmid walk (position shown in Fig. 2), contained two clusters of rare-cutter sites mapping about 12 kb apart (Fig. 4C). The two Not1 sites and the two BssHII sites were shown to be unmethylated in genomic DNA. A fragment containing the Not1 site closest to HLA-DOB was isolated from U15 and used to screen the cDNA library (Fig. 4C). Identification
of Potential
Coding Sequences
Transcribed DNA sequences are more highly conserved during evolution than noncoding regions, and it therefore follows that genomic fragments that cross-hybridize with the genomic DNA of other species may indicate the presence of genes (Monaco et al., 1986). Restriction fragments mapping close to the unmethylated rare-cutter enzyme sites identified above were isolated from the cosmid clones and probed onto zoo blots of EcoRI-cut genomic DNA from different vertebrate species. This approach revealed the presence of conserved sequences in clusters 1 and 2. The probes used are shown in Figs. 4A and 4B and the results are summarized in Table 2. (Fragments from cluster 3 were not tested on zoo blots.) The genomic fragments that detected conserved sequences on zoo blots were then probed onto Northern blots of RNA from a range of different cell lines to test whether they detected transcribed sequences. All three of the conserved fragments gave signals on
422
HANSON,
TABLE Cross-Hybridization 2 to Genomic
of Probes DNA from
POUSTKA,
2 from Clusters Other Species
1 and
Probe Species Rhesus Pig Rat Mouse Whale Chicken
monkey m
33x1
31Kl
71Bs3
+ + + + + -
+ + -
+ + -
+ + -
+ + -
AND
TROWSDALE
tested and at lower levels in the T-LCL. No D6S114E expression was detectable in resting colon carcinoma cell lines, but the transcript was strongly induced in y-interferon-treated cells. In addition, D6S114E expression was both y- and a-interferon inducible in cells of the fibrosarcoma line HT1080 (J. John and G. Stark, personal communication). A summary of the Northern blot data is shown in Table 3. Limited sequencing of each of the cDNA clones indicated that none was related to the classical class II genes (data not shown). Refinement of the Physical Map of the Class II Region
Note. (+) Cross-hybridization detected; (-) no cross-hybridization detected. The probes used were cosmid fragments as shown in Fig. 3. Zoo blots were washed to a final stringency of 2X SSC, 0.1% SDS, at 65°C.
Northern blots (see below for expression data). These fragments were then used to probe the cDNA library. Isolation of cDNA Clones A T-cell cDNA library was probed with the genomic fragments from cluster 1 (33X1, 31Kl) and cluster 2 (71Bs3) which were positive on zoo and Northern blots, and a fragment from cluster 3 (U15KN) which contained an unmethylated Not1 site (Fig. 4). cDNA clones corresponding to each of these four probes were isolated. The insert from each cDNA was hybridized to PFGE blots to prove that the clones mapped back to the class II region and to show that they did not detect any related sequences elsewhere in the genome. The four cDNA probes were also hybridized back onto the relevant cosmid clones to define the positions of the four cognate genes (Fig. 4). We have designated these genes RINGl-4 and their oficial symbols are DGSlllE (RINGl), D6S112E (RINGB), D6S113E (RING3), and D6S114E (RING4). The DGSlllE and D6S112E genes, in cluster 1, are respectively 95 and 90 kb proximal to the HLA-DPB2 gene. D6S113E, in cluster 2, is 110 kb distal of HLA-DPAl. D6S114E, in cluster 3, is 25 kb proximal of HLA-DOB. Northern blots hybridized with cDNA probes from each of the genes are shown in Fig. 5. The D6SllE probe detected a 1.6-kb RNA species that was expressed in all cell lines tested but at a significantly higher level in T-lymphoblastoid cells. The D6S112E probe detected a l.l-kb transcript that was expressed in T- and B-lymphoblastoid cell lines (LCL). The D6S1113E probe detected two large transcripts, a major species of about 3.5 kb and a less abundant species of about 4.5 kb, in all cell lines tested. The D6S114E probe detected a transcript of between 2.5 and 3.0 kb which was expressed at high levels in the B-LCL
Following the cloning of cluster 2, we tested whether this region fell within a recently described cosmid walk around the HLA-DNA gene (Blanck and Strominger, 1990). A 3.9-kb BssHII fragment from the cluster 2 cosmid HPB.ALL 71 was found to hybridize to the overlapping clones 027, U22, and HA14 from this cosmid walk. From the published map, this cross-hybridizing region is 35 kb from the HLA-DNA gene. When these cosmids were mapped with BssHII, it was found that they contained the same pattern of rare-cutter sites as HPB.ALL 71, indicating that cluster 2 is indeed contained within the cosmid walk. Since PFGE data show that HLA-DNA is on the centromeric side of cluster 2 (Fig. 2), it follows that HLADNA is 35 kb centromeric to D6S113E and that the cosmid walk of Blanck and Strominger (1990) is oriented within the class II region such that the HLADNA gene has the same transcriptional orientation as HLA-DPAl, with the 5’ end toward the centromere and the 3’ end toward the telomere. From our PFGE map we estimate that HLA-DNA is 75 kb from HLADPAl. The discovery of cluster 3 rare-cutter sites in the cosmid U15, which maps just centromeric to HLADOB, shows that HLA-DOB is 25 kb telomeric of D6S114E, and anchors the cosmid walk of Blanck and Strominger (1988) within the class II region. According to our physical map, HLA-DOB is 160 kb away from HLA-DNA. DISCUSSION
We have described the cloning of four novel genes in and around the class II region of the human MHC. DGSlllE and D6S112E map centromeric of the HLA-DP genes; D6S113E is 35 kb telomeric of HLADNA, and D6S114E maps 25 kb centromeric of HLADOB. Preliminary nucleotide sequence data reveal that these genes are not related to the class II genes or to one another. The class II region, which is thought to have evolved through a series of gene duplication events, was not previously known to contain any
NEW T
B
E
GENES
M
IN T
DGSlllE
B
THE E
HUMAN M
MHC T
B
CLASS E
D6S113E
D6SllZE
M
423
II REGION 12 BT-+-+-3+
D6S114E
FIG. 5. Northern analyses of D6Slll-4E. The autoradiographs show the results of hybridizing blots of poly(A)+ (first three panels) or total (right-hand panel) RNA from a range of cell lines with cDNA probes from the genes D6Slll-4E (see Methods). In the first three panels, T is T-LCL (MolM), B is B-LCL (Mann), E is erythroleukemia (K562), and M is macrophage (U937). In the right-hand panel, B is B-LCL (Raji), T is T-LCL (Moltl), and 1,2, and 3 are colon carcinoma lines SW1222, SW620, and CC20, respectively, in the absence (-) or presence (+) of y-interferon. Approximate transcript sizes are shown in kilobases.
genes other than the classical class II genes and pseudogenes. It is of great interest to find non-class II genes interspersed among the known sequences, not only from the point of view of the evolution of the region, but also because of the large number of diseasesknown to be associated with the MHC. To test the possible role of the new genes in MHC-associated diseases,we are determining the complete sequence of each of the cDNA clones with the aim of finding clues to the function of the encoded proteins. We are also searching for polymorphisms in the new genes with which the disease association studies could be extended. The finding of new genes in the class II region is also of interest given the recent report of a mutant cell line that is defective in antigen presentation by class I molecules but which has a deletion of the region between the HLA-DP genes and the complement genes (Cerundolo et al, 1990). Indeed, the nucleotide sequence of the interferon-inducible gene D6S114E (RING4), which maps into the region deleted in the TABLE
3
Summary of Expression Patterns of Novel Genes RNA Gene DGSlllE D6SllZE D6S113E D6S114E
approx
size (kb)
T
B
M
H
E
IFN--y
IFN-a
1.6 1.1 3.5 and 4.5 2.8
++ + + +
+ + +
+ + NT
+ NT + NT
+ + NT
NT NT -
NT NT NT +
+
Note. cDNA probes were used in each case. (+) Expression detected, (-) no expression detected. NT, not tested; T,T-lymphoblastoid cell line; B,B-lymphoblastoid cell line; M, macrophage cell line; H, HeLa; E, erythroleukemic cell line; IFN-r, y-interferon inducible; IFN-or, a-interferon inducible. For further details of cell lines used, see text and legend to Fig. 5.
mutant cell line, indicates that this gene encodes a protein in the “ABC” superfamily of transporters (Trowsdale et al., 1990). Other members of this family include bacterial peptide transporters, which raises the intriguing possibility that the D6S114E (RING4) gene product transports peptides from the cytoplasm to the lumen of the endoplasmic reticulum, where binding to class I molecules is thought to occur (Cerundolo et al., 1990). The zoo blot data presented in Table 2 suggest that sequences homologous to D6Slll-3E are present in the DNA of other vertebrates. We have also obtained evidence that sequences homologous to D6Slll-4E are present in the same relative positions in the mouse MHC class II region (I. Hanson and J. Trowsdale, manuscript submitted). The strategy of using PFGE to identify clusters of unmethylated rare-cutter sites followed by the cloning and analysis of these clusters has proved highly successful in the identification of new genes in the human class II region. As shown in Fig. 4, the three regions we have characterized contain several unmethylated CpG sites spread over several kilobases of DNA and most likely encode additional genes. We are also analyzing the cluster of rare-cutter enzyme sites that maps between HLA-DQB3 and HLA-DQBl (cluster 4 in Fig. 2). Not all genes, however, are associated with CpG islands and there could be several other coding sequences in the uncharacterized stretches of DNA between the clusters of rare-cutter sites. We have evidence for a fifth novel gene in the region centromeric of the DP subregion. ACKNOWLEDGMENTS We thank Jenny Dunne for the CEM cDNA library, George Blanck for the generous gift of human class II region cosmids, Pat
424
HANSON,
Miller for laboratory services, and Adrian ing for advice and assistance with Northern
POUSTKA,
Kelly and Ruth blots.
Lover-
REFERENCES 1.
AUSUBEL, F. M., BRENT, R., KINGSTON, R. E., MOORE, D. D., SMITH, J. A., SEJDMAN, J. G., AND STRUHL, K. (1987). “Current Protocols In Molecular Biology,” Wiley Interscience, New York.
2.
BIRD, A. P. (1987). CpG islands as gene markers brate nucleus. Trends Genet. 3: 342-347.
3.
BIRD, A. P. (1990). Two classes of observed frequency for rare-cutter sites in CpG islands. Nucleic Acids Res. 17: 9485. BLANCK, G., AND STROMINGER, J. L. (1988). Molecular organization of the DQ subregion (DO-DX-DV-DQ) of the human MHC and its evolutionary implications. J. Immunol. 141: 1734-1737. BLANCK, G., AND STROMINGER, J. L. (1990). Cosmid clones in the HLA-DZ and -DP subregions. Hum. Zmmunol. 27: 265268. BROWN, W. R. A., AND BIRD, A. P. (1986). Long range restriction site mapping of mammalian genomic DNA. Nature 322: 477-481. CERUNDOLO, V., ALEXANDER, J., ANDERSON, K., LAMB, C., CRESSWELL, P., MCMICHAEL, A., GOTCH, F., AND TOWNSEND, A. (1990). Presentation of viral antigen controlled by a gene in the major histocompatibility complex. Nature 345: 449-452. WEINBERG, A. P., AND VOGELSTEIN, B. (1983). A technique for radiolabeling DNA restriction endonuclease fragments to high specific activity. Anal. Biochem. 132: 6-13. HANSON, I. M., GORMAN, P., LUI, V. C. H., CHEAH, K. S. E., SOLOMON, E., AND TROWSDALE, J. (1989). The human a2(XI) collagen gene (COLllA2) maps to the centromeric border of the major hi&compatibility complex on chromosome 6. Genomics 5: 925-931.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
AND 14
MONACO, A. P., NEVE, R. L., COLLE?TI-FEENER, C., BERTELSON, C. J., KURNIT, D. M., AND KUNKEL, L. M. (1986). Isolation of candidate cDNAs for the Duchenne muscular dystrophy gene. Nature 323: 646-650.
15.
POUSTKA, A., AND LEHRACH, H. (1988). Chromosome jumping: A long range cloning technique. In “Genetic Engineering.” (J. K. Setlow, Ed.), Vol. 10, pp. 169-193, Plenum Press, New York.
16.
SARGENT, C. A., DUNHAM, I., TROWSDALE, J., AND CAMPBELL, R. D. (1989a). Human major histocompatibility complex contains genes for the major heat shock protein HSP-70. Proc. Natl. Acad. Sci. USA 86: 1968-1972. SARGENT, C. A., DUNHAM, I., AND CAMPBELL, R. D. (198913). Identification of multiple HTF-associated genes in the human major histocompatibility complex class III region. EMBO J. 8: 2305-2312.
in the verte-
HARDY, D. A., BELL, J. I., LONG, E. O., LINDSTEN, T., AND MCDEVI?T, H. 0. (1986). Mapping of the class II region of the human major histocompatibility complex by pulsed field gel electrophoresis. Nature 323: 453-455. LEVI-STRAUSS, M., CARROLL, M. C., STEINMETZ, M., AND MEO, T. (1988). A previously undetected MHC gene with an unusual periodic structure. Science 240: 201-204. LINDSAY, S., AND BIRD, A. P. (1987). Use of restriction enzymes to detect potential gene sequences in mammalian DNA. Nature 327: 336-338. MARCADET, A., DUPONT, B., AND COHEN, D. (1989). Organization and design of the Southern blot component of the tenth histocompatibility workshop. In “Immunobiology of HLA” (B. DuPont, Ed.), Vol. 1, pp. 560-566, Springer-Verlag, New York.
TROWSDALE
17.
18.
SPENCE, M. A., SPURR, N. K., AND FIELD, L. L. (1989). Report of the committee on the genetic constitution of chromosome 6. Cytogenet. Cell Genet. 51: 149-165.
19.
SPIES, T., BLANCK, G., BRESNAHAN, M., SANDS, J., AND STROMINGER, J. L. (1989a). A new cluster of genes within the human major histocompatibility complex. Science 243: 214217.
20.
SPIES, T., BRESNAHAN, M., AND STROMINGER, J. L. (1989b). Human major histocompatibility complex contains a minimum of 19 genes between the complement cluster and HLAB. Proc. Natl. Acad. Sci. USA 86: 8955-8958.
21.
STRACHAN, T. (1987). Genetics and polymorphism: antigens. Brit. Med. Bull. 43: 1-14.
22.
TIWARI, J. L., AND TERASAKI, associations,” Springer-Verlag,
23.
TODD, J. A., ACHA-ORBEA, H., BELL, J. I., CHAO, N., FRONEK, Z., JACOB, C. O., MCDERM~, M., SINHA, A. A., TIMMERMAN, L., STEINMAN, L., AND McDE~~T, H. 0. (1988). A molecular basis for MHC class II-associated autoimmunity. Science 240: 1003-1009.
24.
TROWSDALE, J. (1987). Genetics and polymorphism: Class II antigens. Brit. Med. Bull. 43: 15-36. TROWSDALE, J., AND KELLY, A. (1985). The human HLA class II (Y chain gene DZa is distinct from genes in the DP, DQ and DR subregions. EMBO J. 4: 2231-2237.
25.
P. I. (1985). “HLA New York.
Class
I
and Disease
26.
TROWSDALE, J., YOUNG, J. A. T., KELLY, A. P., AUSTIN, P. J., CARSON, S., MEUNIER, H., So, A., ERLICH, H. A., SPIELMAN, R. S., BODMER, J., AND BODMER, W. F. (1985). Structure, sequence and polymorphism in the HLA-D region. Immunol. Rev. 85: 5-43.
27.
TROWSDALE, of the human
28.
TROWSDALE, J., HANSON, I., MOCKRIDGE, I., BECK, S., TOWNSEND, A., AND KELLY, A. (1990). Sequences encoded in the class II region of the MHC related to the ‘ABC’ superfamily of transporters. Nature 348: 741-744.
J., AND CAMPBELL, R. D. (1988). Physical HLA region. Immurwl. Today 9: 34-35.
map