The structural basis of the multiple forms of human complement component C4

The structural basis of the multiple forms of human complement component C4

Cell, Vol. 36, 907-914. April 1984, Copyright 0 1984 by MIT 0092~8674/84/040907-08 $02.00/o The Structural Basis of the Multiple Forms of Human C...

853KB Sizes 14 Downloads 71 Views

Cell, Vol. 36, 907-914.

April 1984, Copyright

0 1984 by MIT

0092~8674/84/040907-08

$02.00/o

The Structural Basis of the Multiple Forms of Human Complement Component C4 K. Tertia Belt, Michael C. Carroll, and Rodney Ft. Porter MRC Immunochemistry Unit Department of Biochemistry Oxford University Oxford OX1 3QU, England

Summary cDNA clones of human complement components C4A and C4B alleles were prepared from mRNA obtained from the liver of a donor heterozygous at both loci. cDNA from one C4A allele was sequenced to give the derived complete amino acid sequence of 1722 amino acid residues of the C4 single chain precursor molecule and the estimated sequences of the three peptide chains of secreted C4. Comparison with partial sequences of a second C4A allele and a C4B allele has led to the tentative identification of some class differences in nucleotide sequences between C4A and C4B and of allelic differences between C4A alleles in this highly polymorphic system. Introduction Three components of the complement system, C2, C4, and factor B, are coded by genes in the major histocompatibility complex (MHC) of man and mouse (for review see Porter, 1983a). All are polymorphic and in man the genes have been mapped between HLA-B and HLA-D possibly closer to HLAB (Barnstable et al., 1979; Olaisen et al., 1983). Restriction maps of overlapping cosmid clones have shown that the genes of C2 and factor B are less than 2 kb apart and are separated from two C4 genes by about 30 kb which are themselves separated by about IO kb (Carroll et al., 1984). C4 is exceptionally polymorphic with at least two loci, C4A and C4B, and so far 13 alleles of C4A and 22 of C4B have been detected (Mauff et al., 1983a) including a null allele at each locus. Allelic differences are recognized by electrophoresis, after removal of sialic acid with neuroaminidase, and development of the gel with anti C4 antiserum (Awdeh and Alper, 1980). C4A and C4B can also be distinguished by specific antisera, anti-Rodgers and antiChido respectively (O’Neill et al., 1978a), but the correlation is not absolute (Rittner et al., 1983). The polymorphism of C4 appears to be substantially higher than that of the adjacent complement genes of C2 and factor B but probably less than that of the HLA Class I and Class II antigens. The biological significance of this high degree of polymorphism is unknown but one suggestion is that it may relate to the role of C4 in forming a covalent bond during activation, with a wide range of antigenic structures in many different pathogens (Porter, 1983b). This section of the genome, HLA-B to HLAD, is of considerable clinical interest because susceptibility to a number of diseases,

generally autoimmune in character, correlates with the presence of particular haplotypes in this region (Barnstable et al., 1979). Resolution of the many forms of C4 is of both theoretical and practical importance but typing of C4 in blood by electrophoresis is difficult because of the complexity, which is increased because each allele shows two minor as well as one major band. In individuals, heterozygous at both loci, separation of all four forms of C4 present is often incomplete. Typing by comparison of the length of restriction enzyme digest fragments that hybridize with C4 cDNA probes has been attempted. Palsdottir et al. (1983) found a good correlation between a restriction enzyme length polymorphism and the presence of the C4A 6 allele but Whitehead et al, (1984) showed that individuals typed as homozygous C4A 3, B 1, showed two different restriction enzyme length polymorphisms, indicating further subdivision of C4 types not recognized by electrophoresis. It is apparent that full resolution of the many different forms of C4 can only be achieved by study of their complete amino acid sequences and a start has been made by sequencing the cDNA of a C4A mRNA. C4 is present in serum as a three polypeptide chain protein, LY chain M, 95,000, p M, 75,000 and y M, 30,000, but it is synthesized as a single chain precursor of about M, 200,000 (Hall and Colten, 1977; Roos et al., 1978) (Figure 1). Most of the differences between C4 alleles have been shown to be in the a chain (Roos et al., 1982) particularly in the degradation product C4d of about 40,000 M, which comes from the center of the o( chain (Figure 1) (Tilley et al., 1978, Mevag et al., 1981). The only amino acid sequence differences reported so far in C4 are also found in C4d fragment (Chakravarti et al., 1983; Lundwall et al., 1981). In order to identify specific differences between C4A and C4B the cDNA of the C4d section of a second C4A allele and a C4B allele have also been sequenced. Differences in nucleotide sequence have been found and have tentatively been identified as class differences between C4A and C4B and as allelic differences between the two C4A alleles.

Results and Discussion The Structure of C4 C4 is synthesized as a single peptide chain to give a secreted three chain protein (Figure 1) of about 200,000 M, of which 7% is carbohydrate (Gigli et al., 1977). It is expected to contain about 1700 amino acid residues and to have a mRNA of about 5.5 kb. Two cDNA libraries were PROc.

Figure 1. Diagram of pro C4 which is split before secretion into three chains; a 95,OCO M,, /3 75,0@0 M, and y 30.000 M, are contained in the original molecule in the order shown. When C4 is activated by Ci. C4a. 7,COO M, is split from the N terminal of the 01 chain. When C% is inactivated by factor I and C4 bp, the C4d fragment is released.

Cell 908

VCLSATFFTLSLQKPRLLLFS GACCAGATCAGCCCCCAGAGCAGCCTCATCCCTGCAGG~~CCAAGAGAGGTTAGATCC~T;CTGTCTGTCTGCTACCTTCg~TCACCTTATCTCTGCAGA~CTC 100 20

120

20 40 ~SVVHLGVPLSVGVQLQDVPRGQVVKGSVFLRNPSRNNVP TCCTTCTGTGGTTCATCTGGGGGTCCCCCTATCGGTGGGGGTCCAGCTCCAGGATGTGCCCCGAGGACAGGTAGTGAAAGGATCAGTGTTCCTGAGAAACCCATCTCGTAATAATG~CC 140 160 180 200 220 210 60 NO VFLIDAKSCGLNQLLRGPE CSPKVDFTLSSERDFALLSLQ CTGCTCCCCAAAGGTGGACTTCACCCTTAGCTCAGAAAGAGACTTCGCACTCCTCAGTCTCCAGGT~CCTTGAAAGATGCGAAGA~TGTGGCCTCCATCAACTCCTCAGAGGCCCTGA 340 360 260 280 300 320 120 100 SRRGNLFLQTDQ VQLVAHSFULKDSLSPTTNIQGINLLFS GGTCCAGCTGGTGGCCCATTCGCCATGGCTAAAGGACTCTCTGTCCAGAACGACAAACATCCAGGGTATCAACCTGC~~~CTCCTCTCGCCGGGGGCACCTCTTTTT~AGACGGACCA b60 bS0 380 400 420 140 160 YRVFALDQKl,RPSTDTITVliVENSHGLRVR PIYNPGQRVR GCCCATTTACAACCCTGGCCAGCGGG~CGGTACCGGGTCTTTGCTCTCGATCAGAAGATGCGCCCGAGCACTGACACCATCACAGTCATGGTGGAGAACTCTCACGGCCTCCGCGTGCC 580 600 500 5iO 540 560 200 180 FVIPDISEPGTUKISARFSDGLESNS KKEVYMPSSIFQDD GAAGAAGGAGGTGTACATGCCCTCGTCCATCTTCCACGATGACTTTGTGATCCCAGACATCTCAGACCCACCGACCTGGAAGATCTCACCCCGATTC~~~GATGGCCTGGAATCCAACAG 620 640 660 680 720 220 2bO PGHLDBHQLDIQ STQFEVKKYVLPNFEVKITPGKPYILTV CAGCACCCAGTTTGAGGTGAAGAAATATG~CTTCCCAACTTTGAGGTGAAGATCACCCCTGGAAAGCCCTACATCCTGACGGTGCCGCCATCTTGATGAAATGCAGITAGACATCCA 740 760 780 800 820 EbO 280 260 ARYIYGKPVQGVA YVRFGLLDEDGKITFFRGLESQTKLVN GGCCAGGTACATCTATGGGAAGCCAGTCCAGGGGGT~sCg~TATGTGCGCTTTGGGCT~~~GATGAGGATGGTAAGA~~~CT~CTTTCGGGGGCTGCAGAGTCAGACCAAGCTCCTGAA 9b0 960 660 320 300 EKLNl,GITDLQGLRLTVAAAIIES GQSHISLSKAEFQDAL TCGACAGACCCACATTTCCCTCTCAAAGGCAGAG~CCAGGACGCCCTGGAGAAGCTGAATATGGGCATTACTGACCTCCAGGGGCTGCGCCTCTACG~~TCCACCCATCATTGAG~ 980 1000 1020 lob0 1060 1080 3b0 360 PGGE,,EEAELTSYYFVSSPFSLDLSKTKRNLVPGAPFLLQ TCCAGCTGGGGAGATGGACAGGCAGACCTCACATCCTCCTATTWGTGICATCTCCCTTCTCCTT~ATCTTACCAAGACCAAGCGACACCTTGTCCCTGGGGCCCCCTTCCTGCTOCA 11.90 1200 1100 1120 1 lb0 1160 LOO 380 ALVREtlSGSPASCIPVKVSATVSSPGSVPEAQDIQQNTDG GGCCTTCC~CGTGAGATGIA~CTCCCCAGCWCTGGCATTCCTG~AAAG~~~~CACGGTG~TTCTCC~~~~TGWCCTGAAGCCCAGGACATTCAGCAAAACACAGACGG 1220 12bO 1300

1320

b20 bb0 SGQVSIPII IPQTISKLQLSVSACSPNPAIARLTVAAPPS GACCGGCCAAGTCACCATTCCAATAATTATCCCTCAGACCATCTCAGA~T~ACCTCK:AGTATCTGCACCCTCCCCACATCCA~GATAGCCAGGCTCACTGTGGCAGCCCCACCTTC 13b0 1360 1380 1400 lb20 1bbO b60 b80 GGPGFLSIERPDSRPPRVGDTLNLNLRAVGSGATFSNYYY AGGAGCCCCCGGG~TCTGTCTATTGAGCGGCCCCATTCTCGACCTCCTCCTGIICCCGACACTCTGAACCTGAACTT~GACCCGTGGGCAGTCGGGCCACCTTTTCTCATTACTACTA lb60 1b80 1500 1520 1540 1560 520 500 IILSRGQIVFINREPKRTLTSVSVFVDHNLAPSFYFVAFY CATG~TCCT~TCCCG~CCGCAGA~CGT~WC~T~U~CGAG~CCCCAA~ACCACCCTGACCTCGC~TCGGT~~GT~A~~AT~A~~TGGCA~~~TC~T~~TACT~~G~~~~CT~CT~ 1580 1600 1620 1640 1660

1680

5bO 560 YNGDNPVAYSLRVDVQAGACEGKLELSVDGAKQYRNGESV CTACC~T~~AGACCACCCAGTGGCCAACTCCCT~~AGT~GATGMCA~~C~~GGCCTGCGA~~GCAAGCT~A~TCAGC~TGCAC~~TGCCAACCA~TACCGGAACG~~~AGTCCGT 1780 1700 1720 1740 1760

1800

600 580 KLHLETDSLALVALGALDTALYAAGSKSNKPLNHGKVFEA G~AGCTCCACTT~GAAACCGACTCCCTAGCCCTGGTGGC~~T~GGA~CTTGGACACAGCTCTGTATGCTGCA~~~A~~AA~~~CCA~AA~CC~TCAACAT~GCAA~~TTTGAACC 1820 1040 1860 lSS0 1900

1920

620 1940

/I

Chain-

6b0 1960

660 -6

2000

INEKLGQ AGAAA 2100

680 Y A SPTAKRCCQDGV CT TGCTTCCCC GACAGC;;~~CGCTGCTGCCAGGATG~~~; 2f20A

Chain

S C P K EKTTRYKRNVNFQKA AA7XTT7TRC AAGGAGAAGACAA~CCGGAAAAAGAGAA~~ 2060 2080

Figure 2. The cDNA and Deduced and y Chains

1980

Amino Acid Sequence

2020

2040

A

of pro C4 Showing the N Terminal Ends of @ a and y Chains and the C Terminal Ends of the @

Also shown IS the position of the thioester bond CGEQ and the N and C terminal ends of the C4d section (in brackets), Amino acid residues underlined were determined by amino acid sequencing (Gig! et al., 1977; Moon et al., 1981; Press and Gagnon, 1981; Campbell et at., 1981; Chakravarti et at., 1983; Law and Gagnon, unpublished data: Chakravani, Campbell, and Gagnon, unpublished data). The seven nucleotides underlined were deleted in the PAT-A sequence relative to the PAT-42 and PAT-F sequences and also relative to the known amino acid sequence.

700

720

740 760 DKGOAGLQRALEILQEEDLIDEDDIPVRSFFPFj YVLYRVR GGACAAGGGCCAGGCGGG~C AACGAG~CTCCACGACCACGACCTGATTGATGAGGATGACA?TCCCGTCCGCACCWCTICCCAGAGAACT~CTCTffiAGAGTGGA 2300 2320 2340 2360 2300

2400

780 800 LTTUEIHGLSLSKTKGLCVATPVQL TVDRFQILTLULPDS ~ACAGTCCACCGCTTTCAAATATTGACACTGTGGCT~~~~GACTCTCTGACCACGT~~~GATCCATGGCCTGAGE~~~~AAACCAAAGGCCTATGTGT~CCACCCCAG~CA~T 2420 2 500

2520

820 840 RVPRF.FHLHLRLP,fSVRRFEQLELRPVLYNYLDK,,LTVSV CCGGGTGT7CCGCGAG~CCACCTCCACCTCCGCCT~~~~TG~TG~CGCCGCT~~~~~AGCTGGAGCTCCGGCCTGTCCTCTATAACTACCTffiATAAAAACCTGACTGTGACCGT 2540 2600 2620

2640

860 880 HVSPVEGLCLAGGGGLAQQVLVPAGSARPVAFSVVPTAAA CCACGTGTCCCCAGTGGAGGGGCTGT~CTffiCTGGGGGCGGAGGGCTCCCCCAGCAGGTCCTGGTGCCTGCGGGCTCTGCCCGGCCTGWGCCTTCTCTGT~TGCCCACGGCACCCGC 2660 2660 2700 2720 2740 2760 900 920 AVSLKVVARGSFEFPVGDAVSKVLQIEKEGAIHREELVYE CGCTGTGTCTCTGAAGGTCGTGGCTCGA~G~CTTCGAATTCCCTGTffiGAGATGCGGTGTCCAACCTlCTGCAGATTGAGAAGGAAGGGGCCATCCATAGAGACCACCICCTCIIIGA 2700 2800 2820 2840 2860 2SSa 960 ACTCAACCCCTTGGACCACCGAGGCCG

2940

2960

AACAGCTACGTCAGGGWACAGCCTCAGATCCATTGGA 2980 3000

960 1000 I ~LGSEGALSPGGVASLLRLPRGCGR~T”IYLAPTLAASRT CACTWAGCCTCTGACCGGGCCTTGKACCAGGA~CGTGGCCTCCCTCTTGAGCCT1CCTCCAGGCTGfEFEGAGCAAACCATGATCTACITGGCTCCGACACTGGCTCC~ 3020 3040 3060 3080 3100

CCGCTA 3120

1020 1040 LDKTEOYS TLPPRTKDNAVDLIOKGY~RIOOFBKADGSYA CCTGGACAAGACAGACCAGTCCAGCACACTGCCTCCCGAGACCAAGGACCACGCCGTGGATCTGATCCAGAAAGGCTACATGCGGATCCACCAGITTCGGAACCCGGATGGTTCCTATGC 3140 3160 3180 3200 3220 3240 1060 1080 AVLSRDSSTULTAFVLKVLSLAOEOVG~ SegKLoersrwL CC~~~CG~C~CC~CC~CCTCCCTC~C~~CTTGTGWG~AGGTCCTCAGWTGGCCCAGGA~~A~~~A~~A~~~~~~~~~A~AAA~~~~A~AGACA~C~AACT~CT 3260 3280 3300 3320 3340

3360

1100 1120 S H Q GGLVGNDETVALTAF”T1 L S Q Q Q A D G S F Q D F C P V L D 1 TC%TCCC AGCAGCAGGCTG CGGCTCGTTCCAGGACCCCTGTCCAGTGTTAGACAGG4l?tATGC4GG-C AATGATGAGACTGTGGCACTCACAGCCTTTGTGACCAT 33*oA 3400 3420 3440 3460 3480 1140 1160 ALNUGLAVFQDEGAEPLKQRV EASISKANSFLGEKASAG~ CGCCCTTCATCATGGGCTGGCCGTCTTCCAGGATGAGGGTGCAGA~CATTGAAGCAGAGAGTGGAAGCCTCCATCTCAAAGGCAAACTCAT~~~GGGAGAAA~AAGT~T~GCT 3500 3520 3540 3560 3580 1180

3600

1200

LC~~AAAITA~ALSLTK~PVDLL~VAHNNL~A~A~~TG~~

CCTCCGTCCCCACGCACCTGCATCACGGCCTAT~CCTG~ACTGACCAAGGCGCCTGTGGACCTGCTCGGTG~GCCCACAACAACCTCATGGCAATGGCCCAGGAGACTCGAGATAA 3620 3640 3660 3680 3700 3720 1220 1240 1 GSVTGSOS NAVSFTFAPRNPSDPldPOAPALUIETTA~ CC GTACTiGGGCTCAGTCACTGGTTC TCACAGCA~TCCCCTGTCCCCC~CCCCGGCTCCTCGCAACCCATCCGACCCCATGCCCCAGGCCCCACCCCTG~CGATTGAAACCACA~CTA 3740 3760 3780 3800 3820 3840 1260 1280 ~LLSLLLNEGKAEIIADO ASAYLT QGS OGGFRSTODTVJ CGCCCTGCTCCACCTCCTGCTTCACGAGGGCAAA~AGACATGGCAGACCAGGCTTCGGCCTGGCTCACCC~~AGGGCAGCT~CCAAGGGGGATTCCGC~AAGACACGGTGAT 3860 3880 3900 3920 3940 1300 3980

3960

1320 NTTEKPGLNVTLSSTGRNGFKSNALQLN CCACACC~CTC~GCAC~GCCGTCTCIITCTGICICTCACC 4000 4020

4040

~XlERtRZcrcccWcc~cc~ 3 4060

4080

1340 1360 YRQIRGLEEELQFSLGSKINVKVGGNSKGTLKVLRTYNVL CAACCCCCAGATTCGCGGCCTGGA~AGGAGCTGCAG~TTCCTTGGGCAACAICIATCAATGTGAAGGTGGGAGGAAACAGCAAAGGAACCCTGAAGG~CTTCGTACCTACAATG~CT 4100 4120 4140 4160 4180 4200 1380 1400 EVTVKGNVEPT,,EANEDYEYDELPbKDD DklKNTTCQDLQI GGACATGAAGAACACGACCTCCCACCACCTACAGATAGAAGTCACAGTCAAAGGCCACGTCGAGTACACGATGGAAGCAAACGAGGACTATGAGTACGATGAGCTTCCAGCCAAGGATGA 4220 4240 4260 4280 4300 4320

B

Cell 910

++ d Chain l44O 1420 PDAPLQPVTPLQLFECR RNRRRPEAPKVVEEQESRVHYTV CCCAGATGCCCCTCTGCAGCCCGTGACACCCCTGCAGCTGTTTGAGGGTCGGAGGAACCGCCGCAGGAGGGAGGCGCCCAAGGTGGTGGAGGAGCAGGAGKCA~ACT 4340 4360 4380 4400 4420

ACACCGT 4440

1460 1480 CIURNGKVGLSGMAIADVTLLSGFHALRADLEKLTSLSDR GTGCATCTGGCGGAACGGCAAGGTGGGGCTGTCTGGCATGGCCATCGCGGACGTCACCCTCCTGAGTGGATTCCAC~~~~GCGTGCTGACCTGGAGAAGCTGACCTCCCTCTCTGACCG 4460 4480 4500 4540 4560 1500 1520 YVSHFETEGPHVLLYFDSVPTSRECVGFEAVQEVPVGL”Q TTACGTGAGTCACTTTGAGACCGAGGGGCCCCACGTCCTGCTGTATTTTGACTCGGTCCCCACCTCCCGGGAGTGCGTGGGCTTTGAGGCTGTGCAGGAAGTGCCGGTGGGGCTGGTGEA 4580 4600 4620 4640 4660 4680 1540 1560 PASATLYDYYNPERRCSVFYGAPSKSRLLATLCSAEVCQC GCCGGCCACCGCAACCCTGTACGACTACTACAACCCCGAGCGCAGATGTTCTGTGTTTTACGGGGCACCAAGTAAGAGCAGACTCTTGGCCACCTTGTGTTCTGCTGAAG~TGCCAGTG 4700 4720 4740 4760 4780 4800 1580 1600 AEGKCPRQRRALERGLQDEDG YR,,KFACYYPRVEYGFQVK TGCTGAGGGGAAGTGCCCTCGCCAGCGTCGCGCCCTGGAGCGGGGTCTGCAGGACGAGGATGGCTACAGGATGAAGTTTGCCTGCTACTACCCCCGTGTGGAGTACGGCTTCCAGGTTAA 4820 4840 4860 4880 4900 4920 1620 1640 VLREDSRAAFRLFETKITQ VLHFTKDVKAAANQMRNFLVR GGTTCTCCGAGAAGACAGCAGAGCTGCTTTCCGCCTCTTTGAGACCAAGATCACCCAAGTCCTGCACTTCACCAAGGATGTCAAGGCCGCTGCTAATCAGATGCGCAACTTCCT~~CG 4940 4960 4980 5000 5020 5040 1660 1660 ASCRLRLEPGKEYLIHGLDGATYDLEGHPQYLLDSNSUIE AGCCTCCTGCCGCCTTCGCTTGGAACCTGGGAAAGAATATTTGATCATGGGTCTGGATGGGGCCACCTATGACCTCGAGGGACACCCCCAGTACCTGCTGGACTCGAATACCTGGATCGA 5060 5080 5100 5120 5140 5160

f Chain 1700 1720 w EWPSERLCRSTRQRAACAQLNDFLQEYGTQGCQV* GGAGATGCCCTCTGAACGCCTGTGCCGGAGCACCCGCCAGCGGGCACCCTGTCCCCAGCTCAACGAC~CAG~~~~TGGCACTCAGGGGTGCCAGGTGTGAGGGCTGCCCTCCCA 5180 5200 5260 5280 5220

5300

5320

5340

GTTGGCAAAAAAAAAAAAAAAAAAAAAAAAAA 5420

constructed selecting for higher molecular weight fractions. cDNA prepared from the 28s fraction of total liver RNA, shown previously to be enriched for C4 mRNA (Carroll and Porter, 1983) was fractionated on a sucrose gradient. Libraries were prepared from DNA of 2-4 kb (II) and greater than 4 kb (I). Two complexities of each library were screened using the C4d specific cDNA probe, pAlu-7 (Carroll and Porter, 1983). Both libraries gave about one positive clone per thousand colonies and the inserts sized on agarose gel ranged from 2-5.5 kb. Recombinant plasmid PAT-A with an insert of 5.5 kb came from library I while the recombinants PAT-F and PAT-42 with inserts of 4.8 and 2.5 kb, respectively, were isolated from library II. These three plasmids were purified for further analysis. To confirm that the 5.5 kb insert of PAT-A contained the whole of the coding sequence for C4 the two ends were sequenced by the Maxam and Gilbert method making use of the unique Cla I and Barn HI restriction enzyme sites in the plasmid vector. The 3’ end showed a polyadenylation signal and a poly (A) tail. The 5’ end nucleotide sequence included bases coding for an amino acid sequence corresponding to that of the N terminal end of the @ chain (Gigli et al., 1977) which is also the N terminal end of pro C4 (Figure 1). This 5.5 kb cDNA clone was sequenced by the random dideoxy technique. Figure 2 shows the results together with the derived protein sequence.

5360

5380

5400

C Comparison of the known protein sequence (Gigli et al., 1977; Moon et al., 1981; Press and Gagnon, 1981; Campbell et al., 1981; Chakravarti et al., 1983; Chakravarti, Campbell, and Gagnon, unpublished data; Law and Gagnon, unpublished data) with the derived amino acid sequence shows good agreement with minor differences such as asparagine at residue 708 rather than aspartic acid (Moon et al., 1981) and several differences at positions 1320-1325 with a sequence towards the C terminal end of the (Y chain (Press and Gagnon, 1981). As discussed below, some differences may arise from the polymorphism of C4. The N terminal ends of the (Y p and y chains, known from the protein sequences, are shown in Figure 2 as are the C terminal ends of the p and y chains established by Gagnon and Law (unpublished data). The C terminal end of the (Y chain has not yet been established but Domdey et al. (1982) found that the amino acid sequence derived from a cDNA sequence of mouse C3 showed a tetraarginine peptide preceding the N terminal end of the (Y chain and, from the known C terminal sequence of the human C3 ,6 chain (Tack et al., 1979) suggested that it was cut out in the proteolytic processing of pro C3. Similarly Whitehead et al. (1983) and Ogata et al. (1983) found, respectively, that human and mouse C4 cDNA coded for a tetra-arginine sequence before the N terminal

Human Complement 911

Component

C4

of the y chain. Our results confirm this and show that the tetra-arginine is part of a very basic heptapeptide with six arginine and one asparagine residue. Amino acid sequencing in progress will determine how large a peptide is excised in the processing between the 01 and y chains. Immediately before the N terminal of the (Y chain there is an arg-lys-lys-arg sequence (positions 657-660) but the C terminal sequence of the p chain is lys-thr-thr (Gagnon and Law, unpublished data) showing again that the basic sequence is eliminated. The reactions may be catalyzed by the same endopeptidase of trypsin-like specificity followed by the action of a carboxypeptidase B type exopeptidase as has been shown to occur in the release of some peptide hormones from precursor molecules (Lazure et al., 1983). The position of the thioester bond in the o( chain is shown in Figure 2. The glutamyl residue is coded as glutamine, as was found in the thioester bond position in mouse C3 (Domdey et al., 1982). Following the stop codon at the C terminal end of the y chain, there is a stretch of 140 nucleotides of noncoding region before the poly (A) tail. The polyadenylation signal ATTAAA in positions 5371-5376 occurs 29 nucleotides before the poly (A). This differs from the reported concensus sequence (Proudfoot and Brownlee, 1976) but agrees with that found in factor B mRNA (Campbell and Porter, 1983). Preceding the p chain, there is a stretch of amino acids in phase with the rest of the sequence that are hydrophobic or uncharged and may be part of a leader peptide but these follow 19 amino acids more hydrophylic in character. No initiation codon is present, possibly because the loop-back method of cDNA synthesis generates artifacts, but the sequencing of a C4 gene now in progress should clarify the sequence at the 5’ end. An unexpected feature of the sequence was the finding of a seven nucleotide deletion (3941-3947) in the PAT-A insert relative to the PAT-F and PAT-42 inserts and to the known protein sequence (Chakravarti et al., 1983) as shown in Figure 2. It may be an artifact arising from a section missed by the reverse transcriptase enzyme or possibly a correct copy of an mRNA derived from a pseudo gene which would give a truncated pro C4 molecule ending prematurely on translation after nucleotide residue 3956. It could also result from aberrant splicing at the GT signal nucleotides 3942-3. The other nucleotide differences found cannot have arisen from sequencing artifacts as most have been confirmed by independent amino acid sequencing in this and other laboratories. Comparison of the sequence of C4 with the available sequences of C2 (Bentley and Porter, 1984) and factor B (Campbell and Porter, 1983) shows no homology nor is there evidence of any internal homology such as has been observed in factor B (Morley and Campbell, 1984). The C4 gene appears to be unrelated to the adjacent genes although C2 and factor B, which are very close to each other, are only 30 kb from C4 (Carroll et al., 1984) and have a related biological function. As has been noted previously (Reid and Porter, 1981) the structures of C3,

C4, and (Y*M, the serum proteinase inhibitor, are related. All have similar molecular weights, (but different peptide chain numbers arising from posttranslational proteolysis), all have intra chain thioester bonds, and the homology between their amino acid sequences becomes more apparent as the sequence data available increases (Wiebauer et al., 1982; Domdey et al., 1982; Sottrup-Jensen et al., 1983). In man, the gene coding for C3 is on chromosome 19 (Whitehead et al., 1982) but the (Y*M gene has not yet been placed.

Polymorphism of C4 Lundwall et al. (1981) were able to distinguish between large homologous tryptic peptides from different forms of C4, approximately equivalent to the C4d fragment (Figure l), by size and by ability to inhibit anti-Rodgers or antiChido antiserum. The tryptic C4d fragment of 30,000 M, carried the Rodgers antigen and that of 28,000 M, carried the Chido antigen, i.e., the 30,000 M, peptide came from a C4A molecule and the 28,000 M, peptide from a C4B molecule. Subsequent amino acid sequencing (U. Hellman, unpublished data) showed C4A to contain the sequence Asp-Pro-Cys-Pro-Val-Leu-Asp-Arg and the equivalent sequence in C4B was Asp-Leu-Ser-Pro-Val-Ile-His-Arg. The complete nucleotide sequence of PAT-A has been compared with the sequences of PAT-42 and PAT-F from residue 2906 to 4048. That is the section corresponding to the C4d fragment of C4. Comparison of the derived amino acid sequences of PAT-A, PAT-42, and PAT-F shows that PAT-A and PAT-42 contain the C4A sequence and PAT-F contains the C4B sequence between residues 1100 and 1107. This therefore defines PAT-A and PAT-42 as C4A clones and PAT-F as a C4B clone. These results provide a structural basis for the earlier observations (Tilley et al., 1978) that the antigenic sites distinguished by the Rodgers and Chido antisera are in the C4d fragment and the histidine-aspartic acid change may explain the more acidic behavior of C4A compared to C4B. Four other amino acid differences between the C4A and C4B sequences have been found in positions 1157, 1188, 1191, and 1267 (Figure 3) but it is not possible to decide whether all eight changes or only some characterize C4A and C4B as the data of Hellman et al. do not extend this far. As mentioned in the Introduction, there are many alleles of both C4A and C4B and some differences may be due to allelic changes. However, as the amino acid sequencing was carried out using a C4 preparation from pooled serum when many allelic forms were likely to be present it is probable that the differences between residues 1100 to 1107 are class specific for C4A and C4B. One presumed allelic change was found between the sequences of PAT-A and PAT-42 coding for C4A. While they were identical in the 11 OO1107 section a threonine replaced a serine at position 1182. However the C4B sequence in PAT-F also coded for threonine in this position. The liver donor was heterozygous at both the C4A and C4B loci and four different cDNA sequences would be expected. Two C4A se-

Cell 912

1100

1120 L s I Ii PQDPCPVl.DRSHQCGLVGNDETVILTAFVTIILHHGLAyF TTCCAGGACCCCTGTCCAGTGTTAGACAGGAGCATGCAGGGGGGTTTGGTGGGCAATGATGAGACTGTGGCACTCACAGCCTTTGTGACCATCGCCCTTCATCATGGGCTGGCCGTCTT~ T C A CT 3400 3420 3440 3460 3480 3500 1140 1160 S QDEGAEPLKQRVEA SISKANSFLGEKASAGLL.GAHAAAIT CAGGATGAGGGTGCAGAGCCATTGAAGCAGAGAGTGGAAGCCTCCATCTCAAAGGCAAACTCATTTTTGGGGGAGAAAGCAAGTGCTGGGCTCCTGGGTGCCCACGCAGCTGCCATCACG G 3520 3540 3560 3580 3600 3620 1180 1200 A R ITI AYALSLTKAPVDLLGVAHNNLnAnAQETGDNL YYGSVTGS GCCTATGCCCTGTCACTGACCAAGGCGCCTGTGGACCTGCTCGGTGTTGCCCACAACAACCTCATGGCAATGGCCCAGGAGACTGGAGATAACCTGTACTGGGGCTCAGTCACTGGTTCT c c CC [Al 3640 3660 3680 3700 3720 3740 1220

1240

QSNAVSPTPAPRNPSDPHPQAPALVIETTArALLHLLLHE CAGAGCAATGCCGTGTCGCCCACCCCGGCTCCTCGCAACCCATCCGACCCCATGCCCCAGGCCCCAGCCCTGTGGATTGAAACCACAGCCTACGCCCTGCTGCACCTCCTGCTTCACGAG 3760

3780

3800

1260

3820

3840

3860

1280

GKAEMADQA! AULTRQGSFQGGFR GGCAAAGCAGAGATGGCAGACCAGGCTTCGGCCTGGCTCACCCGTCAGGGCAGCTTCCAAGGGGGATTCCGC G 3880 3900 3920 Figure 3. Polymorphisms

in the Nucleotide

and Derived Amino Acid Sequence

from the C4d Section

Numbering is as shown in Figure 2, and gives the sequence of PAT-A, C4A. The alternative residues are those of the PAT-F, C4B, but the residue in brackets is also present in PAT-42 suggesting that this may be an allelic variation. Clones PAT-F and pAT-42 were sequenced from residue 2908 to 4048.

quences have been found but only one C4B sequence has been found thus far. Electrophoretic typing suggests that 35 or more different allelic forms of C4 exist and different individuals may have more than one C4 locus (O’Neill et al., 1978b; Olaisen et al., 1979; Bruun-Petersen et al., 1982). Restriction enzyme mapping suggests that more alleles not recognized’ by charge differences will be found (Whitehead et al., 1984). It is noteworthy that of the 12 nucleotide differences found between the C4A and C4B sequences none would have been recognized by the restriction enzymes presently available. It appears therefore that full resolution of the different forms of C4 can only be achieved by nucleotide sequencing. When these differences are found, synthesis of nucleotide probes corresponding to the different sequences should offer a rapid method of typing C4 as has been shown for haemoglobin (Orkin et al., 1983; Conner et al., 1983) and (Y~ anti-trypsin (Kidd et al., 1983). This procedure should have a higher resolving power than other methods and its feasibility has been demonstrated by synthesizing nucleotides specific for C4A and C4B and using them to identify cosmid clones containing either C4A or C4B inserts (Carroll, Belt, and Porter, unpublished data). Sequences have been compared thus far only for the C4d section of C4 as this appears to contain most of the different structures recognized by electrophoresis and serology but a difference in the @ chain between C4A and C4B has recently been reported (Mauff et al., 1983b). The present results confirm this as amino acid sequencing of pooled C4p chain (Gagnon and Law, unpublished data) has shown that serine predominates in position 616 but

the derived amino acid sequence from insert PAT-A shows cystine. Other variants recognized by comparison of the derived sequences and amino acid sequences, but not represented in the three inserts described here, are in position 1054 (asp/gly), 1090 (ile/ser), and 1281 (arg/val). From the combined cDNA and amino acid sequences, 13 variant positions have been found with one alternative in each position. Only the four positions apparently responsible for the C4A, C4B difference are close, the remainder being scattered in the linear sequences over several hundred residues, although possibly more closely associated in the native structure. Much more information will be required before the pattern of variability in this complex system becomes apparent. Experimental

Procedures

Isolation of RNA from a human post-mortem liver has been described (Carroll and Porter, 1983) and the donor has been typed as C4A 3,4; C4B 1,2.

Preparation

of cDNA

Libraries

Two full-length cDNA libraries were constructed using human 28s RNA following the loop-back (Wickens et al., 1978) and Sl nuclease procedures (Efstratiadis et al., 1978). After first strand synthesis, primed with oligo(dT)r2-laand transcribed with AMV reverse transcriptase, the RNA was removed by heating the mixture at 100°C for 1.5 min and cooling on ice. Second strand synthesis using DNA polymerase I was followed by St nuclease digestion to digest single-stranded ends and loop structures. Overhanging 5’ ends were repaired by “filling in” with the Klenow fragment of DNA polymerass I. Blunt-ended double-stranded cDNA was size fractionated on a 15%-30% sucrose gradient. DNA between 2-4 kb and DNA greater than 4 kb were pooled and blunt-end ligated into the pAT 153 Pvu II-8 plasmid vector previously cleaved at its unique Pvu II site and treated with calf intestinal phosphatase to minimize self ligation. The ligation

f$an

Complement

Component

C4

mixtures were used to transform competent E. coli MC1061 then amplified and subsequently stored at -70°C.

which were

Bentley, D. R., and Porter, R. R. (1984). Isolation of cDNA clones for human complement component C2. Proc. Nat. Acad. Sci. USA, in press.

cDNA Probe The 301 bp C4 cDNA probe pAlu-7 (Carroll and Porter, 1983) was used for screening the libraries.

Screening

of cDNA Libraries

Approximately two complexities of each library were plated, transferred to Whatman 541 filter paper by blotting, and processed for hybridization (Gergen et al., 1979). Fitlers were prehybridized for 3-6 hr at 65% and hybridized overnight at 65°C with -I@ dpm/ml of “P-labeled cDNA probe. The filters were washed at 68°C and autoradiographed at -70°C. Plasmid DNA was extracted from positive C4 cDNA clones with alkaline SDS (Birnboim and Daly. 1979). The inserts were excised from the plasmid vector by Cla l/Barn HI double digests and sized by agarose gel electrophorests.

Preparation

of DNA for Sequencing

Following digestion of the recombinant plasmid PAT-A with Hind III and Sal I enzymes, the 5.5 kb C4 insert was purified by electrophoresis onto a dialysis membrane (Maniatis et al.. 1982). Sonication was used to produce randomly sheared fragments of C4 cDNA (Deininger, 1983) which were subsequently made blunt ended using the Klenow fragment of DNA polymerase I. The randomly sheared and repaired DNA molecules were size fractionated by electrophoresis onto a dialysis membrane (Maniatis et al., 1982) and fragments 500-1600 bp in length were ligated into Sma I cut and phosphatased Ml3 mp8 vector (Messing and Vieira, 1982). Singlestranded DNA templates were prepared from the Ml3 subclone library (Sanger et al., 1980).

DNA Sequence

Analysis

Restriction enzyme fragments covering the 5’ and 3’ ends of the cDNA insert were prepared and sequenced by the chemical degradation method of Maxam and Gilbert (1977). The remaining sequence was determined by the dideoxy-nucleotide chain termination method (Sanger et al., 1977, Sanger and Coulson. 1978). A complementary oligonucleotide was used as a primer and a%dATP was used to label the products. Fractionation of the single-stranded DNA products of the primer elongation reaction was carried out on a TBE buffer gradient polyacrylamide gel (Biggin et al., 1983) which increases the length of DNA sequence data that can be read off a single gel. The sequence data were aligned and overlapped by computer (Staden. 1982). The total length of sequence data read was 25999 and on average each nucleotide of sequence was determined 4.8 times. Completion of the sequence and subsequent sequence analysis of specific regions of other C4 cDNA clones was achieved by nonrandom cloning.

Acknowledgments We thank Professor G. G. Brownlee for advice and generous provision of facilities and Dr. Jean Gagnon for computer sequence comparisons. We also thank Dr. U. Hellman of Uppsala University, and Drs. J. Gagnon, R. D. Campbell, D. N. Chakravarti, and A. S.-K. Law of this Department for essential amino acid sequence data prior to publication. Dr. M. C. Carroll is an Investigator of the American Arthriiis Foundation and Miss Tertia Belt holds an MRC Studentship. The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact. Received

November

2.5, 1983; revised

January 5. 1984

References Awdeh, 2. L., and Alper. C. A. (1986). Inherited structural polymorphism of the fourth component of human complement, Proc. Nat. Acad. Sci. USA 77.3576-3580. Barnstable.

of major histocompatibility regions. In Defense and Recognition II A, E. S. Lennox, ed. (Baltimore, Maryland: University Park Press), pp. 151-226.

C. J., Jones, E. A., and Bodmer, W. F. (1979). Genetic structure

Biggin, M. D., Gibson, T. J., and Hong, G. F. (1983). Buffer gradient gels and %S label as an aid to rapid DNA sequence determination. Proc. Nat. Acad. Sci. USA 83.3963-3965. Birnboim, H. C., and Daly, J. (1979). A rapid alkaline extraction procedure for screening recombinant plasmid DNA. Nucl. Acids. Res. 7, 1513-1523. Bruun-Petersen. G., Lamm, L. U., Jacobsen, B. K., and Kristensen, T. (1982). Genetics of complement C4. Two homoduplication haplotypes C4SC4S and C4FC4F in a family. Human Genet. 67, 36-38. Campbell, R. D., and Porter, R. R. (1983). Molecular cloning and characterisation of the gene coding for human complement component factor B. Proc. Nat. Acad. Sci. USA 80, 44644468. Campbell, R. D., Gagnon, J., and Porter, R. R. (1981). Amino acid sequence around the throl and reactive acyl groups of human complement component 0%. Biochem. J. 199, 359-370. Carroll, M. C., and Porter, R. R. (1983). Cloning of a human complement component C4 gene. Proc. Nat. Acad. Sci. USA 80, 264-267. Carroll, M. C., Campbell, R. D.. Bentley, D. FT., and Porter, R. R. (1984). A molecular map of the major histocompatibility complex Class III region of man linking the complement genes C4, C2 and factor B. Nature 307, 237-241. Chakravarti, D. N., Campbell, R. D., and Gagnon, J. (1983). Amino acid sequence of a polymorphic segment from C4d of human complement component C4. FEBS Lett. 754,387-390. Conner, B. J., Reyes, A. A., Morin, C., Itakura, K., Teplitz, R. L., and Wallace, R. B. (1983). Detection of sicklecell $ globin allele by hybridisation with synthetic oligonucleotides. Proc. Nat. Acad. Sci. USA 80, 278-282. Deininger, P. L. (1983). Random subcloning of sonicated DNA: application to shotgun DNA sequence analysis. Anal. Biochem. 129,216-223. Domdey, H., Wiebauer, K., Kazmaier, M., Muller, V., Odink, K., and Fey, G. (1982). Characterisation of the mRNA and cloned cDNA specifying the third component of mouse complement. Proc. Nat. Acad. Sci. USA 79, 7619-7623. Efstratiadis, A., Kafatos, F. C., Maxam, A. M., and Maniatis, Enzymatic in vitro synthesis of globin genes. Cell 7, 279-288.

T. (1976).

Gergen, J. P., Stem, R. H., and Wensink, P. C. (1979). Filter replicas and permanent collections of recombinant DNA plasmids. Nucl. Acids, Res. 7, 2115-2136. Gigli, I., Von Zabem, I., and Porter, R. R. (1977). The isolation and structure of C4, the fourth component of human complement. B&hem. J. 165, 439446. Hall, R. E.. and Colten, H. R. (1977). Molecular size and subunit structure of the fourth component of guinea pig complement. J. Immunol. 118, 19031905. Kidd, V. J., Wallace, R. B., ltakura. K., and Woo, S. L. (1983). Human a, antitrypsin deficiency detection by direct analysis of the mutation in the gene. Nature 304, 230-234. Lazure, C., Seidah. N. G., Pelaprat, D.. and Chretien, M. (1983). Proteases and post-translational processing of prohormones; a review, Canad. J. Biochem. Cell Biol. 61, 50-515. Lundwall, A., Hellman, U.. Eggersten. G.. and Sjoquist. J. (1981). Isolation Of tlyptic fragments of human C4 expressing Chido and Rodgers antigens, Mol. Immunol. 79, 1655-1665. Maniatrs, T., Fritsch, E. F., and Sambrook, J. (1982). Electrophoresis onto a dialysis membrane. In Molecular Cloning, A Laboratory Manual (Cold Spring Harbor, New York: Cold Spring Harbor Laboratory), p, 168. Mauff. G., Alper, C. A., Awdeh, Z.. Batchelor, J. R., Bertrams, T., BraunPetersen, G., Dawkins, R. L., Demant, P.. Edwards, J.. Grosse-Wild, H., Hauptmann, G., Klonda. P., Lamm. L.. Mullenhauer, E., Nerl, C., Olaisen, B., O’Neitl, G. O., Rittner, C., Roes, M. H., Skanes, V., Teisberg, P., and Wells. L. (1983a). Statement on the nomenclature of human C4 allotypes. Immunobiol. 164, 184-191,

Cdl 914

Mauff, G., Stener, M., Week, M., and Bender, K. (1983b). The C4B chain: evidence for genetically determined polymorphism, Human Genet., 64, 186188.

Sottrup-Jensen. L., Stepanik, T. M., Wierzbicki, D. M., Jones, C. M., Lonblad. P. B., Krfstensen, T., Petersen. T. E., and Magnusson, S. (1983). Anal. New York Acad. Sci., in press.

Maxam, A. M., and Gilbert, W. (1977). A new method for sequencing Proc. Nat. Acad. Sci. USA 74, 560-564.

Staden, R. (1982). Automation of the computer handling of gel reading data produced by the shotgun method of DNA sequencing. Nucl. Acids, Res. 70, 47314751.

DNA.

Messing, J., and Vieira. J. (1982). A new pair of Ml3 vectors for selecting either DNA strand of double-digest restriction fragments, Gene 79, 269276. Mevag, B., Olaisen, B., and Teisberg, P. (1981). Electrophoretic polymorphism of human C4 is due to charge differences rn the 01chain presumably in the C4a fragment. Stand. J. Immunol. 74, 303-307. Moon, K. E., Gorski. J. P., and Hugli, T. E. (1981). Complete primary structure of human C4a anaphylatoxin. J. Biol. Chem. 256. 86858692. Morley, B. J., and Campbell, Ft. D. (1984). Internal homologies of the Ba fragment from human complement component factor B, a Class III MHC antigen. EMBO J. 3, 153-157. Ogata, Ft. T., Schreffler, D. C., Sepich, D. S., and Lilly, S. P. (1983). cDNA clone spanning the a-y subunit junction in the precursor of the murine fourth complement component. Proc. Nat. Acad. Sci. USA 80. 5061-5065. Olaisen, B., Teisberg, P., Nordhagen. Ft., Michaelsen, T., and Gedde-Dahl, T. (1979). Human complement C4 is duplicated on some chromosomes, Nature 279, 736-737. Olaisen, B., Teisberg. Ft., Jonassen, Ft., Thorsby, E., and Gedde-Dahl, T. (1983). Gene order and gene distances in the HLA regions studied by the haplotype method. Ann. Human Genet. 47, 285-292. O’Neill. C. J.. Yang, S. Y., Tegolr, J., Berger, R., and DuPont. B. (1978a). Chido and Rodgers blood groups are distinct anttgenic components of human complement C4. Nature 273, 668-670. O’Neill, G. J., Yang, S. Y.. and DuPont, B. (1978b). Two HLA-linked loci controlling the fourth component of human complement. Proc. Nat. Acad. Sci. USA 75, 5165-5169. Orkin, S. H., Markham, A. F., and Kazazian, H. H. (1983). Direct detection of the common Mediterranean fl thalassemia gene with synthetic DNA probes. J. Clin. Invest. 77, 775-779. Pafsdottir. A., Cross, S. J., Edwards, J. H., and Carroll, M. C. (1983). Correlation between a DNA restriction fragment length polymorphism and the C4 A6 protein. Nature 306, 615-616. Porter, R. R. (1983a). The complement components patibility locus. CRC Crit. Rev. B&hem., in press.

of the major histocom-

Porter, R. R. (1983b). Complement polymorphism, the major histocompatibilky complex and associated diseases: a speculation. Mol. Biol. Med. 1, 161-168. Press, E. M., and Gagnon, Biochem. J. 799, 351-357. Proudfoot, sequences

N. J., and in eukaryotic

J. (1981). Human complement

component

Brownlee, G. G. (1976). 3’ non-coding messenger RNA. Nature 263, 21 l-214.

Reid, K. B. M., and Porter, R. R. (1981). The proteolytic of complement. Ann. Rev. Biochem. 50, 433-464. Rittner, C. L., Tippett, P., Giles, C. M., dhagen, R.. Buskjoer, L., Petersen, G. (1983). An internationai reference typing rare human C4 allotypes. VOX. Sang., in

activation

C4. region

systems

Mollenhauer, E., Berger, R., NorB., Lamm, L., and Roos, M. H. for Ch and Rg determinants on press.

Roos, M. H., Atkinson, J. P., and Schreffler, D. C. (1978). Molecular size and characterisation of the Ss and Sl p (C4) proteins of the mouse H2 complex. J. Immunol. 727, 1106-l 115. Roos. M. H., Mollenhauer, E., Demant, P., and Rittner, C. (1982). A molecular basis for the two locus model of human complement component C4. Nature 298, 854-856. Sanger, F., and Coulson, A. Ft. (1978). The use of thin acrylamide DNA sequenang. FEES Lett. 87, 107-l 10.

gels for

Sanger, F., Nicklen, S.. and Coulson, A. R. (1977). DNA sequencing charn termrnating inhibitors. Proc. Nat. Acad. Sci. USA 74, 5463-5468

with

Sanger, F., Coulson, A. R., Barrell, B. G., Smith, A. J. H., and Roe, B. (1980). Cloning rn slngle-stranded bacteriophage as an aid to rapid DNA sequencing. J. Mol. Biol. 743, 161-178.

Tack, B. F., Morris, S. E., and Prahl, J. W. (1979). Third component of human complement: structural analysis of the polypeptide chains of C3 and C3b. Biochemistry 78, 1497. Tilley, C. A., Romans, D. G., and Crookson, M. C. (1978). Localisation of Chido and Rodgers determinants to the C4d fragment of C4. Nature 276, 713-715. Whitehead, A. S., Solomon, E., Chambers, S., Eodmer, W. F., Povey, S., and Fey, G. (1982). Assrgnment of the structural gene for the thrrd component of human complement to chromosome 19. Proc. Nat. Acad. Sci. USA 79, 5021-5025. Whrtehead, A. S., Goldberger, G., Woods, D. E., Markham, A. F., and Colten, H. R. (1983). Use of a cDNA clone for the fourth component of human complement (C4) for analysis of a genetic deficiency of C4 in guinea pig. Proc. Nat. Acad. Sci. USA 80, 5387-5391, Whitehead, A. S., Woods, D. E., Fleischmek. E., Chin, J. E., Yunis, E. J., Katz, A. J., Gerald, P. S., Alper, C. A., and Colten, H. R. (1984). DNA polymorphisms of the C4 genes: a new marker for analysis of the major histocompatibility complex. N. Eng. J. Med. 370, 88-91. Wickens, M. P., Buell, G. N., and Schimke, R. T. (1978). Synthesis of double-stranded DNA complementary to lysozyme. ovomucoid, and ovalbumin mRNAs. J. Mol. Biol. 253, 24832495. Wiebauer, K., Domdey, H., Digglemann, H., and Fey, G. (1982). Isolation and analysis of genomic DNA clones encoding the third component of mouse complement. Proc. Nat. Acad. Sci. USA 79, 7077-7681,