THROMBOSIS RESEARCH, Supplement VIII; 91-97,1988 0049-3848/88 $3.00 + .00 Printed in the USA. Copyright (c) 1988 Pergamon Press plc. All rights reserved.
HUMAN HISTIDINE-RICH GLYCOPROTEIN GENE: EVIDENCE FOR EVOillrIONARY RELATEDNESS TO CYSTATIN SUPERGENE FAMILY
Takehiko Koide Department of Biochemistry, School of Medicine, Niigata University, Niigata 951, Japan Department of Biochemistry, University of Washington, Seattle, Washington 98195, USA
ABSTRACT The human chromosomal histidine-rich glycoprotein (HRG) gene has been isolated and its molecular structure was partially characterized. The gene is approximately 11 kb in length and contains nine exons and eight introns. Locations of the introns in HRG gene coding for cystatin domains are essentially identical with those of cystatin SN, SA and C, and kininogen genes. These results provide direct evidence that HRG belongs to a supergene family that includes cystatin SN, SA and C, and kininogen, and also demonstrate high conservation of the intron-exon organization among this supergene family. INTRODUCTION
Histidine-rich glycoprotein (HRG) is a plasma glycoprotein which has many biological properties such as binding to heparin (1-5), plasminogen (6,7), fibrinogen (8), thrombospondin (9), and others (10). Hence, physiological function of HRG has been suggested, on one hand, to modulate the anti thrombotic functions of antithrombin III and heparin cofactor II by competing for binding to heparin, and on the other hand, to modulate fibrinolysis by preventing plasminogen from binding to fibrin and by enhancing plasminogen activation by tissue plasminogen activator (10). Key words: Histidine-rich glycoprotein, Histidine-rich glycoprotein gene, Gene structure, Cystatin superfamily, Supergene family, Evolution
91
92
HUMAN HRG GENE
Suppl. VIII, 1988
A recent case report that familial elevated levels of HRG have been observed in patients with thrombophilia (11) strongly suggests the physiological importance of HRG in thrombosis and fibrinolysis. Recently, we have reported the amino acid sequence of HRG by determining the nucleotide sequence of its cDNA, which revealed that HRG is composed of several domains with four different types of internal repeats (10, 12). Hore recently, we have suggested that HRG may belong to a cystatin (cysteine proteinase inhibitor) superfamily by showing that the N-terminal half (residues 1 through 229) of the polypeptide consists of two cystatin-like sequences in tandem (13). In the present investigation, the cDNA for human HRG has been used for the isolation of overlapping genomic clones from a A Charon 4A phage library, and the isolated genomic clones have been partially characterized by comparison of the intron-exon organization with those of the genes of cystatin superfamily proteins such as cystatin SA, SN and C (14), and kininogens (15). MATERIALS AND METHODS
Enzymes and Chemicals: Restriction endonucleases, T4 DNA ligase, E.coli DNA polymerase, polynucleotide kinase, bacterial alkaline phosphatase were purchased from Bethesda Research Laboratories, Toyobo Co. (Osaka), or Takara Shuzo Co., (Kyoto), and used according to the manufacturer's instructions. M13 vectors mp18 and mp 19, Bal-31 exonuclease and the Klenow fragment of DNA polymerase I were obtained froll Takara Shuzo Co., and H13 sequencing ki ts was from NEN Research Products. P !>S) dATP a S was obtained from Amersham or NEN Research Products. Screening of the Gene Library: A human genomic library constructed in A Charon 4A phage (16) was kindly provided by Dr. Tom Maniatis. Approximately 1.6 x lOB phage were screened for geRomic clones of human HRG by the plaque hybridization technique of Benton and Davis as modified by Woo (17) using a eDNA for human HRG (12) as the hybridization probe. The cDNA was radiolabeled by nick-translation to a specific activity of 2 x lOB cpm/~g. Positive clones were isolated, and each was plaque purified. DNA Sequence Analysis: Phage DNA was prepared from positive clones by the liquid culture lysis method as described by Silhavy et al. (18). The genomic DNA inserts in the purified phage were removed by digestion with EcoRI and then subcloned into pUC9 or pUC19 for subsequent restriction mapping and sequencing. The sequence of genomic fragments containing the gene for HRG was determined by direct cloning of specific restriction fragments into the H13 phage cloning vectors mp18 and mpl9, as well as by the Bal-31 exonuclease method described by Guo et al. (19) and Yoshitake et al. (20). Dideoxy chain termination sequencing reactions were carried out with deoxyadenosine 5'-(a-(3!'S]thiotriphosphate) «(3 !>S] dATP a S) and run on buffer gradient gels as described by Biggin et al. (21).
Suppl. VIII, 1988
HUMAN HRG GENE RESULTS AND DISCUSSION
Isolation of the gene for human HRG. A human genomic DNA library (1.6 x lOB phage) in A Charon 4A phage was screened with a radiolabeled cDNA probe for human HRG. A total of nine positive clones were isolated, and each was plaque-purified. Seven clones exhibited unique patterns of EcoRI fragments Upon electrophoresis in 0.7% agarose but also contained fragments in common with each other. Southern blot hybridization of digests of these clones with probes made from the 5' and 3'ends of the cDNA established that one of the clones (HRG A18) corresponded to the 5' region of the gene for HRG, two clones (HRG A22 & 23) to the 3' region, and one clone (HRG A32) was positive to both sets of probes. The genomic DNA inserts in HRG A18 and HRG A32 were mapped by singleand double-restriction enzyme digestion followed by agarose gel electrophoresis, Southern blotting, and hybridization experiments employing various fragments of the HRG eDNA as probes. This analysis suggested that the gene for HRG was present in five EcoRI fragments of 5.5, (1.1, 0.7), 0.6, and 6.7 kilobases (kb) oriented 5' to 3' in the genome. The 5.5-kb fragment was isolated from phage HRG A18, and other fragments were isolated from phage HRGA32j each was subcloned into the EcoRI site of pUC9 or pUC19. A detailed restriction map as well as approximate placement of the exon regions within the subcloned fragments were established by further restriction analysis and Southern blotting. After the 5'and 3'ends of the gene were established, the nucleotide sequence of the gene was determined by the dideoxy chaintermination method. Comparison of the genomic sequence with that of the cDNA (12) re~ealed that the HRG gene consists of at least (probably definitively) nine exons and eight introns. Structural Domains and Location of Introns in Human HRG. HRG consists of several structural domains with four different types of internal repeats. Eight introns so far localized in HRG are shown in Fig. 1. The first intron (intron A) occurs in the 5' untranslated region of the gene. Two introns (introns B and C) are present within the cystatin domain 1, separating this domain into three exons (exons II, III and IV), and intron D occurs at the boundary between the cystatin domains 1 and 2. The cystatin domain 2 is also encoded by three exons (exons V, VI and VII) separated by two introns E and F, and intron G is present at the end of the cystatin domain 2. Contrary to our expectations, there appears to be no introns within the histidine-rich domain which consists of 12 tandem repetitions of a five amino acid segment with a consensus sequence of Gly-His-His-Pro-His (12). Furthermore, the histidine-rich domain and two proline-rich domains appear to be encoded by one large exon VIn. The C-terminal region and the 3' ttntranslated region are encoded by the last exon IX. Evolutionary Relatedness to Cystatin Superfamily. There are several evolutionary implications of the structure of the HRG gene and protein. The most obvious is that many tandem repetitions in HRG strongly suggest that the HRG gene arose by endoduplication of primordial minidomains or modules,
93
HUMAN HRG GENE
94
B
C D
E F
G
A i- t ,
5'~
I I
.. -
H 255
Cystatin
Cystatin 2 112
Suppl. VIII, 1988
229
398
330
Pro-rich 1
His-rich
314
389
440
Pro-ri 2 439
C-Term
3'
507
Fig. 1. Domain structure of HRG and locations of the introns in the gene. HRG are represented by open boxes with one domain in each box: Cystatin 1 & 2, cystatin domains 1 & 2; Pro-rich 1 & 2, proline-rich domains 1 & 2; His-rich, histidine-rich domain; C-Term, C-terminal domain. Numbers in each box indicate amino acid residue number from the N-terminal end of HRG. Signal sequence is represented by broken box, and 5'and 3'untranslated regions are by straight lines. Positions of eight introns are shown by solid triangles (A to H). corresponding with the present day types I to IV internal repeats (10,12). This supposition was supported by analyses of the intron-exon organization of the gene. Each cystatin domain (type I internal repeat) of HRG is composed of three exons and two introns, which is the same as those of cystatin SN, SA and C, and three cystatin domains of kininogen heavy-chain. Locations of the introns in two cystatin domains of HRG are compared with'those of cystatin SN, SA and C (14), and kininogens (15) in Fig. 2. The introns in HRG gene occur in essentially the same position as those of human cystatin SN, SA and C, and kininogen genes. These results support our previous proposal that HRG should belong to the cystatin superfamily (13) and provide conclusive evidence that HRG gene is evolutionarily related to the cystatin supergene family. Furthermore, the locations of the introns in all genes of cystatin superfamily so far characterized are essentially identical, showing that the intron-exon organization of the cystatin supergene family is highly conserved. A model to explain the triplication of the primordial domain of kininogen has been proposed by Kitamura et al.(15). An analogous model may be applied to explain the duplication of the primordial dORain of HRG. ACKNOWLEDGMENTS I would like to thank Drs. Earl W. Davie, Don C. Foster, Shinji Yoshitake and Shoji Odani for their helpful discussions and advice. I also thank Dr. Tom Maniatis for kindly providing the human genomic library constructed in A Charon 4A bacteriophage. Finally, I wish to thank Drs. Eiichi Saitoh and Satoko Isemura for kindly making available the DNA sequences for the genes for human cystatin SN, SA and C prior to publication. This work was supported in part by a Grant-in Aid for Scientific Research from the Ministry of Education, Science, and Culture of Japan and by a research grant (HL 16919) from the National Institute of Health.
3
4
5
6
7
8
9
1
0
1
'Y
L-
-1
'Y
L..-
-1
r'Y 'Y ~ VSPTOC-SA-VEPEAElCALDLINKRRRDGYLFQllRIADAHLDRVENTIVYYLVLDVQESDCSVLSRKWND-CEPPDSRRPSEIVIGGCKVIAT- - RHSHESQDLRVIDFNC-nsS ~
'Y
I
'Y
~
'Y 'Y ~ QESQSEEI OCNOKDLfl(AVDAALl
~--------~
~---~
~
L-
-1
~---~
~
~--------~
Fig. 2. Comparison of locations of the introns in genes of a cystatin superfamily. Intron positions for each individual gene are indicated by solid triangles. Gaps are included for the best alignment of the amino acid sequences (13). a) human HRG cystatin domain 1, b) human HRG cystatin domain 2. c) human kininogen heavy-chain domain 1, d) human kininogen heavy-chain domain 2, e) human kininogen heavy-chain domain 3 (15). f) hunan cystatin SN, g) human cystatin SA, h) human cystatin C (14). The disulfide bridges in kininogen (22) and cystatin C (23) are shown by I I and potential disulfide bridges are shown by bold broken lines.
h) SSP-GKPPRLVGGPHDASVEEEGVRRALDFAVGETh'KASNDHYHSRALQVVR-ARKQI VAGVNYFLDVELGRTTCTKTQPN--LDNCPFHIX
'Y
g) Io'SPQEEDRI IEGGIYDADl1IDERVQRAIJlFVI SETh'KATEDEYYRRllRVLR-AREQIVGGVNYFFDIEVGRTICTlCSQPN--LDTCAF1IEQP--ELQKKQLCSFQI - YEVPWEDRl1SL--VNS~EA
~
f) Io'SPKEEDRI IPGGIYNADl1IDEWVQRAIJ{FAISETh'KATKDDYYRRPLRVLR-ARQQTVGGVNYFFDVEVGRTICTlCSQPN--LDTCAFHEQP--ELQKKQLCSFEI - YEVPWENRRSL--VKSR~ES
'Y 'Y
'Y ~ ~ e) KDFVQPPTlCICVGCPRDIPTNSPELEETLTHTITICLNAENNATFYFKIDNVKK-ARVQVVAGKKYFIDFVARETTCSKESNEELTESCETI----l
d) EGPVVTAQYDCLGCVHPISTQSPDLEPILRHGIQYFNNNTQHSSLFHLNEVKR-AQRQVVAGLNFRITYSIVQTNCSKENFU"LTPDCKSL-----IiNGDTGECrDNA-YIDIQLRIASFSQ---NCDIYPG L-I , I ,
c)
~
b)VSSALANTKDSPVLI-DFFEDTERYRKQANKALEKYI
RHHFPR-HPNVFGFCRADLFYDVEALDLESPKNLVINC-EVFDPQ
a)
2
123456a78a90123456789012345678901234567890123456789012345678901234567890a1234567890123456789012ab3456789012345678a9012
1
V)
c:
~
U'I
\0
~
..., :z ...,
::t: ::0
:z
~
:c c
co co
\0
c::
--.... ....
"'C "'C -'
96
HUMAN HRG GENE
Suppl. VIII. 1988
REFERENCES
1. HEIMBURGER, N., HAUPT, H., KRANZ, T. and BAUDNER, S. Human serum proteins with a high affinity for carboxymethylcellulose, II: Physicochemical and immunological characterization of a histidine-rich 3.85 a 2 glycoprotein (CH protein I). Hoppe-Seyler·s Z. Physio!. Chern. 353, 1133-1140, 1972. 2. KOIDE, T., ODANI, S. and ONO, T. The N-terminal sequence of human plasma histidine-rich glycoprotein honologous to antithrombin with high affinity for heparin. FEBS Lett. 141, 222-224, 1982. 3. LIJNEN, H.R., HOYLAERTS, H. and COLLEN, D. Heparin binding properties of human histidine-rich glycoprotein. Hechanism and role in the neutralization of heparin in plasma. J . BioI. Chem. 258, 3803-3808, 1983. 4. LIJNEN, H.R., van HOEF, B. and COLLEN, D. Interaction of heparin with histidine-rich glycoprotein and with antithrombin III. Thromb. Haemostas. 50, 560-562, 1983. 5. NIWA, H., YAHAGISHI, R., KONDO, 5., SAKURAGAWA, N. and KOIDE, T. Histidine-rich glycoprotein inhibits the antithrombin activity of heparin cofactor II in the presence of heparin or dermatan sulfate. Thromb. Res. 37, 237-240, 1985. 6. LIJNEN, H.R., HOYLAERTS, H. and COLLEN, D.Isolation and characterization of a human plasma protein with affinity for the lysine binding sites in plasminogen. Role in the regulation of fibrinolysis and identification as histidine-rich glycoprotein. J. BioI. Chem. 255, 10214-10222, 1980. 7. ICHINOSE , A., HIHURO, J., KOIDE, T. and AOKI, N. Histidine-rich glycoprotein and a r-plasmin inhibitor in inhibition of plasminogen binding to fibrin. Thromb. Res. 33, 401-407, 1984. 8. LEUNG, L.L.K., NACHMAN, R.L. and HARPEL, P.C. Complex formation of platelet thrombospondin with histidine-rich glycoprotein. J. Clin. Invest. 73, 5-12, 1984. 9. LEUNG, L.L.K. Interaction of histidine-rich glycoprotein with fibrinogen and fibrin. J. Clin. Invest. 77, 1305-1311, 1986. 10. KOIDE, T. The primary structure of human histidine-rich glycoprotein and its functions as a modulator of coagulation and fibrinolysis. In Fibrinolysis: Current Prospects, P.J. Gaffney et al. (eds.), John Libbey & Co., London. p.55-63, 1988. 11. ENGESSER, L., KLUFr, C., BRIET, E. and B~, E.J.P. Familial elevation of plasma histidine-rich glycoprotein in a family with thrombophilia. Brit. J. Haematol. 67, 355-358. 1987.
Suppl. VIII, 1988
HUMAN HRG GENE
97
12. KOIDE, T., FOSTER, D., YOSHITAKE, S. and DAVIE, E.W. Amino acid sequence of human histidine-rich glycoprotein derived from the nucleotide sequence of its eDNA. Biochemistry 25, 2220-2225, 1986. 13. KOIDE, T. and ODANI, S. Histidine-rich glycoprotein is evolutionarily related to the cystatin superfamily. Presence of two cystatin domains In the N-terminal region. FEBS Lett. 216, 17-21, 1987. 14. SAITO, E., ISEMURA, S., SANADA, K., KIM, H.-5., SMITHIES, O. and MAEDA, N. Cystatin superfamily: Evidence that family II cystatin genes are evolutionarily related to family III cystanin genes. BioI. Chem. HoppeSeyler ~, 1988. (in press). 15. KITAMURA, N., KITAGAWA, H., FUKUSHIMA, D., TAKAGAKI, Y., MIYATA, T. and NAKANISHI, S. Structural organization of the human kininogen gene and a model for its evolution. J. BioI. Chem. 260, 8610-8617, 1985. 16. MANIATIS, T., HARDISON, R.C., LACY, E., LAUER, J., O'CONNEL, C., QUON, D., SIM, G.K. and EFSTRATIADI5, A. The isolation of structural genes from libraries of eucaryotic DNA. Cell ~, 687-701, 1978. 17. WOO, S.L.C. A sensitive and rapid method for screening. Methods Enzymol. 68, 389-395, 1979.
recombinant phage
18. SILHAVY, T.J., BERMAN, M.L. and ENQUIST, L.W. Large-scale isolation of ADNA. In Experiments with Gene Fusions, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. p. 140-141, 1984. 19. GUO, L.-H., YANG, R.C.A. and WU, R. An improved strategy of both strands of long DNA molecules cloned in a plasmid. Nucleic Acids Res. !!' 55215540, 1983. 20. YOSHITAKE, S., SCHACH, B.G., FOSTER, D.C., DAVIE, E.W. and KURACHI, K. Nucleotide sequence of the gene for human factor IX (antihemophilic factor B). Biochemistry 24, 3736-3750, 1985. 21. BIGGIN, M.D., GIBSON, T.T. and HONG, G.F. Buffer gradient gels and 3~S label as an aid to rapid DNA sequence determination. Proc. Natl. Acad. Sci. USA 80, 3963-3965, 1983. 22. SUEYOSHI, T., MIYATA, T., HASHIMOTO, N., KATO, H., HAYASHIDA, H., MIYATA, T. and IWANAGA, S. Bovine high molecular weight kininogen. The amino acid sequence, positions of carbohydrate chains and disulfide bridges in the heavy chain portion. J. BioI. Chem. 262, 2768-2779, 1987. 23. GRUBB, A., LOFBERG, H. and BARRETT, A.J. The disulfide bridges of human cystatin C (gamma-trace) and chicken cystatin. FEBS Lett. 170, 370-374, 1984.