Gene 196 ( 1997) 121–125
Structure of the gene encoding human alpha -HS glycoprotein (AHSG) 2 Motoki Osawa a,*, Kazuo Umetsu b, Michihiko Sato c, Tamotsu Ohki a, Nobuhiro Yukawa a, Tsuneo Suzuki b, Sanae Takeichi a a Department of Forensic Medicine, Tokai University School of Medicine, Kanagawa 259-11, Japan b Department of Forensic Medicine, Yamagata University School of Medicine, Yamagata 990-23, Japan c Central Laboratory for Research and Education, Yamagata University School of Medicine, Yamagata 990-23, Japan Received 16 December 1996; accepted 15 March 1997; Received by J.L. Slightom
Abstract Alpha -HS glycoprotein ( AHSG) is a human plasma glycoprotein and fetuin is the homologue in the calf. In this report, we 2 present the structure and organization of the AHSG gene. Introns and the 5∞ and 3∞-flanking regions were obtained by polymerase chain reaction ( PCR) and the inverted PCR, respectively, from genomic DNA using AHSG cDNA-specific oligonucleotide primers. The sequence of the PCR products shows that the coding region spans approximately 8.2 kb and is composed of seven exons interrupted by six introns. The exon–intron splice junctions agree with the consensus sequence, and the positions interrupted by introns are precisely identical to those of the rat insulin receptor tyrosine kinase inhibitor (fetuin) gene. The 5∞-promoter region contains several characteristic sequences such as an A+T-rich sequence of TAAATAA, C/EBP-binding site, and hepatocyte nuclear factor-5 (HNF-5) and serum response factor (SRF ) sites. © 1997 Elsevier Science B.V. Keywords: Plasma protein gene; Nucleotide sequence; Gene organization; Regulatory sequences
1. Introduction Alpha -HS glycoprotein ( AHSG), a 49 kDa glycopro2 tein present in the human serum, is synthesized by hepatocytes. The cDNA encoding AHSG has been cloned, and the gene has been chromosomally mapped to 3q (Lee et al., 1987). The AHSG molecule consists of two polypeptide chains of 321 and 27 amino acid residues, which are cleaved from a proprotein of a single mRNA ( Kellermann et al., 1989 ). Based on the homology of the nucleotide and amino acid sequences, proteins homologous to fetuin have been described in other species (Brown et al., 1992a). Fetuin is a fetal protein found in fetal calf serum in which this protein is predominant ( Pedersen, 1944 ). However, the identity between AHSG and fetuin had not been defined until * Corresponding author. Tel. +81 463 931121 (ext 2630); Fax +81 463 920284; e-mail:
[email protected] Abbreviations: AHSG, alpha -HS glycoprotein; AHSG, gene ( DNA, 2 RNA) encoding AHSG; pp63, phosphorylated 63 kD N-glycoprotein; C/EBP, CCAAT enhancer binding protein; HNF, hepatocyte nuclear factor; SRF, serum response factor; IL, interleukin; PCR, polymerase chain reaction; TNF, tumor necrosis factor. 0378-1119/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved. PII S0 3 78 - 11 19 ( 9 7 ) 00 21 6 -3
recently because the plasma level of AHSG in the human fetus is much lower than that in the bovine and sheep fetuses, and because the antibodies to bovine, sheep and pig fetuins do not crossreact with AHSG (Brown et al., 1992a). AHSG and fetuin are involved in several functions, such as endocytosis, brain development and the formation of bone tissue; however, there are dierences among species which have been summarized by Brown et al. (1992b). The protein is commonly present in the cortical plate of the immature cerebral cortex and bone marrow hemopoietic matrix in many species, and it has therefore been postulated that it participates in the development of the tissues (Saunders et al., 1994). However, its exact significance is still obscure. AHSG exhibits extensive genetic polymorphism by originating from many alleles, including two common alleles AHSG*1 and AHSG*2, in the human population (Boutin et al., 1985). A comparison of the AHSG cDNA sequences of phenotypes 1, 2-1 and 2 revealed two potential amino acid residue replacements of Thr (ACG) to Met (ATG ) at amino acid position 230, and Thr (ACC ) to Ser (AGC ) at position 238 in the AHSG*2 allele (Osawa et al., 1997). In this study, we also found a homologous structure of the AHSG gene
122
M. Osawa et al. / Gene 196 ( 1997) 121–125
to the rat insulin receptor tyrosine kinase inhibitor ( pp63 ) gene (Falquerho et al., 1991), the homologue of fetuin in rats (Rauth et al., 1992). The complete sequence of a total of 8748 bp for the gene encoding AHSG was then determined. In this report, we show the gene structure and organization including the 5∞ and 3∞-flanking regions.
2. Experimental and discussion 2.1. Isolation of the human AHSG gene By comparing the human AHSG cDNA sequence (Lee et al., 1987 ) to the genomic sequence and organization of the rat pp63 (fetuin) gene (Falquerho et al., 1991 ), putative exon/intron boundaries were postulated for the human AHSG gene. Fragments of the human AHSG gene were obtained by PCR from genomic DNA, using cDNA sequence-specific primers of which the sequences were selected from the adjacent postulated exons ( Table 1 ). For one to six introns, we obtained the single products of 2.4, 0.7, 0.8, 1.3, 1.3 and 1.1 kb, respectively. They were larger than the sizes estimated from human AHSG cDNA, suggesting the presence of introns between the primer pairs. After the amplified DNA fragments were subcloned into a vector, all DNA sequences were completely determined from both strands. Furthermore, PCR using intron-specific primers was performed to eliminate the possibility of other intron interruptions. The inverted PCR method was employed to obtain Table 1 Oligonucleotides used for genomic PCR Product
Primera
Sequence (5∞3∞)
Siteb
Intron 1
1F1 2R1 2F1 3R1 3F1 4R1 4F1 5R1 5F1 6R1 6F1 7R1 1F2 1R1 7F1 7R1
CACACCTTGAACCAGATTGATG TTCAATCTCAAACAGCTCTCCG TGAAATAGACACCCTGGAAACC ATACCACGGAAAACTTGCCATC TAGATGGCAAGTTTTCCGTGGT GGCAGTCTTGGCACACCTTG CGCTCAGAACAACGGCTCCA AACTCCACATAGGTAGAAGGTG GTGTTGCTAAAGAGGCCACAG AACCTCTGCCCCACCAAGCT GTAAGGCAACACTCAGTGA TCATCTCTGCCATGTCTAG TATAGACAACCGAACTGCG AGGCAGGCGTGCAGGTGGTT TGCCTGTTAATGAAAGTGCC TCATCTCTGCCATGTCTAG
220 291 288 433 410 496 564 650 671 777 737 1172 127 25 1483 1172
Intron 2 Intron 3 Intron 4 Intron 5 Intron 6 5∞-flanking 3∞-flanking
aIn the primers, ‘F’ and ‘R’ indicate forward (5∞ to 3∞) and reverse (3∞ to 5∞) directions, respectively. bThe site indicates the 5∞-end nucleotide number of each primer, in which the nucleotide numbers correspond to the cDNA sequence reported by Lee et al. (1987).
the 5∞ and 3∞-flanking regions of the gene ( Triglia et al., 1988). Briefly, the genomic DNA (8 mg) was digested at 37°C overnight with 300 U of the restriction enzyme Sau3A and circularized at 16°C with T4 DNA ligase. For amplification of the 5∞-flanking region, after the circularized DNA was digested with the restriction enzyme EcoO109I at 37°C for 2 h, PCR was carried out with 1.5 mg of the re-cut DNA fragment using the AHSG cDNA-specific primers 1F2 and 1R1 in Table 1. For the 3∞-flanking region, the circularized DNA from the Sau3A digest was digested with HindIII, then PCR was carried out using the primers 7F1 and 7R1. Single products of 0.4 and 0.3 kb were obtained from the 5∞ and 3∞-flanking regions, respectively. This method was useful for amplifying a region of unknown DNA flanking a part of known DNA. 2.2. Sequence analysis of the AHSG gene For the sequence of the human AHSG gene, a total of 8748 bp were determined from the PCR fragments as shown in Fig. 1. The sequence analysis of the PCR fragments revealed that the human AHSG gene contained seven exons and six introns, which spanned 8.2 kb. Exons 1 to 7 are 261, 111, 85, 164, 102, 84 and 731 bp in length, respectively. The first exon contains the ATG start codon; however, at this time the transcription initiation site could not be determined by the primer extension method using liver tissue. The last exon, 7, contains the stop codon of TAG at nucleotide positions 7787 to 7789 relative to the ATG start codon, and the putative polyadenylation signal of AATAAA at positions 8155 to 8160. The sequence of the exons was consistent with the originally reported human cDNA sequence (Lee et al., 1987). Introns 1 to 6 are 2329, 647, 659, 1187, 1219 and 644 bp in length, respectively. The boundary sequence of GT and AG is highly conserved at each of the donor and acceptor splice sites. The positions of intron interruptions are precisely consistent with the rat pp63 (fetuin) gene ( Falquerho et al., 1991), even though the size of introns is a bit dierent from those of the rat pp63 (fetuin) gene. Southern blot analysis of human, bovine, rat and mouse DNA shows that AHSG and fetuin are encoded by a single gene in their respective genomes (Lee et al., 1987; Dziegielewska et al., 1990; Rauth et al., 1992; Yang et al., 1992). These equivalent exon–intron structures between the human AHSG and rat pp63 (fetuin) genes strongly indicate the identity of the two proteins. Fetuin is characterized by a three-domain structure which consists of two homologous cystatin-like domains comprised of 116 to 118 amino acid residues at the amino terminus, and a variable domain at the carboxyl terminus (Dziegielewska et al., 1990). The proposed structure of fetuin was also observed in the organization of the human AHSG gene. The gene for the cystatin
M. Osawa et al. / Gene 196 ( 1997) 121–125
123
Fig. 1. Nucleotide sequence of the human AHSG gene. A total of 8748 bp are shown, including 409 bp of the 5∞-flanking region, 8175 bp of exon and intron sequences, and 164 bp of the 3∞-flanking region. The sequences of 5∞ and 3∞-flanking regions, including non-coding region of exon 1 and introns are indicated by small letters. In the AHSG sequence, the first nucleotide of the ATG start codon was assigned +1. The ATG start codon, A+T-rich region similar to a TATA box, C/EBP binding site, HNF-5 site, SRF site, TAG stop codon, and polyadenylation signal are all underlined. The sequence data reported in this paper have been deposited in the DDBJ, EMBL and GenBank Data Base under the accession number D67013.
124
M. Osawa et al. / Gene 196 ( 1997) 121–125
and cystatin-like domains in other gene family members commonly consists of three exons. In contrast, the last domain is encoded by one exon (exon 7) of a noncystatin like sequence. This indicates that the AHSG/fetuin gene may have originated from a duplication of a cystatin-like gene and an addition of a unique gene. This kind of gene organization is also evident in other members of the cystatin superfamily such as kininogen and histidine-rich glycoprotein (Salvesen et al., 1986; Koide and Odani, 1987). Furthermore, the last exon was less homologous among the species (Brown et al., 1992a). Only the human AHSG molecule is post-translationally cleaved by proteolysis at amino acid position 322 ( Kellermann et al., 1989). These heterogeneities in the carboxyl-terminal domain may generate dierent functions among the species and explain the variable crossreactivity of antibodies to AHSG and fetuin.
the inflammatory cytokines such as IL-6, TNF-a and IL-1 in rat hepatocytes in culture (Daveau et al., 1990; Ohnishi et al., 1994). Ohnishi et al. (1994) speculated that the molecular mechanism is as follows. The inflammatory cytokines, IL-1, IL-6 and TNF, stimulate the expression of a nuclear factor for IL-6 (NF-IL6). NF-IL6 that is highly homologous to C/EBP (Akira et al., 1990 ) occupies the C/EBP regulatory site, which interferes with the transcription of the fetuin gene (Falquerho et al., 1992). Amplifying each exon by PCR was the prerequisite for identifying mutations characterized by changes in exons. Each exon can now be amplified using primers designed from the identified flanking intron sequences.
2.3. Sequence analysis for the putative regulatory elements
We thank Prof. M. Kimura, Tokai University School of Medicine, for his helpful advice. This work is supported by a grant from the Uehara Memorial Foundation, Tokyo, Japan.
For the 5∞-flanking region, we obtained a 409 bp upstream sequence from the first nucleotide of the ATG start codon (Fig. 1 ). The typical TATA box is not present in the sequence, however an A+T-rich sequence of TAAATAA is present at nucleotide positions −114 to −108 relative to the ATG codon. The C/EBP-binding sequence of TTATGCAAT is found at positions −145 to −137. C/EBP participates in specific gene expression in hepatocytes and adipocytes (Poli et al., 1989), which is agreement with the observation that AHSG is mainly produced by the liver. A putative binding site for HNF-5 of TGTTTGC, which is involved in the transcriptional regulation of several liver-specific genes (Grange et al., 1990 ), is present at positions −203 to −196. In addition to these elements, a SRF sequence of GATGTCC was found at positions −236 to −229. The SRF was obtained in an oncogene, c-fos, and is involved in transient transcription activation in response to growth factor (Treisman, 1985 ). This sequence is also conserved in the rat pp63 (fetuin) gene (Falquerho et al., 1991 ). The expression of the AHSG/fetuin is unique. In bovine and sheep fetuses, the content of fetuin in the plasma reaches up to half of the total protein. The AHSG level in the human fetus is also approximately three times higher than the adult level of 0.4–0.6 mg/ml (Brown et al., 1992b ), which it is much less than that seen in bovine and sheep fetuses. Only a small number of plasma proteins show a decreased production after birth. Although the mechanism of fetal expression is not clear from this experiment, regulatory elements such as SRF may provide clues to elucidate this. In another respect, AHSG is one of the few negative acute phase reactants in the human and the rat (Arnaud et al., 1988 ). The expression of fetuin is reduced in vitro by
Acknowledgement
References Akira, S., Isshiki, H., Sugita, T., Tanabe, O., Kinoshita, S., Nishio, Y., Nakajima, T., Hirano, T., Kishimoto, T., 1990. A nuclear factor for IL-6 expression ( NF-IL6) is a member of a C/EBP family. EMBO J. 9, 1897–1906. Arnaud, P., Miribel, L., Emerson, D.L., 1988. a HS-glycoprotein. 2 Methods Enzymol. 163, 431–441. Boutin, B., Feng, S.H., Arnaud, P., 1985. The genetic polymorphism of alpha -HS glycoprotein: study by ultrathin-layer isoelectric focus2 ing. Am. J. Hum. Genet. 37, 1089–1105. Brown, W.M., Dziegielewska, K.M., Saunders, N.R., Christie, D.L., Nawratil, P., Mu¨ller-Esterl, W., 1992. The nucleotide and deduced amino acid structures of sheep and pig fetuin. Eur. J. Biochem. 205, 321–331. Brown, W.M., Saunders, N.R., Møllga˚rd, K., Dziegielewska, K.M., 1992. Fetuin — an old friend revisited. Bioessays 14, 749–755. Daveau, M., Davrinche, C., Djelassi, N., Lemetayer, J., Julen, N., Hiron, M., Arnaud, P., Lebreton, J.P., 1990. Partial hepatectomy and mediators of inflammation decrease the expression of liver a HS-glycoprotein gene in rats. FEBS Lett. 273, 79–81. 2 Dziegielewska, K.M., Brown, W.M., Casey, S.J., Christie, D.L., Foreman, R.C., Hill, R.M., Saunders, N.R., 1990. The complete cDNA and amino acid sequence of bovine fetuin. J. Biol. Chem. 265, 4354–4357. Falquerho, L., Patey, G., Paquereau, L., Rossi, V., Lahuna, O., Szpirer, J., Szpirer, C., Levan, G., LeCam, A., 1991. Primary structure of the rat gene encoding an inhibitor of the insulin receptor tyrosine kinase. Gene 98, 209–216. Falquerho, L., Paquereau, L., Vilarem, M.J.V., Galas, S., Patey, G., LeCam, A., 1992. Functional characterization of the promoter of pp63, a gene encoding a natural inhibitor of the insulin receptor tyrosine kinase. Nucleic Acids Res. 20, 1983–1990. Grange, T., Roux, J., Rigaud, G., Pictet, R., 1990. Cell-type specific activity of two glucocorticoid responsive units of rat tyrosine aminotransferase gene is associated with multiple binding sites for C/EBP
M. Osawa et al. / Gene 196 ( 1997) 121–125 and a novel liver-specific nuclear factor. Nucleic Acids Res. 19, 131–139. Kellermann, J., Haupt, H., Auerswald, E.A., Mu¨ller-Esterl, W., 1989. The arrangement of disulfide loops in human a HS-glycoprotein. J. Biol. Chem. 264, 14121–14128. 2 Koide, T., Odani, S., 1987. Histidine-rich glycoprotein is evolutionarily related to the cystatin superfamily. FEBS Lett. 216, 17–21. Lee, C.C., Bowman, B.H., Yang, F., 1987. Human a HS-glycoprotein: the A and B chains with a connecting sequence 2 are encoded by a single mRNA transcript. Proc. Natl. Acad. Sci. USA 84, 4403–4407. Ohnishi, T., Nakamura, O., Arakaki, N., Miyazaki, H., Daikuhara, Y., 1994. Eects of cytokines and growth factors on phosphorylated fetuin biosynthesis by adult rat hepatocytes in primary culture. Biochem. Biophys. Res. Commun. 200, 598–605. Osawa, M., Umetsu, K., Ohki, T., Nagasawa, T., Suzuki, T., Takeichi, S., 1997. Molecular evidence for human alpha -HS glycoprotein 2 ( AHSG) polymorphism. Hum. Genet. 99, 18–21. Pedersen, K.O., 1944. Fetuin, a new globulin isolated from serum. Nature 154, 575 Poli, V., Silengo, L., Altruda, F., Cortese, R., 1989. The analysis of the human hemopexin promoter defines a new class of liver-specific genes. Nucleic Acids Res. 17, 9351–9365.
125
Rauth, G., Po¨schke, O., Fink, E., Eulitz, M., Tippmer, S., Kellerer, M., Ha¨ring, H.U., Nawratil, P., Haasemann, M., Jahnen-Dechent, W., Mu¨ller-Esterl, W., 1992. The nucleotide and partial amino acid sequences of rat fetuin. Eur. J. Biochem. 204, 523–529. Salvesen, G., Parkes, C., Abrahamson, M., Grubb, A., Barrett, A.J., 1986. Human low-M kininogen contains three copies of a cystatin r sequence that are divergent in structure and in inhibitory activity for cysteine proteinases. Biochem. J. 234, 429–434. Saunders, N.R., Sheardown, S.A., Deal, A., Møllga˚rd, K., Reader, M., Dziegielewska, K.M., 1994. Expression and distribution of fetuin in the developing sheep fetus. Histochemistry 102, 457–475. Treisman, R., 1985. Transient accumulation of c-fos RNA following serum stimulation requires a conserved 5−∞ element and c-fos 3∞ sequences. Cell 42, 889–902. Triglia, T., Peterson, M.G., Kemp, D.J., 1988. A procedure for in vitro amplification of DNA segments that lie outside the boundaries of known sequences. Nucleic Acids Res. 16, 8186 Yang, F., Chen, Z.L., Bergeron, J.M., Cupples, R.L., Friedrichs, W.E., 1992. Human a HS-glycoprotein/bovine fetuin homologue in mice: 2 identification and developmental regulation of the gene. Biochim. Biophys. Acta 1130, 149–156.