Gene 194 (1997) 301–303
Short Communication
Structure of the gene encoding human colligin-2 (CBP2) Shiro Ikegawa *, Yusuke Nakamura Laboratory of Molecular Medicine, Institute of Medical Science, University of Tokyo, 4-6-1 Shirokanedai, Minato-ku, Tokyo 108, Japan Received 2 December 1996; accepted 12 March 1997; Received by T. Sekiya
Abstract Colligins are collagen-binding proteins localized to the endoplasmic reticulum that belong to the superfamily of serine protease inhibitors and play a role in collagen biosynthesis. Previously, we cloned the human colligin-2 gene (CBP2) and mapped it to chromosome 11q13.15. To further characterize the CBP2 gene, we have determined its genomic structure and the 5∞-flanking sequence. The CBP2 gene spanned approximately 11 kb of genomic DNA and consisted of five exons. The promoter sequence of the human gene showed significant homology to that of its murine counterpart, which contained several regulatory sequences including heat-shock and retinoic acid-responsive elements. These findings suggest colligin may function as a collagen-specific molecular chaperon and play a role in the process of retinoic acid-induced differentiation. © 1997 Elsevier Science B.V. Keywords: Colligin; Genomic structure; Promoter sequence; Heat-shock protein; Retinoic acid-responsive element
1. Introduction Colligins are proteins that bind specifically to Collagen I, Collagen IV and gelatin ( Kurkinen et al., 1984; Cates et al., 1987). The first member of the colligin group described was a 47-kDa glycoprotein from murine parietal endoderm cells ( Kurkinen et al., 1984); similar proteins have since been identified in various species including mouse ( Wang and Gudas, 1990), rat (Clarke et al., 1991), chicken (Hirayoshi et al., 1991) and human (CBP1: Clarke and Sanwal, 1992; CBP2: Ikegawa et al., 1995). The high degree of homology in amino acid sequence and conservation of characteristic motifs among the colligins allow them to be classified as a definite family of proteins. All colligins contain a reactive site common to the serpin (serine protease inhibitor) superfamily of proteins and a C-terminal RDEL sequence that acts as an ER (endoplasmic reticulum) retention signal. * Corresponding author. Tel. +81 3 54495372; Fax +81 3 54495433; e-mail:
[email protected] Abbreviations: bp, base pair(s); cDNA, DNA complementary to RNA; ER, endoplasmic reticulum; HS, heat shock; HSE, HS element; HSP, HS protein; kDa, kilodalton; kb, kilobase(s) or1000 bp; nt, nucleotide(s); RARE, retinoic acid-responsive element. 0378-1119/97/$17.00 © 1997 Elsevier Science B.V. All rights reserved. PII S 03 7 8 -1 1 1 9 ( 9 7 ) 0 0 2 09 - 6
The promoter sequences of murine colligins have been reported (J6, Wang, 1992; Hsp47, Hosokawa et al., 1993); the murine promoters contain several regulatory elements including a heat-shock element (HSE ), an Sp1 binding site, an AP-1 binding site, a GATA binding site, and a retinoic acid-responsive element (RARE). The promoter sequence of human colligins, however, have not been reported to date, as well as their genomic structure. We recently cloned the human colligin-2 (CBP2) cDNA and mapped it to 11q13.5 (Ikegawa et al., 1995). As a step toward examining the regulation of its expression in human tissues and determining a possible association of this gene with diseases, we have determined the exon–intron organization of the gene and characterized the 5∞-flanking region.
2. Experimental A human genomic library constructed in the cosmid vector pWEX15 was screened by a standard procedure, using as a probe a 1.5-kb EcoRI fragment from CBP2 cDNA (DDBJ accession No. 83174) that contained the
302
S. Ikegawa, Y. Nakamura / Gene 194 (1997) 301–303
Fig. 1. Restriction mapping of the CBP2 gene. The gene consists of 5 exons (I-V ) separated by four introns. Open rectangles indicate the 5∞and 3∞-untranslated regions, and filled rectangles the coding regions. Restriction sites: E, EcoRI; X, XhoI. The scale in base pairs is given below.
entire coding region of the gene. A cosmid clone was isolated, which contained the entire coding sequence for CBP2, as well as the 5∞-flanking region. A restriction map of the clone was constructed (Fig. 1). The genomic structure of CBP2 is very similar to its murine counterpart except in the 5∞-untranslated region, where alternatively spliced variants have been reported in mouse homologs ( Wang, 1992; Hosokawa et al., 1993). The CBP2 gene spanned approximately 11 kb of genomic DNA, and consisted of five exons that were contained in four genomic EcoRI fragments (5, 5, 4 and 2 kb, respectively). The genomic DNA sequence of the CBP2 gene was compared with the cDNA sequence, to determine exon–intron boundaries as well as the 5∞- and 3∞-flanking regions of the gene ( Table 1). Exons I–V were composed of 53, 656, 99, 234 and 1005 nucleotides, respectively, and introns 1–4 were 4.0, 2.0, 0.1 and 2.5 kb, respectively. Sequences at exon–intron junctions were consistent with the consensus sequences for splice junctions. Over 1-kb nucleotide sequence of the 5∞-flanking region of CBP2 was determined (Fig. 2). The promoter sequence (DDBJ accession number: 83751) had 74% identity to its murine counterpart and contained several regulatory elements. A TATA box was positioned at
Fig. 2. Nucleotide sequence of the 1-kb region upstream of the CBP2 gene. Sequence motifs are underlined as follows: 1, TATA box; 2, heatshock response element; 3, purine-rich stretch; 4, GATA binding site; 5, possible retinoic acid-responsive element.
nucleotide (nt) −45. Neither an Sp1 binding site nor an AP-1 binding site that exist in the murine homolog was found. A potential HSE spanned from positions −91 to −78. A purine-rich stretch extended for 35 nucleotides between nts −770 and −804. A GATA binding site ( WGATAR) was found at nt −914 (5∞-CTATCA-3∞, complementary to TGATAG on the other strand ). Direct repeats of TGACC or TGACC-like sequence, possible RAREs ( Vasios et al., 1989; Wang, 1992)
Table 1 Exon–intron structure of the CBP2 gene Exon
Intron location
No.
Size (bp)
1 2 3 4 5
53 656 99 234 1005
Exon sequences are in capital letters.
53/54 709/710 808/809 1042/1043
Sequence at exon–intron junction 5∞ splice donor
Intron size
3∞splice acceptor
GACCCAG gtgaggg TTCAAGC gtgagtc CGGACAG gtaggtg CCTGCAG gtaaggg
(4) (2) (0.109) (2.5)
ctcacag GCCCACC actacag CACACTG ctcccag GCCTCTA cccacag AAACACC
S. Ikegawa, Y. Nakamura / Gene 194 (1997) 301–303
existed at nts −1012 to −983, −708 to −682 and −657 to −630.
3. Discussion The presence of HSE would place CBP2 among the family of heat-shock protein (HSP). HSPs are thought to act as ‘‘molecular chaperons’’ in the process of folding or assembling newly synthesized or malformed proteins. As colligin specifically binds to collagen, and the expression of colligin always correlates with that of collagen genes (Clarke et al., 1993), colligin is likely to function as a collagen-specific molecular chaperon. The presence of RAREs is of particular interest. Retinoic acid responsiveness of a similar sequence in murine colligin has been demonstrated by transfection experiment ( Wang, 1992). In laboratory animals, regulation of colligin by retinoic acid is developmental: Gp46, rat colligin, is absent in undifferentiated F9 embryonal carcinoma cells, but the protein is produced when cellular differentiation is induced by retinoic acid (Nandan et al., 1990). The induction of colligin is in parallel with Collagen IV ( Wang and Gudas, 1990). The role of colligin in the process of retinoic acid-induced differentiation remains to be determined.
Acknowledgement This work was supported in part by grants from the Ministry of Education, Culture, Sports and Science of Japan.
303
References Cates, G.A., Nandan, D., Brickenden, A.M., Sanwal, B.D., 1987. Differentiation defective mutants of skeletal myoblasts altered in a gelatin binding glycoprotein. Biochem. Cell Biol. 65, 767–775. Clarke, E.P., Cates, G.A., Ball, E.H., Sanwal, B.D., 1991. A collagenbinding protein in the endoplasmic reticulum of myoblasts exhibits relationship with serine proteinase inhibitors. J. Biol. Chem. 266, 17230–17235. Clarke, E.P., Jain, N., Brickenden, A.M., Lorimer, I.A., Sanwal, B.D., 1993. Parallel regulation of collagen I and colligin, a collagen-binding protein and a member of the serine proteinase inhibitor family. J. Cell Biol. 121, 193–199. Clarke, E.P., Sanwal, B.D., 1992. Cloning of a human collagen-binding protein, and its homology with rat gp46, chick hsp47 and mouse J6 proteins. Biochim. Biophys. Acta 1129, 246–248. Hirayoshi, K., Kudo, H., Takechi, H., Nakai, A., Iwamatsu, K., Yamada, K.M., Nagata, K., 1991. HSP47, a tissue-specific transformation-sensitive, collagen-binding heat shock protein of chicken embryonal fibroblasts. Mol. Cell Biol. 11, 4036–4044. Hosokawa, N., Takechi, H., Yokota, S., Hirayoshi, K., Nagata, K., 1993. Structure of the gene encoding the mouse 47-kDa heat shock protein (HSP47). Gene 126, 187–193. Ikegawa, S., Sudo, K., Okui, K., Nakamura, Y., 1995. Isolation, characterization and chromosomal assignment of human colligin-2. Cytogenet. Cell Genet. 71, 182–186. Kurkinen, M., Taylor, A., Garrels, J.I., Hogan, B.L.M., 1984. Cell surface-associated proteins which bind native type IV collagen or gelatin. J. Biol. Chem. 259, 5915–5922. Nandan, D., Cates, G.A., Ball, E.H., Sanwal, B.D., 1990. Partial characterization of a collagen-binding, differentiation-related glycoprotein from skeletal myoblasts. Arch. Biochem. Biophys. 278, 291–296. Vasios, G.W., Gold, J.D., Retkovich, M., Chambon, P., Dudas, L.J., 1989. A retinoic acid responsive element is present in the 5∞ flanking region of the laminin B1 gene. Proc. Natl. Acad. Sci. USA 86, 9099–9103. Wang, S.-Y., 1992. Structure of the gene and its retinoic acid-regulatory region for murine J6 serpin. J. Biol. Chem. 267, 15362–15366. Wang, S.-Y., Gudas, L.J., 1990. A retinoic acid-inducible mRNA from F9 teratocarcinoma cells encodes a novel proteinase inhibitor homologue. J. Biol. Chem. 265, 15818–15822.