Gene 243 (2000) 161–166 www.elsevier.com/locate/gene
Exon–intron organization of the human gp130 gene k Csaba Szalai a, *, Sa´ra To´th b, Andra´s Falus b a Central Laboratory, Heim Pa´l Pediatric Hospital Budapest, PO Box 66, Budapest, H-1958 Hungary b Department of Genetics, Cell and Immunobiology, Semmelweis University of Medicine, Nagyva´rad te´r 4, Budapest, H-1089 Hungary Received 7 September 1999; accepted 29 November 1999 Received by H. Cooke
Abstract The exon–intron organization and sequences of the exon–intron boundaries of the human gp130 transmembrane receptor gene have been determined using genomic DNAs as samples. The gp130 gene comprises 17 exons and 16 introns. The positions of the exon–intron boundaries show good correlation to the functional/homology regions of gp130. Exons 3–17 code for the gp130 protein, and each subdomain of the receptor is encoded by a set of exons. The coding potential of exons and the intron phasing of the human gp130 gene conform to the patterns observed previously for other cytokine receptor genes. This supports the notions that the gp130 gene evolved from the same ancestral gene that gave rise to other members of the cytokine receptor family. © 2000 Elsevier Science B.V. All rights reserved. Keywords: Cytokine class I receptors
1. Introduction Gp130 is a transmembrane receptor which is required for signal transduction by a group of cytokines, including interleukin-6 (IL-6), IL-11, leukemia inhibitory factor (LIF ), ciliary neutrophic factor (CNTF ), oncostatin M (OSM ) and cardiotrophin-1 (CT-1) ( Taga, 1997). IL-6, after binding to its specific a-receptor subunit (IL-6R), induces the homodimerization of gp130 (Murakami et al., 1993). LIF, CNTF, OSM and CT-1 induce a heterodimerization of gp130 and LIF receptor (LIFR) (Gearing et al., 1992). The homo- or heterodimerization of gp130 induces phosphorylation
Abbreviations: CHR, cytokine-binding homology region; CNTF, ciliary neutrophic factor; CT-1, cardiotrophin-1; IL, interleukin; JAK, Janus kinase; LIF, leukemia inhibitory factor; OSM, oncostatin M; STAT, for signal transducers and activators of transcription. k Names of the sequences of the gp130 gene, and their accession numbers, are given in Appendix A. * Corresponding author. Tel.: +36-1-210-0712; fax: +36-1-333-0167. E-mail addresses:
[email protected] (C. Szalai),
[email protected] (A. Falus)
and activation of the receptor-associated JAK/Tyk kinases, the STAT (for signal transducers and activators of transcription) family of transcription factors and Srcfamily tyrosine kinase pathways (e.g., Hck, Fyn, and Lyn) (Murakami et al., 1993; Stahl et al., 1995; Ernst et al., 1994; Hallek et al., 1997). The entire sequence of the gp130 cDNA has an open reading frame capable of encoding 918 amino acids ( Hibi et al., 1990). The first 22 amino acids are a signal peptide. Thus, the mature gp130 consists of an extracellular region of 597 amino acids, a membrane spanning region of 22 amino acids, and a cytoplasmic region of 277 amino acids. The extracellular part of gp130 consists of six fibronectin-type III-like domains; the first one is predicted to adopt a seven-stranded immunoglobulinlike conformation, while the second and the third ones are designated the cytokine-binding homology region (CHR) (Bravo et al., 1998) and are characterized by four conserved cysteine residues and the WSXWS (codons 310–314) motif, respectively. The WSXWS motif and the spacing of cysteine residues are characteristic for the class I cytokine receptor family (Bazan, 1990). Deletion studies have revealed that the gp130 CHR is sufficient for interaction with IL-6 and its receptor (Horsten et al., 1997). The transmembrane
0378-1119/00/$ - see front matter © 2000 Elsevier Science B.V. All rights reserved. PII: S0 3 7 8 -1 1 1 9 ( 9 9 ) 0 0 53 6 - 3
162
C. Szalai et al. / Gene 243 (2000) 161–166
domain is not only a membrane anchor, but it is also necessary for signal transduction and determines the interaction with members of the JAK family ( Kim and Baumann, 1997). There are two well-conserved motifs in the membrane-proximal cytoplasmic regions, called box1 (codon 652–659) and box2 (codons 691–702). Studies with deletion mutants showed that JAK kinases interact with the box1 region, suggesting that the conserved Pro-X-Pro motif (codon 656–658) was essential for this interaction ( Kishimoto et al., 1995). The box2 region is important for gp130-mediated upregulation of DNA synthesis (Murakami et al., 1991). The distal part of the cytoplasmic region includes box3 (codons 761– 771), which attracts the SH2 domain of STAT3. In this part, a YXXQ motif is suggested to be important for STAT3 activation (Stahl et al., 1995). A region between codons 782 and 787 (STQPLL) plays an important role in ligand induced endocytosis and down-regulation of the IL-6–IL-6R–gp130 receptor complex (Dittrich et al., 1996). In this study, we describe the exon–intron organization and the sequences of the exon–intron boundaries of the human gp130 gene.
Table 1 Oligonucleotide primers used for PCR and sequencinga Sense primers: −86S: (−86–(−66)): AGC ATG ACA TTT AGA AGT AGA 1S (1–27): ATG TTG ACG TTG CAG ACT TGG GTA GTG 211S (211–231): CAT TTT ACT ATT CCT AAG GAG 390S (390–410): TTG AGT TGC ATT GTG AAC GAG 604S (604–628): AAT GCC CTT GGG AAG GTT ACA TCA G 793S (793–813): GAT GCC TCA ACT TGG AGC CAG 916S (916–939): GGT AAG GGA TAC TGG AGT GAC TGG 1045S (1045–1069): CTC GTG TGG AAG ACA TTG CCT CCT T 1597S (1597–1621): AAC GAA GCT GTC TTA GAG TGG GAC C 1777S (1777–1800): GCA GCA TAC ACA GAT GAA GGT GGG 1849S: (1849–1869): GAA ATT GAA GCC ATA GTC GTG 1990S (1990–2012): GCC CAG TGG TCA CCT CAC ACT CC 2212S (2212–2235): TCT AGG CCA AGC ATT TCT AGC AGT Antisense primers 64−21AS (in intron 3): AGG GTA TTT CAT GAA CCT TAC 233AS (210–233): TTG CTC CTT AGG AAT AGT AAA ATG 444AS (424–444): ACC ATC CCA CTC ACA CCT CAT 543AS (520–543): TGA GGT GGG GGT GTC ACG TTT TGC 682AS (661–682): ATT ATG TGG CGG ATT GGG CTT C 980AS (960–980): GGT GGA TGC TGT GTC TTC AGG 1095AS (1075–1095): ATC CAA GAT TTT TCC ATT GGC 1623AS (1599–1623): TTG GTC CCA CTC TAA GAC AGC TTC G 1860AS (1840–1860) GGC TTC AAT TTC TCC TTG AGC 2014AS (1991–2014): AGG AGT GTG AGG TGA CCA CTG GGC 2699AS (2675–2699): CCT TCA TCA GTC GCA GCC TCC ATG C 2830AS (2813–2830): GAA TTC ACA GAT AAA ATC Primers used for inverse PCR: GPEX2S (−79–(−59)): CAT TTA GAA GTA GAA GAC TTA GPEX2AS (−102–(−81)): CAT GCT TTT TCC ATT GGG TTT C
a The primers were selected on the basis of known structures of other members of the cytokine receptor family. In parentheses are the positions of the oligonucleotides in the gp130 cDNA according to Hibi et al. (1990).
2. Materials and methods Total genomic DNA was extracted from white blood cells using the method of Miller et al. (1988). To investigate sequence variations, DNA of five healthy volunteers was sequenced. Oligonucleotide primers were selected for PCR analysis using the gp130 sequence submitted to GenBank by Hibi et al. (1990) (GenBank accession No. M57230) ( Table 1). PCR was carried out by Boehringer Mannheim Expand Long Template PCR System in system 3 as proposed by the supplier. The reaction conditions were as follows: the first denaturation was 2 min at 94°C; then 10 cycles of 94°C for 10 s, 56–63°C for 30 s, 68°C for 3 min; then 20 cycles of 94°C for 10 s, 56–63°C for 30 s, 68°C for 3 min; plus 20 s time increment/cycle. The annealing temperature depended on the melting points of the primers. The lengths of the PCR products were calculated by linear regression by comparing to DNA molecular weight markers VI and VII (Boehringer Mannheim) after separation in 1% agarose gel. Intron 1 could not be amplified using this conventional protocol. In this case, inverse PCR was applied (Ochman et al., 1990). The genomic DNA was digested with RsaI, ligated with T4 DNA ligase at 15°C overnight, and amplified with the above-mentioned protocol using primers synthesized in the opposite orientations to those normally employed for PCR ( Table 1).
The PCR products were purified, cloned into pCRA-XL-TOPO plasmid vector and transformed into MOSBlue competent cells (Amersham) with TOPO PCR Cloning Kit (Invitrogen). After propagation of the bacteria on LB agar medium containing kanamycin, single colonies were picked and propagated overnight in LB medium containing kanamycin. The plasmids were purified by FlexiPrep Kit (Pharmacia Biotech) and sequenced by the dideoxy chain-termination method, using the enzyme Sequenase (Amersham) and 35S-dATP as labelled nucleotide (Institute of Isotopes, Hungary). The primers used to sequence both strands were specific to the plasmid, or to the insert.
3. Results and discussion In our study we amplified most parts of the human gp130 gene, and sequenced the exon–intron boundaries. The primer pairs used for a PCR amplification were selected on the basis of known structures of other members of the cytokine receptor family, and with the
163
C. Szalai et al. / Gene 243 (2000) 161–166
prediction that they were specific to adjacent exons. All introns could be amplified with the given protocol except for intron 1, where inverse PCR was applied with primers in exon 2 ( Table 2) synthesized in the opposite orientations to those normally employed for PCR (Ochman et al., 1990). This technique allowed us to sequence regions outside exon 2. The gp130 gene located on the long arm of chromosome 5(q11) ( Rodriguez et al., 1995 ), comprising 17 exons and 16 introns ( Fig. 1). The sequences and the positions of the exon–intron boundaries, and the estimated length of the introns are summarized in Table 2. The positions of the exon–intron boundaries show good correlation to the functional/homology regions of the gp130. The first translated exon (exon 3) encodes the signal sequence. As found in most Ig-superfamily molecules ( Williams and Barclay, 1988) the Ig-like domain of the gp130 is encoded by a single exon (exon 4). Exons 5–8 encode the cytokine-binding homology region, which consists of two fibronectin III modules. Intron 6 divides the CHR into two parts, one containing the cysteine-rich region, and the other containing the WSXWS motif. Exons 9–14 encode three fibronectin III modules, in a way that every two exons encode one module. Exon 15 encodes the transmembrane region, and exons 16–17 encode the cytoplasmic region. Exon 16 contains the box1 motif, exon 17 contains box2 and box3 motifs, all tyrosines, the signals that clusters the gp130 in clathrin-coated pits, allowing it to be internalized by receptor mediated
endocytosis and the entire 3∞noncoding region. The estimated length of the gene without intron 1 is approx. 41 kb. The DNA sequence of exons matched that of the cDNA from human placenta ( Hibi et al., 1990 ), except in two chromosomes where there was a cytosine instead of a guanine in position 22. Since 10 chromosomes from five patients were sequenced, this indicated a 20% frequency for this polymorphism in these samples. This results in the signal peptide at amino acid position 8 a leucine codon instead of a valine (GTACTA). The valine in this position is not conserved in the cytokine receptor family, and the two amino acids are very similar, differing only in a methyl group, thus this conservative substitution probably does not cause any alteration in the function of the gp130. Moreover, in this position there is also a leucine in the mouse gp130 gene (Saito et al., 1992). No other polymorphism, or sequence variation has been found in the samples. All of the introns had a typical splice donor and acceptor sequence, conforming to the consensus sequences ( Breathnach and Chambon, 1981). The numbers and the coding potential of exons and the intron phasing of the human gp130 are identical to those of some other members of the cytokine receptor family [e.g. human and murine granulocyte colonystimulating factor receptor ( Ito et al., 1994; Seto et al., 1992)] and also conforms to the patterns observed previously for other cytokine receptor genes
Table 2 Sequences and positions of the exon–intron boundaries and the size of the introns and exons of the human gp130 genea Exon
Size (bp)
Position of the last nucleotide
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
152 −104 88 −16 80 64 306 370 121 491 167 658 155 813 160 973 84 1056 210 1267 183 1450 103 1552 147 1699 141 1840 97 1937 82 2019 735 translated+70 untranslated
Intron donor
Size (kb)
Acceptor
TGAG/gtgaggaggg TTGG/gtaagtaatt ACA G/gtaaggttca GGC T/gtaagttttg AA TG/gtacgttatt AAA G/gtagttataaa C CAG/gtaaattgga GAT A/gtaagtacat G AAG/gtaaaaaatc CAA G/gtttgtatct AGA G/gtatacctgg gct C/gtaagtccaa ACT G/gtaataaaac TTTG/gtaagaaata AC CT/gtaagtaact A AGG/gtaagagaaa
>10 6.3 6.4 1.0 3.8 0.7 2.5 3.0 1.0 1.0 2.2 0.18 0.5 3.6 5.0 0.8
ctctttacag/TGAA tctgttgcag/AAAT tcctttttag/GT GA tcccttttag/TG CC ttgtatttag/G GCA atttttcag/TG AA tttgtttcag/ATT C ttttctttag/GA CC tttattacag/ACA T tgttttaaag/CT AC cttgctttag/GG AA tctgttacag/CA CC tttcttccag/CT GT tttttttaag/CT CA ttctcttcag/A ATT tcttccttag/CAC A
a Exon sequences are in upper case, intron sequences are in lower case. Intron 1 donor site is according to O’Brien and Manolagas (1997); accession No.: U70617.
164
C. Szalai et al. / Gene 243 (2000) 161–166
Fig. 1. (A) The exon–intron organization of the human gp130 gene. The vertical bars and boxes indicate the exons, while the horizontal lines represent the introns. Distances between exons show approximate length of introns. Numbers between exons indicate the phase of introns. The phase 1 intron lies between the first and second nucleotides of a codon, the phase 2 intron lies between the second and third nucleotide of a codon, and the phase 0 intron is between codons. The arrow indicates a 85 bp exon inserted in the soluble form. (B) The composite structure of the gp130 protein. Each box indicates the coding exons. Conserved cysteine residues (CC; in exon 5 and 6) and the WSDWS (in exon 8) motifs are indicated. Abbreviations: S: signal peptide; Ig: immunoglobulin-like domain; FNIII: fibronectin-type III-like domain; CHR D1,2=cytokine-binding homology region domain 1 and 2; TM: transmembrane region.
(Nakagawa et al., 1994 ). This supports the notion that the gp130 gene evolved from the same ancestral gene that gave rise to other members of the cytokine receptor family. There is an intronless pseudogene for the human gp130 gene on chromosome band 17p11 ( Rodriguez et al., 1995). It probably differs significantly in sequence from the active gene, because only when we intended to amplify the whole 5∞-UTR part of the gp130 gene with a primer pair did we receive a product that might correspond to the pseudogene. This artifact could be avoided with an intron-specific primer. No other primer pairs amplified the intronless pseudogene. In addition to the cell membrane-anchored form of gp130, it has been reported that naturally occurring soluble forms of the receptor molecule are present in biological fluids ( Narazaki et al., 1993). A form of the soluble gp130 was cloned and sequenced ( Diamant et al., 1997 ). This alternatively spliced variant has a new 85 bp exon inserted just after exon 14 ( Fig 1). The new exon is in intron 14, which gives rise to a frame-shift resulting in a stop codon in exon 15, 1 bp
before the beginning of the transmembrane coding part of the gp130 gene. While homozygous gp130 mutant mice exhibit embryonic lethality, possibly as a consequence of cardiac and hematopoietic defects ( Yoshida et al., 1996), and postnatally induced inactivation of gp130 in mice resulted in neurological, cardiac, hematopoietic, immunological, hepatic and pulmonary defects (Betz et al., 1998), until now no human disease could have been associated to a mutation in the gp130 gene. The gp130 is ubiquitously expressed in almost all cells, and is shared as a signal transducer by many cytokines. The chromosomal organization and the sequences of the splice junctions reported herein will be useful to identify the functions of gp130 in health and diseases.
Acknowledgements This study was supported by grants from the Hungarian Scientific Research Foundation: OTKA T-016111, T-13225 and from the Hungarian Ministry of Welfare: ETT T-07726/93.
C. Szalai et al. / Gene 243 (2000) 161–166
Appendix A:
165
References Bazan, J.F., 1990. Haemopoietic receptors and helical cytokines. Immunol. Today 11, 350–354. Betz, U.A.K., Bloch, W., van den Broek, M., Yoshida, K., Taga, T., Kishimoto, T., Addicks, K., Rajewsky, K., Mu¨ller, W., 1998. Postnatally induced inactivation of gp130 in mice results in neurological cardiac hematopoietic immunological hepatic and pulmonary defects. J. Exp. Med. 188, 1955–1965. Bravo, J., Staunton, D., Heath, J.K., Jones, E.Y., 1998. Crystalstructure of a cytokine-binding region of gp130. EMBO J. 17, 1665–1674. Breathnach, R., Chambon, P., 1981. Organization and expression of eukaryotic split genes coding for proteins. Annu. Rev. Biochem. 50, 349–383. Diamant, M., Rieneck, K., Mechti, N., Zhang, X.-G., Svenson, M., Bendtzen, K., Klein, B., 1997. Cloning and expression of an alternatively spliced mRNA encoding a soluble form of the human interleukin-6 signal transducer gp130. FEBS Lett. 412, 379–384. Dittrich, E., Haft, C.R., Muys, L., Heinrich, P.C., Graeve, L., 1996. A di-leucine motif and an upstream serine in the interleukin-6 (IL-6) signal transducer gp130 mediate ligand-induced endocytosis and down-regulation of the IL-6 receptor. J. Biol. Chem. 271, 5487–5494. Ernst, M., Gearing, D.P., Dunn, A.R., 1994. Functional and biochemical association of Hck with the LIF/IL-6 receptor signal transducing subunit gp130 in embryonic stem cells. EMBO J. 13, 1574–1584. Gearing, D.P., Comeau, M.R., Friend, D.J., Gimpel, S.D., Thut, C.J., McGourty, J., Brasher, K.K., King, J.A., Gillis, S., Mosley, B., Ziagler, S.F., Cosman, D., 1992. The IL-6 signal transducer, gp130 an oncostatin M receptor and affinity converter for the LIF receptor. Science 255, 1434–1437. Hallek, M., Neumann, C., Schaffer, M., Danhauser-Riedl, S., von Bubnoff, N., de Vos, G., Druker, B.J., Yasukawa, K., Griffin, J.D., Emmerich, B., 1997. Signal transduction of interleukin-6 involves tyrosine phosphorylation of multiple cytosolic proteins and activation of Src-family kinases Fyn, Hck and Lyn in multiple myeloma cell lines. Exp. Hematol. 13, 1367–1377. Hibi, M., Murakami, M., Saito, M., Hirano, T., Taga, T., Kishimoto, T., 1990. Molecular cloning and expression of an IL-6 signal transducer, gp130. Cell 63, 1149–1157. Horsten, U., Mu¨ller-Newen, G., Gerhartz, C., Wollmer, A., Wijdenes, J., Heinrich, P.C., Gro¨tzinger, J., 1997. Molecular modeling-guided mutagenesis of the extracellular part of gp130 leads to the identification of contact sites in the interleukin-6 (IL-6). IL-6 receptor.gp130 complex. J. Biol. Chem. 272, 23748–23757. Ito, Y., Seto, Y., Brannan, C.I., Copeland, N.G., Jenkins, N.A., Fukunaga, R., Nagata, S., 1994. Structural analysis of the functional gene and pseudogene encoding the murine granulocyte colony-stimulating-factor receptor. Eur. J. Biochem. 220, 881–891. Kim, H., Baumann, H.J., 1997. Transmembrane domain of gp130 contributes to intracellular signal transduction in hepatic cells. J. Biol. Chem. 272, 30741–30747. Kishimoto, T., Akira, S., Narazaki, M., Taga, T., 1995. Interleukin-6 family of cytokines and gp130. Blood 86, 1243–1254. Miller, S.A., Dykes, D.D., Polesky, H.F., 1988. A simple salting out procedure for extracting DNA from human nucleated cells. Nucleic Acids Res. 16, 1215. Murakami, M., Hibi, M., Nakagawa, T., Yasukawa, K., Yamanishi, K., Taga, T., Kishimoto, T., 1993. IL6-induced homodimerization of gp130 and associated activation of a tyrosine kinase. Science 260, 1808–1810. Murakami, M., Narazaki, M., Hibi, M., Yawata, H., Yasukawa, K., Hamaguchi, M., Taga, T., Kishimoto, T., 1991. Critical cytoplasmic region of the interleukin 6 signal transducer gp130 is con-
166
C. Szalai et al. / Gene 243 (2000) 161–166
served in the cytokine receptor family. Proc. Natl. Acad. Sci. USA 88, 11349–11353. Nakagawa, Y., Kosugi, H., Miyajima, A., Arai, K.-i., Yokota, T., 1994. Structure of the gene encoding the a subunit of the human granulocyte–macrophage colony stimulating factor receptor. J. Biol. Chem. 269, 10905–10912. Narazaki, M., Yasukawa, K., Saito, T., Ohsugi, Y., Fukui, H., Koishihara, Y., Yancopoulos, G.D., Taga, T., Kishimoto, T., 1993. Soluble forms of the interleukin-6 signal-transducing receptor component gp130 in human serum possessing a potential to inhibits signals through membrane anchored gp130. Blood 82, 1120–1126. O’Brien, C.A., Manolagas, S.C., 1997. Isolation and characterization of the human gp130 promoter. J. Biol. Chem. 272, 15 003–15 010. Ochman, H., Mehora, M.M., Garza, D., Hartl, D.L., 1990. Amplification of flanking sequences by invere PCR. In: Innis, M.A., Gelfand, D.H., Sninsky, J.J., White, T.J. ( Eds.), PCR Protocols. Academic Press, San Diego, pp. 219–222. Rodriguez, C., Grosgeorge, J., Nguyen, V.C., Gaudray, P., Theillet, C., 1995. Human gp130 transducer chain gene (IL6ST ) is localized to chromosome band 5q11 and possesses a pseudogene on chromosome band 17p11. Cytogenet. Cell Genet. 70, 64–67. Saito, M., Yoshida, K., Hibi, M., Taga, T., Kishimoto, T., 1992.
Molecular cloning of a murine IL-6 receptor-associated signal transducer, gp130, and its regulated expression in vivo. J. Immunol. 148, 4066–4071. Seto, Y., Fukunaga, R., Nagata, S., 1992. Chromosomal gene organization of the human granulocyte colony-stimulating factor receptor. J. Immunol. 148, 259–266. Stahl, N., Farrugella, T.J., Boulton, T.G., Zhong, Z., Darnell, J.E., Yancopoulos, D.G., 1995. Choice of STATs and other subtrates specified by modular tyrosine-based motifs in cytokine receptors. Science 267, 1349–1353. Taga, T., 1997. Gp130 and the interleukin-6 family of cytokines. Annu. Rev. Immunol. 15, 797–819. Williams, A.F., Barclay, A.N., 1988. The immunglobulin superfamilydomains for cell surface recognition. Annu. Rev. Immunol. 6, 381–385. Yoshida, K., Taga, T., Saito, M., Suematsu, S., Kumanogoh, A., Tanaka, T., Fujiwara, H., Hirata, M., Yamagami, T., Nakahata, T., Hirabayashi, T., Yoneda, Y., Tanaka, K., Wang, W.-Z., Mori, C., Shiota, K., Yoshida, N., Kishimoto, T., 1996. Targeted disruption of gp130, a common signal transducer for the interleukin 6 family of cytokines leads to myocardial and hematological disorders. Proc. Natl. Acad. Sci. USA 93, 407–411.