235
Biochimica et Biophysica Acta, 1087 (1990) 235-240 Elsevier BBAEXP 90192
Islet amyloid polypeptide: structure and upstream sequences of the lAPP gene in rat and man A . D . M . v a n M a n s f e l d 1, S. M o s s e l m a n 1, J . W . M . H t ~ p p e n e r 1, j. Z a n d b e r g 1, H . A . A . M . v a n T e e f f e l e n 1, P . D . B a a s 1, C . J . M . L i p s 2 a n d H . S . J a n s z 1 1 Laboratoryfor Physiological Chemistry and Institute of Molecular Biology and Medical Biotechnology, University of Utrecht, Utrecht and 2 Department of Internal Medicine, University Hospital, Utrecht (The Netherlands) (Received 28 May 1990)
Key words: Islet amyloid polypeptide; Amylin; lAPP gene; Type 2 diabetes mellitus; Diabetes mellitus; Gene structure; (Rat); (Human)
Islet amyloid polypeptide (IAPP) or amylin is a pancreatic islet hormone which was first found in amyioid in insulinomas and in pancreases of patients with type 2 diabetes. In rat a similar polypeptide occurs; however, pancreatic amyloid in this species has not been described. Here we report the structure of the rat and human IAPP gene. Both consist of three exons and two introns which are very similar. The upstream sequence of the rat IAPP gene contains a TATA-box, a CCAAT-sequence and a GT-element, whereas the upstream sequence of the human lAPP gene contains a TATA-box and a rat insulin enhancer-like sequence. This suggests that the rat and human IAPP gene may be controlled differently at the transcriptional level.
Introduction
Islet amyloid polypeptide (lAPP) or amylin, a 37amino acid polypeptide, is the major protein component of amyloid deposits found in insulinomas [1] and in pancreatic islets of type 2 or non-insulin-dependent diabetes mellitus (NIDDM) patients [2]. IAPP is synthesized in the fl cells of the islets of Langerhans [3] from where it is secreted [4] together with insulin [5]. Human lAPP (hlAPP) has been shown to inhibit both basal and insulin-stimulated glucose uptake in rat skeletal muscle in vitro [6,7]. Administration of rat lAPP (rlAPP) antagonizes insulin action in liver and peripheral tissues in rat in vivo [8] and administration of hlAPP causes peripreral insulin resistance in vivo in dogs [9] and antagonizes insulin in vivo in rat liver [10]. Elevated levels of lAPP may thus cause insulin resistance. Both insulin resistance and pancreatic amyloid formation are characteristic for type 2 diabetes and may be due to enhanced expression of the hlAPP gene.
The nucleotide sequence data have been submitted to the EMBL/Genbank Data Libraries under the accession numbers X52818 human lAPP exl, ex2 ; X52819 human lAPP ex3 ; X52820 rat IAPP exl, ex2; X52821 rat lAPP ex3. Correspondence: A.D.M. van Mansfeld, Institute of Molecular Biology and Medical Biotechnology, Padualaan 8, 3584 CH Utrecht, The Netherlands.
Type 2 diabetes and pancreatic islet amyloid have been described in man but not in rat. Therefore, it is interesting to compare the structure and expression of the lAPP genes in man and rat. The amino acid sequences of the middle part (residues 18-29) of lAPP from different species differ considerably [11-15]. These differences may relate to the amyloidogenic properties of IAPP [12,16,17]. The major amyloidogenic determinants in hlAPP are thought to be the Ala and lie residues at positions 25 and 26 [18]. In rat and hamster IAPP, which are not known to form amyloid, there are Pro and Val residues at these positions. Upon change of the Pro Val to Ala lie in the synthetic peptide corresponding to residues 20-29 of hamster lAPP, the peptide will form amyloid-like fibrils [17,19]. Analysis of lAPPspecific human genomic DNA and insulinoma cDNA and clones in our laboratory [20,21] and by others [22,23] have shown that the hlAPP gene contains three exons. In this study we have determined the structure of the rat lAPP gene and extended our previous analysis of the human lAPP gene. As a first step to investigate the transcriptional regulation of the IAPP gene in rat and man we have determined the nucleotide sequences of the upstream regions of both genes. hlAPP shows 43% and 46% homology with the human calcitonin gene-related peptides (hCGRP-I and -II, respectively) [24,25]. The CALC genes, which encode the CGRPs and the lAPP gene are thus considered to belong to one gene family. To explore this relationship
0167-4781/90/$03.50 © 1990 Elsevier Science Publishers B.V. (Biomedical Division)
236
further we investigated whether the lAPP gene contains regions which show sequence homology with the calcitonin-encoding part of the CALC-I gene.
encoding part (nucleotides 13 125 Fig.lb) of the obtained c D N A clone was used as a probe to isolate an IAPP-specific genomic D N A clone from a rat genomic D N A library in kGEM-11 (Promega, U.S.A.).
Materials and Methods
Isolation of rat lAPP-specific cDNA and genomic DNA clones RNA was isolated from rat (strain WAG) pancreas essentially as described by Han et al. [26]. Poly(A)-containing RNAs were selected by chromatography on oligo dT cellulose [27]. A c D N A library in Agt 10 was prepared using a c D N A synthesis kit (Amersham, U.K.) and a c D N A cloning kit (Amersham, U.K.) according to the instructions of the supplier. A total of 7.5-105 independent clones was obtained from 150 ng cDNA. Replica filters were prepared on Hybond-N (Amersham, U.K.). A D N A fragment which contains the major part of the hIAPP encoding sequence (the 167 bp PstIBamHI fragment [20]) was labeled by random priming with [a-32p]dCTP and used to screen the c D N A library under low stringency conditions (see below). A rIAPP-
Hybridizations Southern and Northern blots were prepared on Hybond-N following the standard procedures as described by Maniatis et al. [28]. Hybridization and washing were performed as described previously [20] at 35°C and 42°C, respectively, (low stringency conditions) or at 42°C and 65°C, respectively, (high stringency conditions). Filters were exposed to Fuji X-Ray films. Subcloning and nucleotide sequence analysis Recombinant phages RPC-1 (rlAPP cDNA), RG-I (rIAPP genomic DNA) and A h201 (hIAPP genomic D N A [20]) were plaque-purified, propagated on Petri dishes and the D N A was purified (all according to Maniatis et al. [28]). The insert DNAs from these recombinants were subcloned in plasmid pEMBL8 or phage M13mp9. A rIAPP-encoding part of the c D N A
a -965 ~ T C C C C T T G
TAGCCACTCT
~CC~C~
CTGAGTTCCA
~C~TGAT
AGAGTATGTC
T C ~ A ~
C~GTCA~C
CTTCT~A
T~CTCCAGT
-865 C C T T G T C T T C
A~TCTCTAC
ATGTA~CAC
TT~CAT
ACTT~CAT
ACA~CAC
ACACACACAC
A~CA~CAC
~TC~TC~
-765 T C ~ T T A G T C
ATGA~C
CATGTGT~C
TATGTAT~T
~TGTCTA
~ACAT~AC e ATTCCTGAGA
~T~TCTT
TA~CAT~
~GTT~TGT
TCTGTACTTC
-665 G T T A T ~ G T G
~TTTTGTTT
TCCTTTTC~
A~CACGTTG
TTTCTTTCCC
CCCTGA~TT
GCCATAC~T
~TAC~TA
T~TTTG~C
ACATTCAGCC
-565 C C C A T T C T T C
CCTCCGACCC
CTCCCAG~T
GCTCCCCACT
TCATATTCTC
CACCTG~TT
CCCTGTAGTC
TTTTTTTATA
ATCTGAGTCC
AGTTATTTTT
-465 C C C T C T T A T G
~TTTTTT
~ T A ~ T
CTTACTGT~
TATTAT~
TATTTCTGAG
TGTCCGT~C
A C T ~ C
A~T~TTGT
TCTC~TAT
-365 ~ T G T C T C C A
TTTCCCTA~
ATTCTCTTTG
~ T A C ~
~CTTTATTA
TTATTCCT~
CTACTATCTT
TTT~GTC
ACTTCCTTAC
TGTCAGACTT
-265 A A T T G T T C A A A A A T T C A C T T
GTCTTTTGCA
CAAAATGGAA
ACTC~CT
CCTA~CC
CT~T~ACC
A T ~ C ~
~
~GTATGT~
-165 T C T G C T A A T A
TTTACCTAAG
TGTTAATGTG
TTGATGGCCC
ACCT~TCCA
~ T ~ T G
~CTTTC I
CGTCC~
AT~A~CTA
65 C T C T C T G G T A
ACAGAGCTAA
GCAAGTTGAG
GGATATATAA
~GTCA~TC
~C~TTACC
d ' -
l
T
ATGA~CAGT
AAGCCGTGCG TGCACGCTTG GGCTGTAGTT CCTGAAGCTT
T G T T A T T G C T C-CCACTG~CC A C T G A A A G G T ~ T C C C
a CCGT~TTTG
C - 36 CAGGCTC-CCA G C A C A C T A T C
C
~TATTA
CTTAGT~
~CCGTC
136 T T T G T T G T T A
CCCGTAGGAC
ACATTTCA~
~ G T A
~TG~TAGT
GTGTTATAGT
~TTCTC
A~TGTTTT
T
TT~CATTT
236 ~ C T A T T T
C~GATGAT
~ATCA~
~TCTCTTTG
T ~ T ~ C
TTT~TTTC
CTTT~T~T
TC~TTTAC
ACTTGTGTCA
336 ~ A ~ T G G
TTTGACTTCG
GTGTTCTTTC
TTTTCAG~
TCTTGAGA~-~GGTGCAT f
b 436 CTCGC-CCACT T G A G A G C T A C A C C T G T C G G A
~
ACCAG~CTG
CTCCAGGCTG CCAGCTGTTC TCCTCATCCT CTCGGTGGCA
A~TT~T~
CCTGAACATT
CC
CTCCCTTGTA
CTAC~C~
~T~T~T
CCCAGT~TA
GGTGTTATGT
~TTTTTGTG
G T ~ T T
~TTCTTCTG
TACAGATGTT I
T ~ C ~
A~ACT~GT
GACT~TATA
~ C ~ T G
GTGAG~T~
CATG~T~C
TCTC~TG
b -223 C ~ G T A C C C
~
-123 C A T G T T C T ~
ATGTCTGT~
-
23
GTTCTTCTTT
T
A
CTTATTTGTT CAGTGGTACC AACCCTCAGG
78 T T C G C T C C A G C A A C A A C C T T G G T C C A G T C C 178 C C T G G A T T T C T T A C T C C T ~ - - ~ G T C A A T G 278
T G G A C A A A C G G A A G T G C A A C ACAGCCAC.~T GTGCCACACA A C G T C T G G C A A A C T T C T T G G
TCCCACCAAC CAATGT~A
TCCAATACAT AT~GAG
T G C C A T T C C C G C C C A A G T G A G A A T G T A C A A C A A C G T G C C T G T G C T T G C T G TGTTTGTAAA T T C T T A T T C T A A G A C G T G C T
378 G G T C A G G G T G A A T A C C T C T C
T T A A A C T G A G TGTTGATAAA
T A A T C A C A A C A T G T T C T T G G C T G T A C A T C G A T A T C G T A G G A A C A C T T A A A A T T T C T G T T T TTACCTTGTA A C T C T A T G A C
478 T C A A G T T T A A C A A T A A A G G A GGGCGTC-GGA T G G T G G A C T T G A A A A G T C A T TAACAGCTCA 578
GAATGTGGCA GAGGATCCAA A T A ~ T C
TACTCCCGTA TCTCTTATTA C T T C C T G T G T A A A T G C T C T G A T G A T T T C C T G A A T A A T G T A ACAGTC-CCTT
TAAATACCTG CTAAGATGAC ATTT
c~-7--/-~TAGCAA
TAGTAAATTT CTGATTCTAG ACATTGGAAA GCAAGCATTC
A A T C T T G T A C T T A T T* T A A T T A* A A C T T T T T A A T T C T A T T T C T T G T T G T C C T T G T T G G T A G T
678 AAGAGC-CTGT T T T A A A ~ T T G T T
TGTATCCACG A G G A G T T T A T T G T T T C T T T T T C T C T G T A G A G A G C T T T A A T G A C A T G T T T A T T C T G C T A A G
778 G A A T C T T T C A G C A A A C T C A A A C A T A A T T G
TATCCAGCAG AAGGATC
Fig. 1. Nucleotide sequence of the rat IAPP gene. (A) The upstream region, exons 1 and 2 (a and b, underlined), intron 1 and part of intron 2. Also indicated are the TATA-box (c), a C C A A T - b o x (d), a GT-element (e) and the translation initiation codon (f). The numbering is relative to the 5' end of rat IAPP m R N A . (B) Part of intron 2, exon 3 (underlined) and the 3' flanking region. A l s o indicated are polyadenylation signals (boxed), 3' ends of c D N A s (asterisks) and the translation termination signal (boxed). The numbering is relative to the 5' end of exon 3. The nucleotide sequence, obtained from our c D N A and genomic D N A clones, is identical to the corresponding part of rat I A P P c D N A reported previously [l 1], except for a stretch of 30 nucleotides (positions 278 to 307) which are lacking in that sequence [11] and two C residues instead of three at positions 713 and 714.
237
tive with this probe and was shown to contain a rat IAPP e D N A sequence corresponding to exon 3 and part of the preceding intron 2 (nucleotides 1 to 632 Fig.lb). Using a rlAPP-specific fragment from this clone we isolated a rlAPP-specific D N A clone (RG-1) from a rat genomic D N A library. The rlAPP-encoding part of the e D N A and oligonucleotides complementary to different parts of the e D N A (Ref.ll, Materials and Methods) were used to identify subclones containing different parts of the rlAPP gene. Alignment of the c D N A sequence with the genomic D N A sequences (Fig.l) shows that the rlAPP gene contains two introns, 289 nucleotides and approx. 4.9 kilobases in length, respectively. Consensus splice site sequences [31] are present at the intron-exon junctions.
(see above) and 5'-labeled oligonucleotides, complementary to nucleotides - 5 3 to - 3 2 and - 5 to + 17 in the c D N A reported by Leffert et al. [1.1] were used to identify subclones containing different parts of the rIAPP gene. Nucleotide sequences were determined using the chemical degradation method [29] or the dideoxynucleotide chain termination method [30] using a T7 D N A polymerase sequencing kit (Pharmacia, Sweden). Results and Discussion Analysis of the rat l A P P gene A rat pancreatic e D N A library was screened with a hlAPP-specific D N A probe (the 167 nucleotides PstIBamHI fragment, Ref.20). One clone (RPC-1) was posia -ILOO T C G A G A C C A G
CCTGACCAAC ATAGAGAATC
CCGTCTCCAC
TAAAAATACA AAATTAGCCA TACATGGTGA
TGCATGCCTG
TAATCCCAGC
TACTCTGTAG
-I000 G C T G A ~ A G
GAGAATCACT
TGAACCCC-C4~
A~ACAGGT
TACAGTGAGC
CCAGCCT~
C.AAC_AAAGAG
TGAAACTTCA
900 T C T C A A A A A A A A A G A A A A A A
AAAGAAAGAG
AGAAAGAAAT AAAGAAAAA
-
CGAGATCGCA CCCTTGTACT
GAAAAAGCAA TTTGGAGAAA TTTTAAAAGC AAATCATTTT AAAAGACAAA
800 C T A C A A A G T A
CTGTGTGAC-C T A A G C C A T T C A T A A T G T T C T
TTCAAAATAA
TTGTAAAATT TCTGTGTAAG
700 A A T G A A T G G A
AAAGGTCATC
CAAAAACTAC
TTGACTACAT ATACCTATGG ATCTCTATCT AAGTGCTGAA AAATGAATTT
TTTTATTCAT TTGGTTATGT
600 A G C A T T T G T T
CATTCTAATT
TTGTTTTGTT
CCCTGTCATA TCTCTGGTAC CTAGAATAAT
CCCTACCACA
- 500 A A A T C T C C C C
TCACCTCATT
GTAAATGACT
TTTGATTTCT CTTTTATGCC
CTTTTTATAC
ACCTTTCCCT TATATCTCCA
TTTATTCCTG AAGCTTCATG
- 400 G G A T T C A G C C
ATTGAGGTCA
CTTGGGTTTA GATATACCAA AAGTCTGTGA
TTTCTCTGTT
TGCATATATG CACATTTGTT
GTTATCCTTA
CCCTTTTCTA
- 300 T C A G T T C C T T
ACCATAACAT
ACACTTAATT
CTTGGAAATT
CACTCATGTC
TTACAAAGAT
C-GCAAATTCA A A C T T C T G C T
GTGTATGACA
CACCATTAAC
200 T G C A C A A G G A
CACTGTGTAT
TTGCTACGTT
AATATTTACT
GATGAGTTAA
TGTAATAATG
ACCCATCCC-C T T C T G C T G C C
TGTGAGGTAC TTTCTATCTA
-
-
CATGGAGCTA A T ~ C A
AATCTGTTCT
GAGTAGGTCT TCCATTACTC
TTATGCTTTT
d - -
- I00 T A ~ T G G ,
A AATTAATGAC
AGAGGCTCTC
TGAGCTGCCT GATGTCAGAG
CTGAGAAAGG
GGATACAAGC
TTGGACTCTT
TTCTTGAAGC
TTTCTTTCTA
TGTGAGGGGT ATATAAGAGC C TCAGAAGCAT TTGCTGATAT
TGGATTACTA
1 GA~TAA ATATTCCAGT a I01 AA____~TAAAGA A T T T C C T A T T
TCTGGGAAAG
TTTTATTTAT TTAGAGAAAT
GCACACTTGG
TGTTAAATTC ATGGTTTATT
T C A A A G A A A G GCTAAAGC4~A
201 G A A T G T A T T A
CAATATAAAT
GTTCAGATTG
CTTAGAGAAG
AAGTAA.AAAT C T C G A A A T T A CTTGAA/L~GT G G A C A A T A T T A A G G G A C T G T
301ATCAATAAAA
ATTTTGATCC
TTGTAAATTA
CGTTTTAAAA AGATGTTTCT
TTTAAAAACT
AAGCTCTAAT
TTAAAATTAC ATCAATTAGA ACTGTAAGAA
401 A T C T C T T G A T
TTCAGTGCTG
GATTATTCTT
TGCAGAAAAT
~sCATCC
TGAAGCTGCA
AGTATTTCTC
ATTGTGCTCT
CTGTTGCATT
CCATTGAAAG
GTTGGTAACT TTAAAATCCT
GTTTCTTTGT
AACTTTTGTA AAGTGTGAGA
AAATTAGAAT
TAAATACTGT
GAAATT~
TTGAGAA~
b 501 G A A C C A T C T G A A A G C T A C A C
GTTAGCAAAT
TGCTGACATTGAAACATTAA
e
b -177 T G T C A A A A A A T C T C A G C C A T CTAC-C4~TGTT TC-CAACCAAA C A C T G A G T T A C T T A T G T G A A A A A T T G T T T T C C T T T T G G G G T T T T T C A A T C - 77 A A T A T T T G A T
I
GTCACATGGC TGGATCCAGC
T A A A A T T C T A AGC4ZTCTAAC T T T T C A C A C T T T G T T C C A T G T T A C C A G T C A
24 G C A A C A C T G C
CACATGTGCA
ACGCAGCGCC
TGGCAAATTT
TTTAGTTCAT
TACATATGGC
AAGAGGAATG
CAGTAGAGGT
TTTAAAGAGA
GAGCCACTGA A T T A C T T G C C
cCT~FA~ACAATGTAAC
TCTATAGTTA
TTGTTTTATG
224
TTCTAGTGAT
TTCCTGTATA
ATTTAACAGT
GCCCTTTTCA
TCTCCAGTGT
GAATATATGG
TCTGTGTGTC
TGATGTTTGT
TGCTAGGACA
TATACCTTCT
324
CAAAAGATTG
TTTTATATGT AGTACTAACT
AAGGTCCCAT
GA T A G T A T C T T T
TAAAATGAAA
TGTT'2TTGCT A T A G A T T T G T A T T T T A A A A C
424
ATAAGAACGT
CATTTTGGGA
AGTGGCACAG
GTTTAAGAAC
GAAGGAGAAA AAGGTAGTTT
524
TGAAGTTATT
CTTC.ACATGA G A A A A T C A G T
AATT~CCA
GGCGCGGTGG
CTCTTGCCTG
624 A G G T C A G G A G
TTCGAGACCA
GCCTGACCAA
CATGGTGAAA
CCCTGTCTCT ACTAAAA
724 A G C T A C T C A G
GAG6CTAAGG
CAGGAGAATC
GCTTAAACCC AGGAGGCGGA
824 G T G A G A C T C G
TCTCAAAAAA
AAGAAAGAAA
ATTAGTAATT
GTAAGTACCC
924 T G C A G T A T A T
TTCTGAAATG ACAGAATGCT
GTTTTAAAA
GGTACTAAGA
GGCTATTTAA
AAGTATAAAA
CTGCTTTGTA
TATTTTGTTT AAGTGGCTTT
CAGCAAACCT
CAGTCATATT
1024
TGTGCTTGCT
1124 A T A T G T C A T T 1224
GAC CACAGACTTC
T~CTCT
ACTTTGGTGC
CATTCTCTCA
CAATTACAAG
AAGCGGAAAT
124
CCTATATCTC
TCCAGCAACA
TCAGGTGGAA
TCTACC.AACG TGGGATCC.AA
GAACCTTGGT
AAATTGTAAA
CAGCTAATAA
TAATCCCAGC A C T T T G G G A G
GCCGAGGCAG
GCAGATCACA
TA C A A A A A T T A G
CCGGGGGTG~
TGACATGTGC
CTGTAATCCC
GGTTGCAGTG AGCCGAGATT
~CCACTGC
ACTCCAGCCT
GGGTGGCAGA
CTGATAAGCA
AATTAGTAAT
TGTCAATACC
CCTGTTAAGC AATTCCTTTT
CAAAGAAATA A A A T C C T G C T
CCTGACTCGG
TCAAAATATT
TTTTAAAGTC
TCCATGAGGG
TTTCATTGTG
TGTTAGCAGC
AGTGAGCTTC
TATTAAATGT
CTTATGCAGG
GTATTGCGAA
ACAACTTGTG
TTCTATTAAT
CGTGTCTTCA
TATTGTTTGT
T T G C T G T A T A A A G A T T A T T C T T T G T T A A C A A A T T A G A C A T TCTAC-CAAAG T
Fig. 2. Nucleotide sequence of the human lAPP gene. (A) The upstream region, exons 1 and 2 (a and b, underlined), intron 1 and part of intron 2. Also indicated are the TATA-box (c), an insulin enhancer-like sequence (d) and the translation initiation codon (e). The numbering is relative to the 5' end of exon 1. The 5' part of the sequence contains part of an Alu-repetitive sequence [3,4)] which ends at nueleotide - 8 9 7 . Comparison of our hIAPP gene nucleotide sequence with that in Ref. 23 shows no differences within the exons. In the 5' upstream sequence there is a C - ~ T difference at position - 2 6 0 and in intron 1 we find two additional A residues at positions 307 and 337. (B) Part of intron 2, exon 3 (underlined) and the 3' flanking region. Also indicated are polyadenylation signals (boxed), 3' ends of cDNAs (asterisks) and the translation termination signal (boxed). The numbering is relative to the 5' end of exon 3. Nucleotides 558 to 837 correspond to an Alu-repetitive sequence 140]. The eight A ---, G differences in the Alu sequence within exon 3 we reported previously [21] might be the result of a substitution mechanism involving antisense R N A
[41].
238 Rat IAPP mRNA may have two alternate 3' ends. The cDNA sequence reported by Leffert et al. [11] and our genomic DNA sequence diverge downstream from nucleotide 716 (Fig.lb). This position is preceded at 15 nucleotides by the sequence GATAAA, which can serve as a polyadenylation signal [32] and is followed by a poly(A) sequence in the cDNA [11]. The alternative 3' end may be tocated at or close to nucleotide 632, the end of our cDNA, or nucleotide 638, the end of two of the three cDNAs for which Leffert et al. [11] have determined the 3' end. These sites are preceded at 21 or 27 nucleotides, respectively, by the polyadenylation signal AATAAA. The 5' terminal nucleotide (G) of the rat IAPP m R N A was determined following the so-called RACE protocol [33] (data not shown). This position is preceded by a TATA-box in the genomic DNA sequence at a distance of 26 nucleotides (Fig.la). So, transcription initiation of the rIAPP gene probably takes place at this G residue. The length of the rIAPP-specific mRNA in rat pancreas, approx. 1050 bases (b.) (Fig.3, lane a), corresponds to the length of an RNA containing exons 1,2 and 3 and a poly(A) tail at either one of the polyadenylation sites. The DNA upstream from exon 1 contains sequences homologous to elements known to be involved in transcription initiation and regulation (Fig.la): a TATA-box (TATATAA, - 3 2 to -26), a CCAAT-box (GCCAAT, - 1 1 2 to -107, the binding site for the C T F / N F I protein family [34]) and, more upstream, a GT-element [(CA)16(CAAT)4 , - 8 0 8 to -761]. A GT-element is present at a similar position upstream from the pancreas B-cell-specific transcription initiation region of the glucokinase gene in rat [35]. GT-elements can adopt a Z-DNA structure [36] and thus can be the target for specific, Z-DNA binding factors which may have a regulatory function [37].
Analysis of the human lAPP gene The nucleotide sequence analysis of the lAPP-specific human genomic DNA clone hh201 [20] was extended. The nucleotide sequences of the entire hlAPP gene and 1100 nucleotides upstream from exon 1 were determined (Fig.2). Alignment of the cDNA sequences with the genomic DNA sequences [20-23] shows the location and size of the exons and introns (Fig.2). Polyadenylation of hlAPP mRNA can take place at two sites [21,22], located at positions 390 and 1246, respectively (Fig.2b). Transcription initiation of the hlAPP gene occurs at the T or G residue, 24 or 25 nucleotides downstream from the TATA-box (Ref. 23, J.W.M.H., unpublished data). In insulinomas we find lAPP-specific mRNAs of approx. 1600 b. and 2100 b. (Fig.3, lane b and Ref.21). The length of the 1600 b. hlAPP mRNA corresponds to the length of an RNA containing exons 1,2,3 and a poly(A) tail at position 1246. Polyadenylation at the other site, position 390, may occur less
a
b
--2100 --1600
Fig. 3. Northern blot of rat pancreas R N A and human insulinoma RNA. Lane a: 5/~g rat pancreas poly(A) R N A was hybridized with a rat IAPP-specific D N A fragment (nucleotides 13 to 125, Fig.l b), lane b: 3 ~g human insulinoma poly(A) R N A was hybridized with a human IAPP-specific c D N A (the 588 bp. insert of AhtAPP-cl,Ref.21 ). The lengths of the lAPP-specific R N A s were deduced from their positions relative to the rRNAs.
frequently and therefore the corresponding mRNA may escape detection on Northern blots. The composition of the 2100 b. hIAPP m R N A cannot be explained with the available data. The DNA upstream from exon 1 of the hlAPP gene (Fig.2a) contains sequences homologous to elements involved in transcription initiation and regulation: a TATA-box (TATATAA, - 3 1 to - 2 5 ) and the sequence A A G A T G G C ( - 245 to - 2 3 8 , complementary sequence GCCATCTT) which is similar to the sequence GCCATCTG, a major element in the insulin enhancer, crucial for the activity of the rat insulin I gene promoter in insulinoma cells [38]. This sequence in the rat insulin promoter is the binding site for (a) B-cell-specific protein(s) [39]. A similar element (GCCATCCAG) is also present in the pancreatic B-cell-specific promoter of the rat glucokinase gene [35]. hIAPP has a considerable homology with the CGRPs [1]. T h e CGRPs are encoded by exons 5 of the CALC genes [24,25]. Exon 4 of the CALC-I gene encodes calcitonin. Since exons 2 and 3 of the hIAPP gene correspond to exons 2 and 5 of the CALC genes [21], we have determined the complete nucleotide sequence of hlAPP intron 2 and searched for sequences homologous with exons 3 and 4 of the CALC-I gene. No significant homology was found.
239 E1 83b
I1 289b
E2 95b
I2 -4.9kb
E3 716b
(632/638 b)
I
rat IAPP t
poly A site polyA site ?
human
IAPP
E1 103b TATAt~I
I1 332b
E2 12 95b -4.800 kb
I1 I
[~
~,~........
E3 390/1246b
m-'-,='~
ALU
Iiiii:~i~i~i:i~i~iililililii~ii~:il I polyA site
poly A sure
Fig. 4. Comparison of the rat and human IAPP gene structure. Indicated are the exons and introns and their lengths, the pre-pro-IAPP-encoding regions (hatched), the IAPP-encoding regions (inverse contrast), an Alu-repetitive sequence (spotted), TATA-boxes and polyadenylation signals.
whereas an insulin enhancer-like sequence is present in this region of the hIAPP gene. The differences suggest that transcription of the rIAPP gene and the hIAPP gene is controlled differently. Functional analysis of the promoter regions, however, must reveal which elements are important in r-cell-specific expression of the IAPP gene in rat and man. These studies may also contribute to the elucidation of the mechanism(s) involved in the development of type 2 diabetes a n d / o r pancreatic amyloid formation, which occur in man and not in rat.
Comparison of the I A P P gene structure in rat and man
Fig.4 shows that the rat and human IAPP genes have a very similar size and exon-intron organization. For both genes two polyadenylation sites have been described. The sizes of the rat and human IAPP mRNAs differ considerably. Comparison of the upstream sequences of the rat and human IAPP genes (Fig.5) shows considerable homology, including a very well conserved TATA-box region. However, a CCAAT-box and a GTelement are present in this region of the rIAPP gene,
-523
ATATTCTCCACCTGAATTCCCTGTAGTCTTTTTTTA
-522
CT
-466
TCCCTCTTATGGACTTTTTTTAAAAATAGATCTTACTGTGGTATTATAAAATATTTCTGA
-463
TCTCTTTTATGCCCTTTTT
-406
GTGTCCGTGACACTGAGAAACAGATCATTGTTCTCGGCTATAATGTCTCCATTTCCCTAG
-409
AGCTTCATGGGATTCAGCCATTGA
-346
CATTCTC *****
-351
ATTTCTCTGTTTGCATATATGCACA
-288
GTCACTTCCTTACTGTCAGACTTAATTGTTCAAAAATTCACTTGTCTTTTGCACAAAATG
-296
TTCCTTACCATAACAT
-228
GAA~CTCCAAACTCCTACAGGCCCTGATGCACCATCAACCACACAAAGGCACTCAGTATG
-239
~TTC/~.I~CTTCTGCTGTGTATGAC~CAC~%,TT~%.CTGCA~..GGACACTGTGTATT
-168
TGGTCTGCTAATATTTACCTAAGTGTTAATGTGTTGATGGCCCACCTGCTCCAGC~
-179
TGCTACGTTAATATTTACTGATGAGTTAATGTAATAATGACCCATCCGCTTCTGCTGCCT
TCCATTACTCTTATGCTTTTAAATCTCCCCTCACCTCATTGTAAATGACTTTTGATT
ATACACCTTTCCCTTATATCTCCATTTATTCCTGA
GGTCACT
TGGGTTTAGATATACCAAAAGTCTGTG
TTTGCATACAAACACACTTTATTATTATTCCTAACTACTATCTTTTTAAA ******** . * * * * * * * ** *** * * * * *** * ** * TTTGTTGTTA
ACACTTAATTCTT
n ~ 0 ~ A~C~%AACACTTTCCGTCCGCA
-119
T AAT~TGAGTCCAGTTATTTT
TCCT
*
TACCCTTTTCTATCAG
GGAAATTCACTCATGTCTTACA
GGATGGACACTAATGACACAG
TCTCT
CTG
GTGAGGTACTTTCTATCTATAGGGATGGAAATTAATGACAGAGGCTCTCTGAGCTGCCTG
- 58 GTAACAGAGCTAAGCAA
GT
TGAGG~QTCAGCTCGCCGATTACCAAGCC
- 59 ATGTCAGAGCTGAGAAAGGTGTGAGGG~GCTGGATTACTAGTTAGCAA
AT
Fig. 5. Promoter re~ons of the rat and human IAPP genes. The sequences of the first 523 (rat) and 522 (man) nucleotides upstream from exon 1 ~ e comp~ed. The TATA-box and CCAAT-box in the rat DNA (upper lines) and the TATA-box and insuUn enhancer-like sequence in the human DNA (lower fines) are in~cated (boxes). Small gaps were introduced to maximize homology. From nucleotides - 4 6 8 (rat) and - 4 6 5 (man) downstream to exon 1 the homology is approx. 66%.
240
Acknowledgements This research was supported by the Netherlands Organization for Chemical Research (SON) with financial aid from the Netherlands Organization for Scientific Research (NWO). We would like to thank H.A.M.B. van Nunen for his enthusiastic participation in the analysis of the rat cDNA and E.D. Kluis and P. van der Most for assistance in the preparation of the figures.
References 1 Westermark, P., Wernstedt, C., Wilander, E. and Sletten, K. (1986) Biochem. Biophys. Res. Commun. 140, 827-831. 2 Cooper G.J.S., Willis, A.C., Clark, A., Turner, R.C., Sim, R.B. and Reid, K.B.M. (1987) Proc. Natl. Acad. Sci. USA 84, 8628-8632. 3 Johnson, K.H., O'Brien, T.D., Hayden, D.W., Jordan, K., Ghobrial, H.K.G., Mahoney, W.C. and Westermark, P. (1988) Am. J. Pathol. 130, 1-8. 4 Kanatsuka, A., Makino, H., Oshawa, Y., Yamaguchi, T., Yoshida, S. and Adachi, M. (1989) FEBS Lett. 259, 199-201. 5 Ogawa, A., Harris, V., McCorkle, S.K., Unger, R.H. and Luskey, K.L. (1990) J. Clin. Invest. 85, 973-976. 6 Leighton, B. and Cooper, G.J.S. (1988) Nature 335, 632-635.. 7 Cooper, G.J.S., Leighton, B., Dimitriadis, G.D., Parry-Billings, M., Kowalchuk, J.M., Howland, K., Rothbard, J.B., Willis, A.C. and Reid, K.B.M. (1988) Proc. Natl. Acad. Sci. USA 85, 7763-7766. 8 Koopmans, S-J., van Mansfeld, A.D.M., Jansz, H.S., Krans, H.M.J., Radder, J.K., Frrlich, M., de Boer, S.F., Kreutter, D.K., Andrews, G.C. and Maassen, J.A. (1990) Diabetes 39, 121A. 9 Sowa, R., Sanke, T., Hirayama, H., Furuta, H., Nishimura, S. and Nanjo, K. (1990) Diabetologia 33, 118-120. 10 Molina, J.M., Cooper, G.J.S., Leighton, B. and Olefsky, J.M. (1990) Diabetes 39 260-265. 11 Leffert, J.D., Newgard, C.B., Okamoto, H., Milburn, J.L. and Luskey, K.L. (1989) Proc. Natl. Acad, Sci. USA 86, 3127-3130. 12 Nishi, M., Chang, S.J., Nagamatsu, S., Bell, G.I, and Steiner, D.F. (1989) Proc. Natl. Acad. Sci. USA 86, 5738-5742. 13 Ferrier, G.J.M., Pierson, A.M., Jones, P.M., Bloom, S.R., Girgis, S.I. and Legon, S. (1989) J. Mol. Endocrinol. 3, R1-R4. 14 Betsholtz, C., Christmanson, L., Engstr0m, U., Rorsman, F., Svensson, V., Johnson, K.H. and Westermark, P. (1989) FEBS Lett. 251, 261-264. 15 Asai, J., Nakazato, M., Kangawa, K., Matsukura, S. and Matsuo, H. (1989) Biochem. Biophys. Res. Commun. 164, 400-405. t6 Glenner, G.G., Eanes, E.D. and Willey, C,A. (1988) Biochem. Biophys. Res. Commun. 155, 608-614. 17 Betsholtz, C., Svensson, V., Rorsman, F., EngstriSm, U., Westermark, G.T., Wilander, E., Johnson, K.H. and Westermark, P. (1989) Exp. Cell Res. 183, 484-493.
18 Cooper, G.J.S., Day, A.J., Willis, A.C., Roberts, A.N., Reid, K.B.M. and Leighton, B, (1989) Biochim. Biophys. Acta 1014, 247 258. 19 Betsholtz, C., Christmanson, L., Engstrrm, U., Rorsman, F., Jordan, K., O'Brien, T.D., Murtaugh, M., Johnson, K.H. and Westermark, P. (1990) Diabetes 39, 118-122. 20 Mosselman, S., Hbppener, J.W.M., Zandberg, J., van Mansfeld, A,D,M., Geurts van Kessel, A.H.M., Lips, C.J.M. and Jansz, H.S. (1988) FEBS Lett. 239, 227 232. 21 Mosselman, S. HOppener, J.W.M., Lips, C.3.M. and Jansz, H.S. (1989) FEBS Lett. 247, 154 158. 22 Sanke, T., Bell, G.I., Sample, C., Rubenstein, A.H. and Steiner, D.F. (1988)J. Biol. Chem. 263, 17243-17246. 23 Nishi, M., Sanke, T., Seino, S. Eddy, R.L., Fan, Y.-S., Byers, M.G., Shows, T.B., Bell, G.1. and Steiner, D.F. (1989) MoL Endocrinol. 3, 1775 1781. 24 Steenbergh, P.H., H~bppener, J.W.M., Zandberg, J., Lips, C.J.M. and Jansz, H.S. (1985) FEBS Lett. 183, 403 407. 25 Steenbergh, P.H., H5ppener, J.W.M., Zandberg, J., Visser, A., Lips, C.J.M. and Jansz, H.S. (1986) FEBS Lett. 209, 97-10. 26 Han, J.H., Stratowa, C. and Rutter, W.J. (1987) Biochem. 26, 1617 1625.. 27 Aviv, H. and Leder, P. (1972) Proc. Natl. Acad. Sci. USA 69, 1402 1408. 28 Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular Cloning, Cold Spring Harbor Laboratory, Cold Spring Harbor.. 29 Maxam, A. and Gilbert, W. (1977) Proc. Natl. Acad. Sci. USA 74, 560-564. 30 Sanger, F. Nicklen, S. and Coulson, A.R. (1977) Proc. Natl. Acad. Sci. USA 74, 5463-5467. 31 Padgett, R.A., Grabowski, P.J., Konarska, M.M., Seiler, S. and Sharp, P.A. (1986) Annu. Rev. Biochem. 55, 1119 t150. 32 Manley, J.L. (1988) Biochim. Biophys. Acta 950 1-12. 33 Frohman, M.A., Dush, M,K. and Martin, G,R. (1988) Proc. Natl. Acad. Sci. USA 85, 8998-9002. 34 Jones, K.A., Kadonaga, J.T., Rosenfeld, P.J. Kelly, T.J. and Tijan, R.T. (1987) Cell 48, 79-89. 35 Magnusson. M.A. and Shelton, K.D. (1989) J. Biol. Chem. 264, 15936-15942. 36 Hamada, H., Petrino, M.G. and Kakunaga, T. (1982) Pro:. Natl. Acad. Sci. USA 79 6465 6469. 37 Berg, D.T., Walls, J.D., Reifel-Miller, A.E. and Grinnelk B.W. (1989) Mol. Cell. Biol. 9, 5248-5253. 38 Karlsson, O., Edlund, T., Moss, J.B., Rutter, W.J. and Walker, M.D. (1987) Proc. Natl. Acad. Sci. USA 84, 8819-8823. 39 Ohlsson, H., Karlsson, O. and Edlund, T. (1988) Proc. Natl. Acad. Sci. USA 85, 4228-4231. 40 Labuda, D. and Striker, G. (1989) Nucleic Acids Res. 17 2477 2491. 41 Bass, B.L., Weintraub, H., Cattaneo, R. and Billeter, M.A. (1989) Cell 56, 331.