Islet amyloid polypeptide: Structure and upstream sequences of the IAPP gene in rat and man

Islet amyloid polypeptide: Structure and upstream sequences of the IAPP gene in rat and man

235 Biochimica et Biophysica Acta, 1087 (1990) 235-240 Elsevier BBAEXP 90192 Islet amyloid polypeptide: structure and upstream sequences of the lAPP...

626KB Sizes 2 Downloads 118 Views

235

Biochimica et Biophysica Acta, 1087 (1990) 235-240 Elsevier BBAEXP 90192

Islet amyloid polypeptide: structure and upstream sequences of the lAPP gene in rat and man A . D . M . v a n M a n s f e l d 1, S. M o s s e l m a n 1, J . W . M . H t ~ p p e n e r 1, j. Z a n d b e r g 1, H . A . A . M . v a n T e e f f e l e n 1, P . D . B a a s 1, C . J . M . L i p s 2 a n d H . S . J a n s z 1 1 Laboratoryfor Physiological Chemistry and Institute of Molecular Biology and Medical Biotechnology, University of Utrecht, Utrecht and 2 Department of Internal Medicine, University Hospital, Utrecht (The Netherlands) (Received 28 May 1990)

Key words: Islet amyloid polypeptide; Amylin; lAPP gene; Type 2 diabetes mellitus; Diabetes mellitus; Gene structure; (Rat); (Human)

Islet amyloid polypeptide (IAPP) or amylin is a pancreatic islet hormone which was first found in amyioid in insulinomas and in pancreases of patients with type 2 diabetes. In rat a similar polypeptide occurs; however, pancreatic amyloid in this species has not been described. Here we report the structure of the rat and human IAPP gene. Both consist of three exons and two introns which are very similar. The upstream sequence of the rat IAPP gene contains a TATA-box, a CCAAT-sequence and a GT-element, whereas the upstream sequence of the human lAPP gene contains a TATA-box and a rat insulin enhancer-like sequence. This suggests that the rat and human IAPP gene may be controlled differently at the transcriptional level.

Introduction

Islet amyloid polypeptide (lAPP) or amylin, a 37amino acid polypeptide, is the major protein component of amyloid deposits found in insulinomas [1] and in pancreatic islets of type 2 or non-insulin-dependent diabetes mellitus (NIDDM) patients [2]. IAPP is synthesized in the fl cells of the islets of Langerhans [3] from where it is secreted [4] together with insulin [5]. Human lAPP (hlAPP) has been shown to inhibit both basal and insulin-stimulated glucose uptake in rat skeletal muscle in vitro [6,7]. Administration of rat lAPP (rlAPP) antagonizes insulin action in liver and peripheral tissues in rat in vivo [8] and administration of hlAPP causes peripreral insulin resistance in vivo in dogs [9] and antagonizes insulin in vivo in rat liver [10]. Elevated levels of lAPP may thus cause insulin resistance. Both insulin resistance and pancreatic amyloid formation are characteristic for type 2 diabetes and may be due to enhanced expression of the hlAPP gene.

The nucleotide sequence data have been submitted to the EMBL/Genbank Data Libraries under the accession numbers X52818 human lAPP exl, ex2 ; X52819 human lAPP ex3 ; X52820 rat IAPP exl, ex2; X52821 rat lAPP ex3. Correspondence: A.D.M. van Mansfeld, Institute of Molecular Biology and Medical Biotechnology, Padualaan 8, 3584 CH Utrecht, The Netherlands.

Type 2 diabetes and pancreatic islet amyloid have been described in man but not in rat. Therefore, it is interesting to compare the structure and expression of the lAPP genes in man and rat. The amino acid sequences of the middle part (residues 18-29) of lAPP from different species differ considerably [11-15]. These differences may relate to the amyloidogenic properties of IAPP [12,16,17]. The major amyloidogenic determinants in hlAPP are thought to be the Ala and lie residues at positions 25 and 26 [18]. In rat and hamster IAPP, which are not known to form amyloid, there are Pro and Val residues at these positions. Upon change of the Pro Val to Ala lie in the synthetic peptide corresponding to residues 20-29 of hamster lAPP, the peptide will form amyloid-like fibrils [17,19]. Analysis of lAPPspecific human genomic DNA and insulinoma cDNA and clones in our laboratory [20,21] and by others [22,23] have shown that the hlAPP gene contains three exons. In this study we have determined the structure of the rat lAPP gene and extended our previous analysis of the human lAPP gene. As a first step to investigate the transcriptional regulation of the IAPP gene in rat and man we have determined the nucleotide sequences of the upstream regions of both genes. hlAPP shows 43% and 46% homology with the human calcitonin gene-related peptides (hCGRP-I and -II, respectively) [24,25]. The CALC genes, which encode the CGRPs and the lAPP gene are thus considered to belong to one gene family. To explore this relationship

0167-4781/90/$03.50 © 1990 Elsevier Science Publishers B.V. (Biomedical Division)

236

further we investigated whether the lAPP gene contains regions which show sequence homology with the calcitonin-encoding part of the CALC-I gene.

encoding part (nucleotides 13 125 Fig.lb) of the obtained c D N A clone was used as a probe to isolate an IAPP-specific genomic D N A clone from a rat genomic D N A library in kGEM-11 (Promega, U.S.A.).

Materials and Methods

Isolation of rat lAPP-specific cDNA and genomic DNA clones RNA was isolated from rat (strain WAG) pancreas essentially as described by Han et al. [26]. Poly(A)-containing RNAs were selected by chromatography on oligo dT cellulose [27]. A c D N A library in Agt 10 was prepared using a c D N A synthesis kit (Amersham, U.K.) and a c D N A cloning kit (Amersham, U.K.) according to the instructions of the supplier. A total of 7.5-105 independent clones was obtained from 150 ng cDNA. Replica filters were prepared on Hybond-N (Amersham, U.K.). A D N A fragment which contains the major part of the hIAPP encoding sequence (the 167 bp PstIBamHI fragment [20]) was labeled by random priming with [a-32p]dCTP and used to screen the c D N A library under low stringency conditions (see below). A rIAPP-

Hybridizations Southern and Northern blots were prepared on Hybond-N following the standard procedures as described by Maniatis et al. [28]. Hybridization and washing were performed as described previously [20] at 35°C and 42°C, respectively, (low stringency conditions) or at 42°C and 65°C, respectively, (high stringency conditions). Filters were exposed to Fuji X-Ray films. Subcloning and nucleotide sequence analysis Recombinant phages RPC-1 (rlAPP cDNA), RG-I (rIAPP genomic DNA) and A h201 (hIAPP genomic D N A [20]) were plaque-purified, propagated on Petri dishes and the D N A was purified (all according to Maniatis et al. [28]). The insert DNAs from these recombinants were subcloned in plasmid pEMBL8 or phage M13mp9. A rIAPP-encoding part of the c D N A

a -965 ~ T C C C C T T G

TAGCCACTCT

~CC~C~

CTGAGTTCCA

~C~TGAT

AGAGTATGTC

T C ~ A ~

C~GTCA~C

CTTCT~A

T~CTCCAGT

-865 C C T T G T C T T C

A~TCTCTAC

ATGTA~CAC

TT~CAT

ACTT~CAT

ACA~CAC

ACACACACAC

A~CA~CAC

~TC~TC~

-765 T C ~ T T A G T C

ATGA~C

CATGTGT~C

TATGTAT~T

~TGTCTA

~ACAT~AC e ATTCCTGAGA

~T~TCTT

TA~CAT~

~GTT~TGT

TCTGTACTTC

-665 G T T A T ~ G T G

~TTTTGTTT

TCCTTTTC~

A~CACGTTG

TTTCTTTCCC

CCCTGA~TT

GCCATAC~T

~TAC~TA

T~TTTG~C

ACATTCAGCC

-565 C C C A T T C T T C

CCTCCGACCC

CTCCCAG~T

GCTCCCCACT

TCATATTCTC

CACCTG~TT

CCCTGTAGTC

TTTTTTTATA

ATCTGAGTCC

AGTTATTTTT

-465 C C C T C T T A T G

~TTTTTT

~ T A ~ T

CTTACTGT~

TATTAT~

TATTTCTGAG

TGTCCGT~C

A C T ~ C

A~T~TTGT

TCTC~TAT

-365 ~ T G T C T C C A

TTTCCCTA~

ATTCTCTTTG

~ T A C ~

~CTTTATTA

TTATTCCT~

CTACTATCTT

TTT~GTC

ACTTCCTTAC

TGTCAGACTT

-265 A A T T G T T C A A A A A T T C A C T T

GTCTTTTGCA

CAAAATGGAA

ACTC~CT

CCTA~CC

CT~T~ACC

A T ~ C ~

~

~GTATGT~

-165 T C T G C T A A T A

TTTACCTAAG

TGTTAATGTG

TTGATGGCCC

ACCT~TCCA

~ T ~ T G

~CTTTC I

CGTCC~

AT~A~CTA

65 C T C T C T G G T A

ACAGAGCTAA

GCAAGTTGAG

GGATATATAA

~GTCA~TC

~C~TTACC

d ' -

l

T

ATGA~CAGT

AAGCCGTGCG TGCACGCTTG GGCTGTAGTT CCTGAAGCTT

T G T T A T T G C T C-CCACTG~CC A C T G A A A G G T ~ T C C C

a CCGT~TTTG

C - 36 CAGGCTC-CCA G C A C A C T A T C

C

~TATTA

CTTAGT~

~CCGTC

136 T T T G T T G T T A

CCCGTAGGAC

ACATTTCA~

~ G T A

~TG~TAGT

GTGTTATAGT

~TTCTC

A~TGTTTT

T

TT~CATTT

236 ~ C T A T T T

C~GATGAT

~ATCA~

~TCTCTTTG

T ~ T ~ C

TTT~TTTC

CTTT~T~T

TC~TTTAC

ACTTGTGTCA

336 ~ A ~ T G G

TTTGACTTCG

GTGTTCTTTC

TTTTCAG~

TCTTGAGA~-~GGTGCAT f

b 436 CTCGC-CCACT T G A G A G C T A C A C C T G T C G G A

~

ACCAG~CTG

CTCCAGGCTG CCAGCTGTTC TCCTCATCCT CTCGGTGGCA

A~TT~T~

CCTGAACATT

CC

CTCCCTTGTA

CTAC~C~

~T~T~T

CCCAGT~TA

GGTGTTATGT

~TTTTTGTG

G T ~ T T

~TTCTTCTG

TACAGATGTT I

T ~ C ~

A~ACT~GT

GACT~TATA

~ C ~ T G

GTGAG~T~

CATG~T~C

TCTC~TG

b -223 C ~ G T A C C C

~

-123 C A T G T T C T ~

ATGTCTGT~

-

23

GTTCTTCTTT

T

A

CTTATTTGTT CAGTGGTACC AACCCTCAGG

78 T T C G C T C C A G C A A C A A C C T T G G T C C A G T C C 178 C C T G G A T T T C T T A C T C C T ~ - - ~ G T C A A T G 278

T G G A C A A A C G G A A G T G C A A C ACAGCCAC.~T GTGCCACACA A C G T C T G G C A A A C T T C T T G G

TCCCACCAAC CAATGT~A

TCCAATACAT AT~GAG

T G C C A T T C C C G C C C A A G T G A G A A T G T A C A A C A A C G T G C C T G T G C T T G C T G TGTTTGTAAA T T C T T A T T C T A A G A C G T G C T

378 G G T C A G G G T G A A T A C C T C T C

T T A A A C T G A G TGTTGATAAA

T A A T C A C A A C A T G T T C T T G G C T G T A C A T C G A T A T C G T A G G A A C A C T T A A A A T T T C T G T T T TTACCTTGTA A C T C T A T G A C

478 T C A A G T T T A A C A A T A A A G G A GGGCGTC-GGA T G G T G G A C T T G A A A A G T C A T TAACAGCTCA 578

GAATGTGGCA GAGGATCCAA A T A ~ T C

TACTCCCGTA TCTCTTATTA C T T C C T G T G T A A A T G C T C T G A T G A T T T C C T G A A T A A T G T A ACAGTC-CCTT

TAAATACCTG CTAAGATGAC ATTT

c~-7--/-~TAGCAA

TAGTAAATTT CTGATTCTAG ACATTGGAAA GCAAGCATTC

A A T C T T G T A C T T A T T* T A A T T A* A A C T T T T T A A T T C T A T T T C T T G T T G T C C T T G T T G G T A G T

678 AAGAGC-CTGT T T T A A A ~ T T G T T

TGTATCCACG A G G A G T T T A T T G T T T C T T T T T C T C T G T A G A G A G C T T T A A T G A C A T G T T T A T T C T G C T A A G

778 G A A T C T T T C A G C A A A C T C A A A C A T A A T T G

TATCCAGCAG AAGGATC

Fig. 1. Nucleotide sequence of the rat IAPP gene. (A) The upstream region, exons 1 and 2 (a and b, underlined), intron 1 and part of intron 2. Also indicated are the TATA-box (c), a C C A A T - b o x (d), a GT-element (e) and the translation initiation codon (f). The numbering is relative to the 5' end of rat IAPP m R N A . (B) Part of intron 2, exon 3 (underlined) and the 3' flanking region. A l s o indicated are polyadenylation signals (boxed), 3' ends of c D N A s (asterisks) and the translation termination signal (boxed). The numbering is relative to the 5' end of exon 3. The nucleotide sequence, obtained from our c D N A and genomic D N A clones, is identical to the corresponding part of rat I A P P c D N A reported previously [l 1], except for a stretch of 30 nucleotides (positions 278 to 307) which are lacking in that sequence [11] and two C residues instead of three at positions 713 and 714.

237

tive with this probe and was shown to contain a rat IAPP e D N A sequence corresponding to exon 3 and part of the preceding intron 2 (nucleotides 1 to 632 Fig.lb). Using a rlAPP-specific fragment from this clone we isolated a rlAPP-specific D N A clone (RG-1) from a rat genomic D N A library. The rlAPP-encoding part of the e D N A and oligonucleotides complementary to different parts of the e D N A (Ref.ll, Materials and Methods) were used to identify subclones containing different parts of the rlAPP gene. Alignment of the c D N A sequence with the genomic D N A sequences (Fig.l) shows that the rlAPP gene contains two introns, 289 nucleotides and approx. 4.9 kilobases in length, respectively. Consensus splice site sequences [31] are present at the intron-exon junctions.

(see above) and 5'-labeled oligonucleotides, complementary to nucleotides - 5 3 to - 3 2 and - 5 to + 17 in the c D N A reported by Leffert et al. [1.1] were used to identify subclones containing different parts of the rIAPP gene. Nucleotide sequences were determined using the chemical degradation method [29] or the dideoxynucleotide chain termination method [30] using a T7 D N A polymerase sequencing kit (Pharmacia, Sweden). Results and Discussion Analysis of the rat l A P P gene A rat pancreatic e D N A library was screened with a hlAPP-specific D N A probe (the 167 nucleotides PstIBamHI fragment, Ref.20). One clone (RPC-1) was posia -ILOO T C G A G A C C A G

CCTGACCAAC ATAGAGAATC

CCGTCTCCAC

TAAAAATACA AAATTAGCCA TACATGGTGA

TGCATGCCTG

TAATCCCAGC

TACTCTGTAG

-I000 G C T G A ~ A G

GAGAATCACT

TGAACCCC-C4~

A~ACAGGT

TACAGTGAGC

CCAGCCT~

C.AAC_AAAGAG

TGAAACTTCA

900 T C T C A A A A A A A A A G A A A A A A

AAAGAAAGAG

AGAAAGAAAT AAAGAAAAA

-

CGAGATCGCA CCCTTGTACT

GAAAAAGCAA TTTGGAGAAA TTTTAAAAGC AAATCATTTT AAAAGACAAA

800 C T A C A A A G T A

CTGTGTGAC-C T A A G C C A T T C A T A A T G T T C T

TTCAAAATAA

TTGTAAAATT TCTGTGTAAG

700 A A T G A A T G G A

AAAGGTCATC

CAAAAACTAC

TTGACTACAT ATACCTATGG ATCTCTATCT AAGTGCTGAA AAATGAATTT

TTTTATTCAT TTGGTTATGT

600 A G C A T T T G T T

CATTCTAATT

TTGTTTTGTT

CCCTGTCATA TCTCTGGTAC CTAGAATAAT

CCCTACCACA

- 500 A A A T C T C C C C

TCACCTCATT

GTAAATGACT

TTTGATTTCT CTTTTATGCC

CTTTTTATAC

ACCTTTCCCT TATATCTCCA

TTTATTCCTG AAGCTTCATG

- 400 G G A T T C A G C C

ATTGAGGTCA

CTTGGGTTTA GATATACCAA AAGTCTGTGA

TTTCTCTGTT

TGCATATATG CACATTTGTT

GTTATCCTTA

CCCTTTTCTA

- 300 T C A G T T C C T T

ACCATAACAT

ACACTTAATT

CTTGGAAATT

CACTCATGTC

TTACAAAGAT

C-GCAAATTCA A A C T T C T G C T

GTGTATGACA

CACCATTAAC

200 T G C A C A A G G A

CACTGTGTAT

TTGCTACGTT

AATATTTACT

GATGAGTTAA

TGTAATAATG

ACCCATCCC-C T T C T G C T G C C

TGTGAGGTAC TTTCTATCTA

-

-

CATGGAGCTA A T ~ C A

AATCTGTTCT

GAGTAGGTCT TCCATTACTC

TTATGCTTTT

d - -

- I00 T A ~ T G G ,

A AATTAATGAC

AGAGGCTCTC

TGAGCTGCCT GATGTCAGAG

CTGAGAAAGG

GGATACAAGC

TTGGACTCTT

TTCTTGAAGC

TTTCTTTCTA

TGTGAGGGGT ATATAAGAGC C TCAGAAGCAT TTGCTGATAT

TGGATTACTA

1 GA~TAA ATATTCCAGT a I01 AA____~TAAAGA A T T T C C T A T T

TCTGGGAAAG

TTTTATTTAT TTAGAGAAAT

GCACACTTGG

TGTTAAATTC ATGGTTTATT

T C A A A G A A A G GCTAAAGC4~A

201 G A A T G T A T T A

CAATATAAAT

GTTCAGATTG

CTTAGAGAAG

AAGTAA.AAAT C T C G A A A T T A CTTGAA/L~GT G G A C A A T A T T A A G G G A C T G T

301ATCAATAAAA

ATTTTGATCC

TTGTAAATTA

CGTTTTAAAA AGATGTTTCT

TTTAAAAACT

AAGCTCTAAT

TTAAAATTAC ATCAATTAGA ACTGTAAGAA

401 A T C T C T T G A T

TTCAGTGCTG

GATTATTCTT

TGCAGAAAAT

~sCATCC

TGAAGCTGCA

AGTATTTCTC

ATTGTGCTCT

CTGTTGCATT

CCATTGAAAG

GTTGGTAACT TTAAAATCCT

GTTTCTTTGT

AACTTTTGTA AAGTGTGAGA

AAATTAGAAT

TAAATACTGT

GAAATT~

TTGAGAA~

b 501 G A A C C A T C T G A A A G C T A C A C

GTTAGCAAAT

TGCTGACATTGAAACATTAA

e

b -177 T G T C A A A A A A T C T C A G C C A T CTAC-C4~TGTT TC-CAACCAAA C A C T G A G T T A C T T A T G T G A A A A A T T G T T T T C C T T T T G G G G T T T T T C A A T C - 77 A A T A T T T G A T

I

GTCACATGGC TGGATCCAGC

T A A A A T T C T A AGC4ZTCTAAC T T T T C A C A C T T T G T T C C A T G T T A C C A G T C A

24 G C A A C A C T G C

CACATGTGCA

ACGCAGCGCC

TGGCAAATTT

TTTAGTTCAT

TACATATGGC

AAGAGGAATG

CAGTAGAGGT

TTTAAAGAGA

GAGCCACTGA A T T A C T T G C C

cCT~FA~ACAATGTAAC

TCTATAGTTA

TTGTTTTATG

224

TTCTAGTGAT

TTCCTGTATA

ATTTAACAGT

GCCCTTTTCA

TCTCCAGTGT

GAATATATGG

TCTGTGTGTC

TGATGTTTGT

TGCTAGGACA

TATACCTTCT

324

CAAAAGATTG

TTTTATATGT AGTACTAACT

AAGGTCCCAT

GA T A G T A T C T T T

TAAAATGAAA

TGTT'2TTGCT A T A G A T T T G T A T T T T A A A A C

424

ATAAGAACGT

CATTTTGGGA

AGTGGCACAG

GTTTAAGAAC

GAAGGAGAAA AAGGTAGTTT

524

TGAAGTTATT

CTTC.ACATGA G A A A A T C A G T

AATT~CCA

GGCGCGGTGG

CTCTTGCCTG

624 A G G T C A G G A G

TTCGAGACCA

GCCTGACCAA

CATGGTGAAA

CCCTGTCTCT ACTAAAA

724 A G C T A C T C A G

GAG6CTAAGG

CAGGAGAATC

GCTTAAACCC AGGAGGCGGA

824 G T G A G A C T C G

TCTCAAAAAA

AAGAAAGAAA

ATTAGTAATT

GTAAGTACCC

924 T G C A G T A T A T

TTCTGAAATG ACAGAATGCT

GTTTTAAAA

GGTACTAAGA

GGCTATTTAA

AAGTATAAAA

CTGCTTTGTA

TATTTTGTTT AAGTGGCTTT

CAGCAAACCT

CAGTCATATT

1024

TGTGCTTGCT

1124 A T A T G T C A T T 1224

GAC CACAGACTTC

T~CTCT

ACTTTGGTGC

CATTCTCTCA

CAATTACAAG

AAGCGGAAAT

124

CCTATATCTC

TCCAGCAACA

TCAGGTGGAA

TCTACC.AACG TGGGATCC.AA

GAACCTTGGT

AAATTGTAAA

CAGCTAATAA

TAATCCCAGC A C T T T G G G A G

GCCGAGGCAG

GCAGATCACA

TA C A A A A A T T A G

CCGGGGGTG~

TGACATGTGC

CTGTAATCCC

GGTTGCAGTG AGCCGAGATT

~CCACTGC

ACTCCAGCCT

GGGTGGCAGA

CTGATAAGCA

AATTAGTAAT

TGTCAATACC

CCTGTTAAGC AATTCCTTTT

CAAAGAAATA A A A T C C T G C T

CCTGACTCGG

TCAAAATATT

TTTTAAAGTC

TCCATGAGGG

TTTCATTGTG

TGTTAGCAGC

AGTGAGCTTC

TATTAAATGT

CTTATGCAGG

GTATTGCGAA

ACAACTTGTG

TTCTATTAAT

CGTGTCTTCA

TATTGTTTGT

T T G C T G T A T A A A G A T T A T T C T T T G T T A A C A A A T T A G A C A T TCTAC-CAAAG T

Fig. 2. Nucleotide sequence of the human lAPP gene. (A) The upstream region, exons 1 and 2 (a and b, underlined), intron 1 and part of intron 2. Also indicated are the TATA-box (c), an insulin enhancer-like sequence (d) and the translation initiation codon (e). The numbering is relative to the 5' end of exon 1. The 5' part of the sequence contains part of an Alu-repetitive sequence [3,4)] which ends at nueleotide - 8 9 7 . Comparison of our hIAPP gene nucleotide sequence with that in Ref. 23 shows no differences within the exons. In the 5' upstream sequence there is a C - ~ T difference at position - 2 6 0 and in intron 1 we find two additional A residues at positions 307 and 337. (B) Part of intron 2, exon 3 (underlined) and the 3' flanking region. Also indicated are polyadenylation signals (boxed), 3' ends of cDNAs (asterisks) and the translation termination signal (boxed). The numbering is relative to the 5' end of exon 3. Nucleotides 558 to 837 correspond to an Alu-repetitive sequence 140]. The eight A ---, G differences in the Alu sequence within exon 3 we reported previously [21] might be the result of a substitution mechanism involving antisense R N A

[41].

238 Rat IAPP mRNA may have two alternate 3' ends. The cDNA sequence reported by Leffert et al. [11] and our genomic DNA sequence diverge downstream from nucleotide 716 (Fig.lb). This position is preceded at 15 nucleotides by the sequence GATAAA, which can serve as a polyadenylation signal [32] and is followed by a poly(A) sequence in the cDNA [11]. The alternative 3' end may be tocated at or close to nucleotide 632, the end of our cDNA, or nucleotide 638, the end of two of the three cDNAs for which Leffert et al. [11] have determined the 3' end. These sites are preceded at 21 or 27 nucleotides, respectively, by the polyadenylation signal AATAAA. The 5' terminal nucleotide (G) of the rat IAPP m R N A was determined following the so-called RACE protocol [33] (data not shown). This position is preceded by a TATA-box in the genomic DNA sequence at a distance of 26 nucleotides (Fig.la). So, transcription initiation of the rIAPP gene probably takes place at this G residue. The length of the rIAPP-specific mRNA in rat pancreas, approx. 1050 bases (b.) (Fig.3, lane a), corresponds to the length of an RNA containing exons 1,2 and 3 and a poly(A) tail at either one of the polyadenylation sites. The DNA upstream from exon 1 contains sequences homologous to elements known to be involved in transcription initiation and regulation (Fig.la): a TATA-box (TATATAA, - 3 2 to -26), a CCAAT-box (GCCAAT, - 1 1 2 to -107, the binding site for the C T F / N F I protein family [34]) and, more upstream, a GT-element [(CA)16(CAAT)4 , - 8 0 8 to -761]. A GT-element is present at a similar position upstream from the pancreas B-cell-specific transcription initiation region of the glucokinase gene in rat [35]. GT-elements can adopt a Z-DNA structure [36] and thus can be the target for specific, Z-DNA binding factors which may have a regulatory function [37].

Analysis of the human lAPP gene The nucleotide sequence analysis of the lAPP-specific human genomic DNA clone hh201 [20] was extended. The nucleotide sequences of the entire hlAPP gene and 1100 nucleotides upstream from exon 1 were determined (Fig.2). Alignment of the cDNA sequences with the genomic DNA sequences [20-23] shows the location and size of the exons and introns (Fig.2). Polyadenylation of hlAPP mRNA can take place at two sites [21,22], located at positions 390 and 1246, respectively (Fig.2b). Transcription initiation of the hlAPP gene occurs at the T or G residue, 24 or 25 nucleotides downstream from the TATA-box (Ref. 23, J.W.M.H., unpublished data). In insulinomas we find lAPP-specific mRNAs of approx. 1600 b. and 2100 b. (Fig.3, lane b and Ref.21). The length of the 1600 b. hlAPP mRNA corresponds to the length of an RNA containing exons 1,2,3 and a poly(A) tail at position 1246. Polyadenylation at the other site, position 390, may occur less

a

b

--2100 --1600

Fig. 3. Northern blot of rat pancreas R N A and human insulinoma RNA. Lane a: 5/~g rat pancreas poly(A) R N A was hybridized with a rat IAPP-specific D N A fragment (nucleotides 13 to 125, Fig.l b), lane b: 3 ~g human insulinoma poly(A) R N A was hybridized with a human IAPP-specific c D N A (the 588 bp. insert of AhtAPP-cl,Ref.21 ). The lengths of the lAPP-specific R N A s were deduced from their positions relative to the rRNAs.

frequently and therefore the corresponding mRNA may escape detection on Northern blots. The composition of the 2100 b. hIAPP m R N A cannot be explained with the available data. The DNA upstream from exon 1 of the hlAPP gene (Fig.2a) contains sequences homologous to elements involved in transcription initiation and regulation: a TATA-box (TATATAA, - 3 1 to - 2 5 ) and the sequence A A G A T G G C ( - 245 to - 2 3 8 , complementary sequence GCCATCTT) which is similar to the sequence GCCATCTG, a major element in the insulin enhancer, crucial for the activity of the rat insulin I gene promoter in insulinoma cells [38]. This sequence in the rat insulin promoter is the binding site for (a) B-cell-specific protein(s) [39]. A similar element (GCCATCCAG) is also present in the pancreatic B-cell-specific promoter of the rat glucokinase gene [35]. hIAPP has a considerable homology with the CGRPs [1]. T h e CGRPs are encoded by exons 5 of the CALC genes [24,25]. Exon 4 of the CALC-I gene encodes calcitonin. Since exons 2 and 3 of the hIAPP gene correspond to exons 2 and 5 of the CALC genes [21], we have determined the complete nucleotide sequence of hlAPP intron 2 and searched for sequences homologous with exons 3 and 4 of the CALC-I gene. No significant homology was found.

239 E1 83b

I1 289b

E2 95b

I2 -4.9kb

E3 716b

(632/638 b)

I

rat IAPP t

poly A site polyA site ?

human

IAPP

E1 103b TATAt~I

I1 332b

E2 12 95b -4.800 kb

I1 I

[~

~,~........

E3 390/1246b

m-'-,='~

ALU

Iiiii:~i~i~i:i~i~iililililii~ii~:il I polyA site

poly A sure

Fig. 4. Comparison of the rat and human IAPP gene structure. Indicated are the exons and introns and their lengths, the pre-pro-IAPP-encoding regions (hatched), the IAPP-encoding regions (inverse contrast), an Alu-repetitive sequence (spotted), TATA-boxes and polyadenylation signals.

whereas an insulin enhancer-like sequence is present in this region of the hIAPP gene. The differences suggest that transcription of the rIAPP gene and the hIAPP gene is controlled differently. Functional analysis of the promoter regions, however, must reveal which elements are important in r-cell-specific expression of the IAPP gene in rat and man. These studies may also contribute to the elucidation of the mechanism(s) involved in the development of type 2 diabetes a n d / o r pancreatic amyloid formation, which occur in man and not in rat.

Comparison of the I A P P gene structure in rat and man

Fig.4 shows that the rat and human IAPP genes have a very similar size and exon-intron organization. For both genes two polyadenylation sites have been described. The sizes of the rat and human IAPP mRNAs differ considerably. Comparison of the upstream sequences of the rat and human IAPP genes (Fig.5) shows considerable homology, including a very well conserved TATA-box region. However, a CCAAT-box and a GTelement are present in this region of the rIAPP gene,

-523

ATATTCTCCACCTGAATTCCCTGTAGTCTTTTTTTA

-522

CT

-466

TCCCTCTTATGGACTTTTTTTAAAAATAGATCTTACTGTGGTATTATAAAATATTTCTGA

-463

TCTCTTTTATGCCCTTTTT

-406

GTGTCCGTGACACTGAGAAACAGATCATTGTTCTCGGCTATAATGTCTCCATTTCCCTAG

-409

AGCTTCATGGGATTCAGCCATTGA

-346

CATTCTC *****

-351

ATTTCTCTGTTTGCATATATGCACA

-288

GTCACTTCCTTACTGTCAGACTTAATTGTTCAAAAATTCACTTGTCTTTTGCACAAAATG

-296

TTCCTTACCATAACAT

-228

GAA~CTCCAAACTCCTACAGGCCCTGATGCACCATCAACCACACAAAGGCACTCAGTATG

-239

~TTC/~.I~CTTCTGCTGTGTATGAC~CAC~%,TT~%.CTGCA~..GGACACTGTGTATT

-168

TGGTCTGCTAATATTTACCTAAGTGTTAATGTGTTGATGGCCCACCTGCTCCAGC~

-179

TGCTACGTTAATATTTACTGATGAGTTAATGTAATAATGACCCATCCGCTTCTGCTGCCT

TCCATTACTCTTATGCTTTTAAATCTCCCCTCACCTCATTGTAAATGACTTTTGATT

ATACACCTTTCCCTTATATCTCCATTTATTCCTGA

GGTCACT

TGGGTTTAGATATACCAAAAGTCTGTG

TTTGCATACAAACACACTTTATTATTATTCCTAACTACTATCTTTTTAAA ******** . * * * * * * * ** *** * * * * *** * ** * TTTGTTGTTA

ACACTTAATTCTT

n ~ 0 ~ A~C~%AACACTTTCCGTCCGCA

-119

T AAT~TGAGTCCAGTTATTTT

TCCT

*

TACCCTTTTCTATCAG

GGAAATTCACTCATGTCTTACA

GGATGGACACTAATGACACAG

TCTCT

CTG

GTGAGGTACTTTCTATCTATAGGGATGGAAATTAATGACAGAGGCTCTCTGAGCTGCCTG

- 58 GTAACAGAGCTAAGCAA

GT

TGAGG~QTCAGCTCGCCGATTACCAAGCC

- 59 ATGTCAGAGCTGAGAAAGGTGTGAGGG~GCTGGATTACTAGTTAGCAA

AT

Fig. 5. Promoter re~ons of the rat and human IAPP genes. The sequences of the first 523 (rat) and 522 (man) nucleotides upstream from exon 1 ~ e comp~ed. The TATA-box and CCAAT-box in the rat DNA (upper lines) and the TATA-box and insuUn enhancer-like sequence in the human DNA (lower fines) are in~cated (boxes). Small gaps were introduced to maximize homology. From nucleotides - 4 6 8 (rat) and - 4 6 5 (man) downstream to exon 1 the homology is approx. 66%.

240

Acknowledgements This research was supported by the Netherlands Organization for Chemical Research (SON) with financial aid from the Netherlands Organization for Scientific Research (NWO). We would like to thank H.A.M.B. van Nunen for his enthusiastic participation in the analysis of the rat cDNA and E.D. Kluis and P. van der Most for assistance in the preparation of the figures.

References 1 Westermark, P., Wernstedt, C., Wilander, E. and Sletten, K. (1986) Biochem. Biophys. Res. Commun. 140, 827-831. 2 Cooper G.J.S., Willis, A.C., Clark, A., Turner, R.C., Sim, R.B. and Reid, K.B.M. (1987) Proc. Natl. Acad. Sci. USA 84, 8628-8632. 3 Johnson, K.H., O'Brien, T.D., Hayden, D.W., Jordan, K., Ghobrial, H.K.G., Mahoney, W.C. and Westermark, P. (1988) Am. J. Pathol. 130, 1-8. 4 Kanatsuka, A., Makino, H., Oshawa, Y., Yamaguchi, T., Yoshida, S. and Adachi, M. (1989) FEBS Lett. 259, 199-201. 5 Ogawa, A., Harris, V., McCorkle, S.K., Unger, R.H. and Luskey, K.L. (1990) J. Clin. Invest. 85, 973-976. 6 Leighton, B. and Cooper, G.J.S. (1988) Nature 335, 632-635.. 7 Cooper, G.J.S., Leighton, B., Dimitriadis, G.D., Parry-Billings, M., Kowalchuk, J.M., Howland, K., Rothbard, J.B., Willis, A.C. and Reid, K.B.M. (1988) Proc. Natl. Acad. Sci. USA 85, 7763-7766. 8 Koopmans, S-J., van Mansfeld, A.D.M., Jansz, H.S., Krans, H.M.J., Radder, J.K., Frrlich, M., de Boer, S.F., Kreutter, D.K., Andrews, G.C. and Maassen, J.A. (1990) Diabetes 39, 121A. 9 Sowa, R., Sanke, T., Hirayama, H., Furuta, H., Nishimura, S. and Nanjo, K. (1990) Diabetologia 33, 118-120. 10 Molina, J.M., Cooper, G.J.S., Leighton, B. and Olefsky, J.M. (1990) Diabetes 39 260-265. 11 Leffert, J.D., Newgard, C.B., Okamoto, H., Milburn, J.L. and Luskey, K.L. (1989) Proc. Natl. Acad, Sci. USA 86, 3127-3130. 12 Nishi, M., Chang, S.J., Nagamatsu, S., Bell, G.I, and Steiner, D.F. (1989) Proc. Natl. Acad. Sci. USA 86, 5738-5742. 13 Ferrier, G.J.M., Pierson, A.M., Jones, P.M., Bloom, S.R., Girgis, S.I. and Legon, S. (1989) J. Mol. Endocrinol. 3, R1-R4. 14 Betsholtz, C., Christmanson, L., Engstr0m, U., Rorsman, F., Svensson, V., Johnson, K.H. and Westermark, P. (1989) FEBS Lett. 251, 261-264. 15 Asai, J., Nakazato, M., Kangawa, K., Matsukura, S. and Matsuo, H. (1989) Biochem. Biophys. Res. Commun. 164, 400-405. t6 Glenner, G.G., Eanes, E.D. and Willey, C,A. (1988) Biochem. Biophys. Res. Commun. 155, 608-614. 17 Betsholtz, C., Svensson, V., Rorsman, F., EngstriSm, U., Westermark, G.T., Wilander, E., Johnson, K.H. and Westermark, P. (1989) Exp. Cell Res. 183, 484-493.

18 Cooper, G.J.S., Day, A.J., Willis, A.C., Roberts, A.N., Reid, K.B.M. and Leighton, B, (1989) Biochim. Biophys. Acta 1014, 247 258. 19 Betsholtz, C., Christmanson, L., Engstrrm, U., Rorsman, F., Jordan, K., O'Brien, T.D., Murtaugh, M., Johnson, K.H. and Westermark, P. (1990) Diabetes 39, 118-122. 20 Mosselman, S., Hbppener, J.W.M., Zandberg, J., van Mansfeld, A,D,M., Geurts van Kessel, A.H.M., Lips, C.J.M. and Jansz, H.S. (1988) FEBS Lett. 239, 227 232. 21 Mosselman, S. HOppener, J.W.M., Lips, C.3.M. and Jansz, H.S. (1989) FEBS Lett. 247, 154 158. 22 Sanke, T., Bell, G.I., Sample, C., Rubenstein, A.H. and Steiner, D.F. (1988)J. Biol. Chem. 263, 17243-17246. 23 Nishi, M., Sanke, T., Seino, S. Eddy, R.L., Fan, Y.-S., Byers, M.G., Shows, T.B., Bell, G.1. and Steiner, D.F. (1989) MoL Endocrinol. 3, 1775 1781. 24 Steenbergh, P.H., H~bppener, J.W.M., Zandberg, J., Lips, C.J.M. and Jansz, H.S. (1985) FEBS Lett. 183, 403 407. 25 Steenbergh, P.H., H5ppener, J.W.M., Zandberg, J., Visser, A., Lips, C.J.M. and Jansz, H.S. (1986) FEBS Lett. 209, 97-10. 26 Han, J.H., Stratowa, C. and Rutter, W.J. (1987) Biochem. 26, 1617 1625.. 27 Aviv, H. and Leder, P. (1972) Proc. Natl. Acad. Sci. USA 69, 1402 1408. 28 Maniatis, T., Fritsch, E.F. and Sambrook, J. (1982) Molecular Cloning, Cold Spring Harbor Laboratory, Cold Spring Harbor.. 29 Maxam, A. and Gilbert, W. (1977) Proc. Natl. Acad. Sci. USA 74, 560-564. 30 Sanger, F. Nicklen, S. and Coulson, A.R. (1977) Proc. Natl. Acad. Sci. USA 74, 5463-5467. 31 Padgett, R.A., Grabowski, P.J., Konarska, M.M., Seiler, S. and Sharp, P.A. (1986) Annu. Rev. Biochem. 55, 1119 t150. 32 Manley, J.L. (1988) Biochim. Biophys. Acta 950 1-12. 33 Frohman, M.A., Dush, M,K. and Martin, G,R. (1988) Proc. Natl. Acad. Sci. USA 85, 8998-9002. 34 Jones, K.A., Kadonaga, J.T., Rosenfeld, P.J. Kelly, T.J. and Tijan, R.T. (1987) Cell 48, 79-89. 35 Magnusson. M.A. and Shelton, K.D. (1989) J. Biol. Chem. 264, 15936-15942. 36 Hamada, H., Petrino, M.G. and Kakunaga, T. (1982) Pro:. Natl. Acad. Sci. USA 79 6465 6469. 37 Berg, D.T., Walls, J.D., Reifel-Miller, A.E. and Grinnelk B.W. (1989) Mol. Cell. Biol. 9, 5248-5253. 38 Karlsson, O., Edlund, T., Moss, J.B., Rutter, W.J. and Walker, M.D. (1987) Proc. Natl. Acad. Sci. USA 84, 8819-8823. 39 Ohlsson, H., Karlsson, O. and Edlund, T. (1988) Proc. Natl. Acad. Sci. USA 85, 4228-4231. 40 Labuda, D. and Striker, G. (1989) Nucleic Acids Res. 17 2477 2491. 41 Bass, B.L., Weintraub, H., Cattaneo, R. and Billeter, M.A. (1989) Cell 56, 331.