Isolation and complete nucleotide sequence of the gene for bovine parathyroid hormone

Isolation and complete nucleotide sequence of the gene for bovine parathyroid hormone

319 Gene, 28 (1984) 319-329 Elsevier GENE 1002 Isolation and complete nucleotide sequence of the gene for bovine parathyroid hormone (Recombinant D...

1MB Sizes 0 Downloads 24 Views

319

Gene, 28 (1984) 319-329 Elsevier GENE

1002

Isolation and complete nucleotide sequence of the gene for bovine parathyroid hormone (Recombinant DNA; Southern blot; S 1 mapping; comparison with human gene; plasmid vector pBR322; M 13 and 2 phage; Charon 30; introns; exons)

Christine A. Weaver, David F. Gordon*, Martin S. Kissil*, David A. Mead and Byron Kemper** Department of Physiology and Biophysics and College of Medicine at Urbana-Champaign, University of Illinois at UrbanaChampaign, Urbana, IL 61801 (U.S.A.) Tel. (217)333-l 146 (Received

December

(Revision

received

(Accepted

6th, 1983) February

February

lst, 1984)

13th, 1984)

SUMMARY

The structure of the bovine parathyroid hormone (PTH) gene has been analyzed by Southern blot hybridization of genomic DNA and by nucleotide sequence analysis of a cloned PTH gene. In the Southern analysis, several restriction enzymes produced single fragments that hybridized to PTH cDNA suggesting that there is a single bovine PTH gene. The restriction map of the cloned gene is the same as that determined by Southern blot analysis of bovine DNA. The sequence of 3 154 bp of the cloned gene has been determined including 5 10 bp and 139 bp in the 5’ and 3’ flanking regions, respectively. The gene contains two introns which separate three exons that code primarily for: (i) the 5’ untranslated region, (ii) the pre-sequence of preProPTH, and (iii) PTH and the 3’ untranslated region. The gene contains 68% A + T and unusually long stretches of lOO-to 150-bp sequences containing alternating A and T nucleotides in the 5’ flanking region and intron A. The 5’ flanking region contains two TATA sequences, both of which appear to be functional as determined by Sl nuclease mapping. Compared to the rat and human genes, the locations of the introns are identical but the sizes differ. Comparable human and bovine sequences in the flanking regions and introns are about 80% homologous.

INTRODUCTION * Present

addresses:

Biophysics,

Tel. (319)353-5149; Clearbrook

(D.F.G.)

University Drive,

Department

of Iowa,

(M.S.K.) Arlington

of Physiology

and

Iowa City, IA 52242 (U.S.A.)

Amersham Heights,

Corporation,

2636 S.

IL 60005 (U.S.A.)

Tel.

(312)593-6300. ** To whom reprint Abbreviations:

requests

should

be addressed.

bp, base pairs; kb, 1000 bp; pfu, plaque-forming

units; preProPTH,

preproparathyroid

parathyroid

hormone;

ium dodecyl

sulfate.

0378-l 119/84/$03.00

PTH, parathyroid

0

1984 Elsevier

hormone;

proPTH,

hormone;

Science

pro-

SDS, sod-

Publishers

PTH is an 84 amino acid protein which is involved in maintaining extracellular concentrations of calcium within a narrow range. The secretion and, to a lesser extent, biosynthesis of PTH is regulated by the concentration of extracellular calcium in a simple feedback loop (Kemper, 1984). PTH is initially synthesized as a precursor, preProPTH, which is converted by two sequential proteolytic cleavages to ProPTH and then PTH (Kem-

320

per, 1984). Bovine purified larsky

PTH mRNA

and contains and

Kemper,

about

has been partially

700 nucleotides

(Sto-

1978) and the complete

se-

entire pPTHi4

plasmid

was used as a probe. isolated by preparative

or the 380-bp Sac1 fragment Restriction

fragments

electrophoresis

were

on 4 Tj0or 6 y0

quence of the mRNA

has been derived from cloned

polyacrylamide

bovine

(Gordon

DNA fragments

were labeled with [ a-32P]dCTP

nick translation

to a specific activity of 1 to 4 x 10’

PTH

cDNA

Weaver et al., 1982). Similarly, human

PTH mRNA

and Kemper,

1980;

a partial sequence

of

has been derived from cloned

cDNA (Hendy et al., 1981). Reverse transcription bovine PTH mRNA,

that was primed by restriction near the 5’ terminus,

that PTH mRNA

was heterogeneous

1980). The by

of DNA (Rigby et al., 1977).

of

fragments

hybridized

cpm/pg

gels (Maxam and Gilbert,

(b) DNA blot hybridization

suggested

in length with

DNA

was isolated

from the livers of individual

two clusters of 5’ termini about 30 nucleotides apart (Weaver et al., 1982). The derived sequence of the larger mRNA molecules began with the sequence

17-month-old Angus Hereford steers (Blin and Stafford, 1976; Maniatis‘et al., 1978). Analyses by electrophoresis on 1% agarose gels indicated that the

UAUAUAAA, a consensus TATA sequence, which was in the correct location to direct initiation of transcription to produce the smaller forms of PTH

DNA was probably pg/400 ~1 overnight

mRNA. A partial sequence of the human PTH gene has been determined and demonstrates that the gene contains two TATA sequences about 30 bp apart in the 5’ flanking region (Vasicek et al., 1983). The downstream human TATA sequence is analogous to the TATA sequence detected in the bovine PTH mRNA and a second upstream TATA sequence is present in the appropriate position to direct the initiation of the transcription of the larger forms of bovine PTH mRNA if an analogous sequence is present in the bovine gene. We now report the complete sequence of the bovine PTH gene which contains two TATA sequences analogous to the human mapping of the 5’ terminus that both TATA

MATERIALS

sequences

PTH gene. Sl nuclease of PTH mRNA confirms are functional

in vivo.

AND METHODS

(a) Hybridization

probes

larger than bacteriophage I DNA and was > 100000 bp. Bovine liver DNA, at 30 of the appropriate buffer, was digested with excess restriction endonucleases.

DNA blots were prepared as described by Southern (1975). For hybridization, 10’ cpm of nick translated heat-denatured cDNA probe (1 to 4 x 10’ cpm/pg) in 5 ml of the hybridization buffer was added to the filter and incubated at 68°C for 24-48 h. The final most stringent wash of the filters before autoradiography was in 0.15 M NaCl, 0.12 M Tris . HCl, pH 8.0,5 mM sodium EDTA, 0.1% SDS and 0.1% sodium pyrophosphate. (c) Preparation

of a partial bovine DNA library

High-M, bovine DNA obtained from a single animal was digested to completion with EcoRI. DNA fragments from 4000 to 12000 bp were isolated by sucrose-gradient centrifugation. The DNA was used to produce a partial bovine library in Charon 30 as described by Maniatis et al. (1978). Recombinant phage DNA was packaged in vitro into phage with Packagene as specified by Biotec, Inc.’ (Madison, WI), and plated on 25 15-cm plates as described by Williams and Blattner (1979). A total of 4.3 x lo5 pfu were obtained.

Hybridization probes for the PTH gene were restriction fragments of bovine PTH cDNA that were isolated from pPTHi4, which contains a near full-

(d) Screening the phage library

length PTH cDNA insert (Weaver et al., 1982). For most of the Southern blot experiments a 380-bp fragment generated by Sac1 cleavage was used (Fig. 2). This probe corresponds to most of the coding and the 3’ untranslated region of PTH cDNA. For plaque and colony filter hybridizations either the

The partial bovine library was screened using in situ plaque filter hybridization as described by Maniatis et al. (1978). Two positive plaques were plaquepurified and the presence of the PTH gene confirmed by Southern blot analysis of the phage DNA.

321

(e) Subcloning of the PTH gene into pBR322

RESULTS

The bovine DNA inserts in the recombinant phage DNA were recovered by annealing the cohesive ends of the phage, digestion with EcoRI and separation of the ligated arms from the insert by sucrose gradient centrifugation (Maniatis et al., 1978). The PTH gene fragments were ligated into the EcoRI site of pBR322 and Escherichiu coli RR1 was transformed with the recombinant plasmids by the calcium shock technique (Weaver et al., 1982).

(a) Restriction map of bovine PTH gene

(f) Determination of the DNA sequence The nucleotide sequence was determined by the chemical method of Maxam and Gilbert (1980) except that acetylacetone was added after the first ethanol precipitation of the pyrimidine reactions to eliminate residual hydrazine (Jay et al., 1983). The 1500-bp Sac1 fragment that contains part of intron A, exon 2, intron B and part of exon 3 was inserted into the Sac1 site of M 13mp 10 and sequenced by the dideoxynucleotide termination method (Sanger et al., 1977; Messing et al., 1981). To obtain nested subfragments of the Sac1 fragment, 120 pg of DNA from each of two recombinant phage containing the Sac1 fragment in opposite orientations, were hybridized and digested with Sl nuclease followed by BAL 3 1. The blunt-ended fragments were inserted into the SmaI site of M13mpll for sequencing. (g) Sl nuclease mapping The 5’ end of PTH mRNA was analyzed by Sl mapping as described by Weaver and Weissmann (1979). The 5 ‘-32P-labeled probe was a fragmentlabeled at the PvuII site (nucleotide 53 in Fig. 3) extending to a PstI site at about -600. PTH mRNA (0.25 pg) about 50% pure (Stolarsky and Kemper, 1978) was annealed to 4.5 x lo5 cpm (0.89 pg) of the DNA probe for 18 h at 48°C. After digestion with S 1 nuclease at 30’ C for 0.75,4 or 6 h, the samples were analyzed on an 8% polyacrylamide-7 M urea DNA gel. The 5’-32P-labeled PvuII-PstI fragment that had been chemically degraded by the A + G sequencing reaction (Maxam and Gilbert, 1978) was analyzed in parallel to determine the size of the protected fragments.

For Southern blot analysis of the bovine PTH gene, 26 different restriction enzymes or combinations were used to determine a map of the gene. A representative autoradiogram is shown in Fig. 1 and the restriction map is shown at the top of Fig. 2A. Most of the enzymes which did not cleave within the cDNA probe, as indicated by the dots above the lanes in Fig. 1, produced a single major fragment that hybridized to the cDNA. In addition to those shown in Fig. 1, Hue111 (2100 bp), RsuI (705 bp), and HhuI (22000 bp) also produced single bands. For the enzymes TuqI (Fig. 1, lane 10) and XbuI (Fig. 1, lane 13) which cleave the cDNA once between the two Sac1 sites, two major bands were seen. These data suggest that the PTH gene exists as a single copy within the haploid bovine genome since flanking regions in multiple PTH genes would probably diverge and produce more than one band. Comparison of the restriction map of bovine DNA with that of bovine PTH cDNA indicated that intervening sequences of greater than 1500 bp were present in the 5’ portion but not in the 3’ portion of the PTH gene. The Southern blot analysis indicated that the PTH gene was contained within a 7000-bp EcoRI fragment (Fig. 1, lane 16). Consequently, a partial library of EcoRI-digested bovine DNA was established using Charon 30 as a vector. Four positive plaques out of 400000 screened were obtained and two were plaque-purified and analyzed. A partial restriction map of the cloned PTH gene is shown in Fig. 2A and is compared to the map obtained by the genomic Southern blot analysis. The two maps are basically the same within experimental error for determining the sizes of DNA fragments, particularly for the Southern blots. In addition, new sites are shown in the cloned gene which could not be detected in the Southern analysis since only those sites for a restriction enzyme bordering the probe could be detected. (b) Sequence of bovine PTH gene The complete nucleotide sequence of the bovine PTH gene (Fig. 3) was determined by the strategy shown in Fig. 2B. Except for exon 3, over 90 y0 of the molecule was sequenced at least twice with 75% sequenced either on both strands or by both me-

322

i

5

2

4

5

6

7 ‘8

9

IO I I 9;

13 I4 15 1; I?’

k -23700

-1353 -872 -603 -310

Fig. 1. Southern restriction

blot analysis

endonucleases.

by electrophoresis

through

a l.OY, agarose

section b. PTH gene fragments insert ofpPTHi4,

gel and transferred by hybridization

to a nitrocellulose

thods. The total 3154-bp

sheet as described

RF DNA of $X174 digested

includes

510 bp

in the 5 ‘ flanking region and 139 bp in the 3’ flanking region. The gene contains two introns, one of 1714 bp which interrupts the 5’ untranslated region of PTH mRNA and one of 119 bp which interrupts the region coding for the pro-sequence of preProPTH (Fig. 3). The exons crudely represent functional domains. Exon 1 contains 95 bp that correspond to all of the 5’ untranslated region except for 5 bp preceding the initiator methionine codon. Exon 2 contains 91 bp that correspond to 5 bp from the 5’ untranslated region and the region that codes for the pre-sequence and the first four amino acids of the pro-sequence of preProPTH. Exon 3 contains 486 bp that correspond to the region coding for the remainder of the pro-sequence, PTH and the 3’ untranslated region. The sequences of the exons were identical to that determined for bovine PTH cDNA except for T2372 in the 3’ noncoding region which was a G in the cDNA sequence. This difference is

in bp are shown

probably

with the indicated

DNA digest were fractionated

in MATERIALS

AND METHODS,

of a Sucl-fragment

to a specific activity of2 x 10’ cpm/pg. The dots above the lane numbers

(lane 6). The sizes of the fragments

sequence

was digested

30 pg ofeach

to a sZP probe (see Fig. 2A) consisting

that do not cleave within the Sac1 fragment.

as M, standards

from the liver of a single animal

that both enzymes were used for double digestion.

were detected

which was nick-translated

with single enzymes Hind111 served

of bovine liver DNA. DNA prepared

Slash indicates

of the cDNA

indicate

digestions

with Hue111 and I DNA digested

with

on the right.

the result of polymorphism

at this site or an

error introduced during cloning. The overall nucleotide composition of the PTH gene contains 68)1/, A + T (Table I). The flanking TABLE

I

Nucleotide

composition

of the bovine

Percent Length

PTH gent

of nucleotides A

T

C

G

510

37

95

32

36

13

14

73

31

20

18

63

1714 91

35

35

15

15

70

29

35

13

23

64

A+T

(bp) 5’ flanking Exon 1 Intron

A

Exon 2

119

22

32

17

29

54

Exon 3-coding

Intron

262

34

23

18

26

57

Exon 3-noncoding

224

31

38

17

14

69

3’ flanking

139

36

40

15

9

76

3154

34

34

15

16

68

Total

B

323

A

0 I

2 I

I I

3 I

4 I

Pst I

I

Taq I

Act

Sac I I

6 I

7 I

Kb

P&

Hind III Eco R I

5 I

I Sac I Sac I

Act

I

Pvu II

I

Xba I

I Eco R I

Southern Blot Analysis

I

Xbai

Pvull

Taq I Pst I

Pst I

Hind

III

Hind

III

Xba I XbaIXbaI

Intron

P!e“Pr%

0.5

1.0

1.5

I

I

I

IEcoR

I

Cloned

Gene

A

5’ Untraklated

B

Act

Pfli

5’

2.0

I

Untranslated

2.5

I

3.0

I

I

1 Kb

O---+-

TR

CCL_

<

_

-

-

*

cc-c-

HA

R

----v-&a

S R

HF -e

--

i-

-

. ..+

H I-IF __c

AHCHF ----

HFR

R

-

S

S

R

-

Fig. 2. Partial restriction map of the bovine PTH gene and strategy for sequencing. (A) The restriction map derived from the Southern blot analysis is shown in the second line and compared to the restriction map of the cloned PTH gene in the third line. The Sac1 fragment used as a probe for the Southern blot analysis (see Fig. 1) is shown above. Solid boxes represent the exons of the PTH gene and are referred to in the text as 1 through 3 from the left. The large intron is intron A and the small one is intron B. In the bottom line the PTH gene region is expanded and the regions of mRNA that are encoded in each exon are indicated. (B) Restriction sites used in the sequencing are indicated. The arrows above the line represent fragments sequenced by the Maxam and Gilbert (1980) method with the circles indicating the 5’ radioactive end and the arrow indicating the direction of sequencing. The arrows below the line indicate sequences determined by the dideoxynucleotide chain termination method. The letters represent restriction enzymes as follows: T, TuqI; R, RruI; H, HindIII; A, AccI; S, SacI; HF, Hid; HC, HincII.

regions of the gene, intron A, and the 3 ’ noncoding region of the gene are particularly rich in A + T, ranging from 69 to 76%. The 3’ flanking region shares with the neighboring 3’ untranslated region a high percentage of T’s and very low percentage of c’s. Intron B is unusual in having a high percentage of G’s (29%), a property it shares with its neighboring regions, exon 2 and the translated region of exon 3. A lOO-bp core in intron A is also G (35%) and C (21%) rich. Regions of the gene contain long sequences char-

acterized by alternating A and T in both the 5’ flanking region (-420 to -260) and intron A (1548 to 1643) as shown in Fig. 3. This region in the intron consists of three tandem nearly perfect copies of a 2%bp sequence. Repeating units within this type of sequence in the 5’ flanking region are less obvious, but four sequences that are flanked by small strings of A at the 5’ end and T at the 3’ end are partly homologous to each other. Each of these four sequences contains a 12-bp core that matches 12 bp present in the 28-bp repeats in the intron.

324

-45 -50 -40 CGAATCAGGC8GCAAAAGGACTGGAGATTCAGCATTCAGCATCAGTCCTTCCAATGAATA PTCAAGGCTGATTTCTTTTAGAATTACTAAAGGAATATATTATATATATT 1 TATATTTATAAATATATAATACATATATATATAATATAT ......

-300

-25Q -359 ATATTTTATAAATATTTTATATTTTATAT~TATACATTTTGTA~TATATAATATATATTTAT~TATAT~TATATAT~TATATCTTTTTGGATATAGGCATT~TCAGTCA~TTAC~TTCACTATTTGTTA~AATCTTT ........... ............ -1OQ -159 GCATAAACACTTTTCCAGCCCACGCTGTTTTGCTTTTAATATCC~TTATCT~TTTAA~~ATGGCACCGCCCCATGGGAGTGTGTGTGCTGCTCTATGATCCTATGGTTAAAATTCAGAGAATTGGGAGTGACGTCATCTGT~

CAA~A%TTCTCAGTGTGGAAGACTTATATA~GTCACATTGAAGGGTCT

TGA AAG AAG ATT GTA TCC TAA GAC GTG TG

AC, GCT CAA TTT ATC AGC CTT CTC AGG TTT ACT CAA CTT TGA GAA AGC ;!C AGC TGC TAA TAC ATT

100 '5Q GTWlGTAATCTTTATTTTCTCTTTTAAGTTTCCATGAGTTCTTTCTTAC~TCAAGTAGTCATTT~TATGTACATTTCCTACT~TATCATGCTGTTTGTT~~GT~

300 259 AATGTT~GGGAGGGTTT~TTATATTT~TATT~TG~~A~AATAAA~TA~TTTATG~~A~T~~TATTGTTG~~A~TTATTTT~GTGATT~GTATTTT~~~~~TTGCTTTTTTTTTTTTTTTT~CTCATCTGC

359

AATTA~ATGG~TAATTATTTGG~AT~TGTG~TTT~~TAT~TTGTTA~ATTTTGATAT~AG~AG~~TAGTGA~ATG~T~~~CTGATTT~T~TCAATGT~ATTTATGTCTCTGAGTT~TTCC~~~TTCC~ 65 559 609 CACATAATAT~~T~A~AGA~G~T~AGT~TGTGTA~TATAGTTGGAGAAA~TGTTT~A~TG~TG~TATTTGGT~~~TTCA~ATAGAA~~TTTT~TAGTTTT~TAT~T~~TA~A~ATG~TTTTAAGGATGTTT~GTG~G ?TTcTA 7OQ 7M 80 ATGACCTTCTTGTGAGAAGCGCGCCTCTCAGTTGAGAATAGG~A~AAGGCAGACGCACTAGGT~GTGTGGCGGAGTGGGGCAGACTCCTGT~CTCTGCTACAGTGATCTTTTCTCATGTAAACCAAAATGGAA~CTATCCB TTTGG

TGCTCCTTCAATAGCTTATATTGCACAGTGAAAACAAATATA~~GGTTATAATGGCCCTTATGCTACTTCTW\TGCGCGTTTCCCACAGCC~~CTATGTGTATGTTAAA

?ATCAT

1000 1050 1lOQ tiAAAGCTI\TTGCAGTTGTTGAAATGCAGCTAAGGTAAGRCTTTiCTATTTTWlGATAAAAAGCTTAAAAGCAAAAGAGCAAAATTCTAAACATiATACTCAAACATTGTTGTTAATAAAATGAGCTG~~T~~TACA~C~~ 1209 lisp ATTTTATATGGAATCAAAGCTCAAATTTGTATAACAAGTAATT~TTA~TGATGCTTAAATGCTTTGACACATTCTTTTTAATGCTGGGAGAACTCTGTGTTAAGTTAGTAAAAAATCTCTAAATATTTA

1250

1300 135s TAGTTATCATGACATTGAAAGGAAGTCTACTTAT~~ATTTGCTAAGTAAACA~CGCTGTT~C~TTTACCTAATTAAGA~GTA~CTATTTTC~GCAAATTA~GCATTTAGTCATAATTAATACAAAATGT~GGAGG

1400

15OQ AAATTCTAAACAGTGAACTTTTTGATTCTTCACATAGCTTAAA4 '""pTCATATAACTTGCACAAGAACTCTCAAAAAGATACCTAAACTTACTATATACTAA

155 PTATGT

1600 1650 170Q GTATATATATAATATACAGTAATTATGTGTATATATAT~TATA~AGTAATTATGTATATATATATAT~TATATA~CTGT~AAT~A~~~TGATTCC~~GTCTCTCACA~AC~TCC~TTAGTCACTGTCACACTCTGT

I

,

17x, 18OQ GCAAACTTGTATGGTfATATATAAACTACTATAAArTTTAATTGTACAGTTTGG~TGTT~GTATTTTT~TACCTCCATTTTTCTTTTTCTTTTTTAG Vdl Met Ile Val Met Leu Ala Ile Cys Phe Leu Ala Arg SW Asp Gly Lys Ser Val Ly GTA ATG ATT GTC ATG CT1 GCC ATC TGT TTT CTT GCA AGA TCA GAT GGG AAG TCT GTT AA

Met Met SW Ala Lys Asp Met Val Lys TT AAT ATG ATG TCT GCA AU GAC ATG GTT AAG

1950 GTPJ\GTACCATAGCCTGTTCTGCATGGTGAGGTCAGGGG~TTGGATTTTTAAGGTTGGCTTTAT~TTT

2000 5 Lys Arg Ala Val SW Glu Ile Gln Phe Met His Asn Leu Gly Lys His Leu SW SW Met Glu Arg Val Glu GGAAGAGGGGP;ACTAATGGAGTGATCCTCiCTGATCCTGTTCCCTCCAGG FAG AGA GCT GTG AGT G44 ATA CAG FrTTATG CAT AdC CTG GGC AdA CAT CTG AGC TCC ATG GRA AGA GTG G4A Trp Leu Arg Lys Lys Leu Gln Asp Val His Am Phe Val Ala Leu Gly Ala Ser Ile Ala Tyr Arg Asp Gly SW SW Gln Arg Pro Arg Lys Lys Glu Asp Am Val Leu TGG CTG CGG A44 AAG CTA CAG GAT GTG CAC AAC TTT GTT GCC CTT GSA GCT TCT ATA GCT TAC AGA GRT GGT AGT TCC tAG AGA CCT CGA AAA AAG GAA GAC AAT GTC CTG Val Glu Ser His Gln Lys Ser Leu Gly Glu Ala Asp Lys Ala Asp Val Asp Val Leu Ile Lys Ala Lys Pm Gln End 2300 GTT GAG AGC CAT,CAG AAA AGT CTT GGA @A GCA w\C AAA GCT GRT GTG w\T GTA TTA ATT AdA GCT AA4 CCC CAG TGA AAA CAG ATA TGA TCA G4T CAC TGT TCT AGA CAG

CAT AGG GCA ACA ATA TTA CAT GCT GCT AAT GTG?T"z ACC TTC TAT TAA GTG CCA GTA TTT CTA TGA CCA ACC TTT ATT GCT A,:"", !T GAT ACC TAC AAT TTT AAT TGA GTA 2450 TfT TGA TTC TAC TTT ATT CAT CTA iGA GCT CTT TTA ATA ATT CTA TTT CTA TTG ATT CCAU

2500 TGA AGT Tti GTA T,

TCTCACTTGTTATAAAAATATCTTTTGGTTATGAGTAAC

2609 2550 ACCAAtATGTTAAAATTGATCATGACTAAGAACAACAACAC~ATGTTlTTTT~CCATATGTTTCACATTCT~TATTTTG~TTATCCTTTT~TC

Fig. 3. Nucleotide

sequence

In the 5’ flanking

region potential

two potential regions,

of the bovine PTH gene. The sequence

TATA sequences

the appropriate

are underlined.

Sequences

amino acid is indicated

the large intro” (1545-1630 lation signal, AATAAA,

stem and loop structures

bp) three adjacent is underlined

of the DNA strand

are indicated

of exons are shown as separated

above the nucleotide direct repeat sequences

and a potential

corresponding

transcription

(c) Comparison with human PTH gene Comparison of the nucleotide sequence of the bovine PTH gene with that of the human PTH gene (Vasicek et al., 1983) by dot matrix homology indi-

to the mRNA sequence

by solid lines (stem) connected

sequence. terminator

groups of three nucleotides,

A possible

are indicated

by a dotted

transcription

by arrows.

is underlined

is shown.

line (loop). The and in the coding

start point is numbered

In the 3’ flanking

I. In

region the polyadeny-

twice.

cates strong homology throughout the gene, except in the 3’ noncoding region (Fig. 4). The sequences of the bovine and human genes in the flanking regions and introns are compared directly in Fig. 5. The 238 bp preceding a possible start point for trans-

325

HUMAN Fig. 4. Comparison trix analysis.

SEQUENCE

of bovine and human

A computer

dot when an identical

program

sequence

(bp x 1O-3)

However, the actual site of polyadenylation in the human gene is displaced 4 or 5 bp from the bovine site. A 5-bp gap must be introduced in the human sequence between the polyadenylation signal and the polyadenylation site to maximize homology. This finding indicates that the distance from the polyadenylation signal is more important than the actual sequence at the polyadenylation site for determining the site of polyadenylation. A similar conclusion has been reached on the basis of deletion mutations between the signal and site for polyadenylation in SV40 mRNAs (Fitzgerald and Shenk, 1981). About 80 bp 3’ to the polyadenylation site is a sequence of dyad symmetry preceding a string of Ts. The sequence is similar to rho-independent termination sites in prokaryotes (Platt, 1981). A change from C in the bovine gene to T in the human gene is accompanied by a G to A change which maintains the dyad symmetry.

PTH gene by dot ma-

generated

a plot by placing a

of 7 nucleotides

in the bovine

(d) 5’ termini of PTH mRNA

gene along the vertical axis was detected in the human gene along the horizontal indicated

axis (Novotny,

1982). Exons

of each gene are

by the solid boxes along the axis. The break in the axis

for the human information

gene is the intron

A region for which sequence

is not available. The human sequence is from Vasicek

et al. (1983).

cription are remarkably similar. With minimal introduction of gaps, the two sequences are 84% homologous, which is comparable to the 90% homology in the protein coding regions. Also conserved are two TATA sequences about 30 bp apart (-58 to -54 and -24 to - 19) which would serve to direct the initiation of transcription (Breathnach and Chambon, 198 1). Notably, the region between the TATA sequences is only 50% conserved. The introns and flanking regions are also conserved (Fig. 5). Homology of about 80% between the human and bovine sequences is present in the introns and 3’ flanking region if gaps are not considered. In contrast, the 3’ noncoding region requires multiple gaps to maximize homology (Hendy et al., 1981). The two introns follow the standard GT-AG rule at the splicing junctions (Breathnach and Chambon, 198 1). The polyadenylation signal and 14 bp in the region of the sites of polyadenylation are completely conserved between the human and bovine (Fig. 5).

We have shown previously by reverse transcription that the 5’ ends of bovine PTH mRNA are heterogeneous and that the initiation of transcription of the shorter mRNAs could be directed by a TATA sequence corresponding to the sequence, -24 to -19, that was present in the longer mRNA sequences. The finding of another TATA sequence further upstream confirms the prediction (Weaver et al., 1982) that a second sequence should be present in the gene to direct the initiation of the longer mRNAs. We have further confirmed the presence of heterogeneous 5’ termini of PTH mRNA by S 1 mapping as shown in Fig. 6. The arrows indicate the locations on the sequencing ladder expected to be start sites for PTH mRNA. At the shortest time of S 1 nuclease digestion (lane b), bands corresponding to these sites are observed if a correction of about 1.5 nucleotides is made because of the different nature of cleavage by Sl nuclease and chemical degradation (Sollner-Webb and Reeder, 1979). The fragments marked by the top and bottom arrows are apparently ‘nibbled’ by the S 1 nuclease at the longer times. In both cases regions of A and T follow the expected initial A and would be more susceptible to the nibbling. These studies differ from the previous reverse transcriptase studies in that the intensity of the largest band is much greater relative to the smaller bands.

326

5' FLANKING REGION

-2% -225 -200 -175 Bovine TAGGCATTAATC--------------AGTCAGATTACAATTCACTATjTGTTAGAAATCTTTGCAT~-ACAC~TTTCCAGCC-CACGCTGTTTTGCTTi ***t*t** t l * **t** t******t* l * *,**t****** * *** l * l *** l t*t,***t l *** Human

GATTCATTAATCCACATAGTTTTTCTCGATGGTATAATTCTGTATTTGTTAAAAGTCTTTGCATAAGCCCCTTGTCAGCCAAATGCTGTTTTCCTTT

-1% -125 -100 TAATATCCAATTATCTAAAATTTAAGAAGAATG-GCACCGCCCCATGGGAGTGTGTGTGCTGCT-CTAT~TCCTA~GGTT~TTCA~~TTGGGA l * **t*+****t**t l ** *t*tt**** *t ********* t**** *t**t+ l * t*** l * l * **** l ** l tt ****t******** TAGTATCCAATTATCTGAAACTTAAGAAGAGTGTGCACCGCCCAATGGGTGTGTGTATG-TGCTGCTTT~CCTATAGTTGA~TCCA~~TTGGGA

-7$ -25 -50 1 GTGRCGTCATCTGTAACAATAAAAAAGCTTCTC----AGTGTGGAAGACTTATATATATAAAAGTCAC-ATTGAAGGG-TCTACAGCTCAAT t**tt l ***************** ttt t*** l ***ttt*****t*t* ttt **et* *** l ** * ****at

l ***

GTGACATCATCTGTAACAATAAAAGAGCCTCTCTTGGTAAGCAGAAW\CC--TATATATAAAAGTCACCATTTAAGGGGTCTGCAGTCCAAT

3' FLANKING REGION

2525 2500 2550 2575 Bovine ATTCCAAATAAATGAAGTTAAGTATTtTCTCACTTGTTATAAAAAT~TC-TTTTGGTTATGAGTAACACCAAiATGTTAAAATTGATCAT~CT~~ l tt *t**t**** tt*t**t***t*** l* ** ***t* *t*tt***** l l l *** l tt*t* l t****ttt*****t t l Hu"a"

AGTCTA~TG-----AAGTATTTCTCtAtCTCATTGCAAGTATATCTTTTTGGTTATCACTGATACCCACATGTTTACATTGATCAT~CTAGGTA

2600 262; CAACAACACAAAATGTTTTTTTAACCATATGTTTCACATTCTGATATTTT~-ATTATCCTTTT~TC l **** ***tt ry l ******* *** l ********** l **tt*****t ** l l **t** * GAACAATACAAAGTATTTTTTTAGTCATGTGTTTCACATTTG~TATTTT~ACATCAACGTTTTAGTA -...-

INTRON A

175 125 150 100 Bovine GAC---GTGTGGTGAGTAATCTTTATTTTCTCTTTTAAGTTTCCATGAtiTTCTTTCTTACAATC~GTAGTCATTT~TATGTAC-ATTTCCTACT ** **** l **** n+ t * *+ * * *ttt**** * l ****** l ***. tit** n* tt ** t*t*** l ** Human

GACATTGTATGGTAAGTAAACTTAAAAATTCACTTCTGAATACCATTTC-TACA exon intron

1

225 259 2OQ AATATCATGCTGTTTGTTAAGTAAAATGTTAAGGGAGGGTTT~TTATATTTAATATTAAAATGCCACAATAAAAAATAA..... l ***** **t* **t**** n*******++ l *tt l *** *t*t tt l * *ttt*** **** l *******

(1398 bp)

AATACCATGTTGTTTCTTCAAGGTAAAATGCTAAGAA---GTTTGAGTTATGTTTAATAT-AAAATGCCACATACAAAAATAA.....

(% 3100 bp)

1759 170 1723 1675 . .. ..AAGTCTCTCACAAAAMCAATCCAATTAGTCACTGTCACA TCTGTGCAAACTTGTATGGTTATAAATATAAACTACTATAAATTTAATTGTACtttt l ** et** l *** ** l ******** t * t l * n*** . .. ..AAGCTTCTCGTWU\AACCACCCAATTAGTTAGTATTGCATTCTGT----------------------~---------------------GTACT

1809 1775 --AGTTTGGAATGTTAAAAGTATTTTTAAATACCTCCATTTTTCTTTTTCTTTTTTAGTTAATATG..... ***tt***** *t*ttt n**** **tt***t*ttt*** l ** * l ****** n ** ATAGTTTG(;AATATTATATTTTTAAAATACCTCCATTTT~TTATCC--TTTTAGT~~TG..... intro" exon

1

INTRON B

195Q 1922 1909 Bovine AAGTCTGTTAAGTAAGTACCATAGCCTGTTCTGCATGGTGAGGTCAGGGGAATTGGATTTTTAAGGTTGGCTTTATW\TGG*****t**t*tt+*** *tt* et*** n** **t* ttt l l * *t*tt*****ttt** Human

1973 l ***

h)*t* et****

AAATCTGTTAAGTAAGTAC-------TtiTTTTGCCTT-----------G~ATTG~TTTTT~TGTT~CTTTATCATTTC~GT6GG~GCTAATGffi exe" 1 intro"

2009 2025 -AGTGATCCTCTCTGATCCTGTTCCCTCCAGGAAGAGA l ***

l ttt+***

* ** ***

l t*t***t**t

AAGTGGCCCTCTCTGTTTCTCTTCTTCCCAGGAAGAGA intro" exon

Fig. 5. Comparison sequences region

of the bovine and human

are indicated

a potential

polyadenylation

2 DNA

region

signal is underlined

and loop structure are numbered

by asterisks.

preceding

according

PTH gene sequences

The dashes (-127

indicate

to -111)

and the arrows

and the two TATA indicate

a string of Ts is indicated

to the bovine sequence

in the flanking

gaps introduced

and intron regions.

in the sequences sequences

the polyadenylation

are underlined.

sites. A small potential

by the solid lines (stem) connected

in Fig. 3.

Identical

to maximize

nucleotides

homology.

In the 3’ flanking transcription

with a dotted

in the two

In the 5’ flanking region

termination

the stem

line (loop). The nucleotides

321

a

AT AT

TA AA

b

d

c

DISCUSSION

A

TA

GA ‘CA

A F

AGT GA GG TcTA AC

:

GC TCA*

Fig. 6. Sl nuclease

mapping

of the 5’ end of the bovine

PTH

mRNA. Partly purified bovine PTH mRNA was hybridized

to a

PvuII-PstI

and

fragment

analyzed DNA

by electrophoresis sequencing

METHODS, present

ladder

studies (Weaver Sl probably nuclease

M urea

in MATERIALS

used as a marker.

in this region 5’ termini

based

and smallest

The

fragment.

of PTH mRNA

et al., 1982). Extensive

occurred

are

fragment.

is indicated

on previous

in the AT-rich expected

AND reaction

Extra bands

due to a contaminating

of that of the A + G reacted

the expected

sequencing

largest

fragment

of the DNA

complement

with Sl nuclease

on a 8% polyacrylamide-7

gel as described

in this lane

sequence

digested

section g. In lane a is an A + G sequencing

of the PvuII-PsrI

indicate

(53 to -600)

reverse

and is the The arrows

relative

to the

transcription

nibbling of the DNA by regions

fragments.

at the ends of the

Incubation

with

Sl

was for 45 min (lane b), 4 h (lane c) and 6 h (lane d).

We have cloned an EcoRI fragment from bovine DNA that contains the PTH gene. The restriction map of the cloned bovine PTH gene is the same as that determined by Southern blot analysis of total bovine DNA. The generation of single fragments containing the PTH gene by digestion of bovine DNA with several restriction enzymes suggests that a single copy of the PTH gene is present in the bovine genome. The complete sequence of the gene for bovine PTH has been determined. The gene contains two introns which occur in the same location as those in the human (Vasicek et al., 1983) PTH gene. Intron A, which interrupts the 5’ noncoding region, is 1714 bp in the bovine gene as compared to 3400 bp in the human. The small difference in size of intron B between the human and bovine gene can be accounted for mainly by deleted sequences in a region 8 bp from the 5’ splice junction of the intron. Interestingly, a region about 12 bp from the 5’ splice junction in intron A also shows little homology between human and bovine. The remainder of the intron B and the remainder of the small portion of the human intron A that has been sequenced are about 80% homologous in the bovine and human. This conservation of intron sequence is greater than the homology that is retained in the 3’ untranslated region of the bovine and human PTH genes which suggests that these internal intron sequences are important in some way for the structure or function of the PTH pre-mRNA. RNA transcribed in vivo from the PTH gene has heterogenous 5’ termini as analyzed by reverse transcription (Weaver et al., 1982) and S 1 mapping. The probable heterogeneity falls into two classes, a macroheterogeneity derived from initiation of transcription in two regions of the gene about 30 bp apart and a microheterogeneity within these two regions. The basis of the macroheterogeneity is probably the presence of two TATA sequences about 30 bp apart in the 5’ flanking region of the bovine gene which are conserved in the human gene and which are at the correct location to direct initiation of RNA synthesis in the two regions. Multiple initiation sites for RNA transcription controlled by multiple TATA sequences are unusual but have been observed in the chicken lysozyme gene (Grez et al., 1981) and in genes for avian very low density apolipoporotein II

328

(Hache

et al., 1983). Potentially,

multiple TATA

se-

proximation

of direct

repeats

in the 5’ flanking

quences could increase the efficiency of transcription

region and intron A. These direct repeats

or could be differentially

been generated

physiological

state

functional

of the parathyroid

would explain the conservation the human

cell, which

of these sequences

in

may have

if exon 1 and the 5’ regulatory

were inserted

as part of a mobile element

other PTH exons to form the ancestral

region near the

PTH gene.

starts directed by a single TATA

has been observed

cribed by RNA polymerase 1981 and references TA

on the

and bovine genes. The microheterogenei-

ty in RNA initiation sequence

depending

sequence

in several genes transII (see Baker and Ziff,

therein).

in the

bovine

The downstream gene

is unusual

TAin

We thank K. Bruce, C. Jackson

and D. Shapiro

containing 10 alternating TA nucleotides and this may contribute to the microheterogeneity of initiation starts over 8 bp directed by this sequence. It

for advice and procedures

for the Southern

blot ana-

is also possible that the heterogeneity, particularly the microheterogeneity, at the 5’ terminus could have been introduced by RNase activity during the isolation of the mRNA. Direct studies of the transcription of the gene in vitro will be required to elimi-

DNA ligase and 0. Uhlenbeck for supplying T4 polynucleotide kinase. We acknowledge the assistance of N. Browne in the initial sequencing studies. This research was supported by grant AM18866 from the National Institutes of Health.

lysis and genomic cloning, C. Bauer for aid in synthesizing [32P]ATP, J. Gardner for supplying T4

nate this possibility. In addition to the TATA sequences other regions in the 5’ flanking region may be important in the regulation of transcription. The 240 bp before initiation sites of transcription is strongly conserved in the human and bovine gene. Within this region in the human gene is a region of alternating purines and pyrimidines which potentially could form Z DNA (Vasicek et al., 1983) and is partially conserved in the bovine gene. A second striking sequence containing long stretches of alternating A and T (-260 to -420) is present in the 5’ flanking region of the bovine gene. Sequence information for the human gene is not available in this region so it is not known if homologous sequences occur in the human gene. However, an AT rich region about 500 bp upstream of the RNA initiation site in the silk fibroin gene (Tsuda and Suzukji, 198 1) and the AT-rich spacers between histone genes (Grosschedl and Bimstiel, 1980) are required for optimal rates of transcription. In the bovine PTH gene there are four repeats of a core sequence in this region. Because of the alternating A’s and T’s, potential single-stranded loops could form in this region and, since the sequences are repeated, several alternative loops could be formed which might be related to transcriptional activity of the gene. Alternatively, these sequences may be residual sequences from the initial formation of the PTH gene. A similar AT region is present near the 3’ end of intron A (1540 to 1630) forming an ap-

REFERENCES Baker,

CC.

and Ziff, E.B.: Promoters

termini of the messenger

and heterogeneous

RNAs of adenovirus

serotype

5’ 2. J.

Mol. Biol. 149 (1981) 189-221. Blin, N. and Stafford, high molecular

D.W.: A general

weight

method

for isolation

DNA from eukaryotes.

of

Nucl. Acids

Res. 3 (1976) 2303-2308. Breathnach,

R. and Chambon,

of eukaryotic Biochem. Fitzgerald,

P.: Organization

split genes

coding

Annu.

Rev.

C1981), 349-383.

50

M. and Shenk,

T.: The sequence

forms part of the recognition SV40 mRNA. Gordon,

and expression

for protein.

5’.AAUAAA-3’

site for polyadenylation

of late

Cell 24 (198 I) 25 l-260.

D.F. and Kemper.

and molecular

B.: Synthesis,

cloning of near full-length

ry to bovine parathyroid

hormone

restriction

analysis,

DNA complementa-

mRNA.

Nucl. Acids Res.

8 C ~1980), 5669-5683. Grez,

M., Land,

H., Giesecke,

Sippel, A.E.: Multiple cken lysozyme Grosschedl, stream

M.L.: Spacer

of the T-A-T-A-A-A-T-A

Natl. Acad. R.J.G.,

G., Jung,

are generated

A. and

from the chi-

gene. Cell 25 (1981) 743-752.

R. and Birnstiel,

promotion Hache,

K., Schiitz,

mRNAs

of H2A histone

DNA sequences

sequence

up-

are essential

gene transcription

for

in vivo. Proc.

Sci. USA 77 (1980) 7102-7106. Wiskocil,

R., Vasa, M., Roy, R.N., Lau, P.C.K.

and Deeley, R.G.: The 5’ noncoding

and flanking

the avian very low density apolipoprotein min genes. Homologies

regions

of

II and serum albu-

with the egg white protein

genes. J.

Biol. Chem. 258 (1983) 4556-4564. Hendy,

G.N.,

Nucleotide

Kronenberg, sequence

H.M., Potts Jr., J.T. and Rich, A.:

of cloned cDNAs

encoding

human

pre-

329

proparathyroid hormone. Proc. Natl. Acad. Sci. USA 78 (1981) 7365-7369.

Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74

Jay, E., Seth, A.K., Rommens, J., Sood, A. and Jay, G.: Gene expression: chemical synthesis of E. coli ribosome binding sites and their use in directing the expression of mammalian proteins in bacteria. Nucl. Acids Res. 10 (1983) 63 19-6329. Kemper, B.: Biosynthesis and secretion of parathyroid hormone, in Cantin, M. (Ed.), Cell Biology of the Secretory Process. Karger, Basel, 1984, pp. 443-480. Maniatis, T., Hardison, R.C., Lacy, E., Latter, J., O’Connell, C., Quon, D., Sim, G.K. and Efstratiadis, A.: The isolation of structural genes from libraries of eukaryotic DNA. Cell 15 (1978) 687-701. Maxam, A.M. and Gilbert, W.: Sequencing end-labelled DNA with base-specific chemical cleavages, in Grossman, L. and Moldave, K. (Eds.), Methods in Enzymology, Vol. 65. Academic Press, New York, 1980, pp. 499-560. Messing, J., Crea, R. and Seeburg, P.H.: A system for shotgun DNA sequencing. Nucl. Acids Res. 9 (1981) 309-321. Novotny, J.: Matrix program to analyze primary structure homology. Nucl. Acids Res. 10 (1982) 127-131. Platt, T.: Termination of transcription and its regulation in the tryptophan operon of E. coli. Cell 24 (1981) 10-23. Poncz, M., Solowiejczyk, D., Ballantine, M., Schwartz, E. and Surrey, S.: “Nonrandom” DNA sequence analysis in bacteriophage Ml3 by the dideoxy chain-termination method. Proc. Natl. Acad. Sci. USA 79 (1982) 4298-4302. Rigby, P.W.J., Dieckmann, M., Rhodes, C. and Berg, P.: Labelling deoxyribonucleic acid to high specific activity in vitro by nick translation with DNA polymerase I. J. Mol. Biol. 113 (1977) 237-251.

(1977) 5463-5467. Sollner-Webb, B. and Reeder, R.H.: The nucleotide sequence of the initiation and termination sites for ribosomal RNA transcription in X. laevis. Cell 18 (1979) 485-499. Southern, E.M.: Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98 (1975) 503-517. Stolarsky, L. and Kemper, B.: Characterization and partial purification of parathyroid hormone messenger RNA. J. Biol. Chem. 253 (1978) 7194-7201. Tsuda, M. and Suzuki, Y.: Faithful transcription initiation of tibroin gene in a homologous cell-free system reveals an enhancjng effect of 5’ flanking sequence far upstream. Cell 27 v981) 175-182. Vasicek, T.J., McDevitt, B.E., Freeman, M.W., Fennick, B.J., Hendy, G.N., Potts Jr., J.T., Rich, A. and Kronenberg, H.M.: Nucleotide sequence of the human parathyroid hormone gene. Proc. Natl. Acad. Sci. USA 80 (1983) 2127-2131. Weaver, C.A., Gordon, D.F. and Kemper, B.: Nucleotide sequence of bovine parathyroid hormone messenger RNA. Mol. Cell. Endocrinol. 28 (1982) 41 l-424. Weaver, R.F. and Weissmann, C.: Mapping of RNA by a moditication of the Berk-Sharp procedure: the 5’ termini of 15s /I-globin mRNA precursor and mature 10s b-globin mRNA have identical map coordinates. Nucl. Acids Res. 7 (1979) 1175-1193. Williams, B.G. and Blattner, F.R.: Construction and characterization of the hybrid bacteriophage lambda Charon vectors for DNA cloning. J. Virol. 29 (1979) 555-575. Communicated by S.R. Jaskunas.