319
Gene, 28 (1984) 319-329 Elsevier GENE
1002
Isolation and complete nucleotide sequence of the gene for bovine parathyroid hormone (Recombinant DNA; Southern blot; S 1 mapping; comparison with human gene; plasmid vector pBR322; M 13 and 2 phage; Charon 30; introns; exons)
Christine A. Weaver, David F. Gordon*, Martin S. Kissil*, David A. Mead and Byron Kemper** Department of Physiology and Biophysics and College of Medicine at Urbana-Champaign, University of Illinois at UrbanaChampaign, Urbana, IL 61801 (U.S.A.) Tel. (217)333-l 146 (Received
December
(Revision
received
(Accepted
6th, 1983) February
February
lst, 1984)
13th, 1984)
SUMMARY
The structure of the bovine parathyroid hormone (PTH) gene has been analyzed by Southern blot hybridization of genomic DNA and by nucleotide sequence analysis of a cloned PTH gene. In the Southern analysis, several restriction enzymes produced single fragments that hybridized to PTH cDNA suggesting that there is a single bovine PTH gene. The restriction map of the cloned gene is the same as that determined by Southern blot analysis of bovine DNA. The sequence of 3 154 bp of the cloned gene has been determined including 5 10 bp and 139 bp in the 5’ and 3’ flanking regions, respectively. The gene contains two introns which separate three exons that code primarily for: (i) the 5’ untranslated region, (ii) the pre-sequence of preProPTH, and (iii) PTH and the 3’ untranslated region. The gene contains 68% A + T and unusually long stretches of lOO-to 150-bp sequences containing alternating A and T nucleotides in the 5’ flanking region and intron A. The 5’ flanking region contains two TATA sequences, both of which appear to be functional as determined by Sl nuclease mapping. Compared to the rat and human genes, the locations of the introns are identical but the sizes differ. Comparable human and bovine sequences in the flanking regions and introns are about 80% homologous.
INTRODUCTION * Present
addresses:
Biophysics,
Tel. (319)353-5149; Clearbrook
(D.F.G.)
University Drive,
Department
of Iowa,
(M.S.K.) Arlington
of Physiology
and
Iowa City, IA 52242 (U.S.A.)
Amersham Heights,
Corporation,
2636 S.
IL 60005 (U.S.A.)
Tel.
(312)593-6300. ** To whom reprint Abbreviations:
requests
should
be addressed.
bp, base pairs; kb, 1000 bp; pfu, plaque-forming
units; preProPTH,
preproparathyroid
parathyroid
hormone;
ium dodecyl
sulfate.
0378-l 119/84/$03.00
PTH, parathyroid
0
1984 Elsevier
hormone;
proPTH,
hormone;
Science
pro-
SDS, sod-
Publishers
PTH is an 84 amino acid protein which is involved in maintaining extracellular concentrations of calcium within a narrow range. The secretion and, to a lesser extent, biosynthesis of PTH is regulated by the concentration of extracellular calcium in a simple feedback loop (Kemper, 1984). PTH is initially synthesized as a precursor, preProPTH, which is converted by two sequential proteolytic cleavages to ProPTH and then PTH (Kem-
320
per, 1984). Bovine purified larsky
PTH mRNA
and contains and
Kemper,
about
has been partially
700 nucleotides
(Sto-
1978) and the complete
se-
entire pPTHi4
plasmid
was used as a probe. isolated by preparative
or the 380-bp Sac1 fragment Restriction
fragments
electrophoresis
were
on 4 Tj0or 6 y0
quence of the mRNA
has been derived from cloned
polyacrylamide
bovine
(Gordon
DNA fragments
were labeled with [ a-32P]dCTP
nick translation
to a specific activity of 1 to 4 x 10’
PTH
cDNA
Weaver et al., 1982). Similarly, human
PTH mRNA
and Kemper,
1980;
a partial sequence
of
has been derived from cloned
cDNA (Hendy et al., 1981). Reverse transcription bovine PTH mRNA,
that was primed by restriction near the 5’ terminus,
that PTH mRNA
was heterogeneous
1980). The by
of DNA (Rigby et al., 1977).
of
fragments
hybridized
cpm/pg
gels (Maxam and Gilbert,
(b) DNA blot hybridization
suggested
in length with
DNA
was isolated
from the livers of individual
two clusters of 5’ termini about 30 nucleotides apart (Weaver et al., 1982). The derived sequence of the larger mRNA molecules began with the sequence
17-month-old Angus Hereford steers (Blin and Stafford, 1976; Maniatis‘et al., 1978). Analyses by electrophoresis on 1% agarose gels indicated that the
UAUAUAAA, a consensus TATA sequence, which was in the correct location to direct initiation of transcription to produce the smaller forms of PTH
DNA was probably pg/400 ~1 overnight
mRNA. A partial sequence of the human PTH gene has been determined and demonstrates that the gene contains two TATA sequences about 30 bp apart in the 5’ flanking region (Vasicek et al., 1983). The downstream human TATA sequence is analogous to the TATA sequence detected in the bovine PTH mRNA and a second upstream TATA sequence is present in the appropriate position to direct the initiation of the transcription of the larger forms of bovine PTH mRNA if an analogous sequence is present in the bovine gene. We now report the complete sequence of the bovine PTH gene which contains two TATA sequences analogous to the human mapping of the 5’ terminus that both TATA
MATERIALS
sequences
PTH gene. Sl nuclease of PTH mRNA confirms are functional
in vivo.
AND METHODS
(a) Hybridization
probes
larger than bacteriophage I DNA and was > 100000 bp. Bovine liver DNA, at 30 of the appropriate buffer, was digested with excess restriction endonucleases.
DNA blots were prepared as described by Southern (1975). For hybridization, 10’ cpm of nick translated heat-denatured cDNA probe (1 to 4 x 10’ cpm/pg) in 5 ml of the hybridization buffer was added to the filter and incubated at 68°C for 24-48 h. The final most stringent wash of the filters before autoradiography was in 0.15 M NaCl, 0.12 M Tris . HCl, pH 8.0,5 mM sodium EDTA, 0.1% SDS and 0.1% sodium pyrophosphate. (c) Preparation
of a partial bovine DNA library
High-M, bovine DNA obtained from a single animal was digested to completion with EcoRI. DNA fragments from 4000 to 12000 bp were isolated by sucrose-gradient centrifugation. The DNA was used to produce a partial bovine library in Charon 30 as described by Maniatis et al. (1978). Recombinant phage DNA was packaged in vitro into phage with Packagene as specified by Biotec, Inc.’ (Madison, WI), and plated on 25 15-cm plates as described by Williams and Blattner (1979). A total of 4.3 x lo5 pfu were obtained.
Hybridization probes for the PTH gene were restriction fragments of bovine PTH cDNA that were isolated from pPTHi4, which contains a near full-
(d) Screening the phage library
length PTH cDNA insert (Weaver et al., 1982). For most of the Southern blot experiments a 380-bp fragment generated by Sac1 cleavage was used (Fig. 2). This probe corresponds to most of the coding and the 3’ untranslated region of PTH cDNA. For plaque and colony filter hybridizations either the
The partial bovine library was screened using in situ plaque filter hybridization as described by Maniatis et al. (1978). Two positive plaques were plaquepurified and the presence of the PTH gene confirmed by Southern blot analysis of the phage DNA.
321
(e) Subcloning of the PTH gene into pBR322
RESULTS
The bovine DNA inserts in the recombinant phage DNA were recovered by annealing the cohesive ends of the phage, digestion with EcoRI and separation of the ligated arms from the insert by sucrose gradient centrifugation (Maniatis et al., 1978). The PTH gene fragments were ligated into the EcoRI site of pBR322 and Escherichiu coli RR1 was transformed with the recombinant plasmids by the calcium shock technique (Weaver et al., 1982).
(a) Restriction map of bovine PTH gene
(f) Determination of the DNA sequence The nucleotide sequence was determined by the chemical method of Maxam and Gilbert (1980) except that acetylacetone was added after the first ethanol precipitation of the pyrimidine reactions to eliminate residual hydrazine (Jay et al., 1983). The 1500-bp Sac1 fragment that contains part of intron A, exon 2, intron B and part of exon 3 was inserted into the Sac1 site of M 13mp 10 and sequenced by the dideoxynucleotide termination method (Sanger et al., 1977; Messing et al., 1981). To obtain nested subfragments of the Sac1 fragment, 120 pg of DNA from each of two recombinant phage containing the Sac1 fragment in opposite orientations, were hybridized and digested with Sl nuclease followed by BAL 3 1. The blunt-ended fragments were inserted into the SmaI site of M13mpll for sequencing. (g) Sl nuclease mapping The 5’ end of PTH mRNA was analyzed by Sl mapping as described by Weaver and Weissmann (1979). The 5 ‘-32P-labeled probe was a fragmentlabeled at the PvuII site (nucleotide 53 in Fig. 3) extending to a PstI site at about -600. PTH mRNA (0.25 pg) about 50% pure (Stolarsky and Kemper, 1978) was annealed to 4.5 x lo5 cpm (0.89 pg) of the DNA probe for 18 h at 48°C. After digestion with S 1 nuclease at 30’ C for 0.75,4 or 6 h, the samples were analyzed on an 8% polyacrylamide-7 M urea DNA gel. The 5’-32P-labeled PvuII-PstI fragment that had been chemically degraded by the A + G sequencing reaction (Maxam and Gilbert, 1978) was analyzed in parallel to determine the size of the protected fragments.
For Southern blot analysis of the bovine PTH gene, 26 different restriction enzymes or combinations were used to determine a map of the gene. A representative autoradiogram is shown in Fig. 1 and the restriction map is shown at the top of Fig. 2A. Most of the enzymes which did not cleave within the cDNA probe, as indicated by the dots above the lanes in Fig. 1, produced a single major fragment that hybridized to the cDNA. In addition to those shown in Fig. 1, Hue111 (2100 bp), RsuI (705 bp), and HhuI (22000 bp) also produced single bands. For the enzymes TuqI (Fig. 1, lane 10) and XbuI (Fig. 1, lane 13) which cleave the cDNA once between the two Sac1 sites, two major bands were seen. These data suggest that the PTH gene exists as a single copy within the haploid bovine genome since flanking regions in multiple PTH genes would probably diverge and produce more than one band. Comparison of the restriction map of bovine DNA with that of bovine PTH cDNA indicated that intervening sequences of greater than 1500 bp were present in the 5’ portion but not in the 3’ portion of the PTH gene. The Southern blot analysis indicated that the PTH gene was contained within a 7000-bp EcoRI fragment (Fig. 1, lane 16). Consequently, a partial library of EcoRI-digested bovine DNA was established using Charon 30 as a vector. Four positive plaques out of 400000 screened were obtained and two were plaque-purified and analyzed. A partial restriction map of the cloned PTH gene is shown in Fig. 2A and is compared to the map obtained by the genomic Southern blot analysis. The two maps are basically the same within experimental error for determining the sizes of DNA fragments, particularly for the Southern blots. In addition, new sites are shown in the cloned gene which could not be detected in the Southern analysis since only those sites for a restriction enzyme bordering the probe could be detected. (b) Sequence of bovine PTH gene The complete nucleotide sequence of the bovine PTH gene (Fig. 3) was determined by the strategy shown in Fig. 2B. Except for exon 3, over 90 y0 of the molecule was sequenced at least twice with 75% sequenced either on both strands or by both me-
322
i
5
2
4
5
6
7 ‘8
9
IO I I 9;
13 I4 15 1; I?’
k -23700
-1353 -872 -603 -310
Fig. 1. Southern restriction
blot analysis
endonucleases.
by electrophoresis
through
a l.OY, agarose
section b. PTH gene fragments insert ofpPTHi4,
gel and transferred by hybridization
to a nitrocellulose
thods. The total 3154-bp
sheet as described
RF DNA of $X174 digested
includes
510 bp
in the 5 ‘ flanking region and 139 bp in the 3’ flanking region. The gene contains two introns, one of 1714 bp which interrupts the 5’ untranslated region of PTH mRNA and one of 119 bp which interrupts the region coding for the pro-sequence of preProPTH (Fig. 3). The exons crudely represent functional domains. Exon 1 contains 95 bp that correspond to all of the 5’ untranslated region except for 5 bp preceding the initiator methionine codon. Exon 2 contains 91 bp that correspond to 5 bp from the 5’ untranslated region and the region that codes for the pre-sequence and the first four amino acids of the pro-sequence of preProPTH. Exon 3 contains 486 bp that correspond to the region coding for the remainder of the pro-sequence, PTH and the 3’ untranslated region. The sequences of the exons were identical to that determined for bovine PTH cDNA except for T2372 in the 3’ noncoding region which was a G in the cDNA sequence. This difference is
in bp are shown
probably
with the indicated
DNA digest were fractionated
in MATERIALS
AND METHODS,
of a Sucl-fragment
to a specific activity of2 x 10’ cpm/pg. The dots above the lane numbers
(lane 6). The sizes of the fragments
sequence
was digested
30 pg ofeach
to a sZP probe (see Fig. 2A) consisting
that do not cleave within the Sac1 fragment.
as M, standards
from the liver of a single animal
that both enzymes were used for double digestion.
were detected
which was nick-translated
with single enzymes Hind111 served
of bovine liver DNA. DNA prepared
Slash indicates
of the cDNA
indicate
digestions
with Hue111 and I DNA digested
with
on the right.
the result of polymorphism
at this site or an
error introduced during cloning. The overall nucleotide composition of the PTH gene contains 68)1/, A + T (Table I). The flanking TABLE
I
Nucleotide
composition
of the bovine
Percent Length
PTH gent
of nucleotides A
T
C
G
510
37
95
32
36
13
14
73
31
20
18
63
1714 91
35
35
15
15
70
29
35
13
23
64
A+T
(bp) 5’ flanking Exon 1 Intron
A
Exon 2
119
22
32
17
29
54
Exon 3-coding
Intron
262
34
23
18
26
57
Exon 3-noncoding
224
31
38
17
14
69
3’ flanking
139
36
40
15
9
76
3154
34
34
15
16
68
Total
B
323
A
0 I
2 I
I I
3 I
4 I
Pst I
I
Taq I
Act
Sac I I
6 I
7 I
Kb
P&
Hind III Eco R I
5 I
I Sac I Sac I
Act
I
Pvu II
I
Xba I
I Eco R I
Southern Blot Analysis
I
Xbai
Pvull
Taq I Pst I
Pst I
Hind
III
Hind
III
Xba I XbaIXbaI
Intron
P!e“Pr%
0.5
1.0
1.5
I
I
I
IEcoR
I
Cloned
Gene
A
5’ Untraklated
B
Act
Pfli
5’
2.0
I
Untranslated
2.5
I
3.0
I
I
1 Kb
O---+-
TR
CCL_
<
_
-
-
*
cc-c-
HA
R
----v-&a
S R
HF -e
--
i-
-
. ..+
H I-IF __c
AHCHF ----
HFR
R
-
S
S
R
-
Fig. 2. Partial restriction map of the bovine PTH gene and strategy for sequencing. (A) The restriction map derived from the Southern blot analysis is shown in the second line and compared to the restriction map of the cloned PTH gene in the third line. The Sac1 fragment used as a probe for the Southern blot analysis (see Fig. 1) is shown above. Solid boxes represent the exons of the PTH gene and are referred to in the text as 1 through 3 from the left. The large intron is intron A and the small one is intron B. In the bottom line the PTH gene region is expanded and the regions of mRNA that are encoded in each exon are indicated. (B) Restriction sites used in the sequencing are indicated. The arrows above the line represent fragments sequenced by the Maxam and Gilbert (1980) method with the circles indicating the 5’ radioactive end and the arrow indicating the direction of sequencing. The arrows below the line indicate sequences determined by the dideoxynucleotide chain termination method. The letters represent restriction enzymes as follows: T, TuqI; R, RruI; H, HindIII; A, AccI; S, SacI; HF, Hid; HC, HincII.
regions of the gene, intron A, and the 3 ’ noncoding region of the gene are particularly rich in A + T, ranging from 69 to 76%. The 3’ flanking region shares with the neighboring 3’ untranslated region a high percentage of T’s and very low percentage of c’s. Intron B is unusual in having a high percentage of G’s (29%), a property it shares with its neighboring regions, exon 2 and the translated region of exon 3. A lOO-bp core in intron A is also G (35%) and C (21%) rich. Regions of the gene contain long sequences char-
acterized by alternating A and T in both the 5’ flanking region (-420 to -260) and intron A (1548 to 1643) as shown in Fig. 3. This region in the intron consists of three tandem nearly perfect copies of a 2%bp sequence. Repeating units within this type of sequence in the 5’ flanking region are less obvious, but four sequences that are flanked by small strings of A at the 5’ end and T at the 3’ end are partly homologous to each other. Each of these four sequences contains a 12-bp core that matches 12 bp present in the 28-bp repeats in the intron.
324
-45 -50 -40 CGAATCAGGC8GCAAAAGGACTGGAGATTCAGCATTCAGCATCAGTCCTTCCAATGAATA PTCAAGGCTGATTTCTTTTAGAATTACTAAAGGAATATATTATATATATT 1 TATATTTATAAATATATAATACATATATATATAATATAT ......
-300
-25Q -359 ATATTTTATAAATATTTTATATTTTATAT~TATACATTTTGTA~TATATAATATATATTTAT~TATAT~TATATAT~TATATCTTTTTGGATATAGGCATT~TCAGTCA~TTAC~TTCACTATTTGTTA~AATCTTT ........... ............ -1OQ -159 GCATAAACACTTTTCCAGCCCACGCTGTTTTGCTTTTAATATCC~TTATCT~TTTAA~~ATGGCACCGCCCCATGGGAGTGTGTGTGCTGCTCTATGATCCTATGGTTAAAATTCAGAGAATTGGGAGTGACGTCATCTGT~
CAA~A%TTCTCAGTGTGGAAGACTTATATA~GTCACATTGAAGGGTCT
TGA AAG AAG ATT GTA TCC TAA GAC GTG TG
AC, GCT CAA TTT ATC AGC CTT CTC AGG TTT ACT CAA CTT TGA GAA AGC ;!C AGC TGC TAA TAC ATT
100 '5Q GTWlGTAATCTTTATTTTCTCTTTTAAGTTTCCATGAGTTCTTTCTTAC~TCAAGTAGTCATTT~TATGTACATTTCCTACT~TATCATGCTGTTTGTT~~GT~
300 259 AATGTT~GGGAGGGTTT~TTATATTT~TATT~TG~~A~AATAAA~TA~TTTATG~~A~T~~TATTGTTG~~A~TTATTTT~GTGATT~GTATTTT~~~~~TTGCTTTTTTTTTTTTTTTT~CTCATCTGC
359
AATTA~ATGG~TAATTATTTGG~AT~TGTG~TTT~~TAT~TTGTTA~ATTTTGATAT~AG~AG~~TAGTGA~ATG~T~~~CTGATTT~T~TCAATGT~ATTTATGTCTCTGAGTT~TTCC~~~TTCC~ 65 559 609 CACATAATAT~~T~A~AGA~G~T~AGT~TGTGTA~TATAGTTGGAGAAA~TGTTT~A~TG~TG~TATTTGGT~~~TTCA~ATAGAA~~TTTT~TAGTTTT~TAT~T~~TA~A~ATG~TTTTAAGGATGTTT~GTG~G ?TTcTA 7OQ 7M 80 ATGACCTTCTTGTGAGAAGCGCGCCTCTCAGTTGAGAATAGG~A~AAGGCAGACGCACTAGGT~GTGTGGCGGAGTGGGGCAGACTCCTGT~CTCTGCTACAGTGATCTTTTCTCATGTAAACCAAAATGGAA~CTATCCB TTTGG
TGCTCCTTCAATAGCTTATATTGCACAGTGAAAACAAATATA~~GGTTATAATGGCCCTTATGCTACTTCTW\TGCGCGTTTCCCACAGCC~~CTATGTGTATGTTAAA
?ATCAT
1000 1050 1lOQ tiAAAGCTI\TTGCAGTTGTTGAAATGCAGCTAAGGTAAGRCTTTiCTATTTTWlGATAAAAAGCTTAAAAGCAAAAGAGCAAAATTCTAAACATiATACTCAAACATTGTTGTTAATAAAATGAGCTG~~T~~TACA~C~~ 1209 lisp ATTTTATATGGAATCAAAGCTCAAATTTGTATAACAAGTAATT~TTA~TGATGCTTAAATGCTTTGACACATTCTTTTTAATGCTGGGAGAACTCTGTGTTAAGTTAGTAAAAAATCTCTAAATATTTA
1250
1300 135s TAGTTATCATGACATTGAAAGGAAGTCTACTTAT~~ATTTGCTAAGTAAACA~CGCTGTT~C~TTTACCTAATTAAGA~GTA~CTATTTTC~GCAAATTA~GCATTTAGTCATAATTAATACAAAATGT~GGAGG
1400
15OQ AAATTCTAAACAGTGAACTTTTTGATTCTTCACATAGCTTAAA4 '""pTCATATAACTTGCACAAGAACTCTCAAAAAGATACCTAAACTTACTATATACTAA
155 PTATGT
1600 1650 170Q GTATATATATAATATACAGTAATTATGTGTATATATAT~TATA~AGTAATTATGTATATATATATAT~TATATA~CTGT~AAT~A~~~TGATTCC~~GTCTCTCACA~AC~TCC~TTAGTCACTGTCACACTCTGT
I
,
17x, 18OQ GCAAACTTGTATGGTfATATATAAACTACTATAAArTTTAATTGTACAGTTTGG~TGTT~GTATTTTT~TACCTCCATTTTTCTTTTTCTTTTTTAG Vdl Met Ile Val Met Leu Ala Ile Cys Phe Leu Ala Arg SW Asp Gly Lys Ser Val Ly GTA ATG ATT GTC ATG CT1 GCC ATC TGT TTT CTT GCA AGA TCA GAT GGG AAG TCT GTT AA
Met Met SW Ala Lys Asp Met Val Lys TT AAT ATG ATG TCT GCA AU GAC ATG GTT AAG
1950 GTPJ\GTACCATAGCCTGTTCTGCATGGTGAGGTCAGGGG~TTGGATTTTTAAGGTTGGCTTTAT~TTT
2000 5 Lys Arg Ala Val SW Glu Ile Gln Phe Met His Asn Leu Gly Lys His Leu SW SW Met Glu Arg Val Glu GGAAGAGGGGP;ACTAATGGAGTGATCCTCiCTGATCCTGTTCCCTCCAGG FAG AGA GCT GTG AGT G44 ATA CAG FrTTATG CAT AdC CTG GGC AdA CAT CTG AGC TCC ATG GRA AGA GTG G4A Trp Leu Arg Lys Lys Leu Gln Asp Val His Am Phe Val Ala Leu Gly Ala Ser Ile Ala Tyr Arg Asp Gly SW SW Gln Arg Pro Arg Lys Lys Glu Asp Am Val Leu TGG CTG CGG A44 AAG CTA CAG GAT GTG CAC AAC TTT GTT GCC CTT GSA GCT TCT ATA GCT TAC AGA GRT GGT AGT TCC tAG AGA CCT CGA AAA AAG GAA GAC AAT GTC CTG Val Glu Ser His Gln Lys Ser Leu Gly Glu Ala Asp Lys Ala Asp Val Asp Val Leu Ile Lys Ala Lys Pm Gln End 2300 GTT GAG AGC CAT,CAG AAA AGT CTT GGA @A GCA w\C AAA GCT GRT GTG w\T GTA TTA ATT AdA GCT AA4 CCC CAG TGA AAA CAG ATA TGA TCA G4T CAC TGT TCT AGA CAG
CAT AGG GCA ACA ATA TTA CAT GCT GCT AAT GTG?T"z ACC TTC TAT TAA GTG CCA GTA TTT CTA TGA CCA ACC TTT ATT GCT A,:"", !T GAT ACC TAC AAT TTT AAT TGA GTA 2450 TfT TGA TTC TAC TTT ATT CAT CTA iGA GCT CTT TTA ATA ATT CTA TTT CTA TTG ATT CCAU
2500 TGA AGT Tti GTA T,
TCTCACTTGTTATAAAAATATCTTTTGGTTATGAGTAAC
2609 2550 ACCAAtATGTTAAAATTGATCATGACTAAGAACAACAACAC~ATGTTlTTTT~CCATATGTTTCACATTCT~TATTTTG~TTATCCTTTT~TC
Fig. 3. Nucleotide
sequence
In the 5’ flanking
region potential
two potential regions,
of the bovine PTH gene. The sequence
TATA sequences
the appropriate
are underlined.
Sequences
amino acid is indicated
the large intro” (1545-1630 lation signal, AATAAA,
stem and loop structures
bp) three adjacent is underlined
of the DNA strand
are indicated
of exons are shown as separated
above the nucleotide direct repeat sequences
and a potential
corresponding
transcription
(c) Comparison with human PTH gene Comparison of the nucleotide sequence of the bovine PTH gene with that of the human PTH gene (Vasicek et al., 1983) by dot matrix homology indi-
to the mRNA sequence
by solid lines (stem) connected
sequence. terminator
groups of three nucleotides,
A possible
are indicated
by a dotted
transcription
by arrows.
is underlined
is shown.
line (loop). The and in the coding
start point is numbered
In the 3’ flanking
I. In
region the polyadeny-
twice.
cates strong homology throughout the gene, except in the 3’ noncoding region (Fig. 4). The sequences of the bovine and human genes in the flanking regions and introns are compared directly in Fig. 5. The 238 bp preceding a possible start point for trans-
325
HUMAN Fig. 4. Comparison trix analysis.
SEQUENCE
of bovine and human
A computer
dot when an identical
program
sequence
(bp x 1O-3)
However, the actual site of polyadenylation in the human gene is displaced 4 or 5 bp from the bovine site. A 5-bp gap must be introduced in the human sequence between the polyadenylation signal and the polyadenylation site to maximize homology. This finding indicates that the distance from the polyadenylation signal is more important than the actual sequence at the polyadenylation site for determining the site of polyadenylation. A similar conclusion has been reached on the basis of deletion mutations between the signal and site for polyadenylation in SV40 mRNAs (Fitzgerald and Shenk, 1981). About 80 bp 3’ to the polyadenylation site is a sequence of dyad symmetry preceding a string of Ts. The sequence is similar to rho-independent termination sites in prokaryotes (Platt, 1981). A change from C in the bovine gene to T in the human gene is accompanied by a G to A change which maintains the dyad symmetry.
PTH gene by dot ma-
generated
a plot by placing a
of 7 nucleotides
in the bovine
(d) 5’ termini of PTH mRNA
gene along the vertical axis was detected in the human gene along the horizontal indicated
axis (Novotny,
1982). Exons
of each gene are
by the solid boxes along the axis. The break in the axis
for the human information
gene is the intron
A region for which sequence
is not available. The human sequence is from Vasicek
et al. (1983).
cription are remarkably similar. With minimal introduction of gaps, the two sequences are 84% homologous, which is comparable to the 90% homology in the protein coding regions. Also conserved are two TATA sequences about 30 bp apart (-58 to -54 and -24 to - 19) which would serve to direct the initiation of transcription (Breathnach and Chambon, 198 1). Notably, the region between the TATA sequences is only 50% conserved. The introns and flanking regions are also conserved (Fig. 5). Homology of about 80% between the human and bovine sequences is present in the introns and 3’ flanking region if gaps are not considered. In contrast, the 3’ noncoding region requires multiple gaps to maximize homology (Hendy et al., 1981). The two introns follow the standard GT-AG rule at the splicing junctions (Breathnach and Chambon, 198 1). The polyadenylation signal and 14 bp in the region of the sites of polyadenylation are completely conserved between the human and bovine (Fig. 5).
We have shown previously by reverse transcription that the 5’ ends of bovine PTH mRNA are heterogeneous and that the initiation of transcription of the shorter mRNAs could be directed by a TATA sequence corresponding to the sequence, -24 to -19, that was present in the longer mRNA sequences. The finding of another TATA sequence further upstream confirms the prediction (Weaver et al., 1982) that a second sequence should be present in the gene to direct the initiation of the longer mRNAs. We have further confirmed the presence of heterogeneous 5’ termini of PTH mRNA by S 1 mapping as shown in Fig. 6. The arrows indicate the locations on the sequencing ladder expected to be start sites for PTH mRNA. At the shortest time of S 1 nuclease digestion (lane b), bands corresponding to these sites are observed if a correction of about 1.5 nucleotides is made because of the different nature of cleavage by Sl nuclease and chemical degradation (Sollner-Webb and Reeder, 1979). The fragments marked by the top and bottom arrows are apparently ‘nibbled’ by the S 1 nuclease at the longer times. In both cases regions of A and T follow the expected initial A and would be more susceptible to the nibbling. These studies differ from the previous reverse transcriptase studies in that the intensity of the largest band is much greater relative to the smaller bands.
326
5' FLANKING REGION
-2% -225 -200 -175 Bovine TAGGCATTAATC--------------AGTCAGATTACAATTCACTATjTGTTAGAAATCTTTGCAT~-ACAC~TTTCCAGCC-CACGCTGTTTTGCTTi ***t*t** t l * **t** t******t* l * *,**t****** * *** l * l *** l t*t,***t l *** Human
GATTCATTAATCCACATAGTTTTTCTCGATGGTATAATTCTGTATTTGTTAAAAGTCTTTGCATAAGCCCCTTGTCAGCCAAATGCTGTTTTCCTTT
-1% -125 -100 TAATATCCAATTATCTAAAATTTAAGAAGAATG-GCACCGCCCCATGGGAGTGTGTGTGCTGCT-CTAT~TCCTA~GGTT~TTCA~~TTGGGA l * **t*+****t**t l ** *t*tt**** *t ********* t**** *t**t+ l * t*** l * l * **** l ** l tt ****t******** TAGTATCCAATTATCTGAAACTTAAGAAGAGTGTGCACCGCCCAATGGGTGTGTGTATG-TGCTGCTTT~CCTATAGTTGA~TCCA~~TTGGGA
-7$ -25 -50 1 GTGRCGTCATCTGTAACAATAAAAAAGCTTCTC----AGTGTGGAAGACTTATATATATAAAAGTCAC-ATTGAAGGG-TCTACAGCTCAAT t**tt l ***************** ttt t*** l ***ttt*****t*t* ttt **et* *** l ** * ****at
l ***
GTGACATCATCTGTAACAATAAAAGAGCCTCTCTTGGTAAGCAGAAW\CC--TATATATAAAAGTCACCATTTAAGGGGTCTGCAGTCCAAT
3' FLANKING REGION
2525 2500 2550 2575 Bovine ATTCCAAATAAATGAAGTTAAGTATTtTCTCACTTGTTATAAAAAT~TC-TTTTGGTTATGAGTAACACCAAiATGTTAAAATTGATCAT~CT~~ l tt *t**t**** tt*t**t***t*** l* ** ***t* *t*tt***** l l l *** l tt*t* l t****ttt*****t t l Hu"a"
AGTCTA~TG-----AAGTATTTCTCtAtCTCATTGCAAGTATATCTTTTTGGTTATCACTGATACCCACATGTTTACATTGATCAT~CTAGGTA
2600 262; CAACAACACAAAATGTTTTTTTAACCATATGTTTCACATTCTGATATTTT~-ATTATCCTTTT~TC l **** ***tt ry l ******* *** l ********** l **tt*****t ** l l **t** * GAACAATACAAAGTATTTTTTTAGTCATGTGTTTCACATTTG~TATTTT~ACATCAACGTTTTAGTA -...-
INTRON A
175 125 150 100 Bovine GAC---GTGTGGTGAGTAATCTTTATTTTCTCTTTTAAGTTTCCATGAtiTTCTTTCTTACAATC~GTAGTCATTT~TATGTAC-ATTTCCTACT ** **** l **** n+ t * *+ * * *ttt**** * l ****** l ***. tit** n* tt ** t*t*** l ** Human
GACATTGTATGGTAAGTAAACTTAAAAATTCACTTCTGAATACCATTTC-TACA exon intron
1
225 259 2OQ AATATCATGCTGTTTGTTAAGTAAAATGTTAAGGGAGGGTTT~TTATATTTAATATTAAAATGCCACAATAAAAAATAA..... l ***** **t* **t**** n*******++ l *tt l *** *t*t tt l * *ttt*** **** l *******
(1398 bp)
AATACCATGTTGTTTCTTCAAGGTAAAATGCTAAGAA---GTTTGAGTTATGTTTAATAT-AAAATGCCACATACAAAAATAA.....
(% 3100 bp)
1759 170 1723 1675 . .. ..AAGTCTCTCACAAAAMCAATCCAATTAGTCACTGTCACA TCTGTGCAAACTTGTATGGTTATAAATATAAACTACTATAAATTTAATTGTACtttt l ** et** l *** ** l ******** t * t l * n*** . .. ..AAGCTTCTCGTWU\AACCACCCAATTAGTTAGTATTGCATTCTGT----------------------~---------------------GTACT
1809 1775 --AGTTTGGAATGTTAAAAGTATTTTTAAATACCTCCATTTTTCTTTTTCTTTTTTAGTTAATATG..... ***tt***** *t*ttt n**** **tt***t*ttt*** l ** * l ****** n ** ATAGTTTG(;AATATTATATTTTTAAAATACCTCCATTTT~TTATCC--TTTTAGT~~TG..... intro" exon
1
INTRON B
195Q 1922 1909 Bovine AAGTCTGTTAAGTAAGTACCATAGCCTGTTCTGCATGGTGAGGTCAGGGGAATTGGATTTTTAAGGTTGGCTTTATW\TGG*****t**t*tt+*** *tt* et*** n** **t* ttt l l * *t*tt*****ttt** Human
1973 l ***
h)*t* et****
AAATCTGTTAAGTAAGTAC-------TtiTTTTGCCTT-----------G~ATTG~TTTTT~TGTT~CTTTATCATTTC~GT6GG~GCTAATGffi exe" 1 intro"
2009 2025 -AGTGATCCTCTCTGATCCTGTTCCCTCCAGGAAGAGA l ***
l ttt+***
* ** ***
l t*t***t**t
AAGTGGCCCTCTCTGTTTCTCTTCTTCCCAGGAAGAGA intro" exon
Fig. 5. Comparison sequences region
of the bovine and human
are indicated
a potential
polyadenylation
2 DNA
region
signal is underlined
and loop structure are numbered
by asterisks.
preceding
according
PTH gene sequences
The dashes (-127
indicate
to -111)
and the arrows
and the two TATA indicate
a string of Ts is indicated
to the bovine sequence
in the flanking
gaps introduced
and intron regions.
in the sequences sequences
the polyadenylation
are underlined.
sites. A small potential
by the solid lines (stem) connected
in Fig. 3.
Identical
to maximize
nucleotides
homology.
In the 3’ flanking transcription
with a dotted
in the two
In the 5’ flanking region
termination
the stem
line (loop). The nucleotides
321
a
AT AT
TA AA
b
d
c
DISCUSSION
A
TA
GA ‘CA
A F
AGT GA GG TcTA AC
:
GC TCA*
Fig. 6. Sl nuclease
mapping
of the 5’ end of the bovine
PTH
mRNA. Partly purified bovine PTH mRNA was hybridized
to a
PvuII-PstI
and
fragment
analyzed DNA
by electrophoresis sequencing
METHODS, present
ladder
studies (Weaver Sl probably nuclease
M urea
in MATERIALS
used as a marker.
in this region 5’ termini
based
and smallest
The
fragment.
of PTH mRNA
et al., 1982). Extensive
occurred
are
fragment.
is indicated
on previous
in the AT-rich expected
AND reaction
Extra bands
due to a contaminating
of that of the A + G reacted
the expected
sequencing
largest
fragment
of the DNA
complement
with Sl nuclease
on a 8% polyacrylamide-7
gel as described
in this lane
sequence
digested
section g. In lane a is an A + G sequencing
of the PvuII-PsrI
indicate
(53 to -600)
reverse
and is the The arrows
relative
to the
transcription
nibbling of the DNA by regions
fragments.
at the ends of the
Incubation
with
Sl
was for 45 min (lane b), 4 h (lane c) and 6 h (lane d).
We have cloned an EcoRI fragment from bovine DNA that contains the PTH gene. The restriction map of the cloned bovine PTH gene is the same as that determined by Southern blot analysis of total bovine DNA. The generation of single fragments containing the PTH gene by digestion of bovine DNA with several restriction enzymes suggests that a single copy of the PTH gene is present in the bovine genome. The complete sequence of the gene for bovine PTH has been determined. The gene contains two introns which occur in the same location as those in the human (Vasicek et al., 1983) PTH gene. Intron A, which interrupts the 5’ noncoding region, is 1714 bp in the bovine gene as compared to 3400 bp in the human. The small difference in size of intron B between the human and bovine gene can be accounted for mainly by deleted sequences in a region 8 bp from the 5’ splice junction of the intron. Interestingly, a region about 12 bp from the 5’ splice junction in intron A also shows little homology between human and bovine. The remainder of the intron B and the remainder of the small portion of the human intron A that has been sequenced are about 80% homologous in the bovine and human. This conservation of intron sequence is greater than the homology that is retained in the 3’ untranslated region of the bovine and human PTH genes which suggests that these internal intron sequences are important in some way for the structure or function of the PTH pre-mRNA. RNA transcribed in vivo from the PTH gene has heterogenous 5’ termini as analyzed by reverse transcription (Weaver et al., 1982) and S 1 mapping. The probable heterogeneity falls into two classes, a macroheterogeneity derived from initiation of transcription in two regions of the gene about 30 bp apart and a microheterogeneity within these two regions. The basis of the macroheterogeneity is probably the presence of two TATA sequences about 30 bp apart in the 5’ flanking region of the bovine gene which are conserved in the human gene and which are at the correct location to direct initiation of RNA synthesis in the two regions. Multiple initiation sites for RNA transcription controlled by multiple TATA sequences are unusual but have been observed in the chicken lysozyme gene (Grez et al., 1981) and in genes for avian very low density apolipoporotein II
328
(Hache
et al., 1983). Potentially,
multiple TATA
se-
proximation
of direct
repeats
in the 5’ flanking
quences could increase the efficiency of transcription
region and intron A. These direct repeats
or could be differentially
been generated
physiological
state
functional
of the parathyroid
would explain the conservation the human
cell, which
of these sequences
in
may have
if exon 1 and the 5’ regulatory
were inserted
as part of a mobile element
other PTH exons to form the ancestral
region near the
PTH gene.
starts directed by a single TATA
has been observed
cribed by RNA polymerase 1981 and references TA
on the
and bovine genes. The microheterogenei-
ty in RNA initiation sequence
depending
sequence
in several genes transII (see Baker and Ziff,
therein).
in the
bovine
The downstream gene
is unusual
TAin
We thank K. Bruce, C. Jackson
and D. Shapiro
containing 10 alternating TA nucleotides and this may contribute to the microheterogeneity of initiation starts over 8 bp directed by this sequence. It
for advice and procedures
for the Southern
blot ana-
is also possible that the heterogeneity, particularly the microheterogeneity, at the 5’ terminus could have been introduced by RNase activity during the isolation of the mRNA. Direct studies of the transcription of the gene in vitro will be required to elimi-
DNA ligase and 0. Uhlenbeck for supplying T4 polynucleotide kinase. We acknowledge the assistance of N. Browne in the initial sequencing studies. This research was supported by grant AM18866 from the National Institutes of Health.
lysis and genomic cloning, C. Bauer for aid in synthesizing [32P]ATP, J. Gardner for supplying T4
nate this possibility. In addition to the TATA sequences other regions in the 5’ flanking region may be important in the regulation of transcription. The 240 bp before initiation sites of transcription is strongly conserved in the human and bovine gene. Within this region in the human gene is a region of alternating purines and pyrimidines which potentially could form Z DNA (Vasicek et al., 1983) and is partially conserved in the bovine gene. A second striking sequence containing long stretches of alternating A and T (-260 to -420) is present in the 5’ flanking region of the bovine gene. Sequence information for the human gene is not available in this region so it is not known if homologous sequences occur in the human gene. However, an AT rich region about 500 bp upstream of the RNA initiation site in the silk fibroin gene (Tsuda and Suzukji, 198 1) and the AT-rich spacers between histone genes (Grosschedl and Bimstiel, 1980) are required for optimal rates of transcription. In the bovine PTH gene there are four repeats of a core sequence in this region. Because of the alternating A’s and T’s, potential single-stranded loops could form in this region and, since the sequences are repeated, several alternative loops could be formed which might be related to transcriptional activity of the gene. Alternatively, these sequences may be residual sequences from the initial formation of the PTH gene. A similar AT region is present near the 3’ end of intron A (1540 to 1630) forming an ap-
REFERENCES Baker,
CC.
and Ziff, E.B.: Promoters
termini of the messenger
and heterogeneous
RNAs of adenovirus
serotype
5’ 2. J.
Mol. Biol. 149 (1981) 189-221. Blin, N. and Stafford, high molecular
D.W.: A general
weight
method
for isolation
DNA from eukaryotes.
of
Nucl. Acids
Res. 3 (1976) 2303-2308. Breathnach,
R. and Chambon,
of eukaryotic Biochem. Fitzgerald,
P.: Organization
split genes
coding
Annu.
Rev.
C1981), 349-383.
50
M. and Shenk,
T.: The sequence
forms part of the recognition SV40 mRNA. Gordon,
and expression
for protein.
5’.AAUAAA-3’
site for polyadenylation
of late
Cell 24 (198 I) 25 l-260.
D.F. and Kemper.
and molecular
B.: Synthesis,
cloning of near full-length
ry to bovine parathyroid
hormone
restriction
analysis,
DNA complementa-
mRNA.
Nucl. Acids Res.
8 C ~1980), 5669-5683. Grez,
M., Land,
H., Giesecke,
Sippel, A.E.: Multiple cken lysozyme Grosschedl, stream
M.L.: Spacer
of the T-A-T-A-A-A-T-A
Natl. Acad. R.J.G.,
G., Jung,
are generated
A. and
from the chi-
gene. Cell 25 (1981) 743-752.
R. and Birnstiel,
promotion Hache,
K., Schiitz,
mRNAs
of H2A histone
DNA sequences
sequence
up-
are essential
gene transcription
for
in vivo. Proc.
Sci. USA 77 (1980) 7102-7106. Wiskocil,
R., Vasa, M., Roy, R.N., Lau, P.C.K.
and Deeley, R.G.: The 5’ noncoding
and flanking
the avian very low density apolipoprotein min genes. Homologies
regions
of
II and serum albu-
with the egg white protein
genes. J.
Biol. Chem. 258 (1983) 4556-4564. Hendy,
G.N.,
Nucleotide
Kronenberg, sequence
H.M., Potts Jr., J.T. and Rich, A.:
of cloned cDNAs
encoding
human
pre-
329
proparathyroid hormone. Proc. Natl. Acad. Sci. USA 78 (1981) 7365-7369.
Sanger, F., Nicklen, S. and Coulson, A.R.: DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA 74
Jay, E., Seth, A.K., Rommens, J., Sood, A. and Jay, G.: Gene expression: chemical synthesis of E. coli ribosome binding sites and their use in directing the expression of mammalian proteins in bacteria. Nucl. Acids Res. 10 (1983) 63 19-6329. Kemper, B.: Biosynthesis and secretion of parathyroid hormone, in Cantin, M. (Ed.), Cell Biology of the Secretory Process. Karger, Basel, 1984, pp. 443-480. Maniatis, T., Hardison, R.C., Lacy, E., Latter, J., O’Connell, C., Quon, D., Sim, G.K. and Efstratiadis, A.: The isolation of structural genes from libraries of eukaryotic DNA. Cell 15 (1978) 687-701. Maxam, A.M. and Gilbert, W.: Sequencing end-labelled DNA with base-specific chemical cleavages, in Grossman, L. and Moldave, K. (Eds.), Methods in Enzymology, Vol. 65. Academic Press, New York, 1980, pp. 499-560. Messing, J., Crea, R. and Seeburg, P.H.: A system for shotgun DNA sequencing. Nucl. Acids Res. 9 (1981) 309-321. Novotny, J.: Matrix program to analyze primary structure homology. Nucl. Acids Res. 10 (1982) 127-131. Platt, T.: Termination of transcription and its regulation in the tryptophan operon of E. coli. Cell 24 (1981) 10-23. Poncz, M., Solowiejczyk, D., Ballantine, M., Schwartz, E. and Surrey, S.: “Nonrandom” DNA sequence analysis in bacteriophage Ml3 by the dideoxy chain-termination method. Proc. Natl. Acad. Sci. USA 79 (1982) 4298-4302. Rigby, P.W.J., Dieckmann, M., Rhodes, C. and Berg, P.: Labelling deoxyribonucleic acid to high specific activity in vitro by nick translation with DNA polymerase I. J. Mol. Biol. 113 (1977) 237-251.
(1977) 5463-5467. Sollner-Webb, B. and Reeder, R.H.: The nucleotide sequence of the initiation and termination sites for ribosomal RNA transcription in X. laevis. Cell 18 (1979) 485-499. Southern, E.M.: Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98 (1975) 503-517. Stolarsky, L. and Kemper, B.: Characterization and partial purification of parathyroid hormone messenger RNA. J. Biol. Chem. 253 (1978) 7194-7201. Tsuda, M. and Suzuki, Y.: Faithful transcription initiation of tibroin gene in a homologous cell-free system reveals an enhancjng effect of 5’ flanking sequence far upstream. Cell 27 v981) 175-182. Vasicek, T.J., McDevitt, B.E., Freeman, M.W., Fennick, B.J., Hendy, G.N., Potts Jr., J.T., Rich, A. and Kronenberg, H.M.: Nucleotide sequence of the human parathyroid hormone gene. Proc. Natl. Acad. Sci. USA 80 (1983) 2127-2131. Weaver, C.A., Gordon, D.F. and Kemper, B.: Nucleotide sequence of bovine parathyroid hormone messenger RNA. Mol. Cell. Endocrinol. 28 (1982) 41 l-424. Weaver, R.F. and Weissmann, C.: Mapping of RNA by a moditication of the Berk-Sharp procedure: the 5’ termini of 15s /I-globin mRNA precursor and mature 10s b-globin mRNA have identical map coordinates. Nucl. Acids Res. 7 (1979) 1175-1193. Williams, B.G. and Blattner, F.R.: Construction and characterization of the hybrid bacteriophage lambda Charon vectors for DNA cloning. J. Virol. 29 (1979) 555-575. Communicated by S.R. Jaskunas.