GENOMICS
8,641-647
(1990)
Characterization of the 5’-Flanking Region of the Human and Rat Na,K-ATPase a3 Gene’ BHAVANI Department
G. PATHAK,
DIANA
G. PUGH,
AND JERRY
B. LINGREL
of Molecular Genetics, Biochemistry, and Microbiology, University of Cincinnati College of Medicine, 237 Bethesda Avenue, Cincinnati, Ohio 45267-0524 Received
June 27, 1990;
revised
August
14, 1990
elude fluid movement across the kidney and transport epithelia in the gastrointestinal tract and electrical activity of muscle and nerve. It is also the receptor for cardiac glycosides (Schwartz et al., 1975). The enzyme is a heterodimer consisting of a catalytic a subunit (112,000 Da), which is responsible for all known functions of the enzyme, and a glycosylated /3subunit (35,000 Da), whose function is currently unknown (Jorgensen, 1986). The (Y subunit has three known functional isoforms (al, a2, and a3) that are expressed in a tissue- and developmental stage-specific manner (reviewed in Lingrel et al., 1990; Sweadner, 1989). These isoforms are encoded by separate genes (Shull and Lingrel, 1987; Sverdlov et aZ., 1987) and exhibit differences in hormonal regulation (reviewed in Gick et aZ., 1988; Lingrel et al., 1990), sensitivity to cardiac glycosides (Sweadner, 1979), Na+ affinity (Lytton, 1985), and regulation by insulin (Lytton et al., 1985). To study the regulation of a-isoform expression, it is essential to obtain genomic sequences corresponding to these isoforms. Genomic sequences encoding the human al, a2, and a3 isoforms and /3 subunit (Shull and Lingrel, 1987; Sverdlov et al., 1987; Lane et al., 1990) and the horse al isoform (Kano et al., 1989) have been isolated. In addition, two human a-like genomic clones have been identified as well (Shull and Lingrel, 1987; Sverdlov et al., 1987). The entire human a2 isoform gene has been sequenced, including 1500 nucleotides of the 5’-flanking region (Shull et al., 1989). The sequences of the 5’-flanking region and exons 1, 2, and 3 of the human arl gene (Shull et al., 1990) and entire horse (~1 isoform gene (Kano et al., 1989) have been determined as well. While the sequence of most of the coding regions of the human (~3 gene is known (Ovchinnikov et al., 1988), the sequence of the 5’ region of this gene has not been reported. Here, we describe the genomic organization, DNA sequence, and analysis of the 5’-flanking region of the human and rat (~3 isoform genes.
Genomic clones containing the 5’-flanking region and exon 1 of the human and rat Na,K-ATPase a3 isoform gene have been isolated and characterized. The nucleotide sequences of 1.6 kb of the rat gene and 2.6 kb of the human gene in the 5’-flanking region were determined. Mapping of transcription initiation sites by primer extension and Sl nuclease protection analyses indicates that transcription is initiated in the same region in both genes although the rat gene has a greater number of initiation sites. Neither gene has a canonical TATA box, having instead a ATAT sequence preceding the transcription initiation sites. There is a perfect CCAAT sequence, in the reverse orientation, approximately 30 bp upstream of the potential TATA box in both genes. We have identified potential binding sites for transcription factors Sp-1, AP-1, AP-2, and AP-4, as well as for glucocorticoid and thyroid hormone receptors in the 5’-flanking regions. These are conserved in both human and rat (~3 isoform genes. o 1990 Academic PWS, IDC.
INTRODUCTION The Na,K-ATPase is an integral plasma membrane enzyme found in nearly all animal cells. The enzyme maintains transmembrane gradients of Na+ and K+ across the plasma membrane in an ATP-dependent manner. The maintenance of these gradients is important for regulating various cellular functions such as osmotic balance and cell volume, Na+-coupled transport of various nutrients into cells, and restoration of resting membrane potential (reviewed in De Weer, 1985). In addition, Na,K-ATPase serves specialized functions in different cell types. These in-
Sequence data from this article have been deposited with the EMBL/GenBank Data Libraries under Accession Nos. JO4775 and 504776. 1 This work was supported by National Institutes of Health Grant HL-28573. 641
oss&7543/9o $3.00 Copyright 0 1990 by Academic Press, Inc. All rights of reproduction in any form reserved.
642
PATHAK,
MATERIALS
AND
PUGH.
METHODS
Isolation and Characterization of Genomic Clones The human genomic phage clone (X32-2) was isolated from a human genomic library obtained from Stratagene. Approximately 1.5 X lo6 recombinants were screened by filter hybridization with two end-labeled synthetic oligonucleotides (5’-CTTGGGCTGGGAGCCTCTGCAGCGCCCGCGCCGTCGGTAGGTGCGCCGTCCGTCCGTCCGTCCGTT-3’ and 5’-CTTGGCGGCTCCGGCAGAGAGCCAGGCGGGCGGGGCGGGGACCTCGGGGCGGGCTCAGGCTCAGG-3’) corresponding to positions -66 to +6 and -133 to -65 relative to the translation initiation ATG codon of the human (~3 isoform cDNA (Ovchinnikov et al., 1988). The rat genomic cosmid clone was isolated from a rat genomic library (a gift from Dr. Richard Akeson and Dr. Antonio Reyes, Children’s Hospital Research Center, Cincinnati, OH). Approximately 3 X lo5 clones were screened with a random primer-labeled DNA fragment corresponding to the 5’ untranslated region of the rat a3 isoform cDNA (Shull et aZ., 1986). DNA from the human phage clone was digested with EcoRI and the DNA from the rat cosmid clone was digested with BamHI and fractionated by electrophoresis on 1% agarose gels and transferred to Nytran membranes. These membranes were successively hybridized to random primer-labeled restriction fragments from the rat a3 isoform cDNA (Shull et al., 1986) and end-labeled synthetic oligonucleotides (5’-GGACCGACAGACGCAC-3’ in rat and 5’ACGGACGGACGGACGGACGGC-3’ in human), which were synthesized based on the rat and human a3 isoform cDNAs (Shull et al., 1986; Ovchinnikov et al., 1988). DNA Sequencing Two overlapping restriction fragments from each of the genes were identified by hybridization with synthetic oligonucleotide probes derived from the 5’ untranslated regions of the rat and human cDNAs (5’-CTTGTCATCTTTTTTG-3’ and 5’-GTGCGTCTGTCGGTCC-3’ in rat and 5’-GTCGGTAGGTGCGCCGTCGGTCCGTCCGTCCGTT-3’ and 5’-ACGGACGGACGGACGGACGGC-3’ in human). In the human gene, a 2.8-kb AuaI fragment consisting entirely of 5’-flanking sequence and an overlapping 0.8kb PstI fragment containing part of the 5’ untranslated region, exon 1, and part of intron 1 were subcloned into M13mp18 and M13mp19 vectors. In the rat gene, a 1.6-kb AuaI fragment containing the 5’flanking region and an overlapping 1.35-kb BamHI fragment containing the entire 5’untranslated region, exon 1, and part of intron 1 were also subcloned into
AND
LINGREL
Ml3 vectors. All the genomic fragments were subjected to nested deletion cloning using the Cyclone kit from International Biotechnologies, Inc. Singlestranded DNA was purified from these clones and sequenced by the dideoxy chain termination method (Sanger et al., 1977) using either the T7 sequencing kit (Pharmacia) or the Sequenase kit (United States Biochemical Corp.). The 5’-flanking region, the 5’ untranslated region, and exon 1 of both rat and human genes were sequenced in both strands. In addition, portions of the rat gene that could potentially form Z-DNA (underlined in Fig. 2) were sequenced by the chemical cleavage method (Maxam and Gilbert, 1977). DNA sequences were analyzed using the program DNANALYZE, Version 2.1 (Wernke and Thompson, 1989). The 5’ regions of both gene were aligned to give the best fit using the program NucAln (Wilbur and Lipman, 1983). RNA Isolation Total RNA was isolated from adult human brain and heart and adult rat brain using the method of Chomczynski and Sacchi (1987). Human brain tissue was a gift from Dr. Frank Zemlan, University of Cincinnati. Primer Extension Analysis Two synthetic oligonucleotides complementary to DNA sequences in the 5’ untranslated regions of the human and rat cDNAs, -54 to -82 in human and -94 to -123 in rat relative to the translation initiation site (Fig. 2), were end-labeled with [T-~~P]ATP. The appropriate primer was annealed to 50 pg of total RNA in 5 mM NaPO,, pH 7.0,5 mM EDTA, and 0.15 mM NaCl. With the use of 40 U of AMV reverse transcriptase (Boehringer-Mannheim) and 0.08 mA4 of each dNTP in 50 mM Tris, pH 7.0, 5 mA4 DTT, and 7.5 mM MgCl,, the primer was extended to the 5’ end of the RNA. The resultant labeled fragments were analyzed by electrophoresis in 6% denaturing polyacrylamide gels. Sl Nuclease Mapping Sl nuclease protection was performed by hybridizing single-stranded end-labeled DNA probes to 50 pg of total RNA. The human probe was prepared by annealing an end-labeled oligonucleotide (29-mer, Fig. 2) to single-stranded M13mp19 containing a 2.8-kb AuaI insert that covered the putative transcription start site. The primer was extended with the Klenow fragment of DNA polymerase I and the resultant double-stranded product cleaved with S&I. The probe (142 bases in length) was isolated by electrophoresis in a 1.2% alkaline agarose gel. The rat probe was pre-
HUMAN
AND
RAT
Na,
K-ATPASE
Rat cosmid clone (Cos 34) Total Size -33.5 kb BarnHI restriction sites (sires in kb) B 1.6
Exon 1
B 1.7
B
B
15
9.4
Human phaBe don6 (~32-2) Tote1 size -18.5 kb EcoRl restriction sites (sizes in kb)
B
B
1.35
3.4
.96
Exon 1. EE .l A
FIG. end of EcoRI below
643
013 GENE
E 3.5
E 2.2
11.3
I
1. Restriction map of rat and human genomic clones of the Na,K-ATPase 013 isoform. Both genomic clones contain regions in the 5’ the gene. The top figure is a BumHI restriction map of the cosmid clone, Cos 34, representing the rat gene. The bottom figure is an restriction map of the phage clone, X32-2, representing the human gene. Exon 1 is indicated by a vertical line. The hatched boxes the genomic clones represent sequenced areas of the genes.
pared by annealing an end-labeled oligonucleotide (30-mer, Fig. 2) to single-stranded M13mp19 containing a 1.6-kb AuaI insert that included the transcription start site. The primer was extended with reverse transcriptase and the double-stranded product cleaved with BamHI. The resultant probe (267 bases in length) was isolated as described above. The probes (5 X lo4 cpm) were hybridized to 50 pg of total RNA from the respective species. The samples were digested with 300 U of Sl nuclease (Bethesda Research Laboratories) at 30°C for 1 h. The protected fragments were analyzed by electrophoresis in a 6% denaturing polyacrylamide gel. RESULTS
Isolation of Genomic Clones Containing Human and Rat Genes
the 5’ End of
Human and rat genomic libraries were screened as described under Materials and Methods to obtain clones corresponding to the 5’ end of the a3 isoform gene. One clone from each library was identified that contained 5’-flanking sequence. Rat clone Cos 34, isolated from a cosmid library, spans approximately 33.5 kb, of which 28 kb is located 5’ of the first exon (indicated by a vertical bar on Fig. 1). Hybridization with a 16-mer oligonucleotide probe (described under Materials and Methods) indicated that the 5’-coding region of the gene was located in a 1.35-kb BamHI fragment (as shown in Fig. 1). Approximately 1.6 kb of the 5’flanking region of the rat gene (Fig. 2) and 1 kb of intron 1 (data not shown) were sequenced. The human clone X32-2, containing the 5’end of the (~3 isoform gene, spans 18.5 kb, of which 10 kb corresponds to the 5’-flanking region of the gene (exon 1 is
indicated by a vertical bar on Fig. 1). A total of 3.6 kb of the human a3 isoform gene was sequenced, of which 2.8 kb is in the 5’-flanking region (Fig. 2) and 0.8 kb is in intron 1 (data not shown).
DNA Sequence Analysis DNA sequence analysis (Fig. 2) of the 5’ end of the rat and human a3 genes shows that exon 1 consists of six bases coding for the amino acids methionine and glycine. Exon 1 is followed by a consensus 5’ splice donor site (Breathnach and Chambon, 1981). This donor site is conserved in both human and rat genes. Upstream of the transcription initiation sites both the human and rat genes contain a sequence ATAT (located at -27 in rat and -26 in human), which may function as a TATA box even though it does not have perfect homology to the consensus sequence (Breathnach and Chambon, 1981). Approximately 30 bases upstream of the potential TATA box is a CCAAT sequence in reverse orientation in both genes. CCAAT boxes are known to be functional in either orientation (McKnight and Tjian, 1986) and based on the location and homology of this element to the consensus sequence, it is likely to be functional in both genes. The 5’-flanking regions of both genes contain a number of potential tram-acting factor binding sites (Fig. 2). A more detailed analysis of these elements is discussed in the next section. The sequence of the 5’ untranslated region of exon 1 obtained from the rat genomic clone is in perfect agreement with that previously reported for the rat cDNA (Shull et al., 1986). However, the sequence of the same region of the human gene differs at four positions from the reported sequence of the human cDNA (Ovchinnikov et al.,
644
PATHAK,
PUGH,
AND
LINGREL ctcgaggccaattccctctg
-2692 -2554 -2416 -2278 -2140 -2002 -1864 -1726
gggtctcccccatttctgagtccccggtcccctaccaggtttctgctcagtctccccctggcctttatctctgagtctccctggtttaccaagcctgtctctgcctggctgtctacgtctctctgtctctgtcacctt cactgaactcttccgttgctctgtctccgtgttctcatctctgcatctgccaatctctgtctgtctttttcaatctgtttgtatccgcctctctctggttctgggctgcct~ttctccctctqtqtctctccttccg cctct~catctagctcccctattcccctttatccctgtgaccttcccaggtctttccttctcaccgccccccctacttctgcagagcctcaccaggagcacgatgcacagtcatcagtctcatcaggctcgtgcctg aggagagagtgcatgaggggggacgtgagcttgtgtctgaggtgggatgggaagggctcaaggcctcccccggcccagctggagagcaagagcagacagagcct~gaatctcctggaatccaggcctcccccacctcc ccaggcttccccatctccccagtcccattctccatggtccaggcccatgcctctaggccaaggtgtcaagagaaggaaaggaggatatgttgataggctctccctcccctaggaagagggcagagttggggagggggc gtctgggggcgaggcatctgggttaaaagatgtggctcctccccctttcccagtctccatggcaacagcccagtctgcaggcccaacctcctttgctggcagaagcagctccctgttaccatggcaacaaaggacagc gagaaagatggggacgggtcagaggacaggattgggatggggctctgggcactggggaaagcctccattcactctctggccttgtctctgtccatatgctgtggctctgaatctctgtgtttctgtgcctctctctgg H R H R H R H R H R H R H R H R H R H R H R H R H R
ccttcagcctttcacacatttccctgcaccaggtcctcgaccttcacccacccaccctcatacaccaagctcactccctgtcccttgcctgcgtcccccacatctcctatcccataggtggagaggatacccacattc . . . . . .. .. .. .. .. .. .. .. .. .. ::: :: cccgagtg~cact~tgaccattagattctt~atcttaatgtgca~caa~c~~t~ag~~tg~~~~~~cagctca~t~~tgctccagac~ggagtg---~~---------~~~~~~------------~----------~ tgagaaaactacaaggtcaaatgctctctgtctctctctctctctcttccccttctac-----------catccccagggagagtctggggtgttcctcatagaacaaagtggtgttggagtgagtggggaggcctca .. .. .. .. .. .. .. .. .. : .-.... .. .. .. .. .. .. . . . . . . .. .. .. .. .. .. .. .. .. .. .. .. -------------------------------------------ctcttccccctgctgtcttttaccagcatccctagggagagtctgaaa~catgag--caaagtggtgtcggggtgggcacgggagactct GRE
-1588 -1503 -1461 ::
:::::::::::
::::
::
::
: :::
:::::::
::
:::: -1281 -1198
: :::::
ttgtctgtcattcacagttagca-ttggcaactttgcactgtgggccgttgctcatgcccaaacacactggggagcagtgtgtgtgaggatgtgtgcgacatcgtggg~tcatgcaacgatgtccacaaaggctgatg .. .. .. .. .. .. .. .. .. .. : :: :::::: :::::: : : ::::: : :::: :: : ::: : ::::::::: .v---e . . . . ::: gtgcCtgtCaCtCBcagcaattatttggccagtttgtacca~gagcct~g~cccgacacactagggtacgtgcaag-------tgtgtctga~ggtatgcttgtgca~c~gtggcctc~caggt~t~tg~~ AP-1
:::
:::::::
:::::::::::
::: -1163 -1053
::
:::: -1032
tagtgtgtctggaggcagtgtgaggcctgtgcgtgtgtgatggtgctgagtgcaggtgtgtgttggtttatgggttgc~~atgtgggtttctatgtgagatggtgggaggtcaggtgtggtagagtgtgtgatgtgt :::::: :::: : :::: : ::: ::: :::: .. .. .. .. .. .. .. .. .. .. .. .. .. .. : : :::: : .. .. .. .. .. .. .. .. .. .. : g---gtgtctagaggaaatgtgggtcctatgcttgtgcaatggtgct~agtgaagcaatctcttgg~tctcc~---------------------tgtgag~tgg~g~~~ggt~~~~~tqg
-915
.. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. tgcagtgtgatgg&a-
-919
gagtctgcgattttctgagtgtcggtgttagtccg------------cacagcactttgtggctgtgattatgagtctcttgtggtgtgatgggtgtgtaa------------------------------------:::: : : .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. :::: : : ::: :::: : :: .. .. .. .. .. .. .. .. -------------tctgtgaggtggtgttctgagggcgtgggccctccacagcacttccgcatcgtgactggggttctgctgtgatatggtgggtgtgagcccttcatggattccccttctgctggtttttg~gcttc
-826 -794
---------------------ggtgtgtgccctatgtgtgtttcactctatgtttgagactctgtgtgtggtggggtggtgttgtgtgtgtgagggtacatgctg---------------------gcctatggtgga .. .. .. .. .. .. .. .. .. .. : :: :::: : :::: :: : :::::::: : ::: tggttgaaggatcgtggtggtggtggtggtttcqtqtqcqtqcqtq~qcq~~cq~~t~tqtqtqtqtqtqtqtqtqtqtqtqtqtqtqtqtqtqtqtqt~~~ggtgtgattgggatgtggttaagagcctttggtggt
-730 ::::
:::::: -656
atatgagtgcaaccctgtggtgtgtgcgtgtccccagcatctgggctatcagtgtgatggtcctgtgtaactgtgacgctgggcattttctgggcctcctggggccatcctccaccg-tgttgtgtctcccgtgtacc .. .. .. .. .. .. .. .. : .. .. .. .. .. .. .. : .. .. .. .. .. .. .. .. .. : .. .. .. .. .. .. .. .. .. : ----... .. : atatgtggag-------------------------------------------gtgatggtggtacctaactgtagtgg-gggcatcttcagagcctcctggtgtgagcttctggctt~gtctcctttgtgct
: : ::
:
.. .. .. .. .. .. .. .. .. .. .. ..
:::
-593
:
-562
ctgtgtggttgtggctcattggatagctcaggcgtagtgggggggcacccacagtgagggaatctcagggctcctcccgcc-----agcacccctcctcaattgcaaggcctttcttgctctcttgtgttcccttcca .. .. ...f. . ::: :::: :::: : : :::::: .. .. ..-...... .... .... . : : : g--------tgtggggtggfatgjcttaggt4ttcagtcatggggaca~agtgcagaggatc~taggag~ccc~tcc~at~tctgt~g~tccaccagtctccccatcacgcgtac~atc~t~t~c~tga~~ag
-460 -432
acttcccctcctgcagggccctctccctgaacagcactaccccccgccgtctccagggtccctggtgccacactgcagtgattcaccgggcttccctcccaccccgtgtcactacccgccccccctcccccagcgtac .. .. .. .. .. .. .. .. : : .. .. .. .. .. .. .. .. .. .. .. : : :: :::: ::: : ::: ::::: . :: tatcaaacacagattcctggggtcccctctacatactgcaagatggtggtactctggggccctgcatt---a~tgcagtt~~~tgaggcg--tccctccta~cctgaatccacc~c~ct------------------ccttctcggtgatggcacacccccacaatgagaggatttccggggttttctttccctgagcgccccttcttggagg--ccctactccatattgagggggtctccaagtcccctattgcggaggtctctgggaatcccc .. .. .. .. .. : ::: : :::: :: ::: ::: ::::: : : : :::: : : :::::::: ---------------------cccagaaggaggtagattctggggtgcttcctccctcaaagcctcagtccctaaggtccctccctacattgagcggggtctctaggtccccaccctgggggtgtccggg---~~
-322 ::
:: -318 - la6
: ::::::
::
:
: ::
:: -205
ccacccccgcagcgctcccccttcgcggccgcgccgccactttgcgg-agcccaa---------------------ggggaggacagctgcagtaccaggggcggggccgcgggccgtccgtcagcacgccggcgccc .. .. .. .. .. .. .. .. ::::::::: .. .. .. .. .. : : : .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. *.... :::: ::: . . . . . : :: ::::::: :: : cctctagctgtccgtctgctcagggccgctc
-70 : :::
: -68 69
R
R
:
-1326 :
H
H
:::
-1410
acttcttgggctccagtggggtggctgatggggtgttgtgtgtgacgagctc-tg--tgtgtcatgtgggtgagggcgtt .. .. .. .. .. .. .. .. .. .. : : : :: ::: : :: :: : :: : :: : ::: : : ::: : : : : : .. .. .. .. .. .. ::: ggctcctcaggctctgatgtgaaggagagtgggtgctgtgtatgtgtggaagagcgggaggc-ggg-acacaa-gcgcaccc-t-t-t-t--atatacacacacgggggtgtgggtgtggtgtgtgggtagggacgtt ggggtgtgtttggaggtgcaacaaagatgtctgcgatagtgctgcaataacgtgatgtgtgtatgtgtgggaggtgtgtg--ggtgtcgcgtgacttgtgtttgaattattgtgctatgtgtaagtgtgtgcaccttt .. .. .. .. .. .. .. .. .. .. .. .. : .. .. .. .. .. .. .. .. .. .. .. .. ::: : .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. : :: ::::: : gggatatgtttggaggtgcaacaaactcatacatgacagtgccatac------------------tgtgggatgtgtgagatggtgtcccgtggctg~gttctgaatggctgt--tatgtgtgagtgtgtgcacattt
::
65 PRIMER GAGGCTCCCAGCCCAAGCCTGAGCCTGAGCCCGCCCCGAG/gtaggt...Intron .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .. : .. .. .. .. .. .. .. .. .. .. .. .. .. : :: : :::: : :: ::: : : : : :::: :::::::::::::::::::::: GAGGCCCCCAGCCCGAGCCCGCGCCTGAGCCCGTCCTGC~CCACCGCTCACCAGCCTGTCCGCCGCTCTTCCCGC~ffiCCGCC~T~lgtaggt...Intron MetGly
l...
160
l...
159
FIG. 2. Nucleotide sequence of the 5’ end of human and rat 013 isoform genes. The upper sequence is that of the human gene (marked H) and the lower sequence is the rat gene (marked R). The two sequences have been aligned to give the best comparison. The colons indicate nucleotides that are identical in both sequences. Transcribed regions are in uppercase letters; S-flanking regions and intronic sequences are in lowercase letters. Transcription initiation sites of each of the genes are indicated by arrows, larger arrows representing the predominantly used sites. Numbers to the right of the figure refer to nucleotide positions relative to the transcription initiation site. Potential TATA and CCAAT sequences are boxed. Potential transcription factor and hormone binding sites that are conserved in both genes are underlined. An alternating purine-pyrimidine stretch of nucleotides in the rat sequence is also underlined. A homopurine * homopyrimidine region found in the human gene is marked in parentheses with a region of mirror symmetry underlined. Oligonucleotide primers used for primer extension analysis are marked.
HUMAN
AND
RAT
Na,
K-ATPASE
Comparative Analysis Rat a3 Genes
456L -154 -153c
-156~ -154 -153’
FIG. 3. Transcription initiation sites of the human (~3 isoform gene. (A) Primer extension analysis: An end-labeled 29-mer oligonucleotide complementary to sequences in the 5’ untranslated region of human (~3 RNA was hybridized to 50 pg of total RNA from human brain (lane 1) and human heart (lane 2) and reverse transcribed with AMV reverse transcriptase. The products were resolved in a 6% denaturing polyacrylamide gel. (B) Sl nuclease mapping: A single-stranded end-labeled probe extending from the primer binding region up to the Sac11 site (Fig. 2) and complementary to human RNA was hybridized to 50/rg of total RNA from human brain (lane 1) and human heart (lane 2) and digested with Sl nuclease. The resultant products were analyzed in a 6% denaturing polyacrylamide gel. A tRNA control was performed and it showed no protected bands (data not shown). In both primer extension and Sl reactions, dideoxy sequencing reactions of a singlestranded template containing this region of the gene was performed with the same primer and electrophoresed alongside as a marker. The sequence shown is in the antisense strand. End-labeled fragments of a l.O-kb DNA ladder were also used as markers (data not shown).
645
(~3 GENE
of the 5’ Ends
of Human
and
A comparison of the 5’ ends of the human and rat (~3 genes (Fig. 2) shows blocks of highly conserved sequences. The first of these occurs in the 5’ untranslated region of exon 1. In this region, the two genes exhibit 82% nucleotide similarity. The next stretch of highly conserved sequences occurs just upstream of the transcription initiation sites. This region consists of a total of 130 nucleotides exhibiting 86% similarity. This region contains a number of conserved potential transcription factor and hormone receptor binding sites (indicated on Fig. 2). Upstream of the TATA and CCAAT box-containing region there are a number of sequences that are conserved in both genes. Three large regions of homology containing conserved potential binding sites (underlined on Fig. 2) start at -563, -1090, and -1411 in rat and -594, -1118, and -1462 in human, respectively. The first region consists of 84 nucleotides exhibiting 70% similarity, the second contains 146 nucleotides exhibiting 73% similarity, and the third is a total length of 68 nucleotides exhibiting 72% similarity. Two shorter regions exhibiting similarity but not containing any known binding elements occur at -967 in rat and -985 in human and -1256 in rat and -1301 in human. The first of these is a 64-bp region exhibiting 73% similarity and the second is a 48-bp region exhibiting 83% similarity. Since these two regions are highly con-
1988). These differences include extra bases at positions -14(G), -16(C), and -116(C) and a deletion at -101(C), relative to the translation initiation codon. Mapping
of Transcription
Initiation
Sites
Transcription initiation sites were identified by primer extension and Sl nuclease protection analysis in both genes. The results from these two analyses indicate that the transcription initiation sites are located at -153, -154, and -156 relative to the ATG codon in the human gene (Fig. 3). The rat a3 isoform gene contains multiple initiation sites clustered between -146 and -154 relative to the translation initiation site, with the predominant ones located at positions -147 and -153 (Fig. 4). These sites are in close agreement to the start sites used in the human gene. In both the human and rat genes, transcription initiates predominantly at an adenine which is in agreement with that observed for other eukaryotic genes (Breathnach and Chambon, 1981). Since this base is also the 5’-most predominant start site in both genes, we have designated it as +l and have numbered all 5’ elements relative to this site.
::g\ -152 -151’
-
-141 -146
-
.154L -153 = ::::r -147 -146r
FIG. 4. Transcription initiation sites of the rat (~3 isoform gene. The results from primer extension analysis (A) and Sl nuclease protection (B) are shown. The procedures were performed in essentially the same fashion as for the human gene (Fig. 3). An end-labeled 30-mer oligonucleotide (indicated in Fig. 2) probe was used for primer extension, A single-stranded end-labeled probe extending from the primer to the BamHI site (shown on Fig. 2) was used for Sl analysis. These probes were hybridized to 50 pg of total RNA from rat brain and treated accordingly. The products were analyzed in the same way as described in Fig. 3. Size markers included sequencing reactions (in the antisense direction) and endlabeled fragments of a l.O-kb DNA ladder (data not shown).
646
PATHAK,
PIJGH,
served and contain no known regulatory sequence elements, it will be of interest to determine their functional significance.
DISCUSSION
Our objective is to characterize the 5’ end of the Na,K-ATPase a3 isoform gene and use this information to study the regulated expression of this gene. To this end we have isolated and sequenced this region of the human and rat a3 isoform genes, identified the start of transcription of both genes, and performed a comparative analysis of the two genes to identify potential conserved transcription factor binding sites and hormone response elements. Primer extension and Sl analyses indicate that the human (~3isoform gene has three transcription initiation sites at -153(G), -154(A), -156(G), whereas the rat a3 isoform gene has a cluster of initiation sites with the most predominant ones at -147(T) and -153(A) relative to the translation initiation site. Thus, the two genes initiate transcription at approximately the same sites. Examination of the 5’ end of human and rat a3 isoform genes for potential conserved trans-acting factor binding elements has revealed the presence of a number of these sequences. There are two potential Sp-1 binding sites located at -50 and -100 in the human gene and at -48 and -98 in the rat gene. Two AP-1 sites are found in both the human and rat a3 isoform genes. In the rat gene they are located at -1108 and -1149 and in the human gene they are located at -1136 and -1176 relative to the transcription start site. A conserved AP-2 site (located at -198 in the rat gene and -179 in the human gene) and a conserved AP-4 site (located at -935 in the rat gene and -932 in the human gene) were identified as well. Since the a3 isoform message levels are regulated by hormones such as 3,5,3’-triiodo-L-thyronine (TJ and dexamethasone (Orlowski and Lingrel, 1990), we have examined the 5’ end of the a3 isoform genes for potential glucocorticoid and thyroid response elements. There are a number of sequences that exhibit similarity to the hexanucleotide core, TGTTCT, or its reverse complement, of the 15-bp dyad symmetry sequence constituting the glucocorticoid response element [GRE, reviewed in Yamamoto (1985)]. Three of these GRE elements are conserved in both human and rat genes (indicated in Fig. 2). CACCC boxes, present in upstream regions of genes, bind a protein factor that interacts with the glucocorticoid receptor (Dierks et aZ., 1983; Shiile et al., 1988). There are two conserved CACCC sequences in the human and rat a3 isoform genes (Fig. 2), one of which (present at -199 in rat and -180 in human) is located approximately
AND
LINGREL
70 bp from a potential conserved GRE and could interact with it. Examination of the 5’ ends of both human and rat n3 genes with the thyroid hormone receptor (TRE) consensus binding sequence 5’-GGG(A,T)C(C,G)-3’ and its reverse complement (Norman et al., 1989) revealed that there are a total of seven elements in the human gene and nine in the rat gene that exhibit 100% sequence identity to this element. One of these, at position +15, is conserved in both genes. In addition, the potential TREs at -399 and -398 in human are conserved to a high degree in the rat gene. Conservation of binding sites for trans-acting factors and hormone response elements indicates that they may be important in regulation of gene expression. However, to ascertain this, functional assays coupled with mutagenesis studies need to be performed. A tract consisting of alternating purine-pyrimidines (-694 to -761) which could potentially form Z-DNA exists in the 5’-flanking region of the rat a3 isoform gene. This tract is not conserved in the human gene. A homopurine . homopyrimidine region, containing an imperfect mirror repeat sequence, is located in the 5’-flanking region of only the human a3 isoform gene (-2410 to -2441). Such regions are capable of adopting H-DNA configurations which exist as triple helical structures with a single-stranded region that exhibits Sl nuclease hypersensitivity (Wells et al., 1988). The biological role for Z- and H-DNA configurations is not clear but they are believed to be involved in gene regulation (Rich et al., 1984, Wells et al., 1988). Comparison of the nucleotide sequences in the 5’flanking regions of the human al (Shull et aZ., 1990), a2 (Shull et al., 1989), and a3 isoform genes revealed that, in contrast to the al isoform gene and similar to the a2 isoform gene, the number of CpG dinucleotides in the 5’-flanking region of the human a3 isoform gene is small. As CpG dinucleotides occur in greater frequencies in the 5’ ends of housekeeping genes as compared to genes expressed in a tissue-specific manner (reviewed in Bird, 1986), the number of CpG dinucleotides in the 5’ ends of each of the human a-isoform genes correlates well with their pattern of expression in different tissues. ACKNOWLEDGMENTS The authors thank Dr. M. M. Shull, Dr. G. E. Shull, Dr. J. Orlowski, and Dr. E. M. Price for helpful discussions and critical review of the manuscript, Dr. J. Lloyd for advice on primer extension analysis, G. Wernke for help with computer analysis, and Jean Russell and Jennifer Schroeder for help in preparing the manuscript.
REFERENCES 1.
BIRD, A. P. (1986). CpG-rich islands and the function methylation. Nature (London) 321: 209-213.
of DNA
HUMAN
2. BREATHNACH,
R., AND CHAMBON, and expression of eucaryotic split Annu. Rev. Biochem. 50: 349-383.
SACCHI, N. (1987). isolation by acid guanidinium extraction. Anal. Biochem.
Single-step thiocyanate162: 156-159.
4. DE WEER,
P. (1985). Cellular sodium-potassium In “The Kidney: Physiology and Pathophysiology” Seldin and G. Giebisch, Eds.), pp. 31-48, Raven York.
5. DIERKS,
RAT
Na,
K-ATPASE
P. (1981). Organization genes coding for proteins.
3. CHOMCZYNSKI, P., AND method of RNA phenol-chloroform
AND
P., VAN OOYEN, A., COCHRAN, REISER, J., AND WEISSMANN, C. (1983). stream from the cap site are required for rate transcription of the rabbit P-globin cells. Cell 32: 695-706.
transport. (D. W. Press, New
Na,K-ATPase genes: Structure of the gene of the catalytic subunit (cuIII-form) and its relationship with structural features of the protein. FEBS L&t. 233: 87-94. 19.
F., AND EDELMAN, I. S. (1988). In Hormonal regulation of Na, K-ATPase. “The Na+, K+ Pump. Part B: Cellular Aspects” (J. C. Skou, J. G. Nerby, A. B. Maunsbach, and M. Esman, Eds.), Prog. Clin. Biol. Res.
F., NICKEN, S., AND COULSON, A. R. (1977). sequencing with chain-terminating inhibitors. Proc. Acad. Sci. USA 74: 5463-5467.
A., LINDENMAYER, (1975). The sodium-potassium Pharmacological, physiological Pharm. Reu. 27: 3-134.
22.
P. L. (1986). Structure, function and regulation of Na, K-ATPase in the kidney. Kidney Znt. 29: 10-20.
10.
11.
LINGREL, J. B., ORLOWSKI, J., SHULL, M. M., AND PRICE, E. M. (1990). Molecular genetics of Na,K-ATPase. Prog. Nucl. Acids Res. Mol. Biol. 38: 37-89.
12. LY~ON,
J. (1985). rat adipocyte (Na+,
10080. 13. LY’ITON,
Insulin affects the sodium affinity Kf)-ATPase. J. Biol. Chem. 260:
J., LIN, J. C., AND GUIDOTTI, G. (1985). tion of two molecular forms of (Na+, K+)-ATPase pocytes. J. Biol. Chem. 260: 1177-1184.
M. M., AND LINGREL, J. B. (1987). encode the human Na+, KI-ATPase catalytic Natl. Acad. Sci. USA 84: 4039-4043.
16. NORMAN, 17.
26. SHULL,
M. M., PUGH, D. G., AND LINGREL, J. B. (1990). The human Na,K-ATPase al gene: Characterization of the 5’flanking region and identification of a restriction fragment length polymorphism. Genomics 6: 451-460.
27. SVERDLOV,
E. D., MONASTYRSKAYA, G. S., BROUDE, N. E., USHKARYOV, Y. A., ALLIKMETS, R. L., MELKOV, A. M., SMIRNOV, Y. V., MALYSHEV, I. V., DULOBOVA, I. E., PETRUKHIN, K. E., GINSHIN, A. V., KIJATKIN, N. I., KOSTINA, M. B., SVERDLOV, V. E., MODYANOV, N. N., AND OVCHINNIKOV, Y. A. (1987). The family of human NaC, K+-ATPase genes: No less than five genes and/or pseudogenes related to the a-subunit. FEBS Lett. 217: 275-278.
Identificain rat adi-
28. 29.
SWEADNER, K. J. (1979). Two molecular forms of (Na+ + K+) stimulated ATPase in brain. J. Biol. Chem. 254: 6060-6067. SWEADNER, K. J. (1989). Isozymes of the Na+, K+-ATPase. Biochim. Biophys. Acta 988: 185-220.
30. WELLS,
R. D., COLLIER, D. A., HANVEY, J. C., SHIMIZU, M., AND WOHLARAB, F. (1988). The chemistry and biology of unusual DNA structures adopted by oligopurine + oligopyrimidine sequences. FASEB J. 2: 2939-2949.
M.
31. WERNKE, 32.
18. OVCHINNIKOV,
Y. A., MONASTRYSKAYA, G. S., BROUDE, N. E., USHKARYOV, Y. A., MELKOV, A. M., SMIRNOV, Y. V., MALYSHEV, I. V., ALLIKMETS, R. L., KOSTINA, M. B., DULUBOVA, I. E., KNATKIN, N. I., GRISHIN, A. V., MODYANOV, N. N., AND SVERDLOV, E. D. (1988). Family of human
genes Proc.
M. M., PUGH, D. G., AND LINGREL, J. B. (1989). Characterization of the human Na+, K+-ATPase a2 gene and identification of intragenic restriction fragment length polymorphisms. J. Biol. Chem. 264: 17532-17543.
of the
F., LAVIN, T. N., BAXTER, J. D., AND WEST, B. L. (1989). The rat growth hormone gene contains multiple thyroid response elements. J. Biol. Chem. 264: 12063-12073. ORLOWSKI, J., AND LINGREL, J. B. (1990). Thyroid and glucocorticoid hormone regulate the expression of multiple Na+,K+-ATPase genes in cultured neonatal rat cardiac myocytes. J. Biol. Chem. 265: 3462-3470.
Multiple subunit.
25. SHULL,
14. MAXAM, 15.
SH~~LE, R., MULLER, M., OSTUKA-MURAKAMI, H., AND RENKAWITZ, R. (1988). Cooperativity of the glucocorticoid receptor and the CACCC-box binding factor. Nature (London)
24. SHULL,
10075-
A. M., AND GILBERT, W. (1977). A new method for sequencing DNA. Proc. Natl. Acad. Sci. USA 74: 560-564. MCKNIGHT, S., AND TJIAN, R. (1986). Transcriptional selectivity of viral genes in mammalian cells. Cell 46: 795-805.
G. E., AND ALLEN, J. C. adenosine triphosphatase: and biochemical aspects.
J. B. (1986). Molecular cloning of three distinct forms of the Na+, K+-ATPase a-subunit from rat brain. Biochemistry 25: 8125-8132.
8. KADONAGA,
J. T., JONES, K. A., AND TJIAN, R. (1986). Promoter-specific activation of RNA polymerase II transcription by SPI. Trends Biochem. Sci. 11: 20-23. KANO, I., NAGAI, F., SATOH, K., USHIYAMA, K., NAKAO, T., AND KANO, K. (1989). Structure of the 01~ subunit of horse Na,K-ATPase gene. FEBS Lett. 250: 91-98. LANE, L. K., SHULL, M. M., WHITMER, K. R., AND LINGREL, J. B. (1990). Characterization of two genes for the human Na,K-ATPase p subunit. Genomics 5: 445-453.
DNA Natl.
332:87-90. 23. SHULL, G. E., GREEB, J., AND LINGREL,
266B: 277-295. 7. JORGENSEN,
A., AND ANDREW, H.-J. (1984). The of left handed Z-DNA. Annu. Reu. Bio-
20. SANGER,
6. GICK, G. G., ISMAIL-BEIGI,
9.
RICH, A., NORDHEIM, chemistry and biology them. 53: 791846.
21. SCHWARTZ,
M. D., DOBKIN, D., Three regions upefficient and accugene in mouse 3T6
647
(~3 GENE
33.
G. R., AND THOMPSON, R. L. (1989). DNANALYZE: A comprehensive nucleotide and amino acid sequence similarity comparisons. Biophys. J. 55: 390a. WILBUR, W. J., AND LIPMAN, D. J. (1983). Rapid similarity searches of nucleic acid and protein data banks. Proc. Natl. Acad. Sci. USA 80: 726-730. YAMAMOTO, K. R. (1985). Steroid receptor regulated transcription of specific genes and gene networks. Annu. Reu. Genet. 19:209-252.