Gene 212 (1998) 269–278
The gene and processed pseudogenes of the rat mitochondrial single-strand DNA-binding protein: structure and promoter strength analyses Seema Gupta 1, Glenn C. Van Tuyle * Department of Biochemistry and Molecular Biophysics, Virginia Commonwealth University, Richmond, VA 23298, USA Received 14 January 1998; accepted 3 March 1998; Received by J.A. Engler
Abstract The gene for the rat mitochondrial single-stranded DNA-binding protein (mtSSB) was amplified by PCR and isolated as several overlapping genomic clones. The clones encompassed the 5∞ untranslated sequence and all intron/exon junctions. The gene contained seven exons and six introns. The first exon contained only 5∞ untranslated sequence. The 16-amino acid mitochondrial targeting presequence, encoded by the second and third exons, was precisely bisected by intron 2. All intron donor and acceptor sites were consistent with the GT/AG consensus. The transcription start site was determined by primer-extension analysis to be 69 bp upstream of the translation start codon. The upstream sequence lacked TATA and CCAAT boxes at the expected locations, but did contain several other potential regulatory elements including a GC box (Sp1-binding site) and three NRF-2 sites, one of which was located precisely beside the transcription start site. A 10 out of 12 imperfect NRF-1 site was located within the first exon. The 5∞ flanking sequence (−546 to +30) was shown to have strong promoter activity in transient transfection assays in primary rat hepatocytes and HepG2 cells. In addition, evidence for the existence of several mtSSB processed pseudogenes was obtained. These pseudogenes lacked introns and contained substitution and deletion mutations compared to the cDNA sequence. The 5∞ upstream region of one of the pseudogenes was analyzed and found to contain negligible promoter activity. © 1998 Elsevier Science B.V. All rights reserved. Keywords: PCR; Transient transfections; DNA sequence; Nuclear respiratory factors; Regulatory elements
1. Introduction Mitochondrial biogenesis is subject to both nuclear and mitochondrial genetic control (Clayton, 1991a; Scarpulla, 1997). Mammalian mitochondrial DNA (mtDNA) contains 37 genes that encode 13 protein * Corresponding author. Tel: +1 804 828 0772; Fax: +1 804 828 1473; e-mail:
[email protected] 1 Present address: Division of Gastroenterology, Department of Medicine, Virginia Commonwealth University, Richmond, VA 23298, USA. Abbreviations: bp, base pair(s); cDNA, complementary DNA; D-loop, displacement loop; D. melanogaster, Drosophila melanogaster; dNTP, deoxyribonucleoside triphosphates; dsDNA, double-strand DNA; kb, kilobase(s); mtDNA, mitochondrial DNA; mtSSB, mitochondrial single-stranded DNA-binding protein; NRF-1, nuclear respiratory factor-1; NRF-2, nuclear respiratory factor-2; PCR, polymerase chain reaction; RLU, relative light unit(s); S. cerevisiae, Saccharomyces cerevisiae; SEM, standard error of the mean; U, unit(s); X. laevis, Xenopus laevis. 0378-1119/98/$19.00 © 1998 Elsevier Science B.V. All rights reserved. PII: S0 3 7 8 -1 1 1 9 ( 9 8 ) 0 0 14 8 - 6
subunits of the electron transport and oxidative phosphorylation complexes, as well as 22 tRNAs and two rRNAs that are sufficient to support mitochondrial protein synthesis. For the production of all other mitochondrial proteins, the mitochondria rely on nuclear genes. These additional proteins are imported from the cytosol and serve in energy metabolism as well as in the process of mtDNA replication, transcription and translation. A well-characterized, nuclear-encoded component of the mtDNA replication apparatus is the mitochondrial single-strand DNA-binding protein (mtSSB), which has been purified from rat (Pavco and Van Tuyle, 1985; Hoke et al., 1990), X. laevis (Mignotte et al., 1985; Ghrir et al., 1991), human (Genuario and Wong, 1993), D. melanogaster ( Thommes et al., 1995) and S. cerevisiae ( Van Dyck et al., 1992). This protein binds to the singlestranded regions of displacement loops (D-loops), expanding D-loops and gapped circles ( Van Tuyle and Pavco, 1985) and protects against branch-migrational
270
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
loss of nascent strands (Pavco and Van Tuyle, 1985). Binding of the mtSSB to primed single-strand templates dramatically enhances the rate of the DNA polymerase synthetic reaction (Hoke et al., 1990; Genuario and Wong, 1993; Thommes et al., 1995; Mikhailov and Bogenhagen, 1996) and inhibits the 3∞–5∞ exonuclease activity (Mikhailov and Bogenhagen, 1996). The mtSSB also inhibits hybridization of complementary DNA strands (Hoke et al., 1990), and it has been suggested that this activity may prevent slipped mispairing during mtDNA replication, thus protecting against rearrangements via recombination or slipped replication during the formation of expanded D-loop stuctures ( Tiranti et al., 1993). The cDNA clones of the mtSSB from rat and human ( Tiranti et al., 1993), X. laevis ( Tiranti et al., 1991), and mouse (Li and Williams, 1997) have been isolated. The deduced amino acid sequences from each organism revealed the rat, human and mouse proteins to be more than 90% identical ( Tiranti et al., 1993; Li and Williams, 1997). The X. laevis mtSSB was found to be about 70% identical to the mammalian forms, and many of the differences were conservative changes ( Tiranti et al., 1993). The structure of recombinant human mtSSB has been analyzed by X-ray crystallography ( Yang et al., 1997). The protein crystallizes with a dimer in the asymmetric unit and exhibits electropositive patches presumed to serve as DNA-binding regions. Binding is likely to be aided by aromatic residues located in several of the positive patches. In X. laevis, there are two related forms of the mtSSB (Ghrir et al., 1991). Recently, the genes for these two forms were cloned and sequenced (Champagne et al., 1997), and both were found to consist of seven exons and six introns with conserved junction positions. Several regulatory elements (NRF-2, Sp1, and a CCAAT box) were identified in the 5∞-flanking region of each gene. In this paper, we report the cloning and sequencing of the gene for the rat mtSSB. The 5∞-flanking sequence of the gene was evaluated for promoter activity using transient transfections in homologous cells. In addition, evidence for the existence of processed pseudogenes of rat mtSSB is provided.
extension PCR reactions on 1–2 mg of rat genomic DNA using the Expand Long Template PCR System (Boehringer-Mannheim). PCR with primer set P3/P6 was carried out in 50 mM Tris–HCl, pH 9.2, 16 mM (NH ) SO , 1.75 mM MgCl , 350 mM each dNTP, 42 4 2 300 nM each primer and 0.75 ml of the supplied thermostable Taq/Pwo DNA polymerase mix. The thermal cycling conditions used were: one cycle of 2 min at 94°C; 30 cycles of 10 s at 94°C, 30 s at 65°C and 10 min at 68°C; followed by a final elongation step of 7 min at 68°C. Amplification with primer sets P1/P4 and P2/P5 was carried out using 50 mM Tris–HCl, pH 9.2, 16 mM (NH ) SO , 2.25 mM MgCl , 500 mM each dNTP, 42 4 2 300 nM each primer and 0.75 ml of the polymerase mix. The cycling conditions used were the same as above except that the 30 cycles of amplification had 12-min elongation steps. Either the PCR products were directly TA-cloned into the pCR 2.1 vector (Invitrogen) or the product of interest was first gel-purified on an agarose gel and subsequently TA-cloned. The plasmid inserts were sequenced with M13 forward and reverse universal primers using the Taq DyeDeoxy Terminator Cycle Sequencing System (Perkin Elmer) and analyzed by automated DNA sequencing ( Virginia Commonwealth University Nucleic Acid Analysis Facility). 2.2. Isolation of the rat mtSSB promoter The rat mtSSB promoter was isolated by PCR with the PromoterFinder DNA Walking Kit (Clontech). The two gene-specific primers employed for this procedure, GSP1 and GSP2 (Table 1), were designed from the previously determined rat mtSSB gene sequence. Primary PCR reactions were performed with adaptor primer AP1 of the promoter finder kit and GSP1 according to the manufacturer’s instructions. A secondary PCR reaction was carried out with 1 ml of a 50-fold dilution of the primary PCR using nested adaptor primer AP2 and nested gene specific primer GSP2. The PCR products from library 3 and 5 were TA-cloned into pCR 2.1 vector and sequenced by the ABI automated DNA sequencing system (Perkin Elmer). 2.3. Construction of the reporter plasmids for transfection studies
2. Materials and methods 2.1. Molecular cloning of the rat mtSSB gene Total rat liver genomic DNA was prepared from freshly excised tissue using a genomic DNA isolation kit (Gentra Systems Inc.). Three sets of oligonucleotides (GIBCO BRL) ( Table 1) corresponding to known positions in the published rat mtSSB cDNA sequence ( Tiranti et al., 1993) were used as primers in long
A 576-bp fragment containing the immediate 5∞-flanking region of the rat mtSSB gene and the first exon was generated by PCR using a genomic clone as template. PCR was performed with the T7 vector promoter primer and the gene-specific primer GSP3 ( Table 1), corresponding to the 3∞ end of exon 1 and bearing the BglII restriction site (underlined) to facilitate cloning. The resulting PCR product was digested with BglII and MluI (restriction site present in the amplified
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
271
Table 1 Primers used to PCR-amplify portions of the rat mtSSB gene Primer name
Primer sequence
P1 P2 P3 P4 P5 P6 P7 GSP-1 GSP-2 GSP-3
5∞-GCGCGTCAGGAAAAGCCTAAAGATTAGG (7–34)a 5∞-GACAGGAGTCTGAAGTAGCCAGCAGTTTG (98–126) 5∞-CAGGACCCTGTCATGAGACAGGTGGAG (175–201) 5∞-ATCGTTCAAGAACCAAACTGCTGGCTAC (139–112) 5∞-CTCCACCTGTCTCATGACAGGGTCCTG (201–175) 5∞-CATCCGTTCAATGGCTTTTCTCTTGCCTG (509–481) 5∞-CGAAGACCTGTGTTACAGGTTATTT (61–84) 5∞-CATGATGGATAGGCCTAAGATACTTGTGC 5∞-CAGACTATGAAAACACGGTTGAGCGTTCAC 5∞-cgaagatctCTGACGCGCAAGCCCAGACb
aNumbers in parentheses correspond to the nucleotide positions in the published rat mtSSB cDNA sequence ( Tiranti et al., 1993). bLower-case letters designate an added sequence containing a BglII restriction site (underlined).
fragment) (New England Biolabs) and ligated into similarly cleaved pGL3-Basic reporter vector (Promega) to yield the pGL3–576 construct. Proper directional insertion of this construct was confirmed by sequencing through the insertion site. The pGL3–576 construct thus contained the rat mtSSB gene promoter fused upstream of the firefly reporter gene encoding luciferase. Plasmid constructs containing the upstream sequence of a pseudogene were prepared as follows. The rat liver PromoterFinder DNA Walking Kit (Clontech) was amplified with primers [5∞-TGACGAAATACCTGTAACACAGGTCTTCG (nucleotides 89–61 of the cDNA) and (5∞-CCTAATCTTTAGGCTTTTCCTGACGCGC [nucleotides 34–7 of the cDNA)] used in nested PCR with the kit primers AP1 and AP2, respectively. A 1.5-kb fragment obtained from library 1 was TA-cloned into the pCR2.1 vector (Invitrogen) and was shown to lack intron 1 by DyeDeoxy Terminator Cycle sequencing analysis (Perkin Elmer). The fragment was released with SacI and XhoI, and ligated into similarly cleaved pGL3-Basic reporter vector (Promega). This construct was then cleaved with KpnI and SpeI and digested for varying times with exonuclease III (New England Biolabs) (Henikoff, 1986). The overhangs of products were digested with S1 nuclease and rendered blunt-ended with Klenow enzyme. The products were ligated with T4 DNA ligase and transformed into Top 10F ∞ cells (Invitrogen). Plasmids were isolated from individual colonies by mini-preparation and sequenced by Taq DyeDeoxy Terminator Cycle sequencing (Perkin Elmer). Two of these constructs were used in promoterstrength assays. These constructs contained nucleotide pairs 1–34 of cDNA sequence and either 396 nucleotide pairs (pGL3–Y396) or 902 nucleotide pairs (pGL3–Y902) of 5∞-flanking sequence of the intronlacking pseudogene. 2.4. Cell culture and transfections Primary rat hepatocytes used were kindly provided by Dr P.B. Hylemon (Department of Microbiology and
Immunology, Virginia Commonwealth University). Briefly, hepatocytes were isolated from male Sprague Dawley rats by in-situ collagenase perfusion according to previously described methods (Bissell and Guzelian, 1980). Hepatocytes (8.5×105) were plated in 1.5 ml of William’s E medium containing 10% fetal calf serum (GIBCO BRL), -thyroxine (1 mM ), dexamethasone (50 nM ) and penicillin (100 U/ml ) in 35-mm Primaria culture dishes (Falcon). After a 6-h attachment period, hepatocytes were transfected with 2 mg of test plasmid and 1.5 mg of the b-galactosidase expression plasmid pCMV b-gal (Stratagene) using the calcium-phosphate precipitation method as provided in the MBS Mammalian Transfection Kit (Stratagene). Following addition of the DNA suspension, cells were incubated for 3 h at 35°C in 3% CO . Cells were then washed three 2 times with phosphate-buffered saline, refed with fresh serum-free William’s E medium containing penicillin and incubated for 40 h at 37°C in 5% CO . 2 For transcfections in HepG2 cells, the cells were plated in 35-mm culture dishes in Eagle’s minimal essential medium containing -glutamine supplemented with 100 U/ml penicillin, 100 mg/ml streptomycin, 10% fetal bovine serum and 1× non-essential amino acids solution. The cells were grown for 48 h such that they were 60–70% confluent at the time of transfection. After incubation with the DNA template, the cells were washed with phosphate-buffered saline, refed with serum-free medium and incubated for 40 h at 37°C in 5% CO . 2 2.5. Reporter enzyme assays After the 40-h incubation period, cells were washed with phosphate-buffered saline and harvested in 200–300 ml of reporter lysis buffer according to the manufacturer’s protocol (Promega). Luciferase activity was assayed in 5–20 ml samples of cell extract for 20 s by the addition of 100 ml luciferase substrate using an automated luminometer (Berthold LB 9501) ( Wood,
272
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
1991). Luciferase activity was calculated as relative light units (RLU ) per microgram of protein. b-galactosidase activity was measured in 50–100 ml of cell extract using o-nitrophenyl b--galactopyranoside as substrate (Nielson et al., 1983). The product was then detected by measuring absorbance at 420 nm using a spectrophotometer. b-galactosidase activity was expressed as units (nanomoles of o-nitrophenol formed per minute) per microgram of protein. Protein concentrations were determined using the BCA reagent (Pierce). To normalize for transfection efficiency of each individual transfection, the luciferase activity expressed as RLU per microgram of protein was divided by the b-galactosidase activity expressed as units per microgram of protein. 2.6. Primer extension analysis to determine the transcription initiation site Poly(A)+ RNA was isolated directly from 5×106 primary rat hepatocytes with the micro Poly (A) Pure Kit (Ambion) as described by the manufacturer. The P4 oligonucleotide was end-labeled with [c-32P]ATP (NEN Research Products) using T4 polynucleotide kinase (New England Biolabs). One picomole of the labeled primer was annealed to 2.4 mg of poly(A)+ RNA in a 14-ml reaction volume for 10 min at 70°C and then kept on ice for 1 min. Primer extension was performed with 2 units of Superscript II reverse transcriptase (GIBCO BRL) at 45°C for 50 min in a total volume of 20 ml containing 1× synthesis buffer (20 mM Tris–HCl, pH 8.4, 50 mM KCl, 2.5 mM MgCl , and 2 0.1 mg/ml bovine serum albumin), 0.5 mM each dNTP, and 10 mM dithiothreitol. The reaction was terminated at 70°C for 10 min. Ten microliters of the primer extension reaction were diluted to 100 ml with water and precipitated by the addition of 2 mg of glycogen, 100 ml of 4 M ammonium acetate, and 400 ml of 100% ethanol. The reaction products were then resuspended in 3 ml of gel-loading dye to be run on a sequencing gel. A sequencing ladder was generated with a dsDNA cycle sequencing kit (GIBCO BRL), using the pCR1000 vector containing the full-length rat mtSSB cDNA (pP16cDNA) and the same primer as in the extension reaction, making it possible to comparatively deduce the distance of the start site from the primer site. These sequencing reactions and the primer-extension reaction were run in parallel lanes of the same gel. The gel was dried and exposed to film at −70°C without intensifying screens for 2 days.
3. Results 3.1. Molecular cloning of the rat mtSSB gene by PCR The rat PromoterFinder libraries were analyzed for the rat mtSSB gene using cDNA-specific primers P1
and P7 ( Table 1). The conditions were the same as described in ‘Materials and Methods’ for the cloning of the rat mtSSB promoter. When the PCR products were analyzed by agarose gel electrophoresis, an intense band of about 1.1 kb was observed from the nested PCR reaction obtained with library 5 (data not shown). Sequence analysis of the cloned 1.1-kb product (p16Y1) showed a high sequence homology with the rat mtSSB cDNA, including the absence of introns (Fig. 1). However, several differences including substitutions, deletions, and insertions were observed relative to the published cDNA sequence ( Tiranti et al., 1993). A single adenine nucleotide insertion in the p16Y1 sequence led to a shift in the reading frame resulting in the formation of an in-frame termination codon 26 bases downstream of the point of insertion. A short poly(A) tail was also present at the 3∞-end of the cDNA region. These sequence characteristics are typical of processed pseudogenes ( Vanin, 1985), and we tentatively concluded that the 1.1-kb fragment was the product of such a dysfunctional gene. Therefore, the PCR-based strategy was modified such that the putative intron-containing gene for rat mtSSB could be directly isolated from genomic DNA in the presence of at least one processed pseudogene. This method depended on the design of primers that were expected to span at least one putative intron in the gene of interest, resulting in the amplification of a larger product from the intron-containing gene in addition to the smaller PCR product amplified from the intronlacking pseudogene(s). To this end, oligonucleotide P3, derived from the middle of the cDNA, and oligonucleotide P6 from the 3∞ end of the cDNA were used as PCR primers with a rat genomic DNA preparation as template. Agarose gel electrophoresis of the PCR products revealed two bands of about 350 bp and 5 kb in length ( Fig. 2A). The size of the smaller product was consistent with the size expected for a processed pseudogene product. This was confirmed by cloning and sequencing of the 350-bp band, which was identical to the pseudogene clone p16Y1, except for a single base change ( TC ) (data not shown). It was concluded that the 350-bp clone was an allelic variation of p16Y1. Sequencing of the 5-kb PCR fragment from either end revealed the presence of exon–intron junctions at positions 280 and 458 in the cDNA sequence. The presence of additional exon–intron junctions was established by sequencing in both directions using primers designed around positions 281 and 457 of the cDNA. Using this approach, a third intron was found at position 368 of the cDNA. In order to amplify the remaining 5∞-end of the rat mtSSB gene, PCRs were performed on rat genomic DNA using the same strategy as mentioned above. As shown in Fig. 2B, a dominant 1.6-kb band and several smaller bands were amplified using primers P2 and P5. The sequence of the 1.6-kb band revealed a single intron
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
273
Fig. 1. Comparison of the pseudogene p16Y1 sequence with the mtSSB cDNA. The pseudogene sequence was PCR-amplified from the Promoter Finder Library 5 (Clontech) and TA cloned into the pCR2.1 vector (Invitrogen). The sequence was obtained in both directions from vector primers by automated DyeDeoxy Terminator Cycle sequencing (Perkin-Elmer). The numbering corresponds to the published cDNA ( Tiranti et al., 1993). Bold letters signify nucleotide differences in p16Y1. The boxed nucleotide identifies a nucleotide insertion, and dots identify nucleotide deletions. The comparison was performed with the PILEUP alignment program (Feng and Doolittle, 1987) in the Genetics Computer Group Program (Madison, WI ).
located between nucleotides 139 and 140 of the cDNA. Primers P1 and P4 produced a dominant band of about 2.8 kb as well as several shorter species ( Fig. 2C ). Sequencing of the 2.8-kb band revealed an intron that split the rat mtSSB leader peptide between amino acids Gln–Val (cDNA positions 9 and 8). 3.2. Isolation of the 5∞-flanking region of the rat mtSSB gene To isolate the genomic clone for the remaining 5∞-end of the cDNA and the 5∞-upstream region of the rat mtSSB gene, two reverse primers, GSP1 and GSP2 ( Table 2), were derived from intronic sequence, thereby circumventing the problem of amplifying processed pseudogenes. These two primers were used in a PCR with the rat PromoterFinder DNA Walking Kit. Agarose gel electrophoresis of the PCR products revealed a 350-bp fragment amplified from library 3 and a 1.3-kb fragment amplified from library 5 (data not shown). The 1.3-kb fragment clone overlapped the 350-bp clone and extended in the upstream direction. The sequence of the 1.3-kb fragment revealed the pres-
ence of a sixth intron interrupting the 5∞-untranslated region between cDNA nucleotides 15 and 16. Also, comparison of the 1.3-kb fragment sequence with the published rat mtSSB cDNA sequence revealed two single base differences corresponding to positions 1 and 49 of the cDNA and representing an A to G transition in both cases. Since all of our clones encompassing this region, as well as several different pseudogenes, showed these identical base changes; we believe that they are probably polymorphic differences, perhaps due to strain variability between the the rats used as the source of the cDNA ( Tiranti et al., 1993) and those used in our experiments. Sequences at the exon–intron junctions were completely consistent with the ‘AG/GT’ rule (Breathnach and Chambon, 1981). The complete nucleotide sequence of the rat mtSSB gene, including the 5∞-flanking region, is shown in Fig. 3. 3.3. Determination of the transcription initiation site The transcription initiation site of the rat mtSSB gene was determined by primer extension of poly (A)+ RNA
274
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
Fig. 2. Agarose gel analysis of overlapping PCR products containing the 3∞ portion ( lane A) and the 5∞ portions ( lanes B and C ) of the rat mtSSB gene. The primers used were P3 and P6 ( lane A), P2 and P5 ( lane B), and P1 and P4 ( lane C ) ( Table 1).
Table 2 Transient transfection assays of promoter strengths using luciferase and b-galactosidase reporter genes Construct
Cells
Fold activitya (mean±SEM )
pGL3-576 pGL3-576 pGL3-Y936 pGL3-Y430
Hepatocytes HepG2 HepG2 HepG2
226±20b 280±32b 0.53±0.03c 2.06±0.15c
aFold activity of construct (RLU/b-galactosidase) versus vector control (RLU/b-galactosidase). bMean of seven different assays. cMean of three different assays.
derived from primary rat hepatocytes. A 28-base oligonucleotide corresponding to a region of exon 3 (nucleotides 139–112 of the cDNA) was used for both primer extension and sequencing reactions. We chose not to use a primer for exon 1 because the product would have been too short to evaluate accurately on a sequencing gel. The exon 3 primer extension product is shown in Fig. 4 ( lane P) in parallel with lanes containing the
Fig. 3. The nucleotide sequence of the rat mtSSB gene. The sequence shown is the coding strand in the 5∞ to 3∞ direction and includes the 5∞ untranslated region in lower case letters, the seven exons in capital letters, and the 5∞ and 3∞ ends of the six introns in italicized lower-case letters. The downward arrow is the transcription start site, the star designates the translation start codon, and the dots above the sequence identify two nucleotides that differ from the published cDNA sequence ( Tiranti et al., 1993). The numbering is with respect to the transcription start site (+1) and does not include the intron sequence. Several potential regulatory elements (Sp1, NRF-1, NRF-2) are boxed. The two underlined nucleotides in the NRF-1 sequence are deviations from the consensus ( Virbasius et al., 1993b).
sequence of a cDNA clone. The predominant transcription start site was determined to be a distance of 69 nucleotides upstream from the ATG translation start site. Since the transcription start site extended 15 bases
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
275
elements. In addition, a potential Sp1-binding site was present at position −328 and contained the core recognition sequence GGGCGG ( Kadonaga et al., 1986). 3.5. Functional analysis of the rat mtSSB gene promoter
Fig. 4. Determination of the mtSSB gene transcription start site by primer-extension analysis of hepatocyte poly(A)+ RNA. The methods used are described in the Materials and methods. Lanes marked A, C, G, and T are the sequencing reactions of pP16cDNA with primer P4 ( Table 1). Lane P shows the P4 primer-extension product from poly(A)+ RNA. The arrowhead designates the 5∞-most base of the published cDNA sequence ( Tiranti et al., 1993). The sequence above the arrowhead corresponds to vector sequence used to measure the distance between the primer and the transcription start site.
upstream of the published cDNA sequence, the exact nucleotide corresponding to the start site was deduced to be an adenine nucleotide by comparison with our known genomic clone sequence. 3.4. Regulatory elements in the proximal promoter region A 576-bp fragment known to contain the first exon and a portion of the 5∞-flanking region (see Fig. 3) was analyzed for potential regulatory sequences involved in rat mtSSB expression. First, neither a TATA box nor a CCAAT box, both of which are common cis-acting elements for RNA polymerase II genes (Dynan and Tjian, 1985; Santoro et al., 1988), was evident in the expected location upstream of the transcription start site. Second, the 5∞-flanking region of the gene and the first exon (positions +30 to −131) were GC-rich (58%). Third, three nuclear respiratory factor-2 (NRF-2) binding motifs ( Virbasius et al., 1993a) were identified between nucleotide positions −1 and −117, and a nuclear respiratory factor-1 (NRF-1) recognition site ( Evans and Scarpulla, 1990; Virbasius et al., 1993b) was located between nucleotide +15 and +26. It has been reported that the activity of the proximal promoter of the mtTFA gene is highly dependent on NRF-1 and NRF-2 recognition sites ( Virbasius and Scarpulla, 1994). It appears that transcription of the rat mtSSB gene may also be facilitated by these two cis-acting
To analyze the 5∞-flanking region of the rat mtSSB gene for promoter activity, the −546 to +30 region was inserted into a transient expression vector pGL3-Basic, which contained firefly luciferase as the reporter gene. The resulting pGL3–576 construct was transiently transfected into pimary rat hepatocytes and HepG2 cells and assayed for luciferase activity. As shown in Table 2, the pGL3–576 construct activity was an average of 226±20 times above basal luciferase activity in hepatocytes and 280±32 times increased in HepG2 cells. These data suggest that the 576-bp region contained powerful cis-acting elements required for gene expression. To provide further evidence that this 576-bp region was indeed the bona fide upstream region of the rat mtSSB gene, a region corresponding to the upstream sequence plus the +1 to +34 sequence of the cDNA of a rat mtSSB processed pseudogene was compared with pGL3–576 for promoter activity in HepG2 cells. As shown in Table 2, 902 bases of the upstream region of the processed pseudogene (pGL3–Y902) exhibited a decrease in relative luciferase activity to about half that of the control. Interestingly, deletion of the upstream sequence from −936 to −431 (pGL3–Y396) resulted in a twofold increase in the relative luciferase activity. However, the promoter activity exhibited by this latter construct was not significant when compared to promoter activity of pGL3–576 in HepG2 cells. These results suggest that the region upstream of the pseudogene is unrelated to that of the gene and probably is located in this position due to the adventitious insertion of the pseudogene at this site.
4. Discussion The rat mtSSB gene was found to contain seven exons and six introns. The intron/exon junctions were located in positions that were virtually identical to those found in the two mtSSB genes of X. laevis (Champagne et al., 1997). Based on the sequences of our overlapping clones, no evidence of multiple genes for rat mtSSB was found, although the possibility of more than one gene copy was not rigorously ruled out. In X. laevis, it has been postulated that the two genes that have been identified are the result of evolutionary gene duplication in this organism (Bisbee et al., 1977). Interestingly, both the rat and X. laevis genes contained an intron that essentially bisects the short amphipathic presequence. Intron interuption of mitochondrial presequences has also been
276
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
found for bovine ATP synthase subunit 9 (Dyer et al., 1989), the yeast and bovine subunit alpha (Falson et al., 1991; Pierce et al., 1992), and for human (Suzuki et al., 1989) and yeast (Ohashi et al., 1982) cytochrome c . 1 Each of these other enzyme subunits is a portion of the membrane-bound complexes and contains longer presequences than those of the mtSSBs. It has been postulated that the intron in the cytochrome c presequence sepa1 rates two different targetting domains involved in intracellular matrix targetting on the one hand and intramitochondrial sorting on the other (Suzuki et al., 1989). This is not likely to be the case for the mtSSB genes because the presequences are only 16 (rat) or 17 (X. laevis) amino acids long, and the mitochondrial location of the protein is predominantly in the soluble matrix compartment with the mtDNA (Pavco and Van Tuyle, 1985). Primer-extension analysis of rat poly(A)+ RNA effectively identified a unique transcription start site located 15 bp upstream of the published cDNA sequence. This precise localization made it possible to define more accurately the 5∞ untranscribed region to be evaluated for genetic regulatory elements. In keeping with the genes for several nuclear-encoded mitochondrial proteins ( Evans and Scarpulla, 1990; Wang et al., 1997) as well as a class of other cellular constitutive proteins (Azizkhan et al., 1993), a TATA box was not present in the proximal upstream region. Perhaps the most notable controlling elements found were NRF-2 binding sites, one of which was precisely beside the transcription start site. Two others (one in opposite orientation) were located in the near upstream region. NRF-2 binding sites have also been identified in the promoter region of cytochrome oxidase subunits IV ( Virbasius and Scarpulla, 1994) and Vb (Basu and Avadhani, 1991), ATP synthase beta subunit (Ohta et al., 1988), and mtTFA ( Virbasius and Scarpulla, 1994). In addition, there was an Sp1-binding site ( Kadonaga et al., 1986) at about −330 and a 10/12-bp imperfect NRF-1 site in exon 1 near the transcription start site. The X. laevis mtSSB genes also contained potential NRF-2 and Sp1-binding sites in the near upstream sequence or in exon 1 (Champagne et al., 1997). The region of the rat mtSSB gene from position −546 to +30, which contains the potential genetic regulatory elements described above, exhibited unusually robust promoter strength, relative to vector control, in transient transfection assays. Although the activity of the complete promoter in vivo may also depend on additional modulating elements that may lie further upstream than base pair −546, or even within introns ( Evans and Scarpulla, 1989; Oshima et al., 1990), our finding that the immediate 5∞ flanking sequence exhibits substantial promoter activity provides strong evidence that this gene is likely to be a functional gene. It is logical that the regulation of the mtSSB gene
would be coordinated with other replication-related proteins, thus ensuring adequate production of mtSSB for the demands of the asymmetrical replication scheme. Comparison of the promoter region of mtSSB with that of human mtTFA ( Virbasius and Scarpulla, 1994), thought to be involved in replication as the transcription activator of heavy strand primer synthesis (Clayton, 1991b), revealed the presence of NRF-1 and NRF-2 elements in both cases. This certainly suggests that these genes are similarly regulated and that, at least, nuclear respiratory factors are important in their regulation. Further comparisons with other functional mtDNA replication genes, as they become available, should help to confirm which regulatory signals are important for coordinating mtDNA replication. Curiously, the mtSSB protein, or a variant of this protein, has recently been implicated in the nuclear expression of the Aa fibrinogen gene in rat liver (Liu et al., 1997). This activator protein (Aa-core protein) has a similar molecular weight to mtSSB, has the same five N-terminal amino acids as the mature mtSSB ( lacking the mitochondrial targetting sequence), and binds constitutively to the IL-6 response element. Furthermore, recombinant mtSSB and the Aa-core protein exhibit similar binding preferences and antigenic properties. Thus, if this protein is confirmed to be a true variant of mtSSB that is expressed from the same gene, then the regulatory elements of this gene are likely to be more complex than previously expected in order to serve this alternative transcription pathway. Interestingly, the mtTFA gene has also been shown to have alternative modes of regulation in mouse and human testis where variant transcripts (also lacking the mitochondrial targetting sequence) are generated from alternative transcription sites. In the course of this study, either complete or partial pseudogene-like sequences ( Vanin, 1985) were PCRamplified and at least partially sequenced. One of these, p16Y1 (Fig. 1), exhibited processed pseudogene characteristics, including the absense of introns, a 3∞ oligo(A) tail, and a single base insertion that resulted in an in-frame stop codon. This sequence also had a 15-bp deletion within the reading frame. Two additional PCR products (unpublished data, S. Gupta and G.C. Van Tuyle) derived from primers that targeted the 5∞ end of the gene also contained intron-lacking, cDNA-like sequences. One of these exhibited two base substitutions and a 1-bp deletion in the upstream untranslated region; the other had three base changes and a 13-base deletion in the untranslated region as well as three missense mutations in the reading frame. We believe these three examples were derived from different processed pseudogenes that are silent, mutated vestiges of reversetranscribed mtSSB mRNA. This notion is supported by the fact that the 5∞ upstream sequence of one of the pseudogenes exhibited negligible promoter activity in
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
comparison to that of the intron-containing mtSSB gene ( Table 2).
References Azizkhan, J.C., Jensen, D.E., Pierce, A.J., Wade, M., 1993. Transcription from TATA-less promoters: dihydrofolate reductase as a model. Crit. Rev. Eukaryot. Gene Expr. 3, 229–254. Basu, A., Avadhani, N.G., 1991. Structural organization of the nuclear gene for subunit Vb of mouse mitochondrial cytochrome c oxidase. J. Biol. Chem. 266, 15450–15456. Bisbee, C.A., Baker, A.C., Wilson, I., Haji-Azimi, I., Fishberg, M., 1977. Albumin phylogeny for clawed frogs (Xenopus). Science 195, 785–787. Bissell, D.M., Guzelian, P.S., 1980. Phenotypic stability of adult rat hepatocytes in primary monolayer culture. Ann. NY Acad. Sci. 349, 85–98. Breathnach, R., Chambon, P., 1981. Organization and expression of eucaryotic split genes coding for proteins. Annu. Rev. Biochem. 50, 349–383. Champagne, A.M., Dufresne, C., Viney, L., Gueride, M., 1997. Cloning, sequencing and expression of the two genes encoding the mitochondrial single-stranded DNA-binding protein in Xenopus laevis. Gene 184, 65–71. Clayton, D.A., 1991a. Replication and transcription of vertebrate mitochondrial DNA. Annu. Rev. Cell Biol. 7, 453–478. Clayton, D.A., 1991b. Nuclear gadgets in mitochondrial DNA replication and transcription. Trends Biochem. Sci. 16, 107–111. Dyer, M.R., Gay, N.J., Walker, J.E., 1989. DNA sequences of a bovine gene and of two related pseudogenes for the proteolipid subunit of mitochondrial ATP synthase. Biochem. J. 260, 249–258. Dynan, W.S., Tjian, R., 1985. Control of eukaryotic messenger RNA synthesis by sequence-specific DNA-binding proteins. Nature 316, 774–778. Evans, M.J., Scarpulla, R.C., 1989. Interaction of nuclear factors with multiple sites in the somatic cytochrome c promoter. J. Biol. Chem. 264, 14361–14368. Evans, M.J., Scarpulla, R.C., 1990. NRF-1: A trans-activator of nuclear encoded respiratory genes in animal cells. Genes Dev. 4, 1023–1034. Falson, P., Maffery, L., Conrath, K., Boutry, M., 1991. Alpha subunit of mitochondrial F - ATPase from the fission yeast. Deduced 1 sequence of the wild type and identification of a mutation that alters apparent negative cooperativity. J. Biol. Chem. 266, 287–293. Feng, D.-F., Doolittle, R.F., 1987. A multiple sequence alignment using a simplification of the progressive alignment method. J. Mol. Evol. 35, 351–360. Genuario, R., Wong, T.W., 1993. Stimulation of DNA polymerase c by a mitochondrial single-strand DNA binding protein. Cell. Mol. Biol. Res. 39, 625–634. Ghrir, R., Lecaer, J.-P., Dufresne, C., Gueride, M., 1991. Primary structure of the two variants of Xenopus laevis mtSSB, a mitochondrial DNA binding protein. Arch. Biochem. Biophys. 291, 395–400. Henikoff, S., 1986. Unidirectional digestion with exonuclease III in DNA sequence analysis. Meth. Enzymol. 155, 156–165. Hoke, G.D., Pavco, P.A., Ledwith, B.J., Van Tuyle, G.C., 1990. Structural and functional studies of the rat mitochondrial single strand DNA binding protein P16. Arch. Biochem. Biophys. 282, 116–124. Kadonaga, J.T., Jones, K.A., Tjian, R., 1986. Promoter-specific activation of RNA polymerase II transcription by Spl. Trends Biochem. Sci. 11, 20–23. Li, K., Williams, R.S., 1997. Tetramerization and ssDNA binding
277
properties of native and mutated forms of murine mtSSB proteins. J. Biol. Chem. 272, 8686–8694. Liu, Z., Fuentes, N.L., Jones, S.A., Hagood, J.S., Fuller, G.M., 1997. A unique transcription factor for the Aa fibrinogen gene is related to the mitochondrial single-stranded DNA binding protein P16. Biochemistry 36, 14799–14806. Mignotte, B., Barat, M., Mounolou, J.C., 1985. Characterization of a mitochondrial protein binding to single-stranded DNA. Nucleic Acids Res. 13, 1703–1716. Mikhailov, V.S., Bogenhagen, D.F., 1996. Effects of Xenopus laevis mitochondrial single-stranded DNA-binding protein on primer-template binding and 3∞5∞ exonuclease activity of DNA polymerase c. J. Biol. Chem. 271, 18939–18946. Nielson, D.A., Chou, J., Mackrell, A.J., Casadaban, M.J., Steiner, D.F., 1983. Expression of a preproinsulin-beta-galactosidase gene fusion in mammalian cells. Proc. Natl. Acad. Sci. USA 80, 5198–5202. Ohashi, A., Gibson, J., Gregor, I., Schatz, G., 1982. Import of proteins into mitochondria. The precursor of cytochrome c is processed in 1 two steps, one of them heme-dependent. J. Biol. Chem. 257, 13042–13047. Ohta, S., Tomura, H., Matsuda, K., Kagawa, Y., 1988. Gene structure of the human mitochondrial adenosine triphosphate synthase beta subunit. J. Biol. Chem. 263, 11257–11262. Oshima, R.G., Abrams, L., Kulesh, D., 1990. Activation of an intron enhancer within the keratin 18 gene by expression of c-fos and c-jun in undifferentiated F9 embryonal carcinoma cells. Genes Dev. 4, 835–848. Pavco, P.A., Van Tuyle, G.C., 1985. Purification and general properties of the DNA-binding protein (P16) from rat liver mitochondria. J. Cell Biol. 100, 258–264. Pierce, D.J., Jordan, E.M., Breen, G.A.M., 1992. Structural organization of a nuclear gene for the alpha subunit of the bovine mitochondrial ATP synthase complex. Biochem. Biophys. Acta 1132, 265–275. Santoro, C., Mermod, N., Andrews, P.C., Tjian, R., 1988. A family of human CCAAT-box-binding proteins active in transcription and DNA replication: cloning and expression of multiple cDNAs. Nature 334, 218–224. Scarpulla, R.C., 1997. Nuclear control of respiratory chain expression in mammalian cells. J. Bioenerg. Biomembr. 29, 109–119. Suzuki, H., Hosokawa, Y., Nishikimi, M., Ozawa, T., 1989. Structural organization of the human mitochondrial cytochrome c gene. 1 J. Biol. Chem. 264, 1368–1374. Thommes, P., Farr, C.L., Marton, R.F., Kaguni, L.S., Cotterill, S., 1995. Mitochondrial single-stranded DNA-binding protein from Drosophila embryos. J. Biol. Chem. 270, 21137–21143. Tiranti, V., Barat-Gueride, M., Bijl, J., DiDonato, S., Zeviani, M., 1991. A full-length cDNA encoding a mitochondrial DNA-specific single-stranded DNA binding protein from Xenopus laevis. Nucleic Acids Res. 19, 4291 Tiranti, V., Rocchi, M., DiDonato, S., Zeviani, M., 1993. Cloning of human and rat cDNAs encoding the mitochondrial single-stranded DNA-binding protein (SSB). Gene 126, 219–225. Van Dyck, E., Foury, F., Stillman, B., Brill, S.J., 1992. A singlestranded DNA binding protein required for mtDNA replication in S. cerevisiae is homologous to E. coli SSB. EMBO J. 11, 3421–3430. Vanin, E.F., 1985. Processed pseudogenes: characterization and evolution. Annu. Rev. Genet. 19, 253–272. Van Tuyle, G.C., Pavco, P.A., 1985. The rat liver mitochondrial DNA– protein complex: Displaced single strands of replicative intermediates are protein coated. J. Cell. Biol. 100, 251–257. Virbasius, J.V., Virbasius, C.A., Scarpulla, R.C., 1993a. Identity of GABP with NRF-2, a multisubunit activator of cytochrome oxidase expression, reveals a cellular role for an ETS domain activator of viral promoters. Genes Dev. 7, 2431–2445. Virbasius, C.A., Virbasius, J.V., Scarpulla, R.C., 1993b. NRF-1, an
278
S. Gupta, G.C. Van Tuyle / Gene 212 (1998) 269–278
activator involved in nuclear–mitochondrial interactions, utilizes a new DNA-binding domain conserved in a family of developmental regulators. Genes Dev. 7, 2431–2445. Virbasius, J.V., Scarpulla, R.C., 1994. Activation of the human mtTFA gene by nuclear respiratory factors: A potential regulatory link between nuclear and mitochondrial gene expression in organelle biogenesis. Proc. Natl. Acad. Sci. USA 91, 1309–1313. Wang, Y., Fan, C.L., Kagumi, L.S., 1997. Accessory subunit of mito-
chondrial DNA polymerase from Drosophila embryos. J. Biol. Chem. 272, 13640–13646. Wood, K.V., 1991. In: Stanley, P., Kricka, L. (Eds.), Bioluminescence and Chemiluminescence: Current Status. Wiley, Chichester, UK, p. 543. Yang, C., Curth, U., Burbanke, C., Kang, C-H., 1997. Crystal structure of human mitochondrial single-stranded DNA binding protein at 2.4 A resolution. Nature Struct. Biol. 4, 153–157.