ARCHIVES
OF BIOCHEMISTRY
AND BIOPHYSICS
Vol. 237, No. 2, March, pp. 465-476, 1985
Gene Structure
and Nucleotide
Sequence for Rat Cytochrome
P-450~’
RONALD N. HINES,2 JOAN B. LEVY,3 ROBERT D. CONRAD, PATRICK L. IVERSEN, MEI-LING SHEN, ANN M. RENLI, AND EDWARD BRESNICK The Eppley Institute for Research in Cancer and Allied Diseases, and the Departments of Biochemistry Pharmacology, The University of Nebraska Medical Center, Omaha, Nebraska 68105 Received August 27, 1934, and in revised form November
and
16, 1934
Two clones from rat genomic libraries that contain the entire gene for rat cytochrome P-450~ have been isolated. XMC4, the first clone isolated from an EcoRl library, contained a 14-kb insert. A single 5.5-kb EcoRl fragment from XMC4, the EcoRl A fragment, hybridized to a partial cDNA clone for the 3’ end of the cytochrome P-450~ mRNA. This fragment was sequenced using the dideoxynucleotide chain termination methodology with recombinant Ml3 bacteriophage templates. Comparison of this sequence with the complete cDNA sequence of cytochrome P-450MC [Yabusaki et aL (1984) Nucleic. Acids Res. 12, 2929-29381 revealed that the EcoRl A fragment contained the entire cytochrome P-450~ gene with the exception of a 90-bp leader sequence. The gene sequence is in perfect agreement with the cDNA sequence except for two bases in exon 2. A second genomic clone, XMClO, which was isolated from a Hue111 library, contains the missing leading sequence as well as 5’ regulatory sequences. The entire gene is about 6.1 kb in length with seven exons separated by six introns, all of the intron/exon junctions being defined by GT/AG. Amino- and carboxy-terminal information are contained in exons 2 and 7, respectively. These exons contain the highly conserved DNA sequences that have been observed in other cytochrome P-450 species. Potential regulatory sequences have been located both 5’ to the gene as well as within intron I. A comparison of the coding information for cytochrome P-450~ with the sequence of murine cytochrome P3-450 and rat cytochrome P-450d revealed a 70% homology in both the DNA and amino acid sequence, suggesting a common ancestral gene. Genomic blot analyses of rat DNA indicated that the 3methylcholanthrene-inducible family of cytochrome P-450 isozymes is more limited in number compared to the phenobarbital-inducible isozymes. Cross-hybridization studies with human DNA suggest a high degree of conservation between rat cytochrome P-450~ and its human homolog although gross structural differences do exist between the two genes. 0 1985 Academic Press, Inc. The cytochrome P-450-dependent monooxygenases are a family of isozymes that ’ This study was supported in part by grants from the NIH (ES-01974) and the Council for Tobacco Research (No. 1369). J. Levy was supported under a training grant from the National Cancer Institute (CA-09286). ’ To whom correspondence should be addressed. ’ Present address: Rockfeller University, New York, N. Y. 10021. 465
play a prominent role in the biotransformation of many endogenous as well as foreign substances (1, 2). Oxidative metabolism by these isozymes generally facilitates the elimination of xenobiotics by increasing their hydrophilicity. However, this process often results in the formation of intermediates that are capable of interacting with critical cellular macromolecules and eliciting a pathological response. For this reason, the cytochromes 0003-9861/85 $3.00 Copyright All rights
0 1985 by Academic Press, Inc. of reproduction in any form reserved.
466
HINES
P-450 have become the subject of intense
research by numerous investigators. The exact number of cytochrome P-450 isozymes has yet to be determined. Although at least nine distinctive cytochrome P450 isozymes have been purified from rat liver to date (3-7), this value probably represents a lower limit. Animals exposed to any one of a number of chemicals may exhibit both an induction of the specific content of cytochrome P-450 as well as an alteration in the spectrum of expressed isozymes (1, 8, 9). Recent studies have also shown that this process is tissue specific and may include repression as well as induction (10, 11). Investigations in our laboratory (12) as well as others (13-16) have verified earlier suggestions (17-21) that transcription is a primary regulatory point for alterations in cytochrome P-450 expression. Our current research efforts have concentrated on the regulation of cytochrome P-450~ expression, the major rat isozyme induced by polycyclic hydrocarbons such as 3-methylcholanthrene (3). We have previously examined the kinetics of induction of cytochrome P-450~ mRNA and have characterized a partial cDNA clone for cytochrome P-450~ (12, 22). A report describing the characterization of nuclear precursors to the mature mRNA has been communicated elsewhere (23). Yabusaki et UC (24) have recently described the sequence of a near full-length cDNA clone coding for cytochrome P-450~. We report here the sequence for the cytochrome P450~ gene including some possible regulatory sequences as well as hybridization data with rat and human genomic DNA. While this manuscript was being revised for publication, Sogawa et al. (25) described the structure and sequence for this same gene. With few exceptions, both sequences are in good agreement. EXPERIMENTAL
PROCEDURES
Materi&. [ol-32P]dCTP @OO~Ci/mmol) was obtained from Amersham Corporation (Arlington Heights, Ill.). Deoxy- and dideoxynucleotide triphosphates were purchased from Pharmacia P-L Biochemicals (Milwaukee, Wise.). With the exception of PstI, restriction endonucleases were purchased from New
ET AL. England Biolabs (Beverly, Mass.). Escherichiu cdi DNA polymerase I and PstI were obtained from Boehringer Mannheim (Indianapolis, Ind.). Largefragment E. coli DNA polymerase I (Klenow enzyme) was obtained from Bethesda Research Laboratories (Gaithersburg, Md.). Electrophoresis-grade acrylamide, bisacrylamide and ultra pure urea were purchased from Bio-Rad (Richmond, Calif.) and Schwarz/Mann (Cambridge, Mass.), respectively. BamHl linkers and DNA sequencing primers were purchased from Collaborative Research, Inc. (Lexington, Mass.). Nitrocellulose BA85 membranes were purchased from Schleicher and Schuell, Inc. (Keene, N. H.). Formamide (EM Science, Gibbstown, N. J.) was deionized prior to use by swirling with AG501X8 mixed bed resin (Bio-Rad). Autoradiography was performed at -70°C using Kodak Xar-5 film with or without DuPont Cronex III intensifier screens. The rat genomic libraries, which were constructed by cloning partial EcoRl- or HaeIII-restricted rat DNA into the bacteriophage vector Charon 4A, were provided by Dr. T. Sargent of the NIH (Bethesda, Md.). E. coli LE392 was used as a host strain in our studies. Clone 46, a cDNA clone for mouse cytochrome Pi-450 (26), was provided by Dr. D. Nebert and Dr. M. Negishi, NIH (Bethesda, Md.). Isolation of genomic clones. The EcoRl and Hoc111 rat genomic libraries were screened by in situ plaque hybridization (27) with a probe that had been radiolabeled by nick-translation (28). Plaques showing positive hybridization were rescreened at increasingly lower densities until they were plaque-purified. The initial screen of the EcoRl library was carried out using clone 46 as a probe. Phage DNA was purified for further characterization essentially as described by Maniatis et al (29). Restriction map analysis. Bacteriophage DNA containing the genomic inserts of interest was restricted to completion with a variety of endonucleases under conditions recommended by the supplier. Fragments were fractionated by agarose gel electrophoresis, visualized by eithidium bromide staining, and sized by comparison to the mobility of the Hind111 restriction fragments of X DNA. DNA sequencing. For the initial round of DNA sequencing, So~3Al restriction fragments from the genomic clones were subcloned into BamHl-restricted M13mp701 (D. R. Bentley, University of Oxford, England). Overlap sequences were obtained from HaeIII, AvaII, AccI, HindIII, or EcoRl/PstI fragments subcloned into either M13mp701, M13mp8, or Ml3mu9- (30). Where necessary, fragments were . prepared with flush ends using T4 DNA polymerase and BamHl linkers added to facilitate cloning. After transfection into E. coli JMlOl, recombinant plaques were selected on the basis of insertional inactivation of the fl-galactosidase gene and phage DNA was isolated essentially as described by Sanger et al
CYTOCHROME
I’-450~
GENE
(31). DNA sequencing was performed using the dideoxy chain termination technique with recombinant Ml3 templates (32). The data were analyzed on a Digital Equipment Corporation PDP11/23+ minicomputer using programs provided by Dr. R. Staden (33, 34). Genomic blot hybridization High-molecular-weight DNA was isolated from rat liver as described by Blin and Stafford (35), while human lymphocyte DNA was provided by Dr. W. J. Kimberling, Creighton University (Omaha, Neb.). The DNA was exhaustively digested with EcoRl, BamHl, HindIII, PstI, or ToqI and fractionated on 0.8% agarose gels. The samples were transferred to nitrocellulose (36, 37), hybridized to a nick-translated probe, and analyzed by autoradiography. Hybridizations were carried out essentially as described by Thomas (38). DNA fragments were sized by comparing their mobility to end-labeled Hind111 restriction nuclease digests of h DNA. All experiments concerned with the manipulation of recombinant DNA material in bacterial hosts were carried out under Pl containment conditions as recommended by NIH guidelines. RESULTS
Isolation of genmic clones for cytochrome P-&XC. Initial attempts to use pEB339 (22) for screening the genomic library failed, presumably due to technical problems with the long A-T homopolymer tails added during the cloning of this fragment. Clone 46, a partial cDNA clone for mouse cytochrome Pi-450 (26), was subsequently used. From 500,000 plaques screened in the EcoRl library, a single clone, XMCI, was isolated that exhibited a strong positive hybridization signal to clone 46. DNA from a purified plaque was
STRUCTURE
AND
SEQUENCE
467
isolated and mapped by restriction endonucleases as shown in Fig. 1. The total insert size was determined to be 14 kb. When XMC4 was restricted with EcoRl, and the fragments fractionated by agarose gel electrophoressis and analyzed by Southern blot hybridization to radiolabeled clone 46 DNA, a single band was observed corresponding to the 5.5-kb EcoRl fragment. Since clone 46 represents 3’ information for mouse cytochrome PI450 and assuming an analogous situation in the rat, it was concluded that at least the 3’ end of the cytochrome P-450~ gene must lie within this fragment. To facilitate further study, this 5.5-kb fragment was subcloned into the EcoRl site of pBR322 (PA%
Sequence analysis (to be described later) together with Southern and Northern blot hybridization data suggested that some exon/intron information as well as 5’ regulatory sequences were upstream from XMC4. To obtain this information, pA8 was used to screen the Hue111 library and isolate an overlapping clone, XMClO. As shown in Fig. 1, XMClO extends hMC4 12.6 kb in the 5’ direction. The 1.9-kb BamHl/EcoRl fragment directly upstream from pA8 was subcloned into pUC8 (termed pA9) and used to complete the analysis of gene structure. DNA sequence. The DNA sequence of pA8 and pA9 was determined as described under Experimental Procedures. When compared to the cDNA sequence for cytochrome P-450MC reported by Yabusaki
-2hb-a
FIG. 1. Restriction endonuclease map of XMC4 and XMClO. Bacteriophage DNA or plasmid subclones were restricted with several enzymes and the digests were analyzed by agarose gel electrophoresis. Fragment size was determined by comparing mobility to the Hind111 restriction fragments of X DNA. The relative position of the genomic subclones pA8 and pA9 is depicted.
HINES ET AL.
FIG. 2. DNA sequence of the cytochrome P-450~ gene. DNA sequence was determined as described under Experimental Procedures and summarized in Fig. 3. Intron/exon acceptor and donor sites were determined by comparison with the cDNA sequence as well as the consensus GT/AG dinucleotides. Several features have been highlighted by underlining: (a) Enhancer core sequences are located 6’ to the transcription start site and within intron I. Several enhancer “core-like” sequences are also noted within intron I. The orientation of the enhancer element relative to transcription is indicated by arrows. (b) Stretches of alternating base pairs, some capable of supporting the Z conformation of DNA, with short, flanking direct repeats are also found both upstream from the gene and within intron I. (c) TATA and modified CCAAT boxes,
CYTOCHROME
P-450~ GENE
STRUCTURE
AND
SEQUENCE
469
b
2581 ClTRTCTTGCCTCACTTC;WGCIGCAG CC&TMTG?? Exon 2
CCT TCT GTG TAT GGATTC CCA GCCTTC ACA TCA GCC Pro Ser Ual lyr t!; Phe Pro Ala Phe Thr Se? Ala
2657 ACi4 MG CTG CTC CTG GCC GTC ACC ACA TTC TGC CTT GM TTC TGG GTG GTT AM 6TC ACA AIjA kC Thr Glu Leu Leu Leu Ala Ual Thr Th? Phe Cys Leu Gly Phe Trp Ual Val Arg Ual Thr Arg Th? 2723 TGG GTT CCC IWI GGT CTE,MG AGT CCA k 6G4 CCCT;G GK TTG Cc;: TTC ATA 6G6 & GT6 CT6 Trp Ual Pro Lys Gly Leu Lys Se? Pro Pro Gly Pro Trp Gly Leu Pro Phe E Gly His Ual Leu 2789Ak CTG GGGM&W CCA CAC kG TC4 CTGAbM CTGAG; CAG CAG TAT &G GAC GTG c;G CAG Thr Lru 6ly Lys Asn Pro His Leu Ser Leu Thr Lys Leu Ser 6ln 6la Tyr Gly Asp Val Leu 6ln
2855 ATC CG; ATT 6GC TCC k4 CCC MG ri;6 GT6 CT6 AGic 6GC CT6 NC ACC ATC I%% CAG GCCCT6 GTE Ile Arg Ile 6ly Se? Thr Pro Ual Val Ual Leu Se? Gly Leu Asn Thr lie Lys Gln Ala Leu Ual 2921 W CAGGGGCAT MC TTC MA GGCCGGC& WC CTC TAC A6C TTC ACA Ci+ ATC GCT I& GGCCA6 Lys Gin 61~ Asp Asp Phe Lys Gly Arg Pro Asp Leu Tyr Se? Ptte Thr Leu Ile Ala Asn 61~ 6ln 2987 AGC ATG ACT TTC Ak CCA WC TC; GG+ CC6 CTG ;GG GCT 6CC CL CGGCGCCl6 GCCI% MtT k6 Se? Met Thr Phe Asn Pro Asp Se? Gly Pro Leu Trp Ala Ala Arg Ary Arg Leu Ala 6ln Asn Ala 3053 CTGcyI6 AiT TTC TCC ATi GCCTCA GAC CC4 ACA CT6 Gh TCC TCT TGC TAC TTG GCYI&G CAC GT6 Leu Lys Ser Phe Se? Ile Ala Ser Asp Pro Thr Leu Ala Se? Ser Cy5 Tyr Leu 6111 Glu His Ual 3119 Ak MM !% 6c; GM TAC TTA ;urC AGC M6 T;C CA6 MG Cl; ATG 6C4 GA6 kl 66C CAC T;C W Ser Lys 6lu Ala Glu Tyr Leu Ile Se? Lys Phe Gln Lys Leu Met Ala 6lu Ual 6ly His Phe Asp HRl
FIG. 2-Continued
et aL (24), it became apparent that nearly the entire gene, and all of the coding information for cytochrome P-450~ was
contained within pA8. The missing upstream information was obtained from pA9. The complete DNA sequence of the
commonly associated with promoter function, are located just upstream from exon 1. (d) Two highly conserved sequences among the different cytochrome P-450 species studies to date, HRl and HR2, are located within exon 2 and exon 7, respectively. (e) Two polyadenylation signals are found in the 3’ noncoding information, the one 19 bp 5’ to the end of the gene being the only one that apparently functions.
HINES ET AL.
3185 CCTTTCMG TATTTGGTGGTGTW GTGGCCMT GTCATCTGTGiX ATATGCTTT GGCAM CGTTAT
Pro Phe Ly5 Tyr Leu Val Val Ser Val Ala WI Val lle Cy5Ala lle Q5 Phe Gly Arg Arg Tyr 3251 MC CACMT MC CM W CTGCTCAGCATAGTCMT CTAAGCMT GclGTTT GGGGAG6’lT ACTGGT Asp Hi5 Aq Asp Gin Glu Leu Leu Ser lle Val A5n Leu Ser Asn Glu Phe Gly Glu Val Thr Gly
3317TCT GGCITACCC4GCTMC TX ATT CCTATCCTCCGTTACCTCCCTMC TCTTCCCTGGATGCCl-K Ser Gly Tyr Pro Ala Asp Phe lle Pro lle Lw Arg Tyr Leu Pro Asn Ser Ser Lw Asp Ala Pk 3383 CYIGGACTTGMT MG MG TTCTACAGTTTCAT6 MG MG CTAATCEWI G4GCACTACAGGACATTT
Lys Asp Leu Asn Lys Lys Phe Tyr Ser Phe Met Lys Lys Leu lle Lys Glu Hi5 Tyr Arg Thr Phe 3449 GAGMG ~~GGCTG~G~~T~~~~TC~~~T~~T~~~AT~T Gly Ly5 lntron II
3970 iGG AGGCTGGiC GAGMT GCCMT GTCCM &C TOJ GCITrdT MG GTCAti ACGAl-l GTTk MC Arg Arg Leu Asp 81~Asn Ala A5n Val Gln Leu Ser Asp Asp Ly5 Val lle Thr lle Val Phe Asp 4036 ncT;T
m8c7
GG;AcGTGTATC;GTGTATC~~~~~~~~~~~~~~~~G~~T~~C~
Leu Phe Gly Ala G lntron Ill GGTTT GACACAATCACAACTGCTATCTC’TTGGAGCCTCATGTACCTGGTAACCMC 4121TATCCTGTTCAG ly Phe A5p Thr lle Thr Thr Ala lle Ser Trp Ser Leu Met Tyr Leu Val Thr Asn Exon 4
4189CCTAGGATA CAGM MG ATCWIGGA8GAGr;A G ~~GGTG-cTccATTC~~~~~~G~G~C~A Pro Arg lle Gln Arg Lys lie Gln Glu Glu Leu A lntron IV 4267 AT&CTTG8CMTC~TM~CCTTTT&CTCTGlATTTTGTAGAC k GTGATT GGCA66 GATCGGWIGCCC sp Thr Val lle Gly Arg Asp Arg Gln Pro Exon 5 4345 CGGCTTTCT GX A& CCT0% CTGKC TAT ClG WIGGCCTTCATCCTGGAGACCTTC WI CATTEA Arg Leu Ser Asp Arg Pro Gln Leu Pro Tyr Leu Glu Ala Phe lie Leu Glu Thr Phe Arg Hi5 Ser 4411 TCCTl-T MC CCAl-K ACCATCCCCWC AG mWGGC~T~TCCATIRC~~TA~~~~A~~ATAC Ser Phe Val Pro Phe Thr lle Pro Hi5 Se lntron V
4491ACAGCRNTATGATTCAT iiGw&wt’ TGGCAT~GTmGGG7mGTGIjAmCCmMGGCCCCljATCCT ’ ’ FIG. Z-Continued.
cytochrome P-450~ gene is shown in Fig. 2 and a summary intron/exon map is presented in Fig. 3. We were unable to obtain an Ml3 subclone of the Sau3A fragment from -64 to +420 for sequence analysis, presumably
due to the stem-loop structure supported by sequences within this fragment (positions +183 to +297). Consequently, the sequence depicted for this fragment in Fig. 2 is that reported by Sogawa et al. (25).
CYTOCHROME P-450~ GENE STRUCTURE AND SEQUENCE
471
d 4581 TCCCCAGC ICC ATA AN MT ACA MT CTG MT GGCl-K TFlT ATC CCC MG G&I CAC TGT GTC m r Thr Ile Arg Asp Thr Ser Leu Asn Exon 6
Gly Phe Tyr
Ile Pro Lys
Gly His
GTG Cys Val Phe Val
46491X CAGTGG CAG GTT MC CAT GX CA GT~GTT~CAGGTGG~~C~~~GGWYTG Am Gln Trp Gln Val Asn His Aq 61 Intron VI
4820 AG G GM CTATk
GGT MT CdMC MGTTC &G CCT GA>TT CTT &TCCAGI G&T CTG n Glu Leu Trp Gly Asp Pro Asn Glu Phe Arg Pro Glu Ary Phe Leu Thr Ser Ser Gly Thr Leu Exon 7
4886 P&A
CAC CTGM'i WMG GTCkT CTCTTT GGTTTG GGC& CMAGTGCkT Asp Lys His Leu Ser Glu Lys Val Ile Leu Phe Gly Leu Gly Lys Arg Lys Cys
GGG(jAGi&
Ile Gly Glu Thr tiff2
4952 ATT GGC&CTG lie
Gly Arg
Leu
MGGTC+TT CTCTTC C;GGKATCCITG CTGW CACIATGG~~TTT~T GTGTCA Phe Leu Phe Leu Ala Ile Leu Leu Gln Gln Met Glu Phe Am Val Ser
Glu Val
5018Cd GGC(;cIGFAGf;TGMT Pro Gly Glu Lys Val Asp
AGT i&T CCT GCCTA; GGGCTG ACT ;rA WI CAT Gk CGCTGT ME, MC llet Thr Pro Ala Tyr Gly Leu Thr Leu Lys Hi5 Ala Arg Cys Glu His
Although the gene sequence reported here is in near perfect agreement with the reported cDNA sequence of Yabusaki et al. (24) and corrected amino- and carboxy-terminal protein sequences (Dr. W. Levin, personal communication), two exceptions are noted. Position 2669 of the exon 2 sequence (equivalent to position
124 of the cDNA sequence and amino acid 17 of the protein sequence) reads cytosine while the cDNA sequence reads thymine, and position 2776 of exon 2 (equivalent of cDNA position 231 and amino acid 52) reads adenine while the cDNA reads guanine. The first discrepancy would not alter the amino acid sequence, i.e., leucine.
472
HINES
FIG. 3. Intron/exon map and sequencing strategy for the cytochrome P-450~ gene. The location of the amino- and carboxy-terminal sequences are indicated within exon 2 and exon 7, respectively. Subcloning fragments and sequencing were accomplished as described under Experimental Procedures.
However, the second changes the amino acid from methionine in the cDNA to isoleucine for the gene. The gene sequence of Sogawa et al. (25) also indicates isoleutine at this position. All other discrepancies lie within the 3’ noncoding region of the gene. The highly conserved HRl and HR2 sequences proposed by Gotoh et al. (39) as possible heme binding sites are also indicated in Fig. 2 (HRl within exon 2, HR2 within exon 7). Several potential regulatory sequences are noted (Fig. 2). Stretches of repetitive, alternating base pairs are found at -524, W),; - 212, (TGhs; +184, KW,(TG)u(AG),,]; and +22’71, (TG),,. The sequences at -212, +2271, and part of the sequence at +184 consist of alternating pyrimidine/ purine base pairs which have been shown to stabilize the Z conformation of DNA (40). This alternative conformation has been suggested to play a role in the regulation of gene expression (41). As noted earlier, the sequence at +184 would also support a stem-loop structure. Interestingly, each of these stretches of alternating base pairs is flanked by short direct repeats typical of insertion elements. A Goldberg-Hogness TATA box (42) is found 30 bp upstream of exon 1, and a possible CCAAT box (CCAT) is at position -124. Two enhancer core sequences (43) are found at positions -334 and +52’7, the latter oriented opposite to the direction of transcription of the cytochrome P-450~ gene. Several other enhancer “core-like”
ET AL.
sequences can be found at +501, +857, +1304, and +1974. The significance of some of these sequences was also noted by Sogawa et al (25). Genomic blot analyses. To ascertain whether cytochrome P-450~ was a member of a highly homologous multigene family as has been reported for cytochrome P450b (44-47), and to determine whether or not pseudogenes might be represented in the genome, rat liver genomic DNA was isolated, restricted with a variety of endonucleases, and analyzed by Southern blot hybridization with radiolabeled pA8 (Fig. 4). For each endonuclease, the hybridization pattern can be accounted for by the restriction map of the cytochrome P-450~ gene (Fig. 1). Upon longer exposure of the autoradiograms, some additional bands are observed (data not shown), which most likely correspond to the most homologous protein to cytochrome P-450~ noted to date, i.e., cytochrome P-450d (24). Attempts to use radiolabeled pA9 as a hybridization probe resulted in high levels of background on the autoradiograms. This most likely is a result of the alternating base pair sequences (48). In an attempt to begin extending these studies to the human homolog for rat cytochrome P-45Oc, human lymphocyte DNA was obtained from 20 donors and analyzed in a similar fashion to the rat liver DNA. In all cases, cross-hybridization to the rat probe was observed even under high-stringency hybridization conditions. Although no differences were observed among the different human samples, significant differences were observed between the rat and human hybridization patterns (summarized in Table I). DISCUSSION
As a prerequisite to studies on specific mechanisms involved in the regulation of expression of rat cytochrome P-45Oc, we have isolated and characterized two genomic fragments which contain the entire gene for cytochrome P-450~. Complete sequence analysis of these genomic fragments led to several observations. With the exception of two single-
CYTOCHROME
P-450~ GENE
FIG. 4. Genomic blot hybridization data with rat DNA. Forty-microgram aliquots of rat genomic DNA were exhaustively digested with either EcoRl, BamHl, P&I, or Hind111 restriction endonuclease, and size fractionated by agarose gel electrophoresis. After Southern transfer to a nitrocellulose membrane, the digests were hybridized with radiolabeled pA8 and autoradiographed. Fragment size was determined by comparison with the mobility of the Hind111 restriction fragments X DNA, indicated by arrows. Faint bands not listed in Table I most likely are indicative of cross-hybridization with the eytochrome P-450d gene.
base differences, the coding sequence reported here for cytochrome P-450~ agrees completely with the recently published cDNA sequence for cytochrome P-450MC (24) as well as with the corrected aminoand carboxy-terminal protein sequences for cytochrome P-450~ (Dr. W. Levin, personal communication) and the gene sequence reported by Sogawa et al. (25). Several aspects of the sequence found either 5’ to the gene or within intron I are intriguing relative to possible regulation of expression of cytochrome P-450~.
STRUCTURE
AND
473
SEQUENCE
Enhancer elements are &-essential but orientation-independent sequences shown to play a critical role in increasing the transcriptional efficacy of several viral as well as eucaryotic genes (49). In some cases these elements have exhibited tissue specificity (50, 51) and at least one case of a hormone-responsive enhancer element has been described (52). An enhancer core sequence defined by Weiher et al (43) and found previously in enhancer elements of eucaryotic genes (50, 51, 53) can also be found both 5’ to the cytochrome P-450~ gene and within intron I. The latter sequence is oriented opposite to the direction of transcription of the gene. Several other enhancer “core-like” sequences have been identified as shown in Fig. 2. The stretches of alternating base pairs both upstream from the cytochrome P-450~ gene and within intron I also bear a striking resemblance to those seen within a DNA fragment with enhancer activity from the E, gene of the murine major histocompatibility complex (53). This type of repeating element has also been shown to exist in high copy number in widely diverse euTABLE
I
SOUTHERN BLOT HYBRIDIZATION ANALYSIS OF RAT AND HUMAN DNA WITH A RAT CYTOCHROME P-450~ GENOMIC PROBE Fragment DNA source Rat
Human
EcoRl
BamHl
5.5
10.5 4.5
13.5 2.9
11.1 5.3
size (kb)
PstI 5.3 2.2 2.2 0.6 8.7 2.0 1.7 0.8
Hind111
Tag1
3.3 2.0
5.4 3.0
11.2 8.0
7.8 5.9 2.1
Note.High-molecular-weight genomic DNA (20 cg) from rat liver or human lymphocytes was exhaustively restricted with EcoRl, BamHl. P&I, HindHI, or TaqI endonuclease, size fractionated by agarose gel electrophoresis, and analyzed by Southern blot hybridization to radiolabeled XMC4 EcoRl A fragment. Fragment size was determined by comparison to the mobility of the Hind111 restriction fragments of X DNA. Two P&I fragments, each 2.2 kb in length, are listed for the rat based on the relative intensity of this band compared to others in this lane.
474
HINES
caryotic genomes (48). Since enhancers can act upon a promoter from either the 5’ or 3’ orientation, it is interesting to speculate that the sequences within intron I and the sequences 5’ to the gene are indeed enhancer elements and act to coordinately regulate the expression of CytochromeP-450c as well as neighboring genes. The restriction map as well as the intron/exon organization of the cytochrome P-450~ gene is significantly different than that reported by Nakamura et al. for mouse cytochrome PI-450 (54). These investigators reported a total gene size of 4.6 kb containing five exons in contrast to the gene for cytochrome P-450~ which is 6.1 kb and contains seven exons. These differences are despite the fact that considerable sequence homology exists in at least the 3’ ends of the two genes since clone 46, a cDNA clone for the 3’ end of the cytochrome PI-450 gene, was used to isolate the genomic fragments described in this report. Comparisons were also made with the cDNA sequence of cytochrome P3-450 (the murine homolog to rat cytochrome P450d) which has been recently reported by Kimura et al. (55). To maximize the homology, two compensating alterations in the published sequence of the cDNA were required; the cytosine at position 142 had to be omitted and a guanosine had to be added at position 202. These changes alter the reading frame of the cDNA for amino acids 27 through 46, but most likely represent mistakes in the cDNA sequence and not actual differences in the genes. With these changes, the cDNA for cytochrome P3-450 and the coding exon information for cytochrome P450~ are 70% homologous in both the DNA and amino acid sequence. Of the differences observed, 25% of those in the DNA and 45% of those in the amino acid sequence are conservative in nature (i.e., the altered codon does not translate into a different amino acid or the altered amino acid belongs to the same functional group). The greatest homology is observed within exon 2 (77%) and exon 5 (85%) while the least is observed within exon 3 (45%).
ET AL.
Little or no homology is observed within the 3’ noncoding portions of the two genes except for the presence of A-T-rich stretches of unknown function. Similar results are observed when the cDNA sequence of rat cytochrome P-450d (56) is compared to that for cytochrome P-450~. A graphic representation of these similarities in both the coding and noncoding DNA sequences of cytochromes P-450~ and P-450d is given in Fig. 5A. These data clearly show the striking degree of homology in the coding regions of the two genes, particularly at the amino- and carboxy-terminal ends. Less homology is apparent between the middle of the two genes and is clearly diminished in the 3’ noncoding sequences. A similar comparative plot between the DNA sequences of cytochromes P-450~ and P-450e is presented in Fig. 5B. A slight degree of
A 1700.
-
15001300.
:
;
900.
t m
700. 500.
,’
,,.”
“0 IIOO-
,s,”
300.
,/
loo-
..-( I500
-B
z “,
1500.
= 1300 .’ 3 IIOO%
900.
5: a0 s 500. 700. (I
300100 I
500 RAT
1000 pqsoc
1500 2000 E~EQuENCE#)
FIG. 5. A graphic representation of the homology between cytochrome P-450~ and cytochromes P-450d and P-450e. The PCMATRIX program of Lagrimini et al. (58) was used to compare the DNA sequence of the cytochrome P-450~ gene versus that of the cytochrome P-450d (A) and cytochrome P-450e (B). Both plots were determined using a window setting of nine bases, i.e., a perfect match of nine nucleotides is required to score a match. Identical sequences appear as a diagonal line with a positive slope.
CYTOCHROME
P-450~ GENE
homology, approximately 30%) is observed. Based on sequence homology similar to that reported here, Kimura et aZ. (47) and Sogawa et al. (25) concluded that the 3methylcholanthrene-inducible (murine P3450, and rat P-450d and c) and the phenobarbital-inducible (rat P-450e) cytochrome P-450 isozymes most likely arose from a common ancestral gene by divergent evolution. However, as stated by Sogawa et al. (25), the gross differences in exon/intron structure between the cytochrome P-450e and cytochrome P-450~ genes are not in accordance with the theory proposed by Gilbert (57) for the evolutionary function of intervening sequences, and suggests the possibility of convergent evolution for these two families of cytochrome P-450 isozymes. Since the rat cytochrome P-450~ genomic fragment cross-hybridizes with human DNA under relatively high stringency conditions, one must also conclude that considerable homology exists between at least portions of the cytochrome P-450~ gene and its human homolog. However, gross structural differences do exist based on the differences observed in the genomic blot hybridization patterns. Several investigators have demonstrated that the phenobarbital-inducible family of cytochrome P-450 isozymes contains several highly homologous members encoded at closely linked loci in the rat (44-47). This does not appear to be the case for the 3-methylcholanthrene-inducible isozymes since a single band is observed at 5.5 kb when pA8 is hybridized to EcoRl-digested rat genomic DNA. All of the hybridization patterns can be explained by the restriction map of this fragment (Table 1). This observation also argues against the presence of pseudogenes for cytochrome P-450~. With the elucidation of gene structure at the sequence level, it will now be possible to define specific sequences involved in the regulation of expression of the cytochrome P-450 isozymes beginning with those described earlier in this report. Furthermore, with the observation that the human homolog to cytochrome P-450~ has been conserved enough to allow cross-
STRUCTURE
AND
475
SEQUENCE
hybridization with the rat probe, we will be able to expand our studies into the regulation of expression of the human cytochrome P-450-dependent monooxygenases. ACKNOWLEDGMENTS The authors acknowledge of Ann M. Renli, and the discussion of Dr. W. Levin, D. Ryan, Hoffman-LaRoche,
the technical assistance helpful comments and Dr. P. Thomas, and Dr. Inc.
REFERENCES 1. CONNEY, A. H. (1967) PharmacoL Rev. 19, 317366. 2. GILLETTE, J. R., DAVIS, D. C., AND SASAME, H. A. (19’72) Annu. Rev. PharmacoL 12,57-84. 3. RYAN, D. E., THOMAS, P. E., KORZENIOWSKI, D., AND LEVIN, W. (1979) J. BioL Chem 254.1365-
1374. 4. RYAN, D. E., THOMAS,
P. E., AND LEVIN,
W. (1980)
J. BioL Chem. 255, 7941-7955. 5. RYAN, D. E., THOMAS,
Arch
B&hem.
6. GUENGERICH, F. S. T., MARTIN,
P. E., AND LEVIN,
W. (1982)
Biophys. 216,272-288. P., DANNAN, M. V., AND
(1982) Biochemistry
G.
A.,
WRIGHT,
KAMINSKY, L. S. 21,6019-6030.
7. RYAN, D. E., IIDA, S., WOOD, A. W., THOMAS, P. E., LIEBER, C. S., AND LEVIN, W. (1984) J.
BioL Chem. 259.1239-1250. 8. WATERMAN,
M. R., AND ESTABROOK,
Mol. Cell Biochem
R. W. (1983)
53/54, 267-278.
9. BRESNICK, E., FOLDES, R. L., AND HINES, R. N. (1984) PharmacoL Rev. 36, 435-515. 10. PHILLIPS, I. R., SHEPHARD, E. A., BAYNMEY, R. M., PIKE, S. F., RABIN, B. R., HEATH, R., AND CARTER, N. (1983) B&hem. J. 212,55-64. 11. SERABJIT-SINGH, C. J., ALBRO, P. W., ROBERTSON, G. C., AND PHILPOT, R. M. (1983) J BioL Chem.
258, 12827-12834. 12. BRESNICK, E., BROSSEAU, M., LEVIN, W., REIK, L., RYAN, D. E., AND THOMAS, P. E. (1981) PTOC.
NatL Acad Sci. USA 78,4083-4087. 13. ADESNIK, M., BAR-NUN, S., MASCHIO, F., ZUNICH, M., LIPPMAN, A., AND BARD, E. (1981) J. BioL
Chem. 256,10340-10345. 14. HARDWICK, J. P., GONZALEZ, F. J., AND KASPER, C. B. (1983) J BioL Chem. 258,8081-8085. 15. TUKEY, R. H., NEGISHI, M., AND NEBERT, D. W. (1982) MoL PharmacoL 22,779-786. 16. MORVILLE, A. L., THOMAS, P. E., LEVIN, W., REIK, L., RYAN, D. E., RAPHAEL, C., AND ADESNIK, M. (1983) J. BioL Chem. 258. 3901-3906. 17. CONNEY, A. H., AND GILMAN, A. G. (1963) J. BioL
Chem. 238, 3682-3685.
HINES 18. GELBOIN, H. V., AND BLACKBURN, N. R. (1964) Cancer Res. 24.356-360. 19. CUTRONEO, K. E., AND BRESNICK, E. (1973) B&hem. Pharmacol 22, 675-687. 20. JACOB, S. T., SCHARS, M. B., AND VESELL, E. S. (1974) Proc. Natl. Acod. Sci. USA 71. 704-707. 21. WHITLOCK, J. R., JR., AND GELBOIN, H. V. (1974) J. Biol Chem. 249,2616-2623. 22. BRESNICK, E., LEVY, J., HINES, R. N., LEVIN, W., AND THOMAS, P. E. (1981) Arch. Biochem Biqnhys. 212,501-507. 23. HINES, R. N., FOLDES, R. L., LEVY, J. B., OMIECINSKI, C. J., Ho, K-L., SHEN, M-L., AND BRESNICK, E. (1984) in. Banbury Report No. 16: Genetic Variability in Responses to Chemical Exposure (Omenn, G., and Gelboin, H. V., eds.), pp. 37-49 Cold Spring Harbor Laboratory, Cold Spring Harbor, New York. 24. YABUSAKI, Y., SHIMIZU, M., MURAKAMI, H., NAKAMURA, K., CEDA, K., AND OHKAWA, H. (1984) Nucleic Acids Res. 12, 2929-2938. 25. SOGAWA, K., GOTOH, O., KAWAJIRI, K., AND FUJIIKURIYAMA, Y. (1984) Proc. NatL Acad Sci. USA 81, 5066-5070. 26. NEGISHI, M., SWAN, D. C., EN$UIST, L. W., AND NEBERT, D. W. (1981) Proc. NatL Acad Sci. USA 78,800-804. 27. BENTON, W. D., AND DAVIS, R. W. (1977) Science (Washington, D. C.j 196, 180-182. 28. RIGBY, P. W. J., DIEKMANN, M., RHODES, C., AND BERG, P. (1977) .J. Mol. BioL 113, 237-251. 29. MANIATIS, T. (1978) Cell 15, 687-701. 30. MESSING, J., AND VIEIRA, J. (1982) Gene 19, 269276. 31. SANGER, F., COULSON, A. R., BARRELL, B. G., SMITH, A. J. H., AND ROE, B. A. (1980) J. Mol. BioL 143. 161-178. 32. SANGER, F., NICKLEN, S., AND COULSON, A. R. (1977) Proc. NatL Acad Sci. USA 74, 54635467. 33. STADEN, R. (1977) Nucleic Acids Res. 4, 40374051. 34. STADEN, R. (1980) Nucleic Acids Res. 8.817-824. 35. BLIN, N., AND STAFFORD, D. W. (1976) Nucleic Acids Res. 3, 2303-2308. 36. SOUTHERN, F. M. (1975) J. Mol. BioL 98503-517. 37. SMITH, G. E., AND SUMMERS, M. D. (1980) Anal. Biochem. 109, 123-129. 38. THOMAS, P. (1980) Proc NatL Acad Sci USA 77, 5021-5205.
ET AL. 39. GOTOH, O., TAGASHIRA, Y., IIZUKA, T., AND FUJIIKURIYAMA, Y. (1983) J. Biachem. 93, 807-817. 40. WANG, A. H.-J., QUIGLEY, G. J., KOLPAK, F. J., CRAWFORD, J. L., VAN BOOM, J. H., VAN DER MAREL, G., AND RICH, A. (1979) Nature (Lendon) 282.680-686. 41. NORDHEIM, A., AND RICH, A. (1983) Nature (Lendon) 303, 674-679. 42. BREATHNACH, R., AND CHAMBON, P. (1981) Annu. Rev. Biochem. 50,349-383. 43. WEIHER, H., KONIG, M., AND GRUSS, P. (1983) Science (Washington, D. C.) 219, 626-631. 44. WALZ, F. G., JR., VLASUK, C. P., OMIECINSKI, C. J., BRESNICK, E., THOMAS, P. E., RYAN, D. E., AND LEVIN, W. (1982) J BioL Chem. 257,4023-4026. 45. MIZUKANI, Y., FUJII-KURIYAMA, Y., AND MURAMATSU, M. (1983) Biochemistry 22, 1223-1229. 46. SIMMONS, D. L., AND KASPER, C. B. (1983) J. BioL Chem. 258, 9585-9588. 47. RAMPERSAUD, A., AND WALZ, F. G., JR. (1983) Proc. NatL Acad Sci. USA 80, 6542-6546. 48. HAMADA, H., PETRINO, M. G., AND KAKUNAGA, T. (1982) Proc. NatL Acad Sci USA 79, 64656469. 49. KHOURY, G., AND GRUSS, P. (1983) Cell 33, 313314. 50. BANERJI, J., OLSON, L., AND SCHAFFNER, W. (1983) Cell 33. 729-740. 51. GILLIES, S. D., MORRISON, S. L., Or, V. T., AND TONEGAWA, S. (1983) CeU 33,717-728. 52. OSTROWSKI, M. C., HUANG, A. L., KESSEL, M., WOLFORD, R. G., AND HAGER, G. L. (1984) EMBO J. 3,1891-1899. 53. GILLIES, S. D., FOLSOM, V., AND TONEGAWA, S. (1984) Nature (London,J 310, 594-597. 54. NAKAMURA, M., NEGISHI, M., ALTIERI, M., CHEN, Y-T., IKEDA, T., TUKEY, R. H., AND NEBERT, D. W. (1983) Eur. .I B&hem. 134,19-25. 55. KIMURA, S., GONZALEZ, F. J., AND NEBERT, D. W. (1984) Nucleic Acids Res. 12.2917-2928. 56. KAWAJIRI, K., GOTOH, O., SOGAWA, K., TAGASHIRA, Y., MURAMATSU, M., AND FUJII-KURIYAMA, Y. (1984) Proc. NatL Awe! Sci. USA 81, 16491653. 57. GILBERT, W. (1978) Nature (London) 271,501. 58. LAGRIMINI, M. L., BRENTANO, S. T., AND DONELSON, J. E. (1984) Nucleic Acids Res. 12, 605-614.