Gene 294 (2002) 239–247 www.elsevier.com/locate/gene
Genomic structure and identification of a truncated variant message of the mouse estrogen receptor a gene Deborah L. Swope*, J. Chuck Harrell, Dipak Mahato, Kenneth S. Korach Receptor Biology Section, Laboratory of Reproductive and Developmental Toxicology, National Institute of Environmental Health Sciences, 111 T.W. Alexander Drive, Room E460, Research Triangle Park, NC 27709, USA Received 7 February 2002; received in revised form 6 June 2002; accepted 19 June 2002 Received by K. Gardiner
Abstract Estrogen receptor alpha (ERa) is a ligand-dependent transcription factor that directs the transcription of a wide number of estrogenregulated genes. ERa mediates the effects of 17-b-estradiol in both males and females, and was the first estrogen receptor identified. Despite the cloning of the mouse ERa cDNA over 15 years ago, the precise genomic organization of the mouse ERa gene has not yet been elucidated. In order to determine the structure of this gene, overlapping BAC and P1 clones containing partial genomic sequences of the mouse ERa cDNA were obtained from a mouse ES cell genomic library. Using standard restriction fragment analysis followed by Southern blotting, the mouse ERa gene was determined to be greater than 220 kb in length. The introns vary widely in size, from 1.8 to 60 kb in length. Sequencing of intron–exon boundaries shows that these boundaries are highly conserved between the human and mouse ERa genes. Additionally, we have identified a splice variant message of mouse ERa arising from a failure to properly splice at the 3 0 end of exon 4; the resulting message is predicted to produce a protein lacking the ligand-binding domain. Variant message was detected by RT-PCR in several tissues, including uterus, ovary, mammary gland, placenta and testis. Published by Elsevier Science B.V. Keywords: Estradiol; Splice variant; Steroid hormone receptor; Transcription
1. Introduction The mouse estrogen receptor alpha (ERa) is a member of the large superfamily of steroid hormone receptors that show a high degree of conservation at the protein level. Generally, these receptors contain a variable amino-terminal A/B domain; a highly conserved C domain, which contains two zinc-fingers and is required for DNA binding; the hinge, or D domain; and a large hydrophobic E domain, which contains the dimerization domain, binds steroids and also contains the major ligand-dependent activation domain (Weatherman et al., 1999; Escriva et al., 2000). The greatest degree of homology between receptors in this family lies in the DNA-binding domain (DBD), and in the ligand-responAbbreviations: aERKO, estrogen receptor alpha knockout; AF-2, activation function-2; BAC, bacterial artificial chromosome; bp, base pair(s); cDNA, complementary DNA; DBD, DNA binding domain; ERa, estrogen receptor alpha; ERb, estrogen receptor beta; EST, expressed sequence tag(s); kDa, kilodalton; LBD, ligand binding domain; PCR, polymerase chain reaction; RACE, rapid amplification of cDNA ends; RT–PCR, reverse transcription–PCR; UTR, untranslated region * Corresponding author. Tel.: 11-919-541-3860; fax: 11-919-541-7935. E-mail address:
[email protected] (D.L. Swope). 0378-1119/02/$ - see front matter. Published by Elsevier Science B.V. PII: S 0378-111 9(02)00796-5
sive AF-2 (activation function-2) domain, which is contained within the ligand-binding domain (LBD) and interacts with coactivators to activate transcription when ligand is bound (Weatherman et al., 1999). Nuclear hormone receptors regulate transcription of a variety of genes involved in reproduction, metabolism, and growth and differentiation in response to hormonal ligands as well as to ligand-independent stimuli. Estrogen receptors have been identified in species from Caenorhabditis elegans to human (Hood et al., 2000; Escriva et al., 2000). In a number of species, two estrogen receptors (ERa and ERb) have been identified. In vitro comparisons of ERa versus ERb show that while ERa and ERb are highly related, they differ in the ability to activate transcription through their amino-terminal and carboxyl-terminal domains (Cowley and Parker, 1999). Also, while these two receptors are often found expressed in a single tissue, there is evidence that their expression may not be overlapping in many cell types, suggesting that their functions may not be redundant (Jefferson et al., 2000; Skynner et al., 1999). In vivo studies of the aERKO mouse (estrogen receptor alpha knockout) model demonstrated that ERa is critical for fertility in females and
240
D.L. Swope et al. / Gene 294 (2002) 239–247
males, and for the uterotropic response to estradiol in females (Korach, 1994; Couse and Korach, 1999). Studies of ERb knockout animals have shown that while the mice exhibit a reproductive phenotype, they are still fertile, suggesting that ERb is not required for some aspects of reproduction (Krege et al., 1998). The genomic structure of the upstream noncoding exons of the mouse ERa gene has been described by others (Kos et al., 2000); these studies showed that several splice variants exist within the 5 0 -UTR of mouse ERa, revealing the use of multiple tissue-specific promoters for transcription of mERa. However, despite the cloning of mouse ERa cDNA sequence in 1987 and the prevalent use of the aERKO mouse model to study the role of ERa in vivo, the genomic organization of the coding exons of the mouse ERa gene has not yet been described. The lack of this information has presented difficulties for a number of studies, including the study of alternative splicing of the coding regions of ERa in mouse and comparison to the known splice variants of ERa in human (Lu et al., 1999). Here we detail the genomic structure of mouse ERa, the intron/exon boundaries of this gene, and comparison to the genomic structure of the human ERa gene. In addition, we describe a novel splice variant of mouse ERa message lacking sequences coding for the ligand-binding domain.
CTGTCCAGGAGCAAGTTAGGAGC-3 0 ; exon 6 primers: forward primer (nt 1442–1466) 5 0 -CAAGG0 TAAATGTGTGGAAGGCATGG-3 , reverse primer (nt 1555–1532) 5 0 -GATGGATTTGAGGCACACAAACTC3 0 ; exon 7 primers: forward primer (nt 1567–1587) 5 0 TTCCGGAGTGTACACGTTTCT-3 0 , reverse primer (nt 1719–1700) 5 0 -TGAGCTAGGCGGCGATGC-TGC-3 0 ; exon 8 primers: forward primer (nt 1768–1785) 5 0 -GGAGCATCTCTACAACAT-3 0 , reverse primer (nt 1879–1862) 5 0 -GGGCACTCCCATGCGACT-3 0 .
2.2. DNA isolation and Southern blotting of genomic clones Clones from BAC and P1 libraries positive for mouse ERa exon sequences were obtained from Incyte Genomics. DNA for further analysis was prepared using the Kb100 Magnum Genomic DNA preparation kit (Incyte Genomics). All genomic clones were restriction digested and analysed by Southern blotting. Genomic DNA from 129/SvJ mice was obtained from Incyte Genomics and subjected to restriction analysis followed by Southern blot analysis to verify results obtained from the BAC and P1 clones.
2.3. Sequencing of BAC and P1 clones 2. Materials and methods 2.1. Screening for ERa genomic clones Three mouse genomic clone libraries (P1 129/Ola, BAC 129/SvJ, and BAC C57Bl/6) were screened by PCR and hybridization by Incyte Genomics (St. Louis, MO) for clones containing the coding exons 1–8 of mouse ERa and the non-coding exon C (Kos et al., 2000). For purposes of simplicity, exons are numbered as in the human gene. Sequences of primer pairs used for PCR screening and generation of fragments for hybridization screening are as follows; note that exon 1 was not used for screening due to repetitive sequences. Nucleotide sequence numbers are from GenBank accession number M38651. 5 0 -UTR (includes exon C) primers: forward primer (nt 2351 to 2324) 5 0 -GCACACTTTGACTGGCATTC-3 0 , reverse primer (nt 60–80) 5 0 -GGTTTGTAGAAGTCAAGGGC-3 0 ; exon 2 primers: forward primer (nt 660–680) 5 0 -ATTCTGACAATCGACGCCAGA-3 0 , reverse primer (nt 820–794) 5 0 -CTTGCAGCCTTCGCAGGACCAGACCCC-3 0 ; exon 3 primers: forward primer (nt 863–880) 5 0 -TGTCCAGCTACAAACCAA-3 0 , reverse primer (nt 940–920) 5 0 -GTAACACTTGCGCAGCCGACA-3 0 ; exon 4 primers: forward primer (nt 980–1000) 5 0 -CGAGGAGGGAGAATGTTGAAG-3 0 , reverse primer (nt 1273–1253) 5 0 -CATATGAACCAGCT-CCCTATC-3 0 ; exon 5 primers: forward primer (nt 1300–1319) 5 0 -CTTTGGGGACTT0 GAATCTCC-3 , reverse primer (nt 1437–1415) 5 0 -
Exon/intron boundaries were determined by sequencing using the ThermoSequenase Thermocycle Sequencing Kit (USB). Briefly, 500 ng of purified BAC or P1 clone DNA was used as template. Reactions were cycled according to manufacturer’s instructions for 30 cycles, then run on sequencing gels (Invitrogen). A minimum of 110 bp of sequence was obtained for each analysis, and each intron/ exon boundary independently sequenced at least two times. Sequencing primers used were as follows. 5 0 -UTR (splice junction to promoters upstream of exon 1): 3 0 -end (nt 92– 113) 5 0 -TCTGGAAAGACGCTCTTGAACC-3 0 . Intron 1: 5 0 -end (nt 591–610) 5 0 -TACTACCTGGAGAACGAGCC3 0 ; 3 0 -end (nt 680–657) 5 0 -TCTGGCGTCGATTGTCAGAATTAG-3 0 . Intron 2: 5 0 -end (nt 812–838) 5 0 GGCTGCAAGGCTTTCTTTAAGAGAAGC-3 0 ; 3 0 -end 0 (nt 883–862) 5 -GCATTGGTTTGTAGCTGGACAC-3 0 . Intron 3: 5 0 -end: (nt 921–941) 5 0 -GTCGGCTGCGCAAGTGTTACG-3 0 ; 3 0 -end (nt 1001–979) 5 0 -GCTTCAACATTCTCCCTCCTCGG-3 0 . Intron 4: 5 0 -end (nt 1257– 1283) 5 0 -GGGAGCTGGTTCATATGATCAACTGGG-3 0 ; 3 0 -end (nt 1361–1234) 5 0 -GAATCTCCAGCCAGGCACACTCGAGAAG-3 0 . Intron 5: 5 0 -end (nt 1405–1423) 5 0 GCTCCTGTTTGCTCCTAAC-3 0 ; 3 0 -end (nt 1474–1452) 5 0 -GATCTCCACCATGCCTTCCACAC-3 0 . Intron 6: 5 0 end (nt 1509–1532) 5 0 -GCATGATGAACCTGCAGGGTGAAG-3 0 ; 3 0 -end (nt 1719–1700) 5 0 -TGAGCTAGGCGGCGATGCTGC-3 0 . Intron 7: 5 0 -end (nt 1567–1587) 5 0 TTCCGGAGTGTACACGTTTCT-3 0 ; 3 0 -end (nt 1817– 1798) 5 0 -GGTCATAGAGGGGCACAACG-3 0 .
D.L. Swope et al. / Gene 294 (2002) 239–247
2.4. 5 0 - and 3 0 -RACE reactions Total mouse uterine RNA was used in conjunction with the SMART RACE cDNA Amplification Kit (Clontech) according to manufacturer’s protocol to isolate the 5 0 - and 3 0 -ends of the mERa variant message. The gene-specific primer used in the 3 0 -RACE reaction was 5 0 -GTAACACTTGCGCAGCCGACA-3 0 (nt 940–920); the genespecific primer for the 5 0 -RACE reaction was 5 0 -CATCTCCCTAATCCCATAGC-3 0 . Fragments derived from the RACE reactions were subcloned and sequenced. 2.5. Northern blots Two micrograms of mouse poly(A) 1 RNA (Clontech) from ovary, testis, brain, heart, lung and embryo were electrophoresed on a 1.2% agarose gel, transferred to BrightStar nylon membrane (Ambion) overnight and UV crosslinked. Blots were prehybridized in UltraHyb hybridization solution (Ambion) at 42 8C overnight, probed overnight with a DNA probe labeled using asymmetric PCR against a portion of exon 4 and the variant sequence (nt 1257 to variant nt 1363) at 42 8C, washed according to protocols provided by the membrane manufacturer, and analysed by autoradiography. 2.6. Isolation of total RNA Total RNA was isolated from C57/Bl/6 mouse tissues using Trizol (Invitrogen) according to manufacturer’s protocol. Briefly, tissue was collected from mice and flash frozen in liquid nitrogen. Tissue was homogenized in Trizol reagent and extracted with chloroform. RNA was precipitated from the aqueous phase in isopropanol, dried and resuspended in RNase-free water for further analysis.
241
TAATCCCATAGC-3 0 ). All PCR reactions were designed to span at least one intron to eliminate products arising from genomic DNA contamination. Reactions were cycled for 20, 25, 30 and 35 cycles and analysed by gel electrophoresis followed by transfer to BrightStar nylon membrane (Ambion), and Southern blotting to detect the PCR fragments. Product sizes are as follows: full-length ERa, 516 bp product; variant, 480 bp product; and 18S control, 324 bp product (see Fig. 5). An oligo common to both the fulllength and variant sequences (nt 980–1000; 5 0 -CGAGGAGGGAGAATGTTGAAG-3 0 ) located within exon 4 was end-labeled using [g- 33P]ATP and T4 polynucleotide kinase. To confirm that the products were indeed those of the full-length and variant, blots were stripped and then probed with a full-length specific probe, followed by probing with a variant specific probe. RT–PCR products were detected using RapidHyb (Amersham) and oligo hybridization protocols as provided by the manufacturer. The QuantumRNA Classic II 18S Internal Standards (Ambion) were used as a PCR control to normalize reactions as 18S message is expressed at a constant level and can be used as a control between different tissue and cell types. 18S product was probed with an 18S specific primer (GenBank accession number X00686; oligo probe (nt 1281–1302) 5 0 GACAGGATTGACAGATTGATAG-3 0 ) labeled as above. All experiments included ‘no template’ (in the RT reaction) and ‘no RT’ samples to demonstrate that the signals were not derived from genomic DNA. Southern blots were exposed to PhosphorImager screens, visualized on a Storm PhosphorImager (Molecular Dynamics) and relative expression was determined using volume analysis and determining the ratio of intensity of variant to full-length bands.
3. Results 2.7. RT–PCR analysis 3.1. Genomic organization of the mouse ERa gene RNA was treated to remove contaminating DNA using the DNA-Free kit (Ambion); comparison of these samples to untreated showed no difference in results for RT–PCR analysis. All results shown here are from samples not treated with DNA-Free. cDNA was generated using reverse transcriptase and 500 ng of total RNA. Each 50-ml reaction contained 25 mM MgCl2, 1£ PE buffer II (Applied Biosystems), 10 mM dNTPs, 1.25 ml RNase inhibitor, 1.25 ml random hexamers, 2.5 ml MuLV Reverse Transcriptase (Applied Biosystems) and RNA. RT–PCR was then carried out by adding 8 ml of the RT reaction to forward and reverse gene-specific PCR primers, PCR buffer, Platinum Taq polymerase (Invitrogen), 10 mM dNTPs, and water to a total of 15 ml per reaction. The common forward primer was anchored in exon 3 ((nt 921–941); 5 0 -GTCGGCTGCGCAAGTGTTACG-3 0 ); reverse primers were located in exon 5 for full-length message ((nt 1437–1415); 5 0 CTGTCCAGGAGCAAGTTAGGAGC-3 0 ) and within the variant sequence (variant nt 1401–1382; 5 0 -CATCTCCC-
Genomic clones containing multiple exons of the coding regions of mouse ERa cDNA sequence were identified in BAC and P1 mouse genomic libraries by PCR and hybridization screening (Genome Systems, Inc./Incyte Genomics, St. Louis, MO) using primers described in Section 2. Eight overlapping clones positive by both PCR and hybridization for the selected exons of mouse ERa were selected for further use in restriction mapping. No single BAC clone contained the entire mouse ERa sequence. A total of 885 kb of overlapping sequence was subsequently analysed by restriction enzyme digestion. Digestion products were analysed by Southern blotting using probes to the predicted exons of mouse ERa to determine the total size and genomic structure of the mouse ERa gene. These results were subsequently confirmed by restriction digestion and Southern blotting of whole genomic mouse 129/SvJ DNA in order to rule out errors that may have arisen during generation of BAC clones.
242
D.L. Swope et al. / Gene 294 (2002) 239–247
Fig. 1. Genomic structure of the mouse ERa gene. The structure of the coding regions of mouse ERa gene is shown, along with sizes of intervening introns. Structure of the cDNA and domain structure of the mouse ERa protein are shown for reference. Sequence of the mouse ERa has been previously described (White et al., 1987).
From these analyses, we determined that the region of the mouse ERa gene encompassing the eight coding exons and the untranslated exon C (Kos et al., 2000) is encoded by a minimum of 220 kb (Fig. 1). The mouse ERa gene is substantially larger than the mouse ERb gene, which contains eight exons in a total of 40 kb (Enmark et al., 1997). Studies of the ERa homolog in other species reveals that fish ERa genes are much smaller and in the range of 30–40 kb (Le Roux et al., 1993). However, sequencing of the human ERa gene by the human genome project reveals that the human gene is well over 250 kb in length (GenBank accession number NT_007295).
which the intron/exon boundaries had been previously determined (Ponglikitmongkol et al., 1988). Sequencing primers were generated that flanked the predicted locations of intron/exon boundaries in the mouse gene. Boundaries were identified by sequence comparison to predicted donoracceptor splice sites, comparison to the cDNA sequence, and by comparison to the human ERa intron/exon boundaries. Sequencing of flanking regions of mERa reveals that intron/exon boundaries are well conserved between the mouse and human ERa genes (Fig. 2). This level of conservation is maintained despite separation of splice sites by even the largest introns of this gene.
3.2. Intron/exon structure of the mouse ERa gene
3.3. Identification of a truncated splice variant in mERa
Analysis of genomic clones containing at least two exons ensured complete coverage of the ERa gene as well as the presence of at least one intact intron in each clone, enabling us to accurately analyse the size of the introns. Southern blotting analysis of these clones revealed that introns range in size from 1.8 to 60 kb (Fig. 1). The larger introns are located in the middle of the gene, with the smaller introns at the 5 0 - and 3 0 -ends of the gene. Taken together, the central introns account for over one-half of the overall length of the gene. There is one major structural difference noted when comparing the mouse and human genes; this difference is in the size of the intron that lies between mouse exons 6 and 7. The mouse intron is approximately 6 kb in size, while the human intron is 33.4 kb. Note that for purposes of simplifying comparison to the human gene, the mouse exons here have been numbered the same as the homologous exons in the human gene. We also studied the intron/exon boundaries of this gene. The boundaries were initially estimated by comparing mouse ERa cDNA sequence to human ERa cDNA for
In examining known mouse ERa cDNA sequences, we noted a number of cDNA ESTs (expressed sequence tags) in the databases that contained partial mouse ERa sequences. Several of the ESTs were found to contain sequences from mERa exon 4, but the sequence that followed did not correspond to any known sequence in the mERa cDNA. We explored the possibility that these EST sequences were derived from splice variants of mouse ERa. 3 0 -RACE (rapid amplification of cDNA ends) reactions using a 5 0 forward primer anchored in exon 3 were used to isolate the 3 0 -end of the variant. Subsequent sequencing of the 3 0 end demonstrated that the cDNA isolated by RACE was identical to an EST from mouse four-cell embryo containing exon 4 of mouse ERa. This exon 4 sequence was followed by a unique sequence that is identical to the genomic sequences in the intron following exon 4 (Fig. 3; GenBank accession number AU041270), suggesting a failure to properly splice exon 4 to exon 5. Our RACE analysis also confirmed that this transcript contains a consensus polyadenylation signal, a CA dinu-
D.L. Swope et al. / Gene 294 (2002) 239–247
243
Fig. 2. Intron/exon junctions in mouse vs. human ERa genes. Sequences flanking the intron/exon boundaries of mouse ERa were determined by comparison to known boundaries in the human gene, and by sequencing of BAC clones. Sequencing primers are described in Section 2. Exonic sequences are capitalized; intronic sequences are shown in lowercase. Splice junctions that conform to the canonical AG-GT rule are underlined.
cleotide that normally precedes the poly(A) tail, and a poly(A) tract. This variant is therefore composed of mouse ERa exons 1–4, and the 3 0 -end of the variant message is then derived from intronic sequence that contains several stop codons, and a polyadenylation signal, which is followed by a poly(A) tract that is not contained within the mERa genomic sequences, suggesting that this transcript is genuinely polyadenylated (Fig. 3). These analyses predict a message of at least 1.7 kb in length. Subsequent database searches determined there were several ESTs that matched the variant sequence (GenBank accession numbers BB097271, BB200714, AV311149). Since a number of these ESTs were derived from mouse embryos, we used Northern blotting with a probe against a portion of exon 4 plus the variant sequence to detect the variant message in embryonic poly(A) 1 RNA. Fig. 4 shows a Northern blot confirming the RACE results demonstrating that this transcript is approximately 1.8 kb in length. However, in order to visualize this transcript, we were required to expose the blot for an extended period, suggesting that this transcript is not expressed at a high level. In order to determine the tissue distribution of this variant, we used a more sensitive assay, RT–PCR, to detect the variant message in mouse tissues. A modified triple primer RT–PCR analysis followed by Southern blotting (Gallacchi et al., 1998) was used to detect the messages for the full-length ERa (516 bp product), variant (480 bp product), and 18S control (324 bp product) in a number of mouse tissues. By this method, the variant message was observed in a number of tissues, including
uterus, ovary, mammary gland, placenta and liver (Fig. 5). Full-length message was also detected in these tissues. Expression of this variant message relative to the full-length message was determined using a probe common to both products; since this probe binds to the full-length and variant ERa products with the same affinity, their relative expression levels were determined by calculating the ratio of variant-to-full-length product intensities using a PhosphorImager and ImageQuant software (Molecular Dynamics). Using this method to quantify the expression levels, the variant levels were seen to vary from almost 40% of full-length message in placenta to extremely low levels in hypothalamus (levels are detailed in the legend to Fig. 5). In general, the variant message was expressed at the highest levels (relative to the full-length message) in female reproductive tissues compared to the male reproductive tissues.
4. Discussion This study presents the first description of the complete genomic organization of the coding region of the mouse estrogen receptor alpha (mERa) gene. From the beginning of exon 1 to the end of exon 8, the gene encompasses a minimum of at least 220 kb. Of the currently characterized nuclear receptor genes, this is the largest mapped nuclear receptor gene in mouse to date. Analysis of the mouse ERa splice site junctions revealed that despite the large sizes of the mouse ERa introns, splice
244
D.L. Swope et al. / Gene 294 (2002) 239–247
Fig. 3. The truncated splice variant of mouse ERa is formed by failure to splice at the exon 4 donor site. Sequence of the 3 0 -end of the ERa splice variant as determined by 3 0 -RACE analysis is shown. Exon 4 sequence is indicated; the bold arrow delineates the border between the end of exon 4 and the variant sequence. The first stop codon (TAA) is also bold. The consensus polyadenylation signal is highlighted by the box; the CA dinucleotide is bold and underlined. The forward and reverse primers used to amplify RT–PCR reactions in Fig. 5 are italicized and boldfaced. The reverse primer was also used to confirm the sequence of the 5 0 -end of the variant by 5 0 -RACE analysis.
D.L. Swope et al. / Gene 294 (2002) 239–247
Fig. 4. Variant message is expressed in embryonic mouse tissue. Poly(A) 1 RNA from mouse embryo was probed for the variant transcript as detailed in Section 2. A message of approximately 1.8 kb (as predicted by RACE analysis) was detected using a variant specific probe.
sites are well-conserved from mouse to human. The mouse introns generally are larger than the published human intron sizes; however, the human gene was mapped using clones that were mostly non-overlapping, so the exact size of the gene was unknown until the recent release of the sequence of the human genome (Ponglikitmongkol et al., 1988). Human ERa gene sequences identified by the human genome project suggest that the human gene is a minimum of 290 kb in length. Previous studies by others detailing the genomic mapping of the 5 0 untranslated regions of the mouse, rat and human ERa genes have shown that there are several exons upstream of the eight coding exons of the ERa gene, called exons A, B, C, F1, F2, H in the mouse (Kos et al., 2000; Schuur et al., 2001). These upstream exons are alternatively spliced to a common acceptor site on exon 1 of both genes
245
(the first coding exon). This alternative splicing is tissuespecific and represents usage of a minimum of five different promoters to transcribe the ERa gene in mouse, human and rat. The most commonly used exons in mouse and human are exons C (also known as exon 1 0 in human; Schuur et al., 2001) and F (Kos et al., 2000); our studies presented here include exon C in our maps and use exon C and 1 0 for comparison of the splice junctions in Fig. 2. It is interesting that the large introns in the ERa genes are conserved in the mouse and human genes. The presence of large introns in a gene has been demonstrated to slow the rate of transcription of these genes (O’Farrell, 1992), adding an additional level of gene regulation. Addition, several nonconsensus splice site junctions (canonical splice site ¼ AG-GT) are conserved in the mouse and human genes (Burset et al., 2000). One of these nonconsensus splice junctions occurs between the last two exons in mouse and human ERa, and interestingly, the sequence of this junction is conserved in mouse and human ERb as well, despite little general conservation at the nucleotide level between the ERa and ERb genes. However, presence of this conserved and unusual splice site which occurs just upstream of the estrogen-dependent AF-2 activation domain may represent an evolutionary preservation of this region. In addition, nonconforming splice sites have been described in only a small number of genes (Shapiro and Senapathy, 1987). It has been suggested that nonconsensus splice sites occur in genes critical to regulation of cell growth or differentiation. They may signal the use of alternative mechanisms for splicing that may differ from the normal mechanism in its accuracy or may act as a rate-limiting step to control the levels of gene expression where tight control is needed (Shapiro and Senapathy, 1987). Since loss of appropriate expression of
Fig. 5. RT–PCR analysis of full-length and variant message. Full-length and variant messages were detected by a modified triple-primer RT–PCR assay (Leygue et al., 1999; Gallacchi et al., 1998) followed by Southern blotting using a forward primer anchored in exon 3 (5 0 -GTCGGCTGCGCAAGTGTTACG3 0 ); reverse primers were located in exon 5 for full-length message (5 0 -GATGGATTTGAGGCACACAAACTC-3 0 ) and within the variant sequence (5 0 CATCTCCCTAATCCCATAGC-3 0 ). 18S message was used as control for loading. An oligo located within exon 4 common to both sequences was used to detect both full-length and variant messages (5 0 -CGAGGAGGGAGAATGTTGAAG-3 0 ). Expression levels of the variant relative to full-length ERa message are as follows. Female tissues: ovary, 27.2%; uterus, 26%; liver, 10%; kidney, 7.2%; mammary gland, 18.8%; hypothalamus, 2.5%; placenta, 39.7%; heart, 15.7%. Male tissues: testis, 14.3%; prostate, 14.1%; heart, 3%; liver, 13.7%.
246
D.L. Swope et al. / Gene 294 (2002) 239–247
Fig. 6. Proposed structure of the truncated variant ERa protein compared to full-length ERa protein. Translational start site is delineated by an arrow; stop codon in full-length ERa protein is shown by an asterisk. Functional domains are shown (A–E) for reference. The unique sequence added by the variant is shown by the black striped region.
ERa in mice causes male and female infertility (Lubahn et al., 1993), as does overexpression of ERa (Couse et al., 1997), alternative splice site junctions and large introns may provide additional mechanisms for regulating levels of this gene product. A number of alternatively spliced transcripts have been identified to date for the human ERa message. A number of human variants have been described only in tumor-derived cell lines, but several have been detected in normal human tissue (Pfeffer et al., 1996; Lu et al., 1999). Variants have been identified in the 5 0 UTR of the mouse ERa gene (Kos et al., 2000); these are expressed in a tissue-specific manner and represent tissue-specific usage of promoter sequences to regulate the transcription of the mERa gene. Another recent report showed a low but detectable level of an ERa splice variant lacking mouse exon 4 expressed in mammary, ovarian, uterine and other tissues (Lu et al., 1999). In this report, we have identified another alternatively spliced variant of mouse ERa. Identification of a number of ESTs of mouse ERa in the GenBank database containing mouse ERa exon 4 sequences spliced to unknown sequences led us to investigate the possibility that these ESTs represented a previously unidentified alternative splice product. RACE analysis was used to generate the complete cDNA of this variant message; further analysis of the transcripts derived by RACE revealed that failure to properly splice at the 3 0 end of exon 4 caused intronic sequence to be incorporated at the 3 0 -end of exon 4, thereby deleting exon 5 and all further downstream exons. Interestingly, the sequence of this splice site is not unusual and conforms to the canonical AG-GT splice site. The resulting alternative message contains a number of stop codons downstream of the 3 0 -end of exon 4, which would result in a protein of 42.3 kDa lacking most of the ligand-binding domain (Fig. 6). Using RT–PCR analyses, we found the variant message expressed most highly in the reproductive tissues in normal C57Bl/6 mice, although it was also expressed in other tissues (Fig. 5). An ERa message lacking the ligand-binding domain was
previously identified in human tissues (Fuqua et al., 1991; Castles et al., 1993). However, the mechanism for producing this protein does not involve failure to splice or splicing of an alternative exon, but instead deletes exon 5 in the human gene (Fuqua et al., 1991). This alternative splicing event produces a protein that lacks the ligand-binding domain, as the splice of exon 4 to exon 6 results in a frameshift, introducing stop codons. This message was identified initially in cell lines, but was subsequently found in normal tissues, suggesting that it is not a mutation arising only in cancer cells (Pfeffer et al., 1996; Chaidarun and Alexander, 1998). Further study of the possible function of this protein by transfection analysis demonstrated that although the truncated protein had no transcriptional activity by itself, when cotransfected with full-length ERa, it is able to either augment or repress the estradiol-mediated transcriptional activity of the full-length protein in a cell-type dependent manner (Bollig and Miksicek, 2000). Studies to determine whether the truncated mouse ERa message described here functionally represents a mouse homolog of the exon 5deleted human ERa are ongoing in our laboratory. Alternatively, expression of such a truncated protein would produce a form of ERa which is responsive specifically to ligandindependent activation as exemplified by growth factor mitogen-activated protein kinase signaling (Ignar-Trowbridge et al., 1993; Kato et al., 1995). Such expression could provide alternative ER-mediated signaling transduction mechanisms towards manifesting estrogen-specific responses in target tissues.
Acknowledgements The authors would like to thank Sylvia Hewitt, John Couse and Mariana Yates for technical advice and assistance with Northern blots. We are grateful to Drs. Diane Klotz and Kimberly McAllister for critical reading of the manuscript.
D.L. Swope et al. / Gene 294 (2002) 239–247
References Bollig, A., Miksicek, R.J., 2000. An estrogen receptor-alpha splicing variant mediates both positive and negative effects on gene transcription. Mol. Endocrinol. 14, 634–649. Burset, M., Seledtsov, I.A., Solovyev, V.V., 2000. Analysis of canonical and non-canonical splice sites in mammalian genomes. Nucleic Acids Res. 28, 4364–4375. Castles, C.G., Fuqua, S.A., Klotz, D.M., Hill, S.M., 1993. Expression of a constitutively active estrogen receptor variant in the estrogen receptornegative BT-20 human breast cancer cell line. Cancer Res. 53, 5934– 5939. Chaidarun, S.S., Alexander, J.M., 1998. A tumor-specific truncated estrogen receptor splice variant enhances estrogen-stimulated gene expression. Mol. Endocrinol. 12, 1355–1366. Couse, J.F., Korach, K.S., 1999. Reproductive phenotypes in the estrogen receptor-alpha knockout mouse. Ann. Endocrinol. (Paris) 60, 143–148. Couse, J.F., Davis, V.L., Hanson, R.B., Jefferson, W.N., McLachlan, J.A., Bullock, B.C., Newbold, R.R., Korach, K.S., 1997. Accelerated onset of uterine tumors in transgenic mice with aberrant expression of the estrogen receptor after neonatal exposure to diethylstilbestrol. Mol. Carcinog. 19, 236–242. Cowley, S.M., Parker, M.G., 1999. A comparison of transcriptional activation by ER alpha and ER beta. . J. Steroid Biochem. Mol. Biol. 69, 165– 175. Enmark, E., Pelto-Huikko, M., Grandien, K., Lagercrantz, S., Lagercrantz, J., Fried, G., Nordenskjold, M., Gustafsson, J.A., 1997. Human estrogen receptor beta-gene structure, chromosomal localization, and expression pattern. J. Clin. Endocrinol. Metab. 82, 4258–4265. Escriva, H., Delaunay, F., Laudet, V., 2000. Ligand binding and nuclear receptor evolution. Bioessays 22, 717–727. Fuqua, S.A., Fitzgerald, S.D., Chamness, G.C., Tandon, A.K., McDonnell, D.P., Nawaz, Z., O’Malley, B.W., McGuire, W.L., 1991. Variant human breast tumor estrogen receptor with constitutive transcriptional activity. Cancer Res. 51, 105–109. Gallacchi, P., Schoumacher, F., Eppenberger-Castori, S., Von Landenberg, E.-M., Kueng, W., Eppenberger, U., Mueller, H., 1998. Increased expression of estrogen-receptor exon-5-deletion variant in relapse tissues of human breast cancer. Int. J. Cancer 79, 44–48. Hood, T.E., Calabrese, E.J., Zuckerman, B.M., 2000. Detection of an estrogen receptor in two nematode species and inhibition of binding and development by environmental chemicals. Ecotoxicol. Environ. Saf. 47, 74–81. Ignar-Trowbridge, D.M., Teng, C.T., Ross, K.A., Parker, M.G., Korach, K.S., McLachlan, J.A., 1993. Peptide growth factors elicit estrogen receptor-dependent transcriptional activation of an estrogen-responsive element. Mol. Endocrinol. 7, 992–998. Jefferson, W.N., Couse, J.F., Banks, E.P., Korach, K.S., Newbold, R.R., 2000. Expression of estrogen receptor beta is developmentally regulated in reproductive tissues of male and female mice. Biol. Reprod. 62, 310–317.
247
Kato, S., Endoh, H., Masuhiro, Y., Kitamoto, T., Uchiyama, S., Sasaki, H., Masushige, S., Gotoh, Y., Nishida, E., Kawashima, H., Metzger, D., Chambon, P., 1995. Activation of the estrogen receptor through phosphorylation by mitogen-activated protein kinase. Science 270, 1491– 1494. Korach, K.S., 1994. Insights from the study of animals lacking functional estrogen receptor. Science 266, 1524–1527. Kos, M., O’Brien, S., Flouriot, G., Gannon, F., 2000. Tissue-specific expression of multiple mRNA variants of the mouse estrogen receptor alpha gene. FEBS Lett. 477, 15–20. Krege, J.H., Hodgin, J.B., Couse, J.F., Enmark, E., Warner, M., Mahler, J.F., Sar, M., Korach, K.S., Gustafsson, J.A., Smithies, O., 1998. Generation and reproductive phenotypes of mice lacking estrogen receptor beta. Proc. Natl. Acad. Sci. USA 95, 15677–15682. Le Roux, M.G., Theze, N., Wolff, J., Le Pennec, J.P., 1993. Organization of a rainbow trout estrogen receptor gene. Biochim. Biophys. Acta 1172, 226–230. Leygue, E., Dotzlaw, H., Watson, P.H., Murphy, L.C., 1999. Expression of estrogen receptor b1 b2, and b5 messenger RNAs in human breast tissue. Cancer Res. 59, 1175–1179. Lu, B., Dotzlaw, H., Leygue, E., Murphy, L.J., Watson, P.H., Murphy, L.C., 1999. Estrogen receptor-alpha mRNA variants in murine and human tissues. Mol. Cell. Endocrinol. 158, 153–161. Lubahn, D.B., Moyer, J.S., Golding, T.S., Couse, J.F., Korach, K.S., Smithies, O., 1993. Alteration of reproductive function but not prenatal sexual development after insertional disruption of the mouse estrogen receptor gene. Proc. Natl. Acad. Sci. USA 90, 11162–11166. O’Farrell, P.H., 1992. Developmental biology. Big genes and little genes and deadlines for transcription. Nature 359, 366–367. Pfeffer, U., Fecarotta, E., Arena, G., Forlani, A., Vidali, G., 1996. Alternative splicing of the estrogen receptor primary transcript normally occurs in estrogen receptor positive tissues and cell lines. J. Steroid Biochem. Mol. Biol. 56, 99–105. Ponglikitmongkol, M., Green, S., Chambon, P., 1988. Genomic organization of the human oestrogen receptor gene. EMBO J. 7, 3385–3388. Schuur, E.R., McPherson, L.A., Yang, G.P., Weigel, R.J., 2001. Genomic structure of the promoters of the human estrogen receptor-a gene demonstrate changes in chromatin structure induced by AP2g. J. Biol. Chem. 276, 15519–15526. Shapiro, M.B., Senapathy, P., 1987. RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. Nucleic Acids Res. 15, 7155–7174. Skynner, M.J., Sim, J.A., Herbison, A.E., 1999. Detection of estrogen receptor alpha and beta messenger ribonucleic acids in adult gonadotropin-releasing hormone neurons. Endocrinology 140, 5195–5201. Weatherman, R.V., Fletterick, R.J., Scanlan, T.S., 1999. Nuclear-receptor ligands and ligand-binding domains. Annu. Rev. Biochem. 68, 559– 581. White, R., Lees, J.A., Needham, M., Ham, J., Parker, M., 1987. Structural organization and expression of the mouse estrogen receptor. Mol. Endocrinol. 1, 735–744.