Molecular & Biochemical Parasitology 128 (2003) 67–75
Analysis of the Brugia malayi HSP70 promoter using a homologous transient transfection system夽 Limin Shu a , Charles R. Katholi b , Tarig Higazi a , Thomas R. Unnasch a,∗ a
Division of Geographic Medicine, University of Alabama at Birmingham, BBRB 203, 1530 3rd Avenue South, Birmingham, AL, USA b Department of Biostatistics, University of Alabama at Birmingham, RPHB 327F, 1530 3rd Avenue South, Birmingham, AL, USA Received 5 December 2002; received in revised form 19 February 2003; accepted 19 February 2003
Abstract Biolistic transient transfection of Brugia malayi embryos with constructs driving the expression of a luciferase reporter gene was used to identify regions of the upstream sequence of the heat shock protein 70 (HSP70) gene of B. malayi necessary for transgene expression. Analysis of 1160 nucleotides upstream of the start codon of the HSP70 gene identified several potentially important elements, including putative CAAT and TATA boxes, a core promoter domain, a polypurine stretch, and a spliced leader addition site. Nested deletion analysis of the HSP70 upstream domain mapped the promoter of the HSP70 gene to the region 396 to 31 nucleotides upstream of the start codon. This encompassed the putative CAAT and TATA boxes, and putative core promoter. Deletion of the putative CAAT box did not result in any diminution of reporter activity, while constructs in which the TATA box or core promoter were deleted retained roughly half of the activity of the undeleted construct. Unlike the native gene, transcripts derived from constructs containing the HSP70 upstream sequences were not trans-spliced. However, incorporation of the 495 nucleotides downstream of the start codon (encompassing exon 1, intron 1 and part of exon 2) resulted in the production of transcripts that were correctly cis- and trans-spliced. Similarly, a construct containing the 495 downstream nucleotides in which most of exon 1 was deleted, was correctly cis- and trans-spliced. This finding suggests that downstream intron sequences in addition to the splice leader addition site are necessary for trans-splicing in B. malayi. © 2003 Elsevier Science B.V. All rights reserved. Keywords: Filariasis; Nematode; Transfection; Promoter; Spliced leader
1. Introduction Filarial parasites continue to represent a significant public health problem worldwide. Studies of gene regulation and expression in these parasites have been hampered by the difficulty in culturing these organisms and by the lack of classical genetic methods that can be brought to bear on these systems. With the success of the filarial genome project in identifying large numbers of expressed sequence tags in both Brugia malayi and Onchocerca volvulus [1–3], and the ongoing effort to sequence the genome of B. malayi [4], it is important to develop methods to study cis-acting elements Abbreviations: bp, base pairs; HSP, heat shock protein; luc, luciferase; nt, nucleotides; ORF, open reading frame; RACE, rapid amplification of cDNA ends; SL, spliced leader; TESS, Transcription Element Search System; UTR, untranslated region 夽 Note: Nucleotide sequence data reported in this paper are available in the GenbankTM database under the accession no. AJ508355. ∗ Corresponding author. Tel.: +1-205-975-7601; fax: +1-205-933-5671. E-mail address:
[email protected] (T.R. Unnasch).
that are important for gene expression and gene regulation in these parasites. Given the historical lack of effective methods to genetically manipulate the filaria, it is perhaps not surprising that little is known concerning promoter structure and function in these organisms. Some studies have attempted to address the question of what makes a promoter in filarial parasites by using upstream domains derived from various filarial genes to drive the expression of reporter genes in Caenorhabditis elegans [5]. However, it is not clear that gene regulatory mechanisms will function normally in a heterologous system, making it difficult to draw conclusions about filarial gene regulation using such an approach. This problem is likely to be most acute for genes specifically associated with the parasitic lifestyle, which are likely to be the genes whose regulation will be the most interesting to study. Thus, a need exists for the development of methods to study filarial promoter structure and function in a homologous system. Recently, it has been demonstrated that B. malayi can be transiently transfected by biolistics and intrauterine
0166-6851/03/$ – see front matter © 2003 Elsevier Science B.V. All rights reserved. doi:10.1016/S0166-6851(03)00052-5
68
L. Shu et al. / Molecular & Biochemical Parasitology 128 (2003) 67–75
microinjection [6]. Both green fluorescent protein and luciferase were shown to function as effective reporter genes in these transfected parasites. Using luciferase as a reporter, the most efficient transfection method was found to be biolistic transfection of isolated embryos [6]. These findings opened the possibility of using biolisitics to develop a homologous system for the study of B. malayi promoters. The complete sequence of the HSP70 gene of B. malayi, including 1160 bp upstream of the start codon has been reported by Rothstein and Rajan [7]. These authors noted a number of potentially significant features in the upstream domain of the gene, including putative CAAT and TATA boxes and a stretch of 14 G residues [7]. In the current study, we demonstrate that the B. malayi HSP70 upstream domain is capable of driving the expression of a reporter gene to high levels in transiently transfected embryos and exploit this observation to study the B. malayi HSP70 promoter in a homologous system.
2. Materials and methods 2.1. Preparation of promoter constructs The HSP70 promoter sequence was initially amplified by PCR from B. malayi genomic DNA using 20 mer primers developed from the published sequence (Genbank accession #M68933). For example, to amplify the −659 to −1 domain, primer sequences were derived from positions −659 to −640 and −20 to −1 of the upstream domain of the HSP70 gene sequence. Amplification reactions contained
20 mM Tris–HCl (pH 8.0), 2 mM MgSO4 , 10 mM KCl, 6 mM (NH4 )2 SO4 , 0.1% Triton X-100, 1 g ml−1 bovine serum albumin, 200 M each dATP, dGTP, dCTP, dTTP, 0.5 M of each primer, 5 U Pfu 1 polymerase (Stratagene, La Jolla, CA) and 10 ng of B. malayi genomic DNA. Cycling conditions consisted of denaturation at 94 ◦ C for 2 min, 28 cycles consisting of 94 ◦ C for 1 min, 55 ◦ C for 1 min, and 72 ◦ C for 1 min, followed by a final extension for 10 min at 72 ◦ C. Following amplification, the PCR products were cloned into a PCR cloning vector (PCRTM 2.1, Invitrogen, Carlsbad, CA) and clones containing inserts of the correct size isolated. The DNA sequence of the individual clones were determined, and one clone matching the published sequence was chosen. The insert in the chosen clone was excised and cloned into the reporter vector pGL-3 Basic (Promega, Madison, WS). Nested deletions were prepared by PCR amplification using a clone of the −659 to −1 upstream domain as the template, essentially as described above. The primers used in the nested amplification reactions were 20 nt in length, and developed from the sequence of the HSP70 upstream domain. The boundaries of the primers can be determined by reference to the sequence shown in Fig. 1. For example, the construct −518 to −1 was amplified using primers developed from positions −518 to −499 and −20 to −1 upstream of the HSP70 start codon. Following amplification, the PCR products were cloned, sequence verified and subcloned into pGL-3 Basic as described above. Internal deletions were prepared using an inverse PCR amplification procedure as previously described [8]. In brief, the −659 to −1 promoter construct was subcloned into the
Fig. 1. Map of the upstream domain of the B. malayi HSP70 gene: the putative heat shock factor binding sites referred to in the text are highlighted by boxed text. The putative CAAT box, TATA box, and poly G stretch are shaded. The putative core promoter domain is marked with underlining. The start codon of the HSP70 ORF is marked by double underlining. The single arrow above the sequence marks the position of the SL addition site detected in the EST sequence, while the double arrow marks the position of the experimentally determined SL addition site. The two arrows below the sequence mark the ends of the 5 -RACE clones discussed in the text.
L. Shu et al. / Molecular & Biochemical Parasitology 128 (2003) 67–75
69
plasmid vector pGEM3B. This plasmid was then used in inverse PCR reactions to prepare the internal deletions. To accomplish this, outward facing primers were identified that flanked the region to be deleted. For example, to prepare the internal deletion −209 to −202, outward facing primers corresponding to positions −229 to −210 and −201 to −182 were prepared. The outward facing primers contained synthetic Spe1 sites at their 5 ends. The entire plasmid sequence minus the deleted area flanked by the primers was then amplified by inverse PCR employing the pGEM3Z subclone as a template. The resulting PCR fragment was digested with Spe1, gel purified and self-ligated. Following transformation of the ligated DNA into Escherichia coli, clones were recovered and their DNA sequence determined. Plasmid DNA prepared from clones with the correct sequence was then digested with EcoR1 to liberate the mutated promoter fragment, and the released fragment cloned into pGL-3 Basic as described above.
primer derived from positions −1 to −20 of the HSP70 gene. Amplification conditions were as previously described [9]. cDNA from transfected parasites was subjected to a hemi-nested amplification procedure. The initial amplification reaction was carried out using the SL1 primer and a non-coding primer derived from positions 384–403 of the pGL-3 Basic luciferase open reading frame (5 -TTTGCAACCCCTTTTTGGAA-3 ). Following the initial amplification reaction, the products were used as a template in a hemi-nested PCR, employing the SL1 primer and a second gene specific primer. The second gene specific primer was derived from positions 219–238 of the luciferase ORF of pGL-3 Basic (5 -CGACGATTCTGTGATTTGTA-3 ). The PCR products were visualized by agarose gel electrophoresis, cloned into the PCRTM 2.1 vector and the DNA sequence of the resulting clones determined as described above.
2.2. RT-PCR analyses
Transient transfections were carried out on isolated B. malayi intrauterine embryos essentially as previously described [6]. In brief, plasmid DNA of each construct was precipitated on to the surface of 0.6 m gold particles at a loading ratio of 16 g of DNA/mg of gold, as previously described [6]. Embryos were isolated from gravid female parasites by dissection, placed in 30 l of tissue culture medium and transfected in a PDS 1000/He apparatus (BioRad, Hercules, CA). The embryos were maintained under a vacuum of 12 in. of mercury during the bombardment, and were transfected with 500 g of coated gold particles per bombardment at a pressure of 1100 PSI. Following bombardment, the embryos were maintained in RPMI tissue culture medium containing 25 mM HEPES, 20% fetal calf serum, 20 mM glucose, 24 mM sodium bicarbonate, 2.5 g ml−1 amphotericin B, 10 U ml−1 penicillin, 10 U ml−1 streptomycin and 40 g ml−1 gentamicin, at 37 ◦ C and 5% CO2 for 48 h. The transfected embryos were harvested by centrifugation, and extracts prepared as previously described [6]. Two assay methods were used to monitor firefly luciferase expression in the transfected embryos. In the first method, embryos were transfected with a single experimental construct driving the expression of firefly luciferase. Firefly luciferase activity in extracts prepared from the transfected embryos were measured using the Luciferase Assay System from Promega (Madison, WS) following the manufacturer’s instructions. The net activity was determined by subtracting the number of counts in a negative control from the gross units obtained in each sample. Protein content was determined in the extracts using the Bradford method (BioRad, Hercules, CA) and the data expressed as specific activity (net light units/mg of cell protein). All constructs were tested in triplicate. In each series of experiments, an internal standard was also included consisting of triplicate bombardments of embryos with −1 to −659/luc. To allow comparisons among experiments carried out on different days, the data were expressed as a percentage of the luciferase specific activity in
RNA to be used in the SL and 5 rapid amplification of cDNA ends (RACE) analyses was prepared from transfected embryos or adult female parasites using the Absolute RNA kit (Stratagene, La Jolla, CA), following the manufacturer’s instructions. 5 -RACE amplifications were carried out using a commercially available kit (Invitrogen), following the manufacturer’s instructions. Following reverse transcription and tailing, the cDNA products were amplified by hemi-nested PCR, following the recommendations of the manufacturer. The first amplification reaction employed the 5 adaptor primer recommended by the manufacturer and a gene specific primer derived from the luciferase open reading frame (5 -TTTGCAACCCCTTTTTGGAA-3 ). Nested amplification reactions employed the adaptor primer recommended by the manufacturer and a second gene specific primer derived from the luciferase open reading frame (5 -CGACGATTCTGTGATTTGTA-3 ). PCR fragments resulting from this process were then cloned and their DNA sequence determined as described above. SL1-mediated RT-PCRs were carried out on DNAse treated RNA prepared as described above, essentially as previously described [9]. In brief, cDNA prepared from RNA isolated from adult female parasites or from transfected embryos was used as a template to prepare first strand cDNA with gene specific primers, as previously described. The oligonucleotide used to prime cDNA synthesis from the transfected embryo RNA was derived from positions 505–524 of the luciferase open reading frame encoded in pGL-3 Basic (5 -CCGGGAGGTAGATGAGATGT-3 ). The primer used to prepare the cDNA from the adult female RNA was derived from positions 42–61 downstream of the start of the HSP70 ORF. Amplification of the cDNA from untransfected parasites employed a primer derived from the SL1 sequence (5 -CTCAAACTTGGGTAATTAAACC-3 ) and a
2.3. Transient transfection and luciferase assays
70
L. Shu et al. / Molecular & Biochemical Parasitology 128 (2003) 67–75
each experimental construct relative to that of the −659 to −1/luc standards bombarded on the same day. The specific activity assay described above normalized the data for protein content. However, it did not control for differences in transfection efficiency, resulting in rather high standard deviations in the activity estimate for the different constructs. To control for inter-sample differences in transfection efficiency, a dual luciferase assay containing an internal transfection control was devised. To accomplish this, the −659 to −1 region was cloned upstream of the renilla luciferase gene of the reporter vector phRL-null (Promega). Transfections were then carried out with 500 g of 0.6 m beads coated with a mixture consisting of 6.7 g of DNA containing the experimental construct driving the expression of the firefly luciferase reporter and 1.7 g of plasmid DNA containing the −659 to −1 promoter driving the expression of the renilla luciferase. All constructs were tested in triplicate. Extracts prepared from the transiently transfected embryos were assayed for both firefly and renilla luciferase activity using the Dual Luciferase Assay System (Promega, Madison, WS). Net levels of firefly luciferase activity (gross counts minus those in a concurrent negative control) produced from each experimental construct were then normalized to the net level of renilla luciferase activity produced in each sample from the internal standard. As was done for the single assay, each series of transfections carried out on a given day included a set of three control reactions employing −659 to −1/luc. To permit comparisons of data collected on different days, the data were expressed as the percentage of the firefly luciferase activity in each experimental construct (normalized to its internal renilla control) relative to that of the −659 to −1/luc standards (normalized to their internal renilla controls) bombarded on the same day. In mixing experiments carried out to validate the dual luciferase assay, no difference was seen in the specific activity (net luciferase light units/mg of protein) of embryos transfected with the firefly constructs alone when compared to embryos transfected with a mixture of the renilla and firefly constructs (data not shown). These data confirmed that promoter interference was not occurring in cells transfected with both the firefly and renilla constructs. Furthermore, all constructs were assayed using both the single and dual luciferase assays, and the data obtained from both assays were found to be completely concordant. However, as expected, the standard deviations surrounding the activity estimates were smaller in the dual luciferase assay data than in the data derived from specific activity assay. For this reason, although all constructs were tested with both assays, only the data obtained from the dual luciferase assay is presented below. For analysis purposes, the data from each day were treated as separate data sets, i.e. data from each construct tested was compared only to the control assays carried out on the same day. The statistical significance of the differences between the experimental and control (−659 to −1/luc) replicates were evaluated on the population marginal means (least
squares means) using Dunnett’s test. All calculations were performed using the SAS system, version 9, PROC GLM.
3. Results 3.1. Sequence analysis of the upstream domain of the HSP70 gene of B. malayi The complete sequence of the HSP70 gene of B. malayi, including 1160 nt upstream of the initiation codon, was previously reported by Rothstein and Rajan [7]. They reported the presence of a putative CAAT box located 372–375 bp upstream of the initiation codon (i.e. position −372 to −375) and a putative TATA box at position −210 to −206 [7]. Based upon these observations, the upstream domain of the HSP70 was analyzed in detail to determine if it contained other features of a typical eucaryotic promoter. Surprisingly, little is known concerning the structure of the core promoters of nematodes, including C. elegans, and algorithms to predict nematode promoter sequences are not available. For this reason, the upstream domain was analyzed using promoter prediction algorithms based upon data obtained from human [10] or Drosophila melanogaster genes (www.friutfly.org/cgi-bin/seq tools/promoter.pl). Analysis of the HSP70 promoter for putative promoters based upon the human gene sequences did not identify any sequence typical of a human core promoter. In contrast, analysis of the sequence using the D. melanogaster trained algorithm identified a putative core promoter sequence located from −227 to −171 (Fig. 1). The upstream domain was also analyzed for putative transcription factor binding sites, using the Transcription Element Search System (TESS) of the University of Pennsylvania [11]. TESS identified four potential heat shock factor binding sites in the B. malayi upstream domain. These were located at positions −449 to −435, −399 to −385, −251 to −242 and −231 to −222 (Fig. 1). In addition to the putative CAAT and TATA boxes, heat shock factor binding sites and the core promoter domain, the upstream region of the HSP70 sequence contained two additional interesting features (Fig. 1). The first of these was the presence of a string of 14 G residues located at positions −119 to −106. Such polypurine/polypyrimidine tracts have been noted in several other eucaryotic promoters where they have been shown to act as either enhancers or repressors of transcription [12,13]. Second, a comparison of the genomic sequence of the HSP70 gene to the EST database derived from B. malayi suggested that the HSP70 mRNA was likely to be trans-spliced. An EST containing a partial sequence of the 22 nt spliced leader spliced at position −53 was identified in the EST database (Genbank accession #AW409476, Fig. 1). To confirm that the HSP70 mRNA was trans spliced, SL1-HSP70 primed RT-PCR reactions were carried out as described in Section 2. This produced a PCR product whose sequence confirmed that the HSP70
L. Shu et al. / Molecular & Biochemical Parasitology 128 (2003) 67–75
mRNA was indeed trans-spliced. However, the sequence of this PCR product suggested that the spliced leader was added at position −47, 6 nt downstream from the splice site in the EST clone (Fig. 1). 3.2. Experimental analysis of the putative B. malayi HSP70 promoter The analyses summarized in Fig. 1 suggested that the upstream domain of the HSP70 gene might contain sequences capable of acting as a promoter in B. malayi. To test this hypothesis, a gene fragment consisting of the 659 bp upstream of the HSP initiation codon (−659 to −1) was isolated by PCR and cloned upstream of the firefly luciferase gene in the vector pGL-3 Basic, as described in Section 2. The resulting construct (−659 to −1/luc) was introduced into isolated B. malayi embryos by biolistic bombardment and the transiently transfected embryos assayed for luciferase activity as described in Section 2. Embryos transiently transfected with −659 to −1/luc displayed high levels of luciferase activity, in the range of 10,000 times background (data not shown). In contrast, untransfected parasites displayed luciferase activities that were at or below background levels, while embryos transfected with the pGL-3 Basic vector alone displayed luciferase activities that were roughly two times background levels ([6] and data not shown). These data, when taken together suggest that the 659 bp upstream domain of the HSP70 gene of B. malayi encoded a promoter capable of
71
driving high levels of expression of the firefly luciferase reporter gene in transiently transfected B. malayi embryos. As a first step in mapping the cis elements necessary for the promoter activity of the HSP70 upstream domain, a series of nested deletions were prepared from the 5 end of the core promoter domain described above. These truncated constructs were assayed for luciferase activity following transient transfection into B. malayi embryos using the specific activity and dual luciferase assays described in Section 2. The results of these experiments are summarized in Fig. 2. Constructs containing the 396 bp upstream of the start codon (e.g. −396 to −1/luc) produced levels of activity that were not statistically different from those seen with −659 to −1/luc (Fig. 2). However, deletion of the region containing the putative CAAT box (construct −295 to −1/luc) resulted in a statistically significant reduction in luciferase activity (roughly 90%) when compared to −659 to −1/luc (Fig. 2). Deletion of both the putative CAAT and TAATA boxes (construct −178 to −1/luc) resulted in a further reduction in activity, to roughly 5% of the activity of −659 to −1/luc (Fig. 2). A series of 3 deletions were similarly constructed and assayed for luciferase activity. Deletion of positions −31 to −1 (downstream of both the predicted SL addition sites at −47 and −53) resulted in luciferase activities that were significantly lower than those seen with −659 to −1/luc (Fig. 2). Deletion of the first 83 nt upstream of the start codon (i.e. construct −659 to −84/luc) resulted in complete
Fig. 2. Deletion analysis of the HSP70 upstream domain: the position of the putative CAAT box, TATA box, core promoter, poly R/Y stretch, and SL addition sites are indicated schematically in Panel A. Panel B contains a schematic representation of the deletion constructs tested, and the level of luciferase activity detected in embryos transfected with each construct. Luciferase activity was assayed as using the dual luciferase assay described in Section 2. (∗) Luciferase activity in embryos transfected with this construct was significantly different from that seen in embryos transfected with −659 to −1/luc (0.05 ≥ P > 0.01). (∗∗) Luciferase activity in embryos transfected with this construct was significantly different from that seen in embryos transfected with −659 to −1/luc (P ≤ 0.01).
72
L. Shu et al. / Molecular & Biochemical Parasitology 128 (2003) 67–75
loss of luciferase activity, as did all of the more extensive 3 deletions tested (Fig. 2). As described above, previous studies and our analysis of the HSP70 upstream domain identified four domains that might be important in the promoter activity exhibited by the HSP70 upstream domain. These included the putative CAAT and TATA boxes, the core promoter domain and the polypurine domain. The hypothesis that these domains were essential for activity received support from the nested deletion studies that mapped the core promoter domain to positions −396 to −31 in the upstream region. To determine if the predicted CAAT and TATA boxes were essential for activity, targeted internal deletion constructs were prepared and tested in the transient transfection system. Surprisingly, deletion of the putative CAAT box (construct ∇ −379 to −370/luc) resulted in luciferase activity levels that were not significantly different from those seen with −659 to −1/luc (Fig. 2). In contrast, deletion of the putative TATA box (∇ −210 to −203/luc) resulted in luciferase activities that were significantly lower than (roughly 50% of) those seen with −659 to −1/luc (Fig. 2). Similar levels of luciferase activity were seen in constructs in which the putative core promoter domain was deleted (i.e. construct ∇−221 to −170/luc; Fig. 2). In contrast, a clone containing the region encompassing the putative TATA and CAAT boxes (construct −377 to −190/luc) exhibited no luciferase activity (Fig. 2). As described above, polypurine/polypyrimidine tracts have been found in other eucaryotic promoters where they often function to modulate transcription. To test if this was the case in the HSP70 promoter, a construct was prepared deleting the poly G tract found at positions −119 to −106. This internal deletion (construct ∇ −123 to −105/luc) resulted in luciferase levels that were significantly higher than (roughly 160% of) those seen in the full length construct (Fig. 2). Previous studies had demonstrated that HSP70 mRNA levels were upregulated roughly five-fold in infective larvae (L3) and adult parasites exposed to heat shock [7]. To determine if expression of the luciferase reporter driven by the HSP70 promoter was also induced by heat shock, isolated embryos transfected with −659 to −1/luc alone were incubated at 40 and 28 ◦ C for 48 h following transfection, and the specific activity (net light units/mg of protein) determined. Little or no luciferase activity was detected in the embryos subjected to 40 and 28 ◦ C incubation temperatures (data not shown). However, examination of the embryos cultured at these temperatures revealed that most of the embryos were non-viable after 48 h of incubation. Embryos were then transiently transfected with −659 to −1/luc and held at 37 ◦ C for 24 h. They were then subjected to temperature shocks for 4 h at 28 and 40 ◦ C, returned to 37 ◦ C for 4 h and then assayed for luciferase activity. No significant difference was noted in the luciferase specific activity in the embryos exposed to the different temperatures (Fig. 3). One possible explanation for the lack of a heat shock response in the transfected embryos was that the regulatory
Fig. 3. Effect of heat treatment on luciferase activity in transiently transfected embryos: embryos transiently transfected with −659 to −1/luc or −1160 to −1/luc were subjected to a 4-h temperature shock as described in the text. Luciferase activity was determined using the single luciferase specific activity-based assay described in Section 2.
elements responsible for the heat shock response were actually encoded upstream from the −659 to −1 region. To test this hypothesis, the entire 1160 nt upstream domain of the HSP70 ORF was isolated and cloned upstream of the luciferase reporter of pGL-3 Basic. As expected, this construct (−1160 to −1/luc) produced levels of luciferase that were statistically indistinguishable from those produced by −1 to −659/luc (Fig. 3). The heat shock experiments were then repeated with embryos transfected with −1160 to −1/luc. Again, no significant difference was found in the luciferase specific activity of the embryos exposed to the different temperature conditions (Fig. 3). 3.3. Trans-splicing of transgene transcripts in transiently transfected B. malayi embryos As the native HSP70 message is trans-spliced and the −659 to −1/luc construct contained the native splice acceptor site present in the HSP70 gene, it was of interest to determine if the luciferase transgene RNAs expressed in embryos transiently transfected with −659 to −1/luc were trans-spliced as well. To accomplish this, RNA was prepared from embryos transiently transfected with −659 to −1/luc and used in a template in a hemi-nested RT-PCR with an SL1 primer and nested primers derived from the luciferase ORF. No PCR products were obtained from this experiment, suggesting that the RNA derived from the luciferase transgene was not trans-spliced (data not shown). To confirm this, the 5 end of the message derived from the transgene was determined by 5 -RACE, as described in detail in Section 2. Two classes of cloned products were recovered from this experiment. Neither class of clone
L. Shu et al. / Molecular & Biochemical Parasitology 128 (2003) 67–75
contained sequences homologous to the SL1 at their 5 ends. Instead, the 5 ends the clone classes mapped to positions −64 and −72, 17 and 24 nt upstream of the SL1 addition site identified in the native message (Fig. 1). The finding that the transgene-derived mRNAs were not trans-spliced was unexpected, and suggested that sequences in addition to the native splice acceptor site might be necessary for trans-splicing to occur. To test this hypothesis, a construct was prepared containing nucleotides −659 to 495 of the HSP70 gene sequence (−659 to 495/luc). This sequence contained the promoter domain identified in the experiments described above, as well as the first exon, first intron and six nucleotides of the second exon of the native HSP70 gene. This construct was cloned in frame with the luciferase ORF of pGL-3 Basic and transfected into B. malayi embryos as described above. Luciferase activity in embryos transfected with −659 to 495/luc was not statistically different from that seen in embryos transfected in parallel with −659 to −1/luc (Fig. 4, Panel B). RNA was prepared from embryos transfected with −659 to 495/luc and used in a hemi-nested RT-PCR with the SL1 primer and two primers derived from the luciferase ORF, as described in Section 2. This resulted in the production of a PCR product that was similar in size to that predicted for a PCR product derived from to have been produced from a correct cis- and trans-spliced mRNA (444 bp; Fig. 4, Panel C). DNA sequence analysis of this product confirmed that it had the expected structure for a correct cis- and trans-spliced product. It consisted of the luciferase ORF fused to a correct cis-spliced product derived from exon 1 and the portion of exon 2 of the HSP70 gene contained within −659 to 495/luc. A spliced leader sequence was present at the 5 end of the message, attached at position −47, identical to the SL1 addition site mapped in transcripts derived from the native HSP70 message. To determine if the sequences in exon 1 or intron 1 encoded the necessary sequences to permit trans-splicing an additional construct designated −659 to 495 ∇7 −87 was prepared. In this construct, the majority of exon 1 was deleted, retaining only the 5 6 nt and 3 10 nt of exon 1. Embryos were transiently transfected with −659 to 495 ∇7 −87 and assayed for luciferase activity and for trans-spliced transgene-derived mRNA, as described above. Embryos transiently transfected with this construct expressed luciferase activities that were not statistically different from those seen in the control embryos transfected with −659 to −1/luc (Fig. 4, Panel B). Furthermore, hemi-nested RT-PCRs carried out as described above yielded a product similar in size to that predicted to have been derived from a correctly cis- and trans-spliced mRNA derived from the −659 to 495 ∇7 −87 construct (363 bp; Fig. 4, Panel C). DNA sequence analysis of this product confirmed that it had the expected structure for a correctly cis- and trans-spliced product derived from −659 to 495 ∇7 −87, with a spliced leader attached at position −47 and the 16 nt derived from the 5 and 3 ends of the deleted version of exon 1 correctly
73
Fig. 4. Luciferase activity and analysis of transgene mRNAs in embryos transiently transfected with −659 to 495/luc and −659 to 495 ∇7 −87. Panel A: A schematic representation of the structure of the HSP70 derived and downstream vector encoded sequences contained in −659 to −495/luc that were expected to be amplified in the SL mediated hemi-nested RT-PCR. Panel B: Schematic representation of the sequences contained in −659 to 495/luc and −659 to 495 ∇7 −87 and the level of luciferase activity detected in embryos transfected with each construct. Panel C: Products detected in SL/luc mediated hemi-nested RT-PCRs carried out on RNA isolated from embryos transfected with −659 to 495/luc and −659 to 495 ∇7 −87.
spliced to the six nucleotides derived from exon 2 contained in the construct (data not shown). It is not clear from the data presented above if specific sequences in the first intron of the HSP70 gene were important in facilitating trans-splicing, or if any intron sequence, when present downstream of a SL addition site, would be sufficient to permit trans-splicing to occur. However, if the former hypothesis were true, one would predict that first intron in trans-spliced genes would contain motifs that were lacking in downstream introns. To test this hypothesis, six B. malayi genomic sequences were identified for which explicit evidence exists to document that their transcripts are trans-spliced. These included HSP-70 (Genbank Accession #M68933), chs-1 (Genbank Accession #AF274754), alt-1 (Genbank Accession #), mif-1 (Genbank Accession
74
L. Shu et al. / Molecular & Biochemical Parasitology 128 (2003) 67–75
#AF002699), shp-2 (Genbank Accession #Z35444) and the gene coding for aspartyl tRNA transferase (Genbank Accession #J03971). The first introns from these genomic sequences were then searched to identify motifs that they shared. Shared motifs were then used to search downstream introns derived from the same genes, to identify motifs that were specific to the first introns. This analysis revealed the presence of a 7 bp motif with the sequence RGATRAA (R = A or G) that was found in five of the six first intron sequences examined, but was absent from all six of the downstream intron sequences examined (P < 0.01; Fisher’s exact test). The single first intron lacking this motif, from the shp-2 gene, contained a closely related sequence (AAATGAA).
4. Discussion The data presented above demonstrate that the sequence located roughly 400 nt upstream of the start codon of the HSP70 gene of B. malayi functions as a strong promoter in transiently transfected isolated embryos. Surprisingly, few studies have been carried out to define the essential promoter elements in C. elegans, and promoter prediction algorithms based upon nematode sequences have not been developed. However, sequence analysis of the 400 nt upstream domain of the B. malayi HSP70 gene did suggest that it contained putative CAAT and TATA boxes, and a sequence similar to conserved promoter domains of D. melanogaster. Nested deletions of the upstream domain suggested that the regions containing the putative CAAT and TATA boxes were important in driving transcription, as deletion of the roughly 140 bp region containing the putative CAAT box resulted in loss of roughly 90% of the reporter gene activity, while deletion of the region containing both the putative CAAT and TATA boxes resulted in a 95% decrease in reporter gene activity. Interestingly however, internal deletions of the putative CAAT or TATA boxes, or the entire predicted core promoter domain, resulted in constructs that exhibited from 40 to 100% of the reporter gene activity seen with the intact construct. Furthermore, a construct containing just the region encoding the putative CAAT box and TATA box alone produced no luciferase activity. These data, when taken together, suggest that the promoters of B. malayi may have features that set them apart from those of better studied eucaryotes, making it difficult to use algorithms trained on DNA sequences derived from other organisms to accurately predict B. malayi promoter sequences. Analysis of the 3 end of the HSP70 upstream domain demonstrated that deletion of the 82 nt upstream of the start codon resulted in a complete loss of luciferase expression. This region contained the splice leader addition sites, and also encompassed the sequences that map to the 5 end of the luciferase mRNAs in embryos transfected with −659 to −1/luc. Since the luciferase mRNA from these parasites is not trans-spliced, these latter sites may represent the start sites of transcription, or they may represent sites involved
in mRNA processing. Thus, it is perhaps not surprising that deletion of the −82 to −1 domain resulted in a complete loss of luciferase activity. However, deletion of the −31 to −1 region, which is downstream of both the spliced leader addition and putative transcription start sites, also resulted in a significant decrease in luciferase activity. This finding suggests that as yet unidentified cis-acting factors may map within these 31 nt. It is possible that these sequences do not affect the level of transcription from the transgene, but encode sequences that are necessary in maximizing translation from the transgene-derived mRNA. Alternatively, it is possible that the distance between the SL addition site alone and/or the transcriptional starts and the start codon influences either the level of transcription or translational efficiency of the gene. Further experiments will be needed to resolve this question. The upstream domain of the HSP70 promoter contains a stretch of 14 G residues. Such polypurine/polypyrimidine tracts have been noted in the promoters of other organisms, where the function as transcriptional repressors [13], or in some cases as enhancers [12,14]. Deletion of the poly G tract from the HSP70 promoter resulted in a small but statistically significant increase in the level of luciferase production relative to that seen in the full length construct containing the tract. This suggests that poly G tract may act as a negative regulator of transcription in the B. malayi HSP70 gene. Previous studies have demonstrated a roughly five-fold upregulation of the native HSP70 gene in B. malayi parasites exposed to heat shock [7]. However, no upregulation of luciferase activity in response to temperature shock was noted in parasites transiently transfected with either the −659 to −1/luc or −1160 to −1/luc HSP70 constructs. There are several possible explanations for this difference. First, it is possible that elements that mediate the heat shock response are actually located outside of the 1160 nt examined here. Second, it is possible that the transgenes tested might be capable of responding to a heat shock or stress response, but that technical difficulties prevented us from detecting this response. For example, we found that long-term exposure to either high or low temperatures resulted in the death of the transiently transfected embryos. This fact necessitated a fairly short temperature shift, which may have been insufficient to trigger a detectable upregulation in transcription from the transgene. Finally, in most organisms, HSP70 expression can be triggered by stresses other than temperature shock. It is possible that removal and in vitro culture of the isolated embryos was stressful enough to trigger an upregulation in HSP70 expression. In this case, further stressing the organisms by temperature shock might not have resulted in a further upregulation in HSP70 expression. Both an analysis of the EST database and SL1-mediated RT-PCR amplification data suggest that the native HSP70 message is trans-spliced in vivo. Interestingly, the EST and RT-PCR data resulted in the identification of two separate SL addition sites in the HSP70 message. These sites both contain the typical AG splice acceptor dinucleotide, but are
L. Shu et al. / Molecular & Biochemical Parasitology 128 (2003) 67–75
separated by 6 nt. However, the downstream splice leader addition site more closely matches the UUUCAG trans-splice acceptor consensus for C. elegans SL1 splice sites [15]. It is possible that both splice sites are employed in vivo, and that the EST and SL1-PCR experiments fortuitously detected differently spliced forms. However, an analysis of several independent clones derived from the SL1-mediated RT-PCR of adult female RNA all contained spliced leaders attached to the downstream addition site (data not shown). These data suggest that, at least in adult females, if both sites are indeed used, the downstream splice leader addition site is more commonly used than is the upstream addition site. Interestingly, while the SL1-mediated RT-PCR analysis was carried out on RNA isolated from gravid adult female parasites, the EST containing the partial SL sequence was derived from young adult parasites that had not yet become gravid. It is therefore possible that the splice leader addition sites are differentially utilized in different life cycle stages (e.g. developing embryos versus adult parasites). Experiments are currently underway to test this hypothesis. The data presented above suggest that transgenes containing only the upstream domain of the HSP70 gene are not trans-spliced. However, inclusion of sequences located 495 nt downstream from the start codon resulted in the production of a correct trans-spliced product. Furthermore, deletion of most of the exon-derived sequences from the 495 nt domain did not affect either luciferase expression or trans-splicing. These data, when taken together, suggest that the presence of a downstream intron may be necessary to permit trans-splicing. Previous studies by Shiwaku and Donelson have provided some indirect evidence to suggest that this might be the case. In an analysis of cDNAs and RT-PCR products derived from transcripts of the actin gene of the human filarial parasite O. volvulus, these investigators were unable to detect trans-spliced RNAs that still contained intron sequences [16]. Based upon these data, the authors concluded that cis-splicing and polyadenylation proceeded trans-splicing in O. volvulus. The data presented above support this conclusion, and further suggest that cis-splicing may be an obligatory step in the trans-splicing pathway in the human filarial parasites. As discussed above, it is not yet clear if the first introns of trans-spliced genes contain specific signals necessary to direct trans-splicing at an upstream SL addition site, or if any intron sequence would sufficient to facilitate this process. The discovery that first introns appear to contain a conserved 7 bp motif that is lacking in downstream introns may provide indirect support for the hypothesis that a specific motif in the first intron sequence is involved. Experiments are currently underway to test this hypothesis.
Acknowledgements We would like to thank Drs. Naomi Lang-Unnasch and Con Beckers for critical reading of the manuscript. Para-
75
site material used in this project was provided by the Filariasis Repository at the University of Georgia with funds provided from the National Institute of Allergy and Infectious Diseases, National Institutes of Health, under Contract number NO1-AI-65283. This work was supported by a grant from the US National Institutes of Health (project #R01-AI48562).
References [1] Williams SA, Laney SJ, Lizotte-Waniewski M, Bierwert LA, Unnasch TR. The river blindness genome project. Trends Parasitol 2002;18:86–90. [2] Williams SA, Lizotte-Waniewski MR, Foster J, Guiliano D, Daub J, Scott AL, et al. The filarial genome project: analysis of the nuclear, mitochondrial and endosymbiont genomes of Brugia malayi. Int J Parasitol 2000;30:411–9. [3] Unnasch TR, Williams SA. The genomes of Onchocerca volvulus. Int J Parasitol 2000;30:543–52. [4] Foster JM, Johnston DA. Helminth genomics: from gene discovery to genome sequencing. Trends Parasitol 2002;18:241–2. [5] Thompson FJ, Cockroft AC, Wheatley I, Britton C, Devaney E. Heat shock and developmental expression of hsp83 in the filarial nematode Brugia pahangi. Eur J Biochem 2001;268:5808–15. [6] Higazi TB, Merriweather A, Shu L, Davis R, Unnasch TR. Brugia malayi: transient transfection by microinjection and particle bombardment. Exp Parasitol 2002;100:95–102. [7] Rothstein N, Rajan TV. Characterization of an hsp70 gene from the human filarial parasite, Brugia malayi (Nematoda). Mol Biochem Parasitol 1991;49:229–37. [8] Yung S, Unnasch TR, Lang-Unnasch N. Analysis of apicoplast targeting and transit peptide processing in Toxoplasma gondii by deletional and insertional mutagenesis. Mol Biochem Parasitol 2001;118:11– 21. [9] Wilson WR, Tuan RS, Shepley KJ, Freedman DO, Greene BM, Awadzi K, et al. The Onchocerca volvulus homologue of the multifunctional polypeptide protein disulfide isomerase. Mol Biochem Parasitol 1994;68:103–17. [10] Zhang MQ, A discrimination study of human core-promoters. Pac Symp Biocomput 1998;240–51. [11] TESS: Transcription Element Search Software on the WWW Rep. #CBIL-TR-1997-1001-v0.0, 1997. Computational Biology and Informatics Laboratory, School of Medicine, University of Pennsylvania. [12] Carlini LE, Getz MJ, Strauch AR, Kelm Jr RJ. Cryptic MCAT enhancer regulation in fibroblasts and smooth muscle cells suppression of TEF-1 mediated activation by the single-stranded DNA-binding proteins, Pur alpha, Pur beta, and MSY1. J Biol Chem 2002;277:8682–92. [13] Xu G, Goodridge AG. Function of a C-rich sequence in the polypyrimidine/polypurine tract of the promoter of the chicken malic enzyme gene depends on promoter context. Arch Biochem Biophys 1999;363:202–12. [14] Demaison C, Parsley K, Brouns G, Scherr M, Battmer K, Kinnon C, et al. High-level transduction and gene expression in hematopoietic repopulating cells using a human immunodeficiency virus type 1-based lentiviral vector containing an internal spleen focus forming virus promoter. Hum Gene Ther 2002;13:803– 13. [15] Blumenthal T. Trans-splicing and polycistronic transcription in Caenorhabditis elegans. Trends Genet 1995;11:132–6. [16] Shiwaku K, Donelson JE. Cis-splicing and polyadenylation of actin RNA can precede 5 trans-splicing in nematodes. 1995;211: 49–53.