Gene 362 (2005) 57 – 69 www.elsevier.com/locate/gene
Molecular cloning and tissue-specific transcriptional regulation of the first peroxidase family member, Udp1, in stinging nettle (Urtica dioica) Triantafyllia G. Douroupi, Issidora S. Papassideri ⁎, Dimitrios J. Stravopodis, Lukas H. Margaritis Department of Cell Biology and Biophysics, Faculty of Biology, University of Athens, Panepistimiopolis, Zografou, 15784, Athens, Greece Received 3 March 2005; received in revised form 2 June 2005; accepted 16 June 2005 Available online 10 October 2005 Received by G. Theissen
Abstract A full-length cDNA clone, designated Udp1, was isolated from Urtica dioica (stinging nettle), using a polymerase chain reaction based strategy. The putative Udp1 protein is characterized by a cleavable N-terminal signal sequence, likely responsible for the rough endoplasmic reticulum entry and a 310 amino acids mature protein, containing all the important residues, which are evolutionary conserved among different members of the plant peroxidase family. A unique structural feature of the Udp1 peroxidase is defined into the short carboxyl-terminal extension, which could be associated with the vacuolar targeting process. Udp1 peroxidase is differentially regulated at the transcriptional level and is specifically expressed in the roots. Interestingly, wounding and ultraviolet radiation stress cause an ectopic induction of the Udp1 gene expression in the aerial parts of the plant. A genomic DNA fragment encoding the Udp1 peroxidase was also cloned and fully sequenced, revealing a structural organization of three exons and two introns. The phylogenetic relationships of the Udp1 protein to the Arabidopsis thaliana peroxidase family members were also examined and, in combination with the homology modelling approach, dictated the presence of distinct structural elements, which could be specifically involved in the determination of substrate recognition and subcellular localization of the Udp1 peroxidase. © 2005 Elsevier B.V. All rights reserved. Keywords: Cloning; Nettle peroxidase; PCR; Root expression; Transcriptional induction
1. Introduction Peroxidases are capable of utilizing hydrogen peroxide (H2O2) to oxidize a wide variety of hydrogen donors, such as phenolic substances, nitrite, leuco-dyes, ascorbic acid, indole, amines and certain inorganic ions and only minor differences in substrate specificity are observed among isoenzymes for peroxidation. Like catalases, peroxidases are heme-containing proteins. They consist of an apoenzyme, which contains both carbohydrate and protein, bound to an iron porphyrin. The enzymatic reaction mechanism for peroxidation has been
Abbreviations: ER, endoplasmic reticulum; N-terminal, amino-terminal; PE, positioning element; PCR, polymerase chain reaction; RACE, rapid amplification of cDNA ends; RT, reverse transcription; SDS, sodium dodecyl sulphate; SSC, 3 M NaCl, 0.3 M trisodium citrate solution; Udp1, Urtica dioica peroxidase 1; UTR, untranslated region; UE, upstream efficiency element. ⁎ Corresponding author. Tel.: +30 210 7274546; fax: +30 210 7274742. E-mail address:
[email protected] (I.S. Papassideri). 0378-1119/$ - see front matter © 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.gene.2005.06.039
described in detail and shown to involve three consecutive redox stages of the enzyme, resulting in the consumption of one equivalent of H2O2 and the dehydrogenation (oxidation) of two equivalents of reducing substrate. Peroxidases are widely distributed in higher plants. The number and relative concentration of isoenzymes usually vary among different tissues and developmental stages of a plant organ. Peroxidase activity can be detected in the whole lifespan of various plants: from germination to senescence. In plants, peroxidases are mainly involved in germination, cell wall formation, lignification, suberization, polymer cross-linking, auxin metabolism, cell elongation, stress and pathogen defence reactions, ethylene biosynthesis, plant growth regulation, phenolics and H2O2 catabolism (Scialabba et al., 2002; Roberts and Kolattukudy, 1989; Moerschbacher, 1992; Allison and Schultz, 2004). Peroxidases can create a physical barrier by catalysing cross-linking of cell wall compounds in response to different stimuli such as wounding, pathogen interactions or as a normal cell wall evolution during the growth and senescence
58
T.G. Douroupi et al. / Gene 362 (2005) 57–69
(Passardi et al., 2005). Cross-linking of phenolic monomers during the formation of suberin and the oxidative coupling of lignin subunits are associated with reduction of cell extensibility and growth. Peroxidases are candidates for lignin unit assembly by oxidative polymerization (Lewis and Yamamoto, 1990). Plants exposed to stress upregulate their peroxidase activity. This reaction happens with various abiotic and biotic stresses such as chemical, biological (pathogens) or physical (wounding) assaults (Lavid et al., 2001; Martinez et al., 1998; Hiraga et al., 2000). Plant peroxidases exhibit considerable amount of sequence similarities in regions that build up the heme-binding catalytic site. On the other hand, it has been really difficult to assign specific functions to the large variety of peroxidases that have been purified, characterized and also localized to certain cell compartments. Identification of in vivo specific substrates and regulatory mechanisms of gene expression for each peroxidase remains still unclear. Extensive search of the nucleotide sequence databases for plant peroxidases belonging to Rosales resulted in only 30 sequences, whereas 2428 peroxidase sequences have already been deposited from plants belonging to Eurosids I. Moreover, only 1 of the 30 sequences corresponds to a full-length cDNA clone of a class III peroxidase (Ficus carica peroxidase mRNA, AF479623), while the rest of them mainly correspond to ascorbic peroxidases or EST fragments. Therefore, cloning and characterization of genes encoding peroxidases from diverse plant species are crucial and important steps towards understanding the function and regulation of individual members of this multi-gene family. Urtica dioica (stinging nettle) belongs to Eurosids I, Rosales and has been extensively studied for its medicinal applications, as well as for causing contact urticaria and allergic responses (Oliver et al., 1991). In the present study we describe, for the first time, the isolation and characterization of a full-length cDNA clone, designated Udp1, encoding a stinging nettle cationic peroxidase. The cloning strategy was based on a reverse transcription PCR (RT-PCR) approach, using degenerated primers designed against plant peroxidase conserved motifs and subsequent rapid amplification reactions of cDNA ends (RACE). The nucleotide sequence analysis of the Udp1 cDNA clone revealed that the predicted open reading frame contains a 337 amino acid residues putative protein, including an N-terminal signal peptide. A genomic DNA fragment, designated gUdp1, corresponding to the full-length Udp1 cDNA, was also cloned and fully sequenced. Comparative analysis between the obtained sequences of the genomic fragment and the cDNA clone demonstrated a structural organization of three exons and two introns. Consequent functional studies by Northern blot analysis disclosed the root-specific Udp1 transcriptional activity and its ectopic inducible profile by certain factors, such as mechanical stress and ultraviolet radiation. Multiple sequence alignments, in combination with the molecular modelling approach, among different family members were able to dictate the presence of unique and evolutionary conserved structural elements, which could be likely associated with distinct functions of the Udp1 peroxidase.
2. Materials and methods 2.1. Plant material and exposure of plants to stress Stinging nettle (Urtica dioica) plants were grown from seeds. Seeds were surface-disinfected and allowed to germinate in the dark, before planting in sterile sand pot cultures. The growth chamber was maintained at 25 °C, with a 16 h photoperiod. Plants were irrigated with Hoagland's solution. Leaves from two months old plants were sliced into approximately 10 mm sections and floated on sterile water for 48 h. Two months old plants were exposed to ultraviolet radiation at a distance of 30 cm, for 1 h. UV radiation was generated by Philips TL12 fluorescent tubes (λmax 315 nm). Two months old Urtica dioica plants were chilled for 48 h, at 4 °C. 2.2. Molecular cloning techniques Unless stated otherwise, all conventional molecular cloning techniques were performed as previously described by Sambrook et al. (1989). 2.3. RNA isolation Total RNAwas isolated according to the procedure developed by Jacobs-Lorena (1980) and modified by Bouhin et al. (1992). Poly(A)+ RNA was purified directly from crude extracts with DYNAL “Dynabeads mRNA DIRECT™ Kit”, according to manufacturer's protocol. 2.4. Polymerase chain reaction (PCR) and reverse transcription (RT) PCR Amplification reactions were carried out with deoxynucleotides, buffers and enzyme concentrations as recommended by the enzyme manufacturer (New England Biolabs Vent DNA polymerase, with proofreading exonuclease activity). Reactions were performed on an MJ Research Minicycler™ thermocycler with an initial denaturation step at 94 °C for 3 min, followed by 30 cycles at 94 °C for 1 min, 58 °C for 1 min and 72 °C for 1 min. Á final polymerization step at 72 °C for 15 min was added after the completion of 30 cycles. The amplification reactions with Urtica dioica genomic DNA as a template were carried out under standard PCR conditions, with the exception of a critical modification in the cycling parameters. The initial denaturation step at 94 °C for 3 min was followed by 35 cycles at 94 °C for 1 min, 53 °C for 1.5 min and 72 °C for 3 min. A final polymerization and extension step at 72 °C for 20 min was added after the end of 35 cycles. The following oligonucleotide primers were used in the present study: (a) primers corresponding to highly conserved regions of plant peroxidases, sense A (5′-CACTTCCACGACTGCTTTG-3′), sense B (5′-GTTTCTTGTGCTGACATGCTCGC-3′) and antisense C (5′-GAGGTTGGTGTAGTAGGCGTT-3′), (b) Udp1-specific primer, antisense D (5′-GTGTGTGATCCAAGGAGAAC-3′), (c) primers, sense H (5′-GCTTGGTTAGTAGTTATTAG-3′) and
T.G. Douroupi et al. / Gene 362 (2005) 57–69
antisense I (5′-TTAACAAAATACATTCTCCC-3′), used for the amplification of a 226 bp fragment of the 3′-untranslated region (3′-UTR) of the Udp1 cDNA and (d) Udp1-specific primers used for the isolation of the whole genomic fragment, sense F (5′GAACCCAATCTGTAATTTCC-3′), which corresponds to the 5′-end of the Udp1 cDNA and antisense G (5′-GTTAACAAAATACATTCTCCC-3′), which has been designed from the nucleotide sequence of the 3′-UTR of the Udp1 cDNA. First-strand cDNA synthesis was carried out with Ambion “RETROscript™ First-strand Synthesis Kit for RT-PCR”, using mRNA preparations isolated from stinging nettle tissues. 5 μl of the first strand cDNA product were consequently used, as a template, in a 50 μl PCR reaction. 2.5. Generation of double-stranded cDNA pool and RACE-PCR
59
2.7. Northern blot analysis Purified total RNA extracts (15 μg) from Urtica dioica roots, stems, leaves and inflorescence were separated in 1.2% agaroseformaldehyde denaturing gel electrophoresis and subsequently transferred onto Nylon-N Hybond™ membrane.32P-labelled Udp1 or 3′-UTR cDNA fragments were prepared and used as probes to hybridize the blots. Hybridization reaction was performed overnight at 42 °C, in hybridization buffer containing 0.1% SDS, 50% formamide, 5 X SSC, 50 mM phosphate buffer pH 6.8, 0.1% sodium pyrophosphate, 5 X Denhardt's solution and 50 μg/ml sheared salmon sperm DNA. Blots were washed under high stringency conditions (0.2 X SSC and 0.1% SDS, at 56 °C) and exposed to autoradiographic film for visualization of the bound probe. 2.8. DNA sequence analysis and comparisons
Poly(A)+ RNA was purified from twenty days old stinging nettle plants and used as a template for the first-strand synthesis reaction. An oligo-dT adaptor (ad3T: 5′-CGAGGCGGCCGACATGdT[20]3′) primed the reverse transcription reaction, catalysed by the MMLV reverse transcriptase (“SuperscriptII”, Gibco BRL). A 5′adaptor primer (ad5L: 5′-AAGCAGTGGTATCAACGCAGAGTGGCCATTATGGCCGGG-3′) was also included in the reaction (adopted from “SMART cDNA library construction Kit”, Clontech Laboratories Inc.). When reverse transcriptase completes the synthesis of the 5′-end of the mRNA, it performs a non-template driven dC-tailing. Primer ad5L anneals to the dC-tail and then the MMLV reverse transcriptase switches templates and continues the synthesis to the end of the oligo, resulting in a pool of mainly full-length single-stranded cDNAs. Most of the single-stranded cDNAs contain the complete 5′-end of the mRNAs and adaptor sequences at each end. Double-strand synthesis was performed in a PCR reaction, with 1 μl of the first-strand reaction as a template in a total reaction volume of 50 μl and adaptor primers ad5s (5′AAGCAGTGGTATCAACGCAGAGT-3′) and ad3T. Reactions were carried out on an MJ Research Minicycler™ thermocycler, with an initial denaturation step at 95 °C for 1 min, followed by 25 cycles at 95 °C for 20 s and 68 °C for 6 min. Á final polymerization step at 72 °C for 5 min was added after the completion of the 25 cycles. 1 μl of the double-strand reaction product was used as a template in 50 μl RACE reaction volume. Reactions were performed under standard PCR conditions, with the exception that the appropriate specific oligonucleotide primer was included in the reaction at 10-fold excess compared to the concentration of the adaptor oligonucleotide primer ad5s or ad3T. 2.6. Cloning of PCR products PCR products of the genomic and cDNA amplification reactions were separated in agarose gel electrophoresis and the obtained fragments were excised, purified and ligated into the pGEM-T-Easy cloning vector (Promega, Madison, WI, USA). The ligation reactions transformed Escherichia coli DH5a host cells and the bacterial colonies carrying the inserts were identified by blue/white selection.
The nucleotide sequence analysis of the cloned cDNA and genomic fragment was performed with the dideoxy chain termination Sanger sequencing method (Sanger et al., 1977). The putative amino acid sequence of the Udp1 peroxidase was deduced from the cDNA nucleotide sequence. Nucleotide and amino acid sequence alignments were performed with the Clustal algorhythm (which is available at http://www.ebi.ac.uk/ clustalw/). 2.9. Molecular modelling of peroxidases Peroxidases were modelled by means of the Swiss-Model and Swiss-PdbViewer molecular graphics modelling packages (which are available at www.expasy.ch/spdbv/), according to the similarities of the modelled sequences to the known structures, available in the Protein Data Bank (PDB). Urtica dioica peroxidase, Udp1, was modelled from VPRI to structures 1QGJ (Arabidopsis thaliana peroxidase N), 1SCH (peanut peroxidase) and 1QO4 (Arabidopsis thaliana peroxidase A2, at room temperature). Arabidopsis thaliana peroxidase P71 was modelled from GTRI to structures 1QO4, 1QGJ and 1FHF (soybean peroxidase). Cotton peroxidase was modelled from GTRV to structures 1QGJ and 1FHF. Arabidopsis thaliana peroxidase P62 was modelled from GTRI to structures 1QO4, 1QGJ and 1PA2 (Arabidopsis thaliana peroxidase A2). Arabidopsis thaliana peroxidase P69 was modelled from RPHV to structure 1FHF. Pepper peroxidase was modelled from GTRV to structures 1QO4, 1SCH and 1FHF. Arabidopsis thaliana peroxidase P25 was modelled from QLLK to structures 1QO4, 1FHF and 1SCH. The molecular surface charges were computed using simple Coulomb interactions. The protein is considered to be at pH 7.0, with a default protonation state for all residues. As default, only charged residues (Arg, Lys, Glu and Asp) are taken into account and the charges are located at the corresponding (non-H) atom positions. Net surface charges in the images range from −4.0 C (red) to +4.0 C (blue). Surface electrostatic potentials were mapped, using the Coulomb computation method and assuming a dielectric constant (solvent) of 80,000.
60
T.G. Douroupi et al. / Gene 362 (2005) 57–69
3. Results 3.1. Isolation of Urtica dioica peroxidase 1 (Udp1) full-length cDNA clone RT-PCR reactions were performed on mRNA purified from root, leaf and stem tissue, as well as from wounded two months old Urtica dioica plants. Three primers for PCR reactions were designed according to highly conserved regions of plant peroxidases: two sense primers, A and B and one antisense primer, C. The PCR products of the expected size (approximately 550 bp) were resolved in 1.5% agarose gel electrophoresis (data not shown). The purified fragments were cloned in the pGEM-T-Easy vector and sequenced from both strands. A 580 bp product A/C was isolated (with RT-PCR reactions) from root and stem, while a 540 bp product B/C was isolated from wounded two months old plants. Both putative peptides A/C and B/C exhibited significant similarities to plant peroxidase sequences (Fig. 1). A 3′-RACE-PCR reaction on the doublestranded full-length template, with primers A and ad3T, produced two fragments of 900 and 1000 bp, approximately. The 1000 bp fragment was sequenced and also found to exhibit significant similarities to plant peroxidase sequences (data not shown). Based on the triple alignment of the three clones (A/C, B/C and A/ad3T), a more specific internal primer was designed (antisense primer D), corresponding to the peptide sequence VLLGSH. This primer was used in a 5′-RACE-PCR reaction on double-stranded cDNA pool from nettle seedlings as a template. The PCR product, ad5L/D, was cloned and sequenced from both strands. Alignment of the nucleotide sequences of ad5L/D and A/ad3T PCR fragments showed 100% identity in their overlapping regions. An additional sense primer F was finally designed and used in a 3′-RACE-PCR reaction on doublestranded cDNA pool. The approximately 1300 bp PCR amplified product, F/ad3T, hereafter referred to as Udp1, was cloned in the pGEM-T-Easy vector. Three independent clones were isolated and fully sequenced from both strands and were all found to be identical to each other. Sequence comparison to known plant peroxidase family members revealed strong homologies all over the Udp1 protein (Fig. 1).
3.2. Nucleotide and amino acid sequence analysis of Udp1 The full-length cDNA clone, designated Udp1, consists of 1315 nucleotides, excluding the poly(A)+ tail, with an open reading frame of 1014 nucleotides (GenBank accession number AY660964). The 94 nucleotides of the 5′-untranslated region (5′-UTR) contain 44% adenine (A), a rare feature also observed in mRNAs predominantly encoding stress-induced proteins. This observation could be likely associated with a translational regulation of the gene activity (Ostergaard et al., 1998). The sequence flanking the initiation codon AUG (5′AAAAUGG-3′) is in absolute accordance with the conserved [A/G]XXAUGG Kozak consensus (Kozak, 1981), since it has purines in positions − 3 (A) and + 4 (G). The 5′-UTR of the Udp1 cDNA sequence also contains in frame a proximal stop
codon (UAA), upstream of the initiation codon, thus confirming the 5′-border of the translation starting site (Fig. 2). In yeast and plants, the 3′-end processing sequences consist of an upstream efficiency element (UE), an A-rich positioning element (PE) and multiple U-rich regions situated upstream of the cleavage site (Graber et al., 1999). In the Udp1 3′-UTR, a variant (AAUUAAA) of the canonical AAUAAA PE is found 29 bases upstream of the cleavage site and is tandemly repeated in position −33. An alternative PE (AAUUAA) is also observed 70 bases upstream of the cleavage site (Fig. 2). In Udp1, upstream of the Arich PE, the sequence displays some evidence of UE, with the hexamer UUGUUU at positions −44 and −57 (Fig. 2). Application of von Heijne rules predicted a 27 amino acid residues cleavable N-terminal signal sequence, responsible for the rough endoplasmic reticulum entry and a 310 amino acid residues mature protein of molecular weight (MW) 33,803 kDa, with an estimated pI of 8.95. A signal peptide usually contains many hydrophobic residues in the middle region (von Heijne, 1990). Ala and Leu amino acids constitute 41% of the total residues in the Udp1 signal peptide. The putative cleavage site is located between positions 27 and 28: VHG-KVP (Fig. 2). The alignment with other plant peroxidases suggests that Udp1 has a six residues carboxyl-terminal unique extension, SLVSSY, from amino acid 331 to amino acid 337 (Figs. 1 and 2). C-terminal tails are believed to direct plant proteins to the vacuoles. The Udp1 protein also contains nine cystein residues. Eight of them are identically positioned to those of HRP C, likely participating in the formation of four disulfide bridges (Fig. 1). Five putative Nglycosylation sites can be identified: N(139)VSE, N(180)FTN, N (183)ATE, N(210)GSV and N(219)RSG (Figs. 1 and 5). Screening the available databases (GenBank/EMBL/DDJB) with the different Blast algorithms demonstrates that Udp1 protein shares the highest degree of amino acid sequence identity with: (a) a bacterial-induced peroxidase from Gossypium hirsutum (cotton), (55% identity), (b) P71 and P62 peroxidases from Arabidopsis thaliana (57% and 52% identity, respectively) and (c) a Capsicum annuum (pepper) peroxidase (56% identity). The amino acid sequence alignment is presented in Fig. 1. Protein alignment of the Udp1 overlapping regions with the two putative peptides A/C and B/C reveals that Udp1 peroxidase has the highest degree of identity to the peptide B/C, which was amplified by a reverse transcription reaction on mRNA template isolated from wounded nettle leaves (Fig. 1). The phylogenetic relationships of the Udp1 protein and the putative peptides B/C and A/C with the Arabidopsis thaliana peroxidases, based on sequence alignment of the mature proteins, are illustrated in Fig. 6. 3.3. Root-specific expression of the Udp1 gene Northern blot analysis has revealed that Udp1 mRNA expression is exclusively restricted to the roots (Fig. 3). Total RNA preparations isolated from roots, stems, leaves (10 days old and 2 months old) and also from inflorescence with immature seeds (2 months old) were examined for the presence of Udp1 mRNA. The Udp1 cDNA and a PCR-generated fragment corresponding to the 3′-UTR of the Udp1 mRNA were labelled with 32P-dCTP and separately used as probes in
T.G. Douroupi et al. / Gene 362 (2005) 57–69
HRP C residue no.
HRP C A/C B/C Cotton Pepper P62 P71 P69 P25 Udp1 HRP C residue no.
HRP C A/C B/C Cotton Pepper P62 P71 P69 P25 Udp1
1
10
20
61
30
40
*
------------QLTPTFYDNSCPNVSNIVRDTIVNELRSDPRIAASILRLHFHDCFVNG ---------------------------------------------------HFHDCFVNG ----------------------------------------------------------------------QGTRVGFYARTCPRAESIVRSTVQSHFRSNPNIAPGLLRMHFHDCFVQG -----------QGTRVGFYSSTCPRAESIVQSTVRSHFQSDPTVAPGLLTMHFHDCFVQG -----------QGTRIGFYSTTCPNAETIVRTTVASHFGSDPKVAPGLLRMHNHDCFVQG --QATARPGPVSGTRIGFYLTTCPRAETIVRNAVNAGFSSDPRIAPGILRMHFHDCFVQG QGNRGSNSGGGRRPHVGFYGNRCRNVESIVRSVVQSHVRSIPANAPGILRMHFHDCFVHG -----------QLLKNGYYSTSCPKAESIVRSTVESHFDSDPTISPGLLRLHFHDCFVQG -----------KVPRIGFYDETCPKAESIVTKAVKKGLKENPRIAPGILRIAFHDCFVRG 50
60
70
80
90
100
* * CDASILLDNTTSFRTEKDAFGNANSARGFPVIDRMKAAVESACPRTVSCADLLTIAAQQS CDASVLLENTTSTNGEKFAAPNINSLRGFEVIDEIKYELEKLCPQKVSCADILAVAARDS ----------------------------------------------VSCADML--AARDA CDASILIDGPN---TEKTAPPNR-LLRGYEVIDDAKTQLEATCPGVVSCADILTLAARDS CDASILISGSG---TERTAPPNS-LLRGYEVIDDAKQQIEAICPGVVSCADILALAARDS CDGSVLLSGPN---SERTAGANV-NLHGFEVIDDAKRQLEAACPGVVSCADILALAARDS CDGSILISGAN---TERTAGPNL-NLQGFEVIDNAKTQLEAACPGVVSCADILALAARDT CDGSVLLAGNT---SERTAVPNR-SLRGFEVIEEAKARLEKACPRTVSCADILTLAARDA CDGSVLIKGKS---AEQAALPNL-GLRGLEVIDDAKARLEAVCPGVVSCADILALAARDS CDASVLIEGPG---TEKTSGANR-NIQGYNVIDDAKTELERVCPGVVSCADILTLAARDA
HRP C residue no.
110
HRP C A/C B/C Cotton Pepper P62 P71 P69 P25 Udp1
VTLAGGP---SWRVPLGRRDSLQAFLDLANANLPAPFFTLPQLKDSFR-NVGLNRSSDLV VSALARPSLISWSVKYGRKDSLTASMDAANKNLPPFFLDLDVLIAFFK-TKGFT-VEDTV TVLTGGA---SWEVPTGRRDGLISLVNETTS-LPGPTETITDQIQKFETVMGLD-TQDLV VFLTRGI---NWAVPTGRRDGRVSLASDTTI-LPGFRESIDSQKQKFA-AFGLN-TQDLV VLVTKGL---TWSVPTGRRDGLVSRASDTSD-LPGFTESVDSQKQKFS-AKGLN-TQDLV VSLTNGQ---SWQVPTGRRDGRVSLASNVNN-LPSPSDSLAIQQRKFS-AFRLN-TRDLV VILTQGT---GWQVPTGRRDGRVSLASNANN-LPGPRDSVAVQQQKFS-ALGLN-TRDLV VVLTGGQ---RWEVPLGRLDGRISQASDVN--LPGPSDSVAKQKQDFA-AKTLN-TLDLV VDLSDGP---SWRVPTGRKDGRISLATEASN-LPSPLDSVAVQKQKFQ-DKGLD-THDLV TVLTGGA---SWKVPTGRKDGLVSLVAEAGP-LPGPRENVSEQIRKLD-EIGLN-TQDLV
HRP C residue no.
HRP C A/C B/C Cotton Pepper P62 P71 P69 P25 Udp1
170
130
180
140
190
150
200
160
210
220
* ALSGG-HTFGKNQCRFIMDRLYNFSNT--GLPDPTLNTTYLQTLRGLCP-LNGNLSALVD ALSGG-HTIGKAHCPTYRNRIHNKT--------AIIDEQFASALQTDCPKFSG--GETVA VLLGS-HTIGTTSCPLFQFRLYNFTNATESGADPSIDPEFLPTLRALCP-ENEVSSVRVD ALVGG-HTIGTSACQLFSYRLYNFTN---GGPDPTINPAFVPQLQALCP-QNGDGSRLID TLVGG-HTIGTSACQFFSYRLYNFNST--GGPDPSIDASFLPTLRGLCP-QNGDGSKRVA TLVGGGHTIGTAACGFITNRIFNSSG---NTADPTMDQTFVPQLQRLCP-QNGDGSARVD VLVGG-HTIGTAGCGVFRNRLFNTTG---QTADPTIDPTFLAQLQTQCP-QNGDGSVRVD TLVGG-HTIGTAGCGLVRGRFVNFNGT--GQPDPSIDPSFVPLILAQCP-QN--GGTRVE TLLGA-HTIGQTDCLFFRYRLYNFTVT--GNSDPTISPSFLTQLKTLCP-PNGDGSKRVA VLLGS-HTLGTTSCALFRFRLYNFTNATESGADPSIDPKFLPTLRKLCP-DGGNGSVRVH
HRP C residue no.
HRP C A/C B/C Cotton Pepper P62 P71 P69 P25 Udp1
120
230
240
250
260
270
* * * FDLRTPTIFDNKYYVNLEEQKGLIQSDQELFSSPNATDTIPLVRSF--ANSTQTFFNAFV ADLDDPVDNNDAYYTNL------------------------------------------LDNGSGENFDSSFYINLSNGRGILQSDQVLWTDPRTQPFVRRLLN---PYDAYYTNL--LDTGSGNRFDTSFFANLRNVRGILESDQKLWTDPSTRTFVQRFLGE-RGSRPLNFNVEFA LDTGSVNNFDTSYFSNLRNGRGILESDQKLWTDDSTKVFIQRYLGL-RGFLGLRFGVEFG LDTGSGNTFDTSYFINLSRNRGILQSDHVLWTSPATRSIVQEFMAP-RG----NFNVQFA LDTGSGSTWDTSYYNNLSRGRGVLQSDQVLWTDPATRPIVQQLMAP-RS----TFNVEFA LDEGSVDKFDTSFLRKVTSSRVVLQSDLVLWKDPETRAIIERLLGL-RR-PSLRFGTEFG LDIGSPSKFDESFFKNLRDGNAILESDQRLWSDAETNAVVKKYASRLRGLLGFRFDYEFG LDNRSGEKFDTTFYKNLKRGRGVLQSDQVLWTDLRTQPFVRRLLDS-EAYDALNFKVEFG
HRP C residue no.
280
HRP C A/C B/C Cotton Pepper P62 P71 P69 P25 Udp1
EAMDRMGNITPLTGTQ-GQIRLNCRVVNSNSLLHDMVEVVDFVSSM ------------------------------------------------------------------------------------------RSMVKMSNIGVKTGTN-GEIRRICSAIN-----------------RSMVKMSNIEVKTGTN-GEIRKVCSAIN-----------------RSMVKMSNIGVKTGTN-GEIRRVCSAVN-----------------RSMVRMSNIGVVTGAN-GEIRRVCSAVN-----------------KSMVKMSLIEVKTGSD-GEIRRVCSAIN-----------------KAMIKMSSIDVKTDVD-GEVRKVCSKVN-----------------KAMVKMSLIGVKTNPKESEIRKVCTAVN--SL----------VSSY
290
300
310
320
Fig. 1. Alignment of the cDNA deduced mature amino acid sequence of the putative nettle peroxidase Udp1 with the mature amino acid sequences of horseradish peroxidase HRP C, cotton peroxidase, pepper peroxidase and Arabidopsis thaliana peroxidases P25, P62, P69 and P71. Putative N-terminal signals for endoplasmic reticulum targeting have been removed. C-terminal extension tails are shown in italics. Active site residues are shown on a grey background. Cysteine residues involved in the formation of disulfide bridges are indicated by arrowheads. Side chain ligands to the distal and proximal Ca2+ ions are marked with asterisks. Putative N-glycosylated amino acid triplets are underlined. Intron positions in the corresponding genes are indicated by residues in white print on a black background (phase 0 introns between two marked residues, phase 1 or 2 introns within a single residue). The protein accession numbers for the retrieval of these peroxidase sequences from the Swiss-Prot database are: HRP C: P00433, Cotton: AAL73112, Pepper: AAL35364, P62: Q9FKA4, P71: Q43387, P69: Q96511 and P25: O80822.
62
T.G. Douroupi et al. / Gene 362 (2005) 57–69
bp:
E1
I1
E2
I2
E3
498
268
163
925
662
60 5'- aacccaatctgtaatttcctaaattaagctaaaaaccaaaaaccccatttctcactagta aaacactactttactatactatactaacacaaaATGGCAAACCACCCTTGTTTTAGCAAA 120 M A N H P C F S K ACCATCCTTGGGATGGCTTTGCTTCTTCTCCTTGCCGCAGCCTCGGTTCACGGAAAAGTC 180 T I L G M A L L L L L A A A S V H G K V CCACGCATCGGGTTCTATGATGAAACGTGCCCTAAGGCCGAGAGTATCGTCACTAAGGCG 240 P R I G F Y D E T C P K A E S I V T K A GTCAAGAAGGGCCTAAAAGAAAACCCTAGAATAGCCCCGGGGATTCTAAGGATTGCCTTC 300 V K K G L K E N P R I A P G I L R I A F E1 CACGACTGCTTTGTCCGAGGCTGCGACGCCTCGGTCCTCATAGAAGGCCCCGGAACCGAG 360 H D C F V R G C D A S V L I E G P G T E AAGACTTCAGGGGCGAATCGCAACATACAAGGCTACAACGTCATCGACGATGCCAAGACT 420 K T S G A N R N I Q G Y N V I D D A K T GAGCTCGAAAGAGTCTGCCCTGGCGTCGTTTCTTGCGCCGATATCCTCACGCTCGCCGCA 480 E L E R V C P G V V S C A D I L T L A A CGTGACGCCACTGTCTTGgtatgatttacatacatatatacatataattgcactaattaa 540 R D A T V L gccattgtatattagccaatttttcatcttgtttcaaaaatagtccgttgtgatgacatc 600 aaaaattataatatgctacggtatttcgcaaaaatagacgaaaaattggctaaagtgcac 660 tttataatagaaatttaacgctcaataaagcgcgtaaataagtgtgttcttgaaaaattt 720 ataatatactatatatttcaataatagccatatgggtttgttttagACCGGAGGAGCTAG 780 T G G A S TTGGAAGGTGCCAACCGGAAGGAAAGACGGTTTGGTTTCATTAGTAGCGGAGGCAGGGCC 840 W K V P T G R K D G L V S L V A E A G P E2 TTTGCCGGGTCCGAGAGAGAATGTTAGTGAACAGATCAGAAAGCTCGATGAGATTGGTCT 900 L P G P R E N V S E Q I R K L D E I G L CAACACTCAAGACCTTGTTGTTCTTCTTGgtatataccgctcatgattactcatttattt 960 N T Q D L V V L L ttgaatagcactagtttctaagttatttattactcacttatttttctggtcgcatttgaa 1020 attatatctaatattttattttagaaaaggaaaaaaaaaacatatcccccacaatctaaa 1080 taccaaacccgacgactaagttcgtaataatttgttaaatcaaacaaatatttaaatctt 1140 ggctttaatatttccaaaaaatacgtagaaatttataatctaaattttatgttttctatt 1200 ttttgcatcgtcactttttattttgtcattatatcgattctctatagatccgcacataaa 1260 agttgatgaaaaaagtaagaggtattagcatgaattagtctattttagtactgactttta 1320 aattatagtcaaagtttgtataattttaaaaattagctcaaacaggagaagtaaataaaa 1380 taaactgggaagtaaatacaatattaactggccggtaaataatgataattggctattttt 1440 ttaatgaatcaaaatttgggctataattttaaaaaagaaaatcatttgggctattttact 1500 taacctcccaaaaaataattaatgtattataacaactttattactaactaagccagcaaa 1560 aatataaaatttgacaaaatagtcaacgtttatgaacgttttaacatctcgcatttttac 1620 tcaatttttggtcgttattagatctctaatactaaaaataaacagtcatcactaacaaat 1680 ttattttgattaattttacttaatgattaatttgtaatattaactaaaacttggttattt 1740 agctataaaatcttaatttacaagaggtatataaagaaagagaaagcaacataaaaagac 1800 atttattaatcctagttcaaaattatatggacaagagaggctaactaagtgtagGCTCAC 1860 G S ACACCCTCGGAACAACTTCGTGCGCGTTGTTCCGATTCAGGCTGTACAACTTCACGAACG 1920 H T L G T T S C A L F R F R L Y N F T N CCACCGAGTCTGGGGCAGACCCTTCCATCGATCCGAAGTTCCTCCCGACACTCAGGAAGC 1980 A T E S G A D P S I D P K F L P T L R K TATGCCCCGACGGAGGAAACGGTTCGGTGCGTGTCCATCTCGATAATCGCAGCGGCGAAA 2040 L C P D G G N G S V R V H L D N R S G E AATTCGACACCACTTTCTACAAAAATCTCAAGAGGGGTCGCGGAGTCCTCCAGTCGGACC 2100 K F D T T F Y K N L K R G R G V L Q S D AGGTTCTATGGACCGACCTGAGGACTCAGCCTTTTGTCCGTCGACTCCTCGACTCTGAAG E3 2160 Q V L W T D L R T Q P F V R R L L D S E CCTATGACGCTCTTAACTTCAAAGTCGAGTTCGGAAAAGCGATGGTCAAGATGAGCCTCA 2220 A Y D A L N F K V E F G K A M V K M S L TCGGTGTCAAAACTAACCCTAAAGAGAGTGAAATTAGGAAGGTTTGTACGGCTGTTAATA 2280 I G V K T N P K E S E I R K V C T A V N GCTTGGTTAGTAGTTATtagatatatcatatatctacgttgaaatttccaagtgttgttt 2340 S L V S S Y * ttcgttattgggtggggttggaataattttgtcatcgatcggttggtggaagtgacgtgt 2400 ttgggatggtgacgtgtgggaggcgtgtttcgtttactaattaagaaaacttgtttattg 2460 cttttgttttgtgtaattaattaaatgggagaatgtattttgttaacaaaaaaaaa -3'
Fig. 2. Nucleotide sequence and structural organization of gUdp1, the nettle peroxidase Udp1 genomic locus. The graphic on the top of the figure represents the structure and the respective sizes in base pairs (bp) of the three exons and two introns that constitute the Udp1genomic area. The exon sequences are indicated by the grey background. Coding sequence is uppercase, whereas non-coding sequence is lowercase. The deduced amino acid sequence is uppercase below the coding nucleotide sequence. The N-terminal endoplasmic reticulum targeting signal (MANHPCFSKTILGMALLLLLAAASVHG) is shown in bold and the C-terminal putative vacuolar targeting signal (SLVSSY) is double-underlined. The in frame stop codon UAA (5′) is dotted underlined and the 3′-end stop codon UAG is denoted by an asterisk. The putative N-glycosylation sites are single-underlined. The putative 3′-end processing positioning elements are boxed and the putative upstream efficiency elements are shown in italics. The 5′-GT and the 3′-AG conserved splice junctions are indicated by brackets.
hybridization reactions performed at high stringency conditions. Even though peroxidases generally exhibit a high degree of sequence similarity, the 3′-UTR region is considered to be significantly variable and absolutely specific for each mRNA
and thus was selected as a Udp1 -specific probe. Even after long exposure periods of the autoradiograms, the signal of the 1300 bp Udp1 mRNA was exclusively detected in the roots, with either of the two radiolabelled probes used (Fig. 3).
Roots
Stems
Leaves
Roots
Udp1
Inflorescence
63
B Stems
Leaves
A
Inflorescence
T.G. Douroupi et al. / Gene 362 (2005) 57–69
Udp1
1.3Kbp
1.3Kbp
28s rRNA 28s rRNA
18s rRNA
18s rRNA
Chl-rRNA
Probe:
Chl-rRNA
Udp1
Probe:
3'-UTR
Fig. 3. The expression of the Udp1 mRNA is restricted to the roots. Total RNA preparations from 2 months old Urtica dioica roots, stems, leaves and inflorescence were separated by denaturing agarose gel electrophoresis and subsequently transferred onto nylon membrane. Hybridization reaction was performed at high stringency conditions. 32P-dCTP labelled Udp1 cDNA (A) or 3′-UTR (B) fragments were prepared and used as probes to hybridize the RNA blots. The upper part of the figure reveals the hybridization signals on the autoradiographic films, whereas the lower part shows the ethidium bromide stained gels before the transfer, with the 28S and 18S characteristic rRNA bands. The low molecular weight zones in the lanes of the leaves correspond to the chloroplastic rRNA (Chl-rRNA).
Additionally, the strong Udp1 transcriptional activity and mRNA accumulation in the roots could not be distinguished in between young seedlings (10 days old, data not shown) and mature plants (2 months old). 3.4. Induction of Udp1 transcriptional activity by stress Wounding (mechanical stress) and UV radiation treatment were able to induce the ectopic transcriptional activity of the Udp1 gene in leaves, as it is demonstrated by Northern blot analysis in Fig. 4A. On the other hand, chilling at 4 °C (temperature stress) did not exhibit any detectable effect on the expression of the Udp1 gene either in roots or in leaves, except for a small reduction in the chilled roots, as it is illustrated in Fig. 4B. The Udp1-specific probe used was a purified and 32PdCTP labelled 226 bp 3′-UTR PCR fragment. 3.5. Molecular cloning and structural organization of the Udp1 genomic locus A PCR amplification reaction, with Urtica dioica genomic DNA as a template and flanking primers F and G, was able to generate a 2500 bp product, designated gUdp1 (GenBank accession number AY660965). Three independent cloned fragments were fully sequenced from both strands. Alignment with the Udp1 cDNA sequence revealed 100% identity in three distinct regions, therefore suggesting that the gUdp1 locus contains three exons and two introns: exon 1 (498 bp), intron 1 (268 bp), exon 2 (163 bp), intron 2 (925 bp) and exon 3 (662 bp). The DNA sequence and structural organization of the
gUdp1 genomic locus is presented in Fig. 2. The two introns, I1 and I2, contain the consensus dinucleotide sequences GT and AG at the 5′ and 3′ corresponding termini (splice junctions), regulating the accuracy and efficiency of the splicing process. Interestingly, the gUdp1 locus is missing an intronic sequence, frequently present in many of the plant peroxidase genomic loci published so far, that is an intron between the codons Asn 47 and Gly 48 close to the distal His 42 of HRP C (Fig. 1). Precisely at this position of the protein an intron is found in horseradish prxC2 and prxC3 (Fujiyama et al., 1990), in tomato TAP1 and TAP2 (Roberts and Kolattukudy, 1989), in wheat pseudogene POX1 (Hertig et al., 1991) and in barley seed BP 2A (Theilade and Rasmussen, 1992). 3.6. Molecular modelling of the Udp1 protein Theoretical comparative models of the Udp1, cotton, pepper, P25, P62, P69 and P71 peroxidases were constructed from the primary structure of the corresponding mature proteins (Fig. 5), using the automated protein modelling server Swiss-model (Guex and Peitsch, 1997). This modelling approach dictates that the predicted Udp1 mature protein (the residue position 1 is set at the valine residue of VPRI) contains the important and characteristic residues of the plant peroxidase superfamily (Fig. 1). At the closed proximal site, a His residue (His 164) is likely ligated to the iron atom of the heme group and is hydrogen bonded to an aspartate residue (Asp 243). At the accessible distal site, a His residue (His 42) and an Arg residue (Arg 38) are probably involved in the reduction of H2O2. Fig. 5A illustrates the conserved helices (A–J) of the predicted three-
64
T.G. Douroupi et al. / Gene 362 (2005) 57–69
Udp1
Roots
Udp1
1.3Kbp
1.3Kbp
28s rRNA
28s rRNA
18s rRNA
18s rRNA
Chl-rRNA
Chl-rRNA
Treatment:
Leaves
B Leaves
Roots
A
Leaves
2 months old U. dioica plants
Control
UV
Wounding
Probe:
Treatment:
Control
4oC
3'-UTR
Fig. 4. The ectopic expression of Udp1 mRNA in the leaves is induced by UV radiation and mechanical wounding, but not by chilling. (A) 2 months old plants were exposed to UV radiation or wounding. Total RNA was isolated from the leaves and subsequently separated by denaturing agarose gel electrophoresis. (B) 2 months old Urtica dioica plants were chilled at 4 °C and total RNA was isolated from leaves and roots and separated by denaturing agarose gel electrophoresis. In both cases (A and B), hybridization reactions were performed at high stringency conditions, using the 32P-dCTP labelled 3′-UTR specific probe. The upper part of the figure demonstrates the hybridization signals on the autoradiographic films, whereas the lower part shows the ethidium bromide stained gels, before the transfer, with the 28S and 18S rRNA characteristic bands. The low molecular weight zones in the lanes of the leaves correspond to the chloroplastic rRNA (Chl-rRNA).
dimensional structure of the enzyme. The central heme group must be sandwiched between the two protein domains, at a position indicated in Fig. 5A by an arrow. The Udp1 peroxidase model strongly suggests that three distinct arginine residues (Arg 68, Arg 137 and Arg 175) “guard” the entrance of the heme pocket (Fig. 5B). Interestingly, the cotton and Arabidopsis P71 peroxidase models reveal the presence of Arg 68 and Arg 137 (for cotton) and Arg 137 and Arg 175 (for P71) closely located at the entrance of the heme pocket. The relative positioning of these arginine residues could
play an essential role in substrate recognition and may also indicate similar mechanisms for substrate binding specificity among the three peroxidases. Mapping of the electrostatic potential demonstrates a characteristic uneven distribution of charges on the surface of the Udp1 protein. Positively charged surface residues appear to be clustered on a remote area, away from the putative substrate entry channel. This characteristic structural feature is shared with cotton, P62 and P69 peroxidases and – to a lesser extent – with P71 and pepper peroxidases (Fig. 5C and D).
Fig. 5. Comparative molecular modelling of Udp1 with representative peroxidase family members. Peroxidases were modelled by means of the Swiss-Model and Swiss-PdbViewer molecular graphics modelling packages, using the coordinates of known structures available in the Protein Data Bank (PDB), according to the similarity of the modelled sequence to the available resolved structure. Positively and negatively charged areas are shown in blue and red, respectively. (A) Theoretical three dimensional molecular model of the Udp1 protein structure. A ribbon representation of the Udp1 peroxidase, with the predicted helices A–J, marked with white letters, is illustrated on the left. The arrow indicates the location of the central heme group sandwiched between the two protein domains. The three dimensional surface structure of the Udp1 protein is shown on the right. The arrow indicates the entry into the heme cavity. The red arrowheads denote the predicted sites of Nglycosylation (Asparagine 219 is not visible in the surface view). All putative N-glycosylation sites appear to cluster at the lower part of the molecule, as viewed in this figure. (B) Theoretical three dimensional models of the surface of the Udp1, Arabidopsis P71 and cotton peroxidases. Molecules are aligned similarly to each other, with the arrows indicating the entry into the substrate channel. The red arrowheads denote arginine residues (Udp1: Arg 68, 137 and 175) “guarding” the entry into the heme pocket. Located at similar positions, compared to those of Udp1, are Arg 68 and 137 of cotton, as well as Arg 137 and 175 of P71. (C) Surface structure and mapping of the electrostatic potential of Udp1, cotton, pepper and Arabidopsis thaliana peroxidases P62, P69 and P71, based on the homology modelling approach. Protein molecules are aligned similarly to each other, with the arrows indicating the entry into the substrate channel. A characteristic uneven distribution of surface charges is evident for Udp1, cotton, P62 and P69 and less pronounced in the cases of pepper and P71 peroxidases. (D) Theoretical three dimensional models of the surface of Udp1, Arabidopsis thaliana P25 and cotton peroxidases. The red arrowheads show distinct lysine and arginine residues, which appear to form clusters of aligned positive charges on the surface of the molecules. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
T.G. Douroupi et al. / Gene 362 (2005) 57–69
65
C-terminal
C N-terminal
D
A B
D'
A
Asn210
Asn210
J F
E
H I
Asn219
G
F'
F'' F' Asn139
Asn139
Udp1
Arg68
Arg68
B
C
Asn180 Asn183
Asn180 Asn183
Arg137 Arg175
Arg137
Arg137 Arg175
Udp1
P71
Cotton
Udp1
P62
P69
Pepper
Cotton
P71 Arg9
Lys13 Lys20 Lys24 Lys23
D
Arg13 Lys1 14
Lys276 Arg269
Lys276
Lys27 Lys271
Udp1
Lys284
Arg115
Lys287
Arg263
P25
Arg27
Lys273 Arg263
Cotton
66
T.G. Douroupi et al. / Gene 362 (2005) 57–69
4. Discussion Seventy three peroxidase genes have been identified throughout the Arabidopsis thaliana genome. From the
Common gene ancestor
cladogram illustrated in Fig. 6, it appears that the Udp1 protein and B/C peptide are clustered together within a group of peroxidases assigned by Tognolli et al. as group 2 (Tognolli et al., 2002), whereas the A/C peptide is clustered with
P34 P33 HRP C P32 P37 P38 P22 P23 P54 P53 P58 P59 P10 P72 P36 P49 P14 P15 P9 P20 P67 P68 P52 P4 P5 A/Cnettle P11 P40 P17 P64 P66 P47 P12 P48 P29 P42 P21 P69 P70 P71 P62 Udp1 B/Cnettle P25 P13 P43 P57 P28 P8 P44 P60 P26 P61 P30 P3 P39 P1/2 P27 P56 P24 P18 P46 P50 P51 P73 P35 P45 P16 P55 P63 P31 P65 P41 P6 P19
C term
group 4
C term group 3 C term group 1
C term group 2
group 3
group 5
Fig. 6. Molecular evolution of the Udp1 peroxidase. The phylogenetic relationships of Urtica dioica, HRP C and Arabidopsis thaliana peroxidases were based on amino acid sequence alignments of the mature proteins. The evolutionary most conserved and closest relatives to the Udp1 peroxidase, P62, P69, P70, P71 and P25 (group 2), are shown by the grey background. The Udp1 peroxidase, the nettle peroxidase peptide fragments A/C and B/C and the HRP C peroxidase are boxed. The presence of the C-terminal extension tails is also indicated. The groups numbering (group 1–5) at the right side of the figure refers to the classification proposed by Tognolli et al. (2002). The protein accession numbers for the retrieval of these sequences from the Swiss-Prot data base are: HRP C P00433, P34 Q9SMU8, P69 Q96511, P20 Q9SLH7, P51 Q9SZE7, P15 Q9SI16, P32 Q9LHB9, P71 Q43387, P17 Q9SJZ2, P73 Q43873, P9 Q96512, P37 Q9LDN9, P62 Q9FKA4, P10 Q9FX85, P35 Q96510, P24 Q9ZV04, P38 Q9LDA4, P25 O80822, P11 Q96519, P45 Q96522, P50 Q43731, P22 P24102, P13 O49293, P40 O23474, AtP22 CAA70034, P65 Q9FJR1, P23 O80912, P43 Q9SZH2, P12 Q96520, P55 Q96509, P31 Q9LHA7, P54 Q9FG34, P57 Q43729, P64 Q43872, P63 Y11791, P6 O48677, P53 Q42578, P28 Q9SS67, P66 Q9LT91, P33 P24101, P56 Q9LXG3, P58 P59120, P8 Q9LNL0, P47 Q9SZB9, P16 Q96518, P14 Q9SI17, P59 Q39034, P44 Q93V93, P48 O81755, P41 O23609, P27 Q43735, P67 Q9LVL2, P60 Q9FMR0, P70 Q9FMI7, P49 O23237, P19 O22959, P68 Q9LVL1, P26 O22862, P18 Q9SK52, P29 Q9LSP0, P39 Q9SUT2, P52 Q9FLC0, P61 Q9FLV5, P46 O81772, P1/2 Q96506, P4 Q9LE15, P30 Q9LSY7, P42 Q9SB81, P36 Q9SD46, P5 Q9M9Q9, P3 O23044, P21 Q42580, P72 Q9FJZ9.
T.G. Douroupi et al. / Gene 362 (2005) 57–69
representative members of group 4. These data strongly suggest the existence of at least two distinct groups of nettle peroxidases. Comparison of the amino acid sequence of Udp1 to Arabidopsis thaliana peroxidases reveals that the Udp1 protein shares the highest degree of identity with peroxidases P71 (57%), P62 (52%) and P69 (50%). Interestingly, the exon/ intron structural organization of the Udp1 genomic locus appears to be identical to the structure of these peroxidase genes. In the majority of peroxidase genes their coding sequences are disrupted by three introns at highly conserved positions, suggesting the presence of a common ancestor with four exons and three introns. Variations of this basic pattern are observed in approximately one third of the Arabidopsis thaliana peroxidase gene family members (Tognolli et al., 2002; Welinder et al., 2002). Four genes encoding for P62, P69, P70 and P71 peroxidases appear to have lost the first intron. This group of Arabidopsis thaliana peroxidases seems to have close phylogenetic relationships to Udp1, as it is shown in Fig. 6. These observations strongly indicate that the loss of the first intron occurred before the divergence of Eurosids. Urtica dioica belongs to Eurosids I, Rosales, while Arabidopsis belongs to Eurosids II, Brassicalles. However, the Udp1 peroxidase contains a short C-terminal extension tail, which is absent in group 2 Arabidopsis peroxidases, but only present in group 4 protein members (Tognolli et al., 2002). Eight closely related Arabidopsis peroxidase family members from group 4, all characterized by C-terminal extensions, have preserved the classical four exons/three introns structural pattern. On the other hand, two more distantly related genes, P12 and P17, that contain a C-terminal extension, appear to have lost the third intron. The Udp1 peroxidase is characterized by a unique Cterminal extension tail of six amino acids, which could potentially function as a vacuolar targeting signal. Although no C-terminal consensus sequence regulating the vacuolar targeting has been identified so far, short hydrophobic regions followed by an acidic amino acid residue, such as the LVAE sequence detected in the short (15 amino acids) C-terminal tail of barley lectin, can function as individually sufficient elements for vacuolar targeting (Bednarek and Raikhel, 1992). At least one peroxidase derived from a gene encoding a carboxyterminal propeptide has been localized into the vacuoles (Theilade et al., 1993). By comparing the putative amino acid sequence deduced from the cDNA with the one obtained by amino acid sequence analysis of the mature protein, it has been shown that barley seed BP1 (Rasmussen et al., 1991) and HRP C (Fujiyama et al., 1988) peroxidases are both processed at their C-termini. Even though the proposed C-terminal extension tail of Udp1 peroxidase, SLVSSY, is unusually short in length compared to the propeptides found in other plant peroxidase sequences, it shows significant similarities to the last residues of the C-terminal tails of HRP C, LLHDMVEVVDFVSSM and French bean peroxidase FBP1, AGLATLATKESSEDGLVSSI (Blee et al., 2001). Differences in the glycosylation patterns and surface charges among the peroxidase protein family members could be essentially involved in the determination of subcellular
67
localization and substrate specificity. In Udp1, the five putative glycosylation sites appear to be positioned in protruding surface loops or turns and therefore they are considered to be truly amenable to glycosylation in vivo. Since glycans are large, the one located close to the substrate binding proline (Pro 137 of the putative Udp1 mature polypeptide) is likely to affect substrate access and reaction dynamics due to a dampening of backbone motion (Nielsen et al., 2001). In the case of the cationic peanut peroxidase, site-directed removal of each of the three N-linked complex glycans revealed that the N-60 and N-144 glycans influence the peroxidase catalytic activity, whereas the N-185 glycan is important for the thermostability of the enzyme (Lige et al., 2001). Interestingly, 46 out of the 73 Arabidopsis thaliana peroxidases contain a putative Nglycosylation site at a position corresponding to the site N-186 of HRP C. The uneven distribution of charges on the surfaces of Udp1, P62, P69 and cotton peroxidases, characterized by clusters of positive surface charges at distinct remote areas away from the substrate entry channel, might play an essential role in binding of the enzymes to the sites of their biological activity through electrostatic interactions. In the case of APRX, an anionic peroxidase from Curcubita pepo (zucchini), the homology modelling approach and generation of site-directed glycoprotein mutants resulted in the identification of a motif of four clustered arginines (Arg 117, 262, 268 and 271) as being responsible for the specific binding of the enzyme to the negatively charged Ca2+-pectate (Carpin et al., 2001). This group of arginines is located at the level of helix J in a remote area away from the entry of the heme cavity. Interestingly, at the level of helix J, the Udp1 peroxidase contains a distinct cluster of four lysine residues (Lys 114, 271, 276 and 287), as it is clearly illustrated in Fig. 5D. In addition, a cluster of aligned lysine residues (Lys 13, 20, 23, 24 and 27) is also observed at the level of helix A. A similar motif of aligned positively charged surface residues at the level of helix J is observed by the homology modelling approach applied for the Arabidopsis thaliana peroxidase P25. P25 belongs to group 2 and is classified close to Udp1 in the cladogram presenting the phylogenetic relationships of the Udp1 peroxidase to the Arabidopsis thaliana peroxidase family members (Fig. 6). The homology modelling approach also dictates that the cotton bacterial-induced peroxidase contains three evenly distributed arginines (Arg 9, 13 and 27) at the level of helix A and one more motif of three cationic residues (Lys 284, Arg 263 and Arg 273) located at the level of helix J (Fig. 5D). It should be noted that in the case of the APRX peroxidase the glycosylation does not appear to affect the contribution of the charged surface amino acids to the specificity of the binding (Carpin et al., 2001). Although we could not exclude the potential effect of the glycosylation on the overall charge of the Udp1 enzyme, we suggest that the charged surface amino acids of Udp1 may play a similar role in substrate recognition and binding (i.e., to cell wall components). As it is clearly demonstrated in Fig. 3, the Udp1 peroxidase gene is specifically expressed in the roots. High expression
68
T.G. Douroupi et al. / Gene 362 (2005) 57–69
levels of peroxidases in the roots are observed in Arabidopsis thaliana (Welinder et al., 2002) and rice as well (Hiraga et al., 2000). It is very likely that different peroxidase isoenzymes play an essential role in the normal development and physiology of the plant roots. It is reasonable to speculate that the root-specific transcriptional regulation of the Udp1 gene could be mediated by distinct transcription factors selectively activated in the plant roots. The activity of peroxidase genes has been previously shown to be induced by a variety of environmental stimuli, such as wounding (Mohan et al., 1993), ethylene (Mogens et al., 1990), pathogen infection (Rebmann et al., 1991; Reimers et al., 1992) and UV radiation (Ito et al., 2000). Plant peroxidases can function in cell walls, while they are also presumed to be involved in extensin and proline-rich protein cross-linking (Bradley et al., 1992), lignification, suberization, disease resistance and wound-healing. However, the physiological role of each individual isoenzyme is only partially understood and is highly complicated by the presence of multiple peroxidase isoenzymes. A number of different peroxidases are known to be produced upon wounding via de novo protein synthesis. The expression of a tomato peroxidase gene (TPX1), which is constitutively expressed only in roots, is induced in stems and leaves as a result of mechanical wounding (Botella et al., 1994). The Udp1 gene also shows a similar ectopic expression in leaves after wounding or UV radiation treatment, as it is illustrated in Fig. 4. The molecular mechanisms involved in these regulatory transcriptional responses have not been deciphered yet. UV radiation is a minor component of the solar spectrum, yet it has the potential to affect metabolic processes in humans, animals, plants and microorganisms. In plants, UV radiation can potentially interfere with growth, development, photosynthesis, flowering, pollination and transpiration (Jansen et al., 1998). UV radiation provokes alterations in peroxidase activity in a large number of plant species. Physiological evidence indicates a role for phenol-oxidizing class III peroxidases in UV protection, through a change in leaf phenolic content and/or composition (Jansen et al., 2001). We believe that the Udp1 peroxidase activity is likely involved in a protection mechanism of nettle against mechanical and UVmediated stress. In conclusion, the study of gene structure and regulation of stress (wounding and UV radiation)-induced peroxidases is an essential issue, directly implicated in the understanding of the biological functions involved in plant defence responses. The Udp1 nettle peroxidase is constitutively and selectively expressed only in the roots, probably playing an important role in the normal plant physiology and development. Moreover, the ectopic expression of the Udp1 gene in plant leaves after wounding or UV radiation stress conditions could be directly involved in a variety of protective, or adaptive, biological mechanisms controlling distinct cellular functions in different environments. The homology modelling approach dictates the presence of unique structural elements, likely regulating the substrate binding specificity and recognition, as well as the intracellular localization and distribution of the Udp1
peroxidase. More specifically, the clusters of conserved lysine residues could likely determine essential functional properties of the Udp1 peroxidase, while the intracellular sorting pathways of the Udp1 protein to the vacuoles could be probably regulated by its carboxyl-terminal short extension tail. It is worth mentioning that the C-terminal extension tail, SLVSSY, constitutes a novel and unique putative structural element of the Udp1 peroxidase. The shared structural features of the Udp1 protein with other plant peroxidases could reasonably reflect their similar physiological roles throughout plant life. It is interesting to note that the comparison of the Udp1 protein to the peroxidase family members of the model plant Arabidopsis thaliana has revealed significant amino acid sequence similarities with a small group of four peroxidases (P62, P69, P70 and P71), all characterized by the same exon/intron structure, as the Udp1 gene. Additionally, like the Udp1 peroxidase gene, P62 and P71 (the evolutionary closest family members to Udp1) are also known to be induced by mechanical wounding. Moreover, homology modelling analysis of the Udp1, P62, P69 and P71 peroxidases has shown similar uneven clustered distribution of surface charges (Fig. 5). Finally, all the above observations further support the notion for the existence of a common ancestor peroxidase gene, which after multiple events of divergence throughout evolution resulted in the generation of all the known peroxidase family members including Udp1. Acknowledgements This work was supported by a TMR grant (No. ERBFMRXCT 980200) and the Special Account for the Research Grants of Athens University to Professor L. H. Margaritis. References Allison, S.D., Schultz, J.C., 2004. Differential activity of peroxidase isozymes in response to wounding, gypsy moth and plant hormones in northern red oak (Quercus rubra L.). J. Chem. Ecol. 30, 1363–1379. Bednarek, S.Y., Raikhel, N.V., 1992. The barley lectin carboxyl-terminal propeptide is a vacuolar sorting determinant in plants. Plant Cell 3, 1195–1206. Blee, K.A., Jupe, S.C., Richard, G.S., Zimmerlin, A., Davies, D.R., Bolwell, G.P., 2001. Molecular identification and expression of the peroxidase responsible for the oxidative burst in French bean (Phaseolus vulgaris L.) and related members of the gene family. Plant Mol. Biol. 47, 607–620. Botella, M.A., Quesada, M.A., Medina, M.I., Pliego, F., Valpuesta, V., 1994. Induction of a tomato peroxidase gene in vascular tissue. FEBS Lett. 347, 195–198. Bouhin, H., Charles, J.P., Quennedey, B., Delachambre, J., 1992. Developmental profiles of epidermal mRNAs during the pupal-adult molt of Tenebrio molitor and isolation of a cDNA clone encoding an adult cuticular protein: effects of a juvenile hormone analogue. Dev. Biol. 149, 112–122. Bradley, D.J., Kjellbom, P.P., Lamb, C.J., 1992. Elicitor- and wound-induced oxidative cross-linking of a proline-rich plant cell wall protein: a novel, rapid defence response. Cell 70, 21–30. Carpin, S., Crèvecoeur, M., Meyer, M., Simon, P., Greppin, H., Penel, C., 2001. Identification of a Ca 2+-pectate binding site on an apoplastic peroxidase. Plant Cell 13, 511–520.
T.G. Douroupi et al. / Gene 362 (2005) 57–69 Fujiyama, K., et al., 1988. Structure of the horseradish peroxidase isozyme C genes. Eur. J. Biochem. 173, 681–687. Fujiyama, K., Takemura, H., Shinmyo, A., Okada, H., Takano, M., 1990. Genomic DNA structure of two new horseradish peroxidase encoding genes. Gene 89, 163–169. Graber, J.H., Cantor, C.R., Mohr, S.C., Smith, T.F., 1999. In silico detection of control signals: mRNA 3′-end processing sequences in diverse species. Proc. Natl. Acad. Sci. U. S. A. 96, 14055–14060. Guex, N., Peitsch, M.C., 1997. Swiss-Model and the Swiss-PdbViewer: an environment for comparative protein modelling. Electrophoresis 18, 2714–2723. Hertig, C., Rebmann, G., Bull, J., Mauch, F., Dudler, R., 1991. Sequence and tissue-specific expression of a putative peroxidase gene from wheat (Triticum aestivum L.). Plant Mol. Biol. 16, 171–174. Hiraga, S., et al., 2000. Diverse expression profiles of 21 rice peroxidase genes. FEBS Lett. 471, 245–250. Ito, H., et al., 2000. Xylem-specific expression of wound-inducible rice peroxidase genes in transgenic plants. Plant Sci. 155, 85–100. Jacobs-Lorena, M., 1980. Dosage of 5S and ribosomal genes during oogenesis of Drosophila melanogaster. Dev. Biol. 80, 134–145. Jansen, M.A.K., Gaba, V., Greenberg, B.M., 1998. Higher plants and UV-B radiation: balancing damage, repair and acclimation. Trends Plant Sci. 3, 131–135. Jansen, M.A.K., van den Noort, R.E., Adilla, Tan M.Y., Prinsen, E., Lagrimini, L.M., Thorneley, R.N.F., 2001. Phenol-oxidizing peroxidases contribute to the protection of plants from ultraviolet radiation stress. Plant Physiol. 126, 1012–1023. Kozak, M., 1981. Possible role of flanking nucleotides in recognition of the AUG initiator codon by eukaryotic ribosomes. Nucleic Acids Res. 9, 5233–5262. Lavid, N., Schwartz, A., Yarden, O., Tel-Or, E., 2001. The involvement of polyphenols and peroxidase activities in heavy-metal accumulation by epidermal glands of the waterlily (Nymphaeaceae). Planta 212, 323–331. Lewis, N.G., Yamamoto, E., 1990. Lignin: occurrence, biogenesis and biodegradation. Annu. Rev. Plant Physiol. Plant Mol. Biol. 41, 455–496. Lige, B., Ma, S., van Huystee, R.B., 2001. The effects of the site-directed removal of N-glycosylation from cationic peanut peroxidase on its function. Arch. Biochem. Biophys. 386, 17–24. Martinez, C., et al., 1998. Apoplastic peroxidase generates superoxide anions in cells of cotton cotyledons undergoing the hypersensitive reaction to Xanthomonas campestris pv. malvacearum race 18. Mol. Plant-Microb. Interact. 11, 1038–1047. Moerschbacher, B.M., 1992. In: Penel, C., Gaspar, T., Greppin, H. (Eds.), Plant Peroxidases 1980–1990: Topics and Detailed Literature on Molecular, Biochemical, and Physiological Aspects. University of Geneva, Geneva, pp. 91–99. Mogens, P.H., Callahan, A.M., Dunn, L.J., Abeles, F.B., 1990. Isolation and sequencing of cDNA clones encoding ethylene-induced putative peroxidases from cucumber cotyledons. Plant Mol. Biol. 14, 715–725.
69
Mohan, R., Bajar, A.M., Kolattukudy, P.E., 1993. Induction of a tomato anionic peroxidase gene (tap1) by wounding in transgenic tobacco and activation of tap1:GUS and tap2:GUS chimeric gene fusions in transgenic tobacco by wounding and pathogen attack. Plant Mol. Biol. 21, 341–354. Nielsen, K.L., et al., 2001. Differential activity and structure of highly similar peroxidases. Spectroscopic, crystallographic and enzymatic analyses of lignifying Arabidopsis thaliana peroxidase A2 and horseradish peroxidase A2. Biochemistry 40, 11013–11021. Oliver, F., et al., 1991. Contact urticaria due to the common stinging nettle (Urtica dioica)-histological, ultrastructural and pharmacological studies. Clin. Exp. Dermatol. 16, 1–7. Ostergaard, L., Pedersen, A.G., Jespersen, B.M., Brunak, S., Welinder, K.G., 1998. Computational analyses and annotations of the Arabidopsis peroxidase gene family. FEBS Lett. 433, 98–102. Passardi, F., Cosio, C., Penel, C., Dunand, C., 2005. Peroxidases have more functions than a Swiss army knife. Plant Cell Rep. 24, 255–265. Rasmussen, S.K., Welinder, K.G., Hejgaard, J., 1991. cDNA cloning, characterization and expression of an endosperm-specific barley peroxidase. Plant Mol. Biol. 16, 317–327. Rebmann, G., Hertig, C., Bull, J., Mauch, F., Dudler, R., 1991. Cloning and sequencing of cDNAs encoding a pathogen-induced putative peroxidase of wheat (Triticum aestivum L.). Plant Mol. Biol. 16, 329–331. Reimers, P.J., Guo, A., Leach, J.E., 1992. Increased activity of a cationic peroxidase associated with an incompatible interaction between Xanthomonas oryzae pv. oryzae and rice (Oryza sativa). Plant Physiol. 99, 1044–1050. Roberts, E., Kolattukudy, P.E., 1989. Molecular cloning, nucleotide sequence and abscisic acid induction of a suberization-associated highly anionic peroxidase. Mol. Gen. Genet. 217, 223–232. Sambrook, J., Fritsch, E.F., Maniatis, T., 1989. Molecular Cloning: A Laboratory Manual, 2nd ed. Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. Sanger, F., Nicklen, S., Coulson, A.R., 1977. DNA sequencing with chainterminating inhibitors. Proc. Natl. Acad. Sci. U. S. A. 74, 5463–5467. Scialabba, A., Bellani, L.M., DellAquila, A., 2002. Effects of ageing on peroxidase activity and localization in radish (Raphanus sativus L.) seeds. Eur. J. Histochem. 46, 351–358. Theilade, B., Rasmussen, S.K., 1992. Structure and chromosomal localization of the gene encoding barley seed peroxidase BP 2A. Gene 118, 261–266. Theilade, B., et al., 1993. Subcellular localization of barley grain peroxidase BP 2 by immunoelectron microscopy. In: Welinder, K.G., Rasmussen, S.K., Penel, C., Greppin, H. (Eds.), Plant Peroxidases: Biochemistry and Physiology (3rd International Symposium Proceedings). Université de Genève, Genève, pp. 321–324. Tognolli, M., Penel, C., Greppin, H., Simon, P., 2002. Analysis and expression of the class III peroxidase large gene family in Arabidopsis thaliana. Gene 288, 129–138. von Heijne, G., 1990. The signal peptide. J. Membr. Biol. 115, 195–201. Welinder, K.G., et al., 2002. Structural diversity and transcription of class III peroxidases from Arabidopsis thaliana. Eur. J. Biochem. 269, 6063–6081.