Plant proteins containing the RNA-recognition motif

Plant proteins containing the RNA-recognition motif

Plant proteins containing the RNArecognition motif Post-transcriptional regulation of gene expression is mediated by the interaction of protein factor...

1MB Sizes 0 Downloads 82 Views

Plant proteins containing the RNArecognition motif Post-transcriptional regulation of gene expression is mediated by the interaction of protein factors with specific RNA sequences. In recent years, an increasing number of plant proteins that contain the principal RNA-binding domain, the RNA-recognition motif (RRM), have been identified. Many of these proteins can be classified into functional groups involved in different aspects of RNA metabolism. Each protein family has a characteristic domain structure, with one or more copies of the RRM and a variety of auxiliary domains. The most variable regions of the RRM of plant RNA-binding proteins probably contain determinants of target specificity, as has been shown for equivalent non-plant proteins. Thus, characterization of the RRM sequence of different plant RNA-binding proteins is likely to provide information about functional and/or evolutionary relationships.

P

ost-transcriptional metabolism of RNA involves both molecular mechanisms that underlie plant post-transcriptional housekeeping and regulatory mechanisms’. These processes regulation has lagged behind that of other systems. require the interaction of RNA-binding proteins with specific RNA sequences. Nuclear post-transcriptional events include Plant RNA-binding proteins pre-mRNA capping, splicing and polyadenylation’. Once mRNAs RNA-binding proteins have a modular structure similar to that of have been processed, they are transported to the cytoplasm, where transcription factors. They typically contain one or more RRMs and the translation machinery is located. In this compartment, the a variety of auxiliary motifs, such as glycineiarginine-rich, acidic or translation rate of particular mRNAs may be controlled by spe- SR-repeat?. The particular arrangement of these domains serves cific protein factors in a tissue- or development-specific manner. to define different protein families. For example, chloroplast RNAIn plants, chloroplast gene expression is also known to be subject binding proteins comprise an acidic region at the N-terminus and to post-transcriptional regulation involving pre-RNA processing two repeats of the RRM. In contrast to the RRM, functional roles and stability’. Recently, much progress has been made in the identification of RNA-binding proCTC teins and protein-RNA complexes or PABP2 1 P-rich H H H H ribonucleoproteins (RNPs). This has led to the discovery of several conserved protein UlA 7h-I RNA-binding motifs with a predictive U2B” 7-1 value, such as the RNA-recognition motif (RRM), the arginine-rich motif, the RGG Nucleolin 11 m 1 1 1 1 I m B H > H box, the hnRNP K homology motif, the Chloroplastic B Zn-finger motif and the double-strandedRNA-binding motif”. The RRM [also known as the consensus-sequence-type RNA-binding domain (CS-RBD) or RNP motif] is the RNA-binding motif that is the most widely found and the best characterized”,‘. It comprises W-90 amino acids and k@SlSmit> ARPS19 1 is found in one or more copies in proteins SRI r[;)rm involved in post-transcriptional processes, which bind pre-mRNA (or heterogenous RSp31 7-7. nuclear RNA), mRNA, pre-rRNA, small nuclear RNAs or chloroplast RNAs. It appears to be an ancient structure, and is present in animals, fungi, plants and FCA L :-,.i ’ cyanobacteria. The most conserved regions are two short sequences, RNPl (eight 0 RD-repeat q SR-repeat l Intermediate amino acids) and RNP2 (six amino acids), as well as other amino acids interspersed Fig. 1. Schematic representation of plantRNA-bindingproteinsthat containoneor more throughout the motif. copiesof theRNA-recognition motif(RRM).Typesof auxiliarydomains areindicatedat the Although many proteins with the RRM bottomof the figure,unless specificallylabelled.TheCTC motif (in PABPZ)isa common regionof the poly(A)-bindingprotein(PABP)family. Zn (in RZl) indicates a zinc-finger motif have been characterized in verteconsensus sequence. RPS19mit (in ARPS19) indicates a region of Arabidopsis RPS19 with brates and yeast, much less is known about homology to other plant mitochondrial RPS19 proteins. Note that the intermediate domain is plant RNA-binding proteins, and in only a unique to the SRI and RSp31 group of proteins. few cases has their function been studied. As a consequence, understanding of the Copyright

0 1998

Elsevier

Science

Ltd. All rights

reserved.

1360

1385/98/$19.00

PII: S1360-1385(97)01151-5

January 1998, Vol. 3, No. 1

15

reviews NSRl protein, but in contrastto vertebrate nucleolins,which have four copiesof the domain.In animals,nucleolinis considered to play a key role in the regulationof rDNA Plant RNAPossible functions Evidence in support of function Refs transcription, pre-rRNA processing and binding protein or protein ribosomalassembly.In alfalfa, the proteinis family also predominantlyfound in the nucleolus and, as expectedfor a protein requiredfor Poly(A)-binding Controlof mRNA Sequence homology”;PAB5hasa 9, 10 rRNA metabolism,its expressionis highly proteins turnoverandtranslation strongaffinity for poly(A) in vitro induceduponproliferation. efficiency. (bindingto thepoly(A) tail of Recently, an Arabidopsisprotein, FCA, mRNAs). containing two RRMs, has beenshown to Splicingfactors Splicingof pre-mRNAs; snRNPU2B” andUlA proteins 16, 17 function in the control of flowering time”. spliceosome assembly. bindto U2 snRNAandUl snRNA, The protein was identified asthe wild-type respectively;sequence homology”. geneproductoff&-l, which resultsin a late5’-splice-site selection; atRSp31complements a splicing- IS,19 flowering phenotype.FCA also containsa alternativesplicing. deficientHelacell SlOOextract. SRl influences 5’-splice-site WW protein interaction domain (Fig. l), selectionin anin vitro mammalian which hasbeenfound in a diversegroup of splicingassay;sequence homology”. proteins involved in cell signalling or Chloroplast Chloroplast 3’-end Spinach28RNPis requiredfor 35,41 regulation”. The RRMs have the strongest proteins processing. correctchloroplast mRNA 3’-end similarity to RRMs presentin a subfamily formationandstabilization. of proteins encodedby ELAV-like genes, Glycine-rich Unknown. Nucleolarlocalizationof maize 13,14 which includevertebrateneural-specificand proteins MA16 and GRPl wzests a role in -Drosophiladevelopmentalgenes. rRNA metabolism. Proteinsthat containa singleRRM anda Nucleolin DNA transcription, pre- Sequence homology”. 11 glycine-rich region at the C-terminus rRNA processing and ribosomal assembly. (glycine-rich RNA-binding proteins) were Mutantshavea late-flowering FCA Controlof flowering 12 first describedin plants, although recently time. phenotype. this protein family has been extended to RPS19 Mitochondrialribosomal Mitochondrialribosomalprotein 15 include representatives from mammalsand protein;interactionwith with anRNA-recognition motif cyanobacteria. Their function is still mitochondrial rRNA. (RRM) extension. unknown, althoughthe finding that maize and tobacco proteins accumulate in the Existsasa 60sRNPcomplexin 32 RZ-1 Unknown. nucleolussuggestsa role in pre-rRNA prothenucleoplasm of tobaccocells, cessing”.‘4.ProteinsGRP-2 and RZ-1 from suggesting a rolein pre-mRNA tobacco are structurally related to the processing and/ormRNAnucleocytoplasmictransport. glycine-rich protein family, but contain extra auxiliary domains(Fig. 1). “Refers to high homology to known vertebrateor yeastproteins. As already mentioned, a characteristic type of nuclear-encoded, RNA-binding proteinsis presentin the chloroplastsof higher for theauxiliary domainshaveonly beenestablished in somecases, plants,with a uniquestructureformed by an acidic region at the N-terminusandtwo repeatsof the RRM motif (Fig. 1). Thesetypes and may include binding to nucleic acids and protein factor? (Table 1). Different strategieshave beenfollowed to identify plant of protein appearto be involved in post-transcriptionalregulation proteinsinvolved in RNA metabolism,including PCR usingpar- of chloroplastgeneexpression’. In Arabidopsis, the mitochondrialribosomeprotein RPS19 is tially degenerateoligonucleotidesspecificfor conservedregionsof andhasan N-terminalextensionthat includesan the RRM’; and the useof biochemicalapproaches, suchassingle- nuclear-encoded, RRMIS.The RPS19protein is thought to originatefrom the fusion strandedDNA affinity columns’. A representation of the structureof different typesof plant RNA- of a genomicRRM-encodingsequenceand a nucleus-transferred binding proteinsis shownin Fig. 1. Several of theseproteinsare mitochondrialrpsl9 gene (Fig. 1). Becausethe RPS13 protein, with RPS19to form an RNA-binding highly homologousto well-characterizedmammalianand/or yeast which normally associates proteins,and thus their function is very likely to be related.These heterodimer,is not presentin mitochondriain Arabidopsis,it has RRM funcincludesplicingfactorsUlA, U2B”, SRl andRSp31;poly(A)-bind- beenproposedthat during evolution a nuclear-encoded ing proteins(PABPs);andplantnucleolin.The PABPsareknownto tionally replacedRPS13,in what is an interestingexampleof the bind the poly(A) tail of eukaryotic mRNAs, and this association potentialflexibility andversatility of the RRM. plays a role in the control of mRNA turnover andtranslationefficiency. In Arubidopsis, a PABP-encodinggene family has been Splicing factors identified,comprisingPABl, PAB3, PABS andPAB29.‘“.The genes Splicing of pre-mRNAs, which removesintrons, is mediatedby areexpressedin an organ-specificmanner;for example,PAB3 and variousprotein factors andsmallnuclear (sn) RNPs.Understandin plantsis still limited,becausethere PAB5 arepredominantlyexpressedin immatureflowers,andthere- ing of the splicingmechanism fore it hasbeensuggestedthat plant PABPs may be involved in is no in vitro assayfor splicing.It is, however,generallyassumed that the mechanismis similarto that operatingin yeastand mamorganand/orstage-specific, post-transcriptional processes’. A protein with homology to nucleolin has been describedin mals,becauseplantscontainall five major spliceosomalsnRNAs, alfalfa”. This proteincontainstwo RRM motifs, asin yeast-related Ul, U2, U4/U6 and U5, and severaleukaryotic-conservedsplicing 16

January

1998, Vol. 3, No. 1

reviews factors’. However, significant differences have been reported to exist in the cis-acting elements that are needed for efficient intron recognition in mammalian systems. For example, efficient premRNA splicing requires intron sequences with an elevated AU content, and plant introns lack the characteristic pyrimidine track found near 3’-splice sites in mammals’. Moreover, in general, plants cannot properly recognize introns of animal origin2. These differences in intron recognition are indicative of the presence of plant-specific splicing factors. The sequence of only a very small number of plant splicing factors has thus far been determined. These include the snRNP components UlA, isolated from potato and Arabidops#‘, and U2B” found in potato”. The structure of these proteins is very similar and consists of two RRMs separated by a lysine-rich region (Fig. 1). The U2B” protein from potato has been demonstrated to be a functional counterpart of human U2B” - it is able to bind specifically to U2 snRNA stem-loop IV (from either vertebrates or plants) and this is significantly enhanced by the presence of human U2A (Ref. 17). The specific recognition of Ul snRNA by plant UlA protein also shows functional relatedness to the vertebrate proteinh. Human UlA autoregulates its expression by binding to the 3’-untranslated region of its own pre-mRNA, which has an inhibitory effect on premRNA polyadenylation. In contrast, plant UlA proteins do not bind to their own mRNA, and the possibility that they regulate their expression by the same mechanism has thus been ruled out”. Members of the RS protein superfamily constitute a different type of splicing factor that contains arginineiserine-rich regions. This group of proteins (mainly characterized in mammals and Drosophila) is implicated in accurate splice-site recognition and alternative splicing. InArubidopsis, four different RS proteins have recently been identified. Three of them, atRSp31, atRSp35 and atRSp4, are highly homologous’x. Analysis of their structure revealed interesting features: an N-terminal RRM highly homologous to that of SR proteins; a divergent intermediate region; and an unusual repeat in the RS-rich domain (Fig. l), suggesting that this class of RS-rich splicing factors may be plant specific. The possibility that atRSp31 was involved in splicing was demonstrated by its capacity to complement a splicing-deficient Hela cell SlOO extract”. Protein SRl fromArabidop.sis is also a component of the SR-protein family”. This protein is highly homologous to the extensively characterized SF2/ASF mammalian splicing factor, except for the presence in the C-terminus of a Pro-Ser-Lys-rich domain, reminiscent of sequences found in histones (Fig. 1). Similar to SF2/ASF, SRl is able to influence S-splice site selection in an in vitro splicing assay for mammalian sequences”. The recent discovery that plants cells contain SR proteins may have important implications for the understanding of the mechanisms of splice-site recognition. It is generally believed that the splice-site selection patterns in plants are defined primarily by sequences within the intron (intron definition)2, in contrast to the exon-definition model generally accepted for mammals. The SR proteins are thought to play a fundamental role in exon definition, by mediating the interaction of complexes assembled at a 3’-splice site and a downstream S-splice site. Thus, if plant SR proteins turn out to function in the same way as mammalian SR proteins, exon definition may soon be regarded as a more important mechanism for splice-site selection in plants.

significant arginine content (about 10-15%). The glycine-rich regioncontainsrepeatsof the RGG-box, which hasbeendefinedas an RNA-binding motif in otherproteins”.Sincethe first memberof thisfamily wasidentifiedin maizem,cDNAs encodinghomologous proteinshave beenfound in various other plant species,including tobacco”,“, Arabidopsi.~~~~” and barleyz4.In animals,two highly homologousproteinshave beenidentified,RMB3 in humans”and CIRP in mice andhuman?; the function of theseproteinsis also unknown. This situation is exceptional,becausethe characterization of animal(and/oryeast)structurallyconserved,RNA-binding proteinfamilieshasnormallyprecededtheir identificationin plants. This may bea consequence of the abundance of this type of proteins in plant cells. Interestingly, glycine-rich, RNA-binding proteins have alsobeendescribedin cyanobacteria”,‘“,and this is the first bacterialRRM reported.Thus, this type of RNA-binding proteins probablyrepresents a very ancientstructure,which appearedbefore the divergenceof eukaryotesandprokaryotes. The RNA-binding activity of plantglycine-rich proteinshasbeen analysed by ribohomopolymer-bindingassays. Proteins from maize”, tobacco” andbarley2’showa high affinity for poly(G) and poly(U), indicating that cellular RNA ligands are likely to be enrichedin G- andU-residues.In fact, the MA16 protein showsa high affinity for its own mRNA, which is very rich in G-residues’“. In a numberof studies,an increasein geneexpressionin stress conditionshas been reported, such as during drought stress’“.2’, wounding” or cold”3,‘“.As a consequence,a role for glycine-rich proteinsin the plant responses to changingenvironmentalconditions hasbeenhypothesized.However, at presentthe significance of theseresultsis speculative,becauseno in vivo functionaldatayet exist. In addition,mostof the studieswererestrictedto the quantification of mRNA concentrations,not protein concentrations,and there is no evidencethat the concentrationsof the protein increase under stressconditions”. However, it is remarkablethat genes expressingmouseCIRP protein and cyanobacterialglycine-rich proteins are also cold-inducible2h,‘7. This conservedpattern of expressionsupportsthe notionthat glycine-rich proteinsmay representa classof RNA-binding proteinsinvolved in generalmolecular responses to low temperatures, andperhapsto other environmental stressconditions.Localization studiesalsoprovide somecluesto their possiblerole in RNA post-transcriptionalmetabolism.Both MA16 from maize and GRPl from tobacco accumulatein the nucleolus,which suggeststhat they may participate in pre-ribosomalRNA processingevents”.“. In addition,in situ hybridization experimentshaverevealeda higherconcentrationof MA16 mRNA in different expandingtissuesof maizeseedlings’?. Takentogether, theseresultsseemto indicatethat glycine-rich proteinscould be involved in rRNA metabolismand growth, and that they could regulateor affect theseprocesses duringenvironmentalstresss. Severalproteinsthat sharesomeof the characteristicsof the previously describedtype of glycine-richproteinshavebeenidentified in tobacco: GRP2 (Ref. 31) GRP3 (Ref. 14) and RZ-1 (Ref. 32). The commonfeaturesare a highly similar RRM sequence(about 30.60% identity), and the presenceof a glycine-rich region. The GRP2 and RZ-1 proteinscontain additionaldomains(Fig. l), and GRP3 hasa muchshorterglycine-rich region.It hasbeenproposed that RZ-1, which existsasa largecomplexof approximately60s in the nucleoplasm,could be involved in pre-mRNA processing and/or nucleo-cytoplasmictransport”‘.The GRP3 protein is also predominantlyfound in the nucleoplasm,andthereforecould also Glycine-rich, RNA-binding proteins Plant glycine-rich, RNA-binding proteinsaresmall,approximately be involved in pre-mRNA metabolism’4. 16-17 kDa, and consistof two very diverse regions.The RRM occupiesthe N-terminalhalf of theprotein andshowsa high degree Chloroplast RNA-binding proteins of homologyamongall proteins(60-80% identity). The C-terminal A class of nuclear-encoded,RNA-binding proteins has been regionis extremely rich in glycine residues(about70%) andhasa describedin the chloroplastsof higher plant?. In addition to the January1998,Vol.3,No.l

17

reviews

a-1

P-1 w

LOOP-1

l-l

P-2 L-ZH

a-2

P-3 I,OOP-3

H

w

1-l

LOOP-S

H

PABPP-1 PABPS-1 PABP2-2 PABPS-2 PABPZ-4 PABPI-4 PABP2-3

PABP5-3 ACP31-1 IoCP31-1

..

826RNP-1 WCPBl-2 SZBRNP-2

.. ..

ACP31-2 AT1 AT2 MA16

NUC-1 NUC-2 AUlA-1 PUlA-I

PU2Bm-2 AUlA-1 PUlA-1

PU2Bm-1 SRl RSP31 CONSENSUS

XYA

IYIXO

Y

M

I

Fig. 2. Multiple alignment of plant RNA-recognition motifs (RRMs). The alignment was obtained using the PILEUP program (Wisconsin GeneticsComputer Group), with final manual adjustments. Secondary structures (P-sheets and cu-helices) were deduced from previously identified regions in a diversity of RNA-binding proteins”,‘. Black boxes indicate conserved residues in most of the sequences; shaded boxes indicate the conservation of similarresidues in mostof thesequences. Theconsensus sequences includethemostconserved positionsandRNP-1 and RNP-2 sequences. The numbers after the protein names refer to the position of multiple RRMs with respect to the N terminus. The amino

acidsequences wereobtainedfromthefollowingreferences: pabp2,Ref. 10;pabp5,Ref.9; acp31,Ref.36;ncp31,Ref.34;s2Krnp.Ref.33;at1 and at2, Ref. 22; ma16 Ref. 20; nut, Ref. 11; aula and pula, Ref. 16; pu2b, Ref. 17; srl, Ref. 19; and rsp31. Ref. 18.

transit peptide, which allows import into the chloroplast,these polypeptidesare composedof two different regions,an acidic Nterminaldomainandtwo RRMs (Fig. 1). Basedon aminoacidsimilarities in the RRM they have beenclassifiedinto three group?. ChloroplastRNA-binding proteinshave beenidentified in various plant species,includingtobacco7,8,3”.3’, spinach3’, Arabidopsis3’and the halophyte Mesembryanthemumcrystallinum’7. Ribohomopolymer-bindingassaysperformedwith tobaccoRNA-binding proteinshave shownthat, like glycine-richproteins,they preferentially bind to poly(G) and p01y(U)~‘.Using deletion mutants,a major binding affinity hasbeenobservedfor proteinslacking the acidic domain, and for proteinsthat retain both RRMs instead of only one3’.However, studieswith spinach28RNP protein, while confirming that two RRMs bind better thanone, showa positive effect of the acidicdomainon RNA-binding”. A possibleexplanationfor thesediscrepancies is thedifferent targetused- ribohomopolymers in the first case,anda chloroplasttranscript,psbA, for the spinach protein.The phosphorylationstateof chloroplastRNA-bindingproteins may also be important for the modulationof their cellular activities, as phosphorylation of 28RNP hasbeen demonstrated to changeits affinity for RNA in vitro”“. The stability of chloroplastmRNAs is greatly influenced by developmentaland environmentalcondition$‘. During the developmentof the chloroplastfrom proplastids,there is a rapid accumulation of several mRNAs, causedby an increasein their half-lives. Similarly, an increasein steady-statelevelsof a number of transcripts is observed under drought-stressconditions in Mesembryanthemumcrystallinum, and these cannot be solely 18

JanuarylWiVol.3,No

1

accountedfor by a higher transcriptionratej’. As in animalsand yeast, chloroplastmRNA stability dependson correct mRNA 3’end formation and processing.A generalcharacteristicof plant plastid mRNA is the presenceof an inverted sequence,which is locatedin the 3’-untranslated region andcanfold into a stem-loop structure.This sequenceis capableof stabilizing upstreamRNA regionsin vitro andin vivo, andthusis importantin preventingtranscript degradation:however, this is insufficient to explain the developmentalregulationof transcriptstability. ChloroplastRNAbinding proteinsare thoughtto function in the post-transcriptional regulationof the expressionof chloroplastgenes,includingmRNA stabilization.The mainbody of evidenceisprovidedby work on the spinach28RNP protein. This protein interactswith the 3’-untranslated region of plastid mRNAs, containing the inverse-repeat sequence,and its expressioncorrelateswith the accumulationof plastidmRNA?. The involvement of the protein in the control of transcriptstability is demonstrated by depletionexperiments.In the absenceof 28RNPin chloroplastextracts,there isincorrectmRNA 3’-endformation,andthisresultsin mRNA degradation”.Recently, the machineryresponsible for chloroplastmRNA 3’-endprocessing hasbeen dissected”.A high molecularweight enzyme complex, comprisinga PNPase-likeexoribonuclease,is requiredfor proper cleavageand 3’-endformation,but it is insufficient for the formation of a stableRNA product.The interactionof RNA-binding proteins 28RNP and 24RNP (homologousto 28RNP) appearsto be prerequisitefor avoiding RNA degradation.This interactionmoderatesthe nucleolytic activity of the complex, regulatingthe accumulationof chloroplastmRNA.

reviews

Poly(A)-binding prolelns

Chloroplasl and glycine-rich Prolei”s

aula-l

pula-1 pu2b-1

Splicing ‘actors

Fig. 3. Unrooted tree constructed from the alignment displayed in Fig. 2. The branch lengths represent the evolutionary distances between sequences. The programs DISTANCES and GROWTREE from the Wisconsin Genetics Computer Group were used. Similarity between sequences was calculated as a ratio between matches and total positions. The tree was generated by the neighbour-joining method, which clusters the sequences in a pairwise fashion.

Structural

comparison

of plant RNA-recognition

motifs

Although poorly conservedat the primary sequencelevel, X-ray crystallographicand/or NMR studieson vertebrate snRNPUlA and hnRNPC proteinshave shownthat RRM hasa characteristic fold, with the RNPl and RNP2 consensus sequencelocatedin the two central strandsof a four-strandedantiparallelP-sheetpacked againsttwo cr-helices”,“,4’. Contactswith the RNA would be establishedthrough the P-sheet,the loops and the domaintermin?. Highly conservedaminoacidsof RNPl and RNP2, althoughcrucial for RNA binding,probablydo not distinguishbetweendifferent RNA sequences. Major determinantsof RNA-binding specificity would residein the mostvariableregionsand/orin aminoacidsoutThe RRM could alsoserveas a site for prosidethe domain’,5,4’. tein-protein interactions,ashasbeendemonstrated in a few cases”‘. Many RNA-binding proteinscontain severalcopiesof the RRM, and thesedifferent copiesnormally showsomedegreeof specialization - the RNA affinity and specificity of eachof the protein RRM motifs is often highly variable. A comparisonof 27 plant RRM motifs, from 14 different proteins,is shownin Fig. 2. The selectedsequences arerepresentative of different classesof plant RNA-binding proteins.The alignment showsc1-and p-structures,basedon previous studies”,‘.In addition to the RNPl andRNP2 consensus sequences, there areother conserved residues,mostly hydrophobic, scattered along the sequence.

The similarity betweenRRMs suggeststhat they have evolved from a commonancestor.Previousstudieshave shownthat functional relatednessbetweenRNA-binding proteins is reflected by sequencesimilarity in their RNA-binding domainh,41. In a dendogramof plant RRMs, the proteinsfall into separateclusters,which canbe identified asseparateprotein classes:poly(A)-binding proteins;splicingfactors; nucleolin; andglycine-rich/chloroplastproteins(Fig. 3). The similarity betweendifferent copiesof the RRM within the protein families indicatesthat they were formed by a seriesof duplicationevents.For example,the different domainsof PABPsappearmorerelatedto eachother thanto any otherRRM in the tree. This alsoappliesto conservedproteinsin phylogenetically distantorganisms,suchashumansand yeast,and so it seemsthat the duplicationsoccurredearly in evolution, prior to the differentiation of RNA-binding proteinsh.4’. The clusterformedby splicing factors is in fact composedof two distinct subgroups,snRNPproteins and SR proteins. One of the differentiating featuresof the RRM of the SR proteins(SRl and RSP31)is the presence,in the seconda-helix of the RRM, of theoctapeptideF[E/D]DxRDAEDA (Fig. 2). This hasbeendefinedasa diagnosticfeatureof this protein family”. The similarity betweensplicing-factorRRMs denotesan evolutionary and/or functional relatedness of all theseproteins.A distinct clusteris formed by the two RRM domainsof nucleolin, which appeardistantlyrelatedto otherplant RRMs in the tree. A striking homology is observedbetween the RNA-binding domainsof chloroplastandglycine-rich proteins,mostremarkably in the caseof the C-terminalRRM (RRM2) of chloroplastproteins (Fig. 3). The high sequencehomologybetweentheseproteinshas led to the suggestionthat they could be related’.‘“.Sequencecomparisonof the C-terminalRRM domainof chloroplastproteinsand the RRM of glycine-rich proteinsrevealsa conservedsequencein loop 3 (includingthe last amino acid of P-sheet2), with the consensus DRETGRS; high homologyextendsthroughthe contiguous RNPl (Fig. 2). Loop 3 has previously drawn attention, because althoughit is generallyhighly variable, it is remarkablywell conserved in someprotein families”. Its location in the RRM-RNA complexis closeto the RNA-interacting RNPl and RNP2consensussequences? andso it is likely to influenceRNA-binding, either through direct interaction or by affecting the orientation of the domainfor RNA recognitionin the RNPl andRNP2 regions’.“‘.In fact, the sequenceof loop 3 hasbeenimplicatedin RNA-sequence selectivity; the replacementof residuesin loop 3 of humanUlA proteinwith the analogousloop of U2B” confersU2B” RNA-binding specificity to the hybrid protein’. In additionto sequencehomology, other similarities between chloroplast and glycine-rich proteinsareobservedat the level of geneorganization.An intron is found in the samepositionin the RRM regionof glycine-rich proteinsasin the N-terminalRRM of chloroplastproteins.Also, a subgroup of chloroplast proteins contains a glycine-rich region betweenthe two RRMs, which, thoughshorter,resemblesthat in glycine-richprotein?. Relatedness of theseproteinsmay alsobe at a functional level - they all show a preferencefor G/U-rich sequencesin in vitro-binding assays.Moreover, the observation that transcriptsencodingboth types of proteinsaccumulateduring drought stressmay be indicative of similar expression-regulatory mechanisms2”,“‘. A groupof cyanobacterialRNA-binding proteinswith the same domainorganizationasplant glycine-rich proteinshasbeenidentified. Their RRM is also very similar to that in the chloroplast proteins,andthey showthe sameribohomopolymer-bindingpreferences”:it hasbeenhypothesizedthat theseproteinshave a commonorigin”. According to the endosymbiotictheory, chloroplasts are thought to originate from organismslike cyanobacteria’“. Thus, an RRM protein may have been transferred from a January 1998, Vol. 3, No 1

19

reviews cyanobacterium-like endosymbiont to the nucleus. Later, this nuclear gene may have duplicated and become fused to other genes, giving rise to nuclear-encoded chloroplast and glycine-rich proteins. However, cDNAs encoding mammalian proteins with sequence similarity to the glycine-rich plant proteins have also recently been identified”,“, and their RRM is also significantly homologous to those in chloroplast and cyanobacterial proteins. This finding extends the list of known members of this RRM family, and opens up new possibilities for their evolutionary relationships, which may date back to an earlier time. Future prospects

Unravelling the molecularmechanisms that underliecellularposttranscriptionalprocesses is crucial for understandingthe regulation of geneexpressionin responseto changingenvironmentalconditionsor at different developmentalstages.In spiteof the growing list of plant proteinsknown to containthe RRM, the biologicalprocessesin which they participate,with a few exceptions,are still poorly understood.For example,glycine-rich, RNA-binding proteins are abundant plant nucleolar proteins that appear to be involved in the molecularresponses to stressconditions,yet their preciserole - perhapsin rRNA post-transcriptionalprocesses - is still a mystery. Various plant splicing factorshave beenidentified on the basisof their homology to well-known vertebrateproteins, but again,not muchisknown abouttheir precisefunction otherthan by inferencefrom vertebratesystems.In addition,there are some importantgroupsof vertebrateRNA-binding proteinsfor which no clear plant homologueshave yet beenidentified, suchas hnRNP proteins,known to affect pre-mRNA processingand transport’,3. Two proteinsfrom tobacco,RZ-1 andGRP3,localizein thenucleoplasm, and their RRMS are similar to the RRM of the hnRNP G-protein”,3’: couldthey be plant hnRNPproteins‘? Lack of functionalinformationon plant RNA-binding proteinsis partly a consequenceof the novelty of the field. It may also be causedby the experimentaldifficulties involved in the study of RNA-binding proteins,anda majorproblemis the identificationof the RNA targets. Ribohomopolymer-bindingassaysare widely used,and they serveto classify a protein asan RNA-binding protein, but their biological significanceis unclear.A relatively new approachis the useof a randomizedpool of shortsequences to perform various roundsof in vitro selection,followed by the cloning andsequencingof the selectedsequences to determinea consensus binding-site,which may thenprovide informationon the actualcellular ligand/?. This techniquehas proven very useful in some cases,but it hasthe disadvantageof beingratherindirect-the situationin vivo may bemuchmorecomplex.For instance,it overlooks the naturalfolding of RNA molecules,and,in the cell, the activity of the protein may be affected by interactionwith other proteins. Another possibleapproachis the amplification, by random RTPCR, of RNAs co-immunopurifiedwith the RNA-binding protein of interestin a cellular extract. This procedurecould theoretically lead to the cloning of interacting RNAs in a relatively direct manner. How muchcanbe learnedfrom sequence comparisonstudies?It has been shown that the analysis of similarities in the RRM sequencesserves to differentiate diverse functional classesof RNA-binding proteins. Residuesthat are specifically conserved within a protein family are expectedto be importantfor function, and this can be experimentally assessed by mutagenesis studies. Sequencingprojects are in progressfor different plant species, and many new RRM sequences will be discoveredin the future. This will provide a morecompletepicture of the diversity of plant RNA-binding proteins,andallow us to assignsequences to different functional groups. 20

January 1998, Vol. 3, No. 1

Acknowledgements

We would like to thank RamonRota andGiovannaVinti for assistance with the illustrations. This work was funded by grants BIOTECH BI04-CT960062 from the EuropeanCommunity and B1097-1211from the Plan National de Investigation Cientifica y DesarrolloTecnologico. References 1 Dreyfuss. G., Hentze, M. and Lamond. A.!. (1996) From transcript to protein, Cell 85,963-972 2 Simpson, G.G. and Filipowicz, W. (1996) Splicing of precursors to mRNA in higher plants: mechanism, regulation and sub-nuclear organization of the spliceosomal machinery, Plant Mol. Biol. 32. 141 3 Sugita, M. and Sugiura, M. (lYY6) Regulation of gene expression in chloroplasts of higher plants, Plant Mol. Biol. 32.315326 4 Burd. C.G. and Dreyfuss. G. (1994) Conserved structures and diversity of functions of RNA-binding proteins, S&we 265, 615-621 5 Kenan. D.J., Query, C.C. and Keene, J.D. (1991) RNA recognition: towards identifying determinants of specificity, TrendsBiochem. Sci. 16, 214-220 6 Fukami-Kobayashi, K., Tomoda. S. and Go. M. (1993) Evolutionary clustering and functional similarity of RNA-binding proteins, FEB.5 Len. 335, 28%293 7 Mieszczak. M. er al. (1900) Multiple plant RNA-binding proteins identified by PCR: expression of cDNAs encoding RNA-binding proteins targeted to chloroplast in Nicotiana Plumbugini~olia, Mol. Gen. Genet. 234,39&400 8 Li. Y. and Sugiura, M. (19YU) Three distinct ribonucleoproteins from tobacco chloroplasts: each contains a unique amino terminal acidic domain and two ribonucleoprotein consensus motifs, EMBOJ. 9. 3059-3066 9 Belostotsky. D.A. and Meagher, R.B. (1993) Differential organ-specific expression of three poly(A)-binding-protein genes fromArabidopsi.7 thaliana. Proc.

Natl. Acad.

&I. U. S. A. 90, 668&66X)

10 Hilson, P., Carroll, K.L. and Masson, P.H. (1903) Molecular characterization of PAB2. a member of the multigene family coding for poly(A)-binding proteins in Arabidopsis thaliana, Plant Physiol. 103,525-533 11 Bogre. L. ef al. (1996) Developmental and cell cycle regulation of alfalfa nucMs1, a plant homolog of the yeast Nsrl and mammalian nucleolin, Plant Cell 8,417428 12 Ma&night et al. (lYY7) FCA, a gene controlling flowering time in Arabidopsis, encodes a protein containing RNA-binding domains, Cell 89.737-745 13 Alba, M.M. et al. (1994) The maize RNA-binding protein MA16 is a nucleolar protein located in the dense fibrillar component, PIant J. 6: 825-834 14 Moriguchi. K., Sugita, M. and Sugiura, M. (1997) Structure and subcellular localization of a small RNA-binding protein from tobacco, Plant J. 12, 215-221 15 Sanchez, H. etal. (1996) Transfer of rpslY to the nucleus involves the gain of an RNP-binding motif which may functionally replace RPSl3 iniirabidopsls thaliana. EMBO J. 15.213%214Y 16 Simpson, G.G. et al. (1995) Molecular characterization of the spliceosomal proteins UlA and U2B” from higher plants, EMBOJ. 14,4540-4550 17 Simpson. G.G. et al. (1YYl) Evolutionary conservation of the spliceosomal protein, U2 B”, Nucleic Acids Rex 19.5213-5217 18 Lopato, S., Waigmann, E. and Barta, A. (1996) Characterization of a novel arginineiserine-rich splicing factor in Arabidopis. Plant Cell 8, 2255-2264 19 Larar, G. et al. (lYY5) Identification of a plant serine-arginine-rich protein similar to the mammalian splicing factor SFZIASF. Proc. Natl. Acad. Sci. U. S. A. 92,7672-7676

20 Ghmen, J. et al. (1988) A gene induced by the plant hormone abscisic acid in response to water stress encodes a glycine-rich protein, Nature 344,262-264 21 Hirose, T., Sugita, M. and Sugiura, M. (1993) cDNA structure, expression and nucleic-acid binding properties of three RNA-binding proteins in tobacco: occurrence of tissue alternative splicing, Nucleic AcidsRex 21.3981-3987 22 van Nocker, S. and Viestra. R.D. (1993) Two cDNAs fromArabidop.sis thaliana encode putative RNA binding proteins containing glycine-rich domains, Plant Mol. Biol. 21,6Y5499 23 Carpenter, CD., Kreps, J.A. and Simon. A.E. (1994) Genes encoding glycinerichArabidop.sis rhaliana proteins with RNA-binding motifs are influenced by

reviews cold treatment and an endogenous circadian rhythm, Plant Physiol. 104, 1015-1025 24 Dunn, M.A. et al. (1996) A low-temperature-responsive gene from barley encodes a protein with single-stranded nucleic acid-binding activity which is phosporylated in vitro, Plant Mol. Bid 30,947-959 25 Derry, J.M.J., Kerns, J.A. and Francke, LJ. (1995) RBM3, a novel human gene in Xpll.23 with a putative RNA-binding domain, Hum. Mol. Genet. 4, 2307-2311 26 Nishiyama. H. e? al. (1097) A glycine-rich RNA-binding protein mediating cold-inducible suppression of mammalian cell growth. J. Cell Biol. 137,

36 Ohta. M., Sugita. M. and Sugiura, M. (1995) Three types of nuclear genes encoding chloroplast RNA-binding proteins (cp2Y, cp31 and cp33) are present in Arabidops thaliana: presence of cp31 in chloroplasts and its homologue in nuclei/cytoplasm, Plant Mol. Biol. 2?,529-539 37 Breiteneder, H., Michalowski, C.B. and Bohnert, H.J. (1994) Environmental stress-mediated differential 3’ end formation of chloroplast RNA-binding protein transcripts. Plant Mol. BioL 26,83.%849 38 Ye, L. and Sugiura: M. (1992) Domains required for nucleic acid binding activities in chloroplast ribonucleoproteins, Nucleic Acids Rex 20.62754279 39 Lisitsky, I.. Liveanu, V. and Schuster, G. (1995) RNA-binding characteristics of a ribonucleoprotein from spinach chloroplast. Plant Phywol. 107,93%941 40 Lisitsky, 1. and Schuster, G. (1995) Phosphorylation of a chloroplast RNAbinding protein changes its affinity to RNA, Nuckic Acids Rex 23, 25062511 41 Hayes, R. et al. (1996) Chloroplast mRNA 3’-end processing by a high molecular weight protein complex is regulated by nuclear encoded RNA binding proteins, EIMBOJ. 15, 1132-l 141 42 Nagai, K. er al. (1995) The RNP domain: a sequence specific RNA-binding domain involved in processing and transport of RNA, Trozds Biochem. Sci. 20,235-240 43 Kim, Y-J. and Baker, B.S. (lYY3) Isolation of RRM-type RNA-binding protein genes and analysis of their relatedness by using a numerical approach, Mol. Cell Blol. 13, 174-183 44 Margulis, L. (1981) .S>~mhiosis in CellEvolurron, W.H. Freeman 45 Tuerk, C. and Gold, L. (1990) Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase, Science 249, 505-510

899-908 27 Sato, N. (1994) A cold-regulated 28

29 30 31

cyanobacterial gene cluster encodes a RNAbinding protein and ribosomal protein S21, Plant Mol. &al. 24. X19-823 Sugita. M. and Sugiura, M. (1994) The existence of eukaryotic ribonucleoprotein consensus sequence-type RNA-binding proteins in a prokaryote, Swrechococcus 6301, Nucleic Acids Res. 22.25-3 1 Ludevid, M.D. et nl. (1992) RNA binding characteristics of a 16 kD glydne-rich protein from maize, Plant/. 2,99%1003 Freire, M.A. and Pa&, M. (1995) Functional characteristics of the maize RNAbinding protein MA16, Plant Mol. Bzol. 29.797-807 Hirose, T., Sugita, M. and Sugiura, M. (1994) Characterization of a cDNA encoding a novel type of RNA-binding properties, MoI. Gen. Genet.244. 36C366

32 Hanano. S., Sugita, M. and Sugiura. M. (1996) lsolation of a novel RNA-

binding protein and its association with a large ribonucleoprotein particle present in the nucleoplasm of tobacco cells, Plant Mol. Biol. 31, 5748 33 Schuster, G. and Gruissem, W. (1991) Chloroplast mRNA 3’ end processing requires a nuclear-encoded RNA-binding protein, EMBOJ. 8,4163-4170 34 Li. Y. and Sugiura, M. (1991) Nucleic acid-binding specificities of tobacco chloroplast ribonucleoproteins, Nucleic Acids Res. 19, 28Y3-2896 35 Ye, L. et al. (1991) Diversity of a ribonucleoprotein family in tobacco chloroplasts: two new chloroplast rihonucleoproteins and a phylogenetic tree of ten chloroplast RNA-binding domains, NucleicAcids Res. lY, 6485-6490

M. Mar Albe and Montserrat Pages* are at the Departament de Genetica Molecular, Centre d’lnvestigaci6 i Desenvolupament (C.S.I.C.), Jordi Girona l&26, 08034 Barcelona, Spain. *Author for correspondence (tel t34 3 400 6131; fax +34 3 204 5904; e-mail [email protected]).

The role of NADP in the mitochondrial matrix Many diverse metabolic processes are coupled to the turnover of the coenzyme NADP in the matrix of plant mitochondria. NADPH can be produced via the NADP-specific isocitrate dehy drogenase as well as via enzymes like NAD-malic enzyme, NAD-malate dehydrogenase and A’-pyrroline-Scarboxylate dehydrogenase. Although not NADP-specific, the latter enzymes can all catalyse the reduction of NADP’ at appreciable rates. The NADPH produced can be used in folate metabolism, by glutathione reductase for protection against oxidative damage, and by thioredoxin reductase in the (putative) regulation of metabolic pathways via thiolgroup reduction. It can also be oxidized by the respiratory chain via a Ca’+-dependent NADPH dehydrogenase - this is a potential way of regulating the NADP reduction level in the matrix and thus, indirectly, the other processes. It is now possible to present an integrated picture of NADP turnover inside the mitochondrion.

N

icotinamide adenine dinucleotide (NAD) and nicotinamideadeninedinucleotidephosphate(NADP) are major carriersof metabolicredox energy in the cell, alternating between their reduced [NAD(P)H], and oxidized [NAD(P)‘] forms when transferring reducing equivalents. Metabolism has historically beendivided into catabolicreactionsusingNAD asa coenzyme and anabolicreactions using NADP. For example, a textbook view is that ‘NADH acts as a diffusible carrier, transCopyright

0 1998 Elsevier Science Ltd. All rights reserved,

1360

1385/98/$19.00

porting the electrons derived from catabolic reactions to their point of entry into the respiratory chain, the NADH dehydrogenasecomplex...NADPH is a diffusible carrier that supplieselectrons to anabolicreactions.” Salisbury and Rossfurther note that ‘none of the dehydrogenaseenzymesof the [Krebs] cycle uses NADP’ asan electron acceptor...NADP’ is usually nearly undetectable in plant mitochondria.‘2Recentresearchsuggeststhese statementsareno longertenable.

PII: 51360-1385(97)01156-4

January 1998, Vol. 3, No. 1

21