Plant MAP kinase kinase kinases structure, classification and evolution

Plant MAP kinase kinase kinases structure, classification and evolution

Gene 233 (1999) 1–11 www.elsevier.com/locate/gene Mini-review Plant MAP kinase kinase kinases structure, classification and evolution S. Jouannic, ...

294KB Sizes 1 Downloads 89 Views

Gene 233 (1999) 1–11

www.elsevier.com/locate/gene

Mini-review

Plant MAP kinase kinase kinases structure, classification and evolution S. Jouannic, A. Hamal 1, A.-S. Leprince, J.W. Tregear 2, M. Kreis, Y. Henry * Institut de Biotechnologie des Plantes (IBP), Laboratoire de Biologie du De´veloppement des Plantes, Baˆtiment 630, UMR 6818, Universite´ de Paris-Sud, F-91405 Orsay Cedex, France Received 20 December 1998; accepted 14 April 1999; Received by W. Martin

Abstract The increasing number of reports describing plant MAP kinase signalling components reflects the cardinal role that MAP kinase pathways are likely to play during plant growth and development. Relationship and structural analyses of plant MAP kinase kinase kinase related cDNAs and genes established, on one hand, the PMEKKs, which may be distinguished into the a, b, c, and f groups, and, on the other hand, the PRAFs that consist of the delta, eta and theta groups. Plant MAP3Ks are characterized by different primary structures, but conserved within a single group. A relationship analysis, which included animal, fungal and plant MAP3Ks, revealed a high degree of diversity among this biochemically established set of proteins, thus suggesting a range of biological functions. Four major families emerged, namely the MEKK/STE11, including the PMEKKs, the RAF, including the PRAFs, as well as the MLK and CDC7 families. These four families showed phylum-dependent distributions. Signature sequences characterizing the RAF family and the RAF subfamilies have been evidenced. However, no equivalent sequence motifs were identified for the MEKK/STE11 family, which is highly heterogeneous. © 1999 Elsevier Science B.V. All rights reserved. Keywords: Arabidopsis thaliana; Brassica napus; Evolution; Relationship analysis; Signature sequences

1. Introduction The pivotal role played by protein phosphorylation in eukaryotic signal transduction is well illustrated by the wide range of phosphorylation cascades that involve MAP kinases (mitogen-activated protein kinases or MAPKs). In vertebrates, MAPKs are typically activated in response to various mitogenic agents such as growth factors and hormones and have been shown to play an important role in the regulation of cell division and differentiation (Gustin et al., 1998). The activation of serine/threonine MAP kinases (MAPKs) occurs via * Corresponding author. Tel.: +33-1-69-33-63-93; fax: +33-1-69-33-64-25. E-mail address: [email protected] ( Y. Henry) 1 Present address: Laboratoire de Ge´ne´tique et Biotechnologie Ve´ge´tale, De´partement de Biologie, Faculte´ des Sciences, BP 524, 60000 Oujda, Morocco. 2 Present address: ORSTOM, L.R.G.A.P.T., 911 Avenue Agropolis, BP 5045, F-34032 Montpellier, France.

phosphorylation of conserved threonine and tyrosine residues in the catalytic subdomain VIII and is effected by dual specific MAPK kinases (MAP2Ks). The latter kinases are activated by serine/threonine MAP2K kinases (MAP3Ks). The activation of MAP3Ks occurs either via phosphorylation by serine/threonine protein kinases called MAP3K kinases, MAP4Ks (Sells and Chernoff, 1997), via G-proteins (Fanger et al., 1997; Sugden and Clerk, 1997) or via direct interaction with the receiver domain of a two-component histidine kinase receptor (Posas and Saito, 1998). Cell signals are thus transmitted via a chain of phosphorylation and dephosphorylation events, the latter mediated by protein phosphatases (Meskiene et al., 1998). MAP kinases may phosphorylate a number of different substrates in vivo, such as transcription factors and cytoskeletal proteins, and a wide range of cellular signals have been found to feed into the pathways in which they operate. MAP kinase pathways thus perform diverse roles, not only amongst the different organisms

0378-1119/99/$ – see front matter © 1999 Elsevier Science B.V. All rights reserved. PII: S0 3 7 8 -1 1 1 9 ( 9 9 ) 0 0 15 2 - 3

2

S. Jouannic et al. / Gene 233 (1999) 1–11

studied but also within a given organism or a single cell. A good example of the latter phenomenon is budding yeast, where five distinct MAP kinase pathways have been identified to date (Gustin et al., 1998). MAP kinase homologues, including MAPKs, MAP2Ks, MAP3Ks and MAP4Ks, have been identified in different plant species. A number of potential roles for MAP kinase signalling pathways have been identified in higher plants, including response to ethylene ( Kieber et al., 1993), abscisic acid ( Knetsch et al., 1996), gibberellin (Huttly and Phillips, 1995), hydration ( Wilson et al., 1997), pathogen (Suzuki and Shinshi, 1995; Zhang and Klessig, 1997), wounding (Seo et al., 1995; Bo¨gre et al., 1997; Morris et al., 1997), stress (Jonak et al., 1996; Mizoguchi et al., 1996), cell-cycle regulation (Jonak et al., 1993; Wilson et al., 1998) and suppression of auxin response ( Kovtun et al., 1998). In this review, we have focused on the complexity and the analysis of the relationship between the different members of the MAP3Ks. An analysis of the similarities shared between the various previously described plant MEKK/RAF protein kinases and the deduced coding sequences of related EST clones suggest that the MEKK/RAF genes from higher plants belong to at least seven distinct groups, namely MAP3Ka, MAP3Kb, MAP3Kc, MAP3Kd, MAP3Kf, MAP3Kg and MAP3Kh which are organized into two major subfamilies called PMEKK and PRAF. As the MAP3Ks from yeast and animals are now represented by an increasingly large number of DNA and protein sequences, we propose a classification of this biochemically established set of protein kinases based on a relationship analysis. We also discuss the significance of the observed diversity of MAP3Ks within the different phyla with respect to their biological function.

2. Experimental 2.1. Multiple sequence alignments and relationship analysis The NCBI BLAST facility (Altschul et al., 1990) was used to identify MAP3K protein sequences from animals, yeast and plants present in the databases on or before 20/09/98, as well as the plant MAP3K-related sequences reported in Jouannic et al. (1999). Alignments of the MAP3K protein kinase catalytic domains identified were obtained using the CLUSTAL W 1.5 program (Higgins et al., 1996) and optimized manually. The Blosum weight matrix (Henikoff and Henikoff, 1992) was used, and all parameters were kept at default values during all sequence alignments. The alignments obtained were used to generate relationship trees using the neighbor-joining distance method (Saitou and Nei, 1987) conjugated to a bootstrap analysis (Felsenstein, 1985)

on 1000 replicates performed by the CLUSTAL W 1.5 program according to Ku¨ltz (1998). To generate the relationship tree of the plant MAP3K proteins, the estimated protein distance values were not corrected for multiple amino acid substitutions for each single site. 2.2. Gene structure The structures of the AtANP1 and AtMAP3Kb3 genes were determined by comparing them to the cognate cDNA sequences (Nishihama et al., 1997; Jouannic et al., 1999). In the case of the AtMAP3Kc gene, the 5∞ structure of the gene was predicted using the NetPlantGene program (Hebsgaard et al., 1996) and the 3∞ structure by comparison with the corresponding cDNA sequence (Jouannic et al., 1999). The structures of the AtMAP3Kb4, AtMAP3Kd3, AtMAP3Kd4, AtMAP3Kg2 and AtMAP3Kh1 genes were determined on the basis of the related gene and cDNA sequences supported by a NetPlantGene analysis.

3. Results 3.1. Relationship analysis of the proteins encoded by the plant MAP3K genes Twenty-seven plant cDNAs or genes from Arabidopsis thaliana, Brassica napus, Lycopersicum esculentum, Oryza sativa, Ricinus communis and Citrus sinensis encoding putative MAP kinase kinase kinases ( Table 1) have been analyzed. Plant MAP3Ks share identities ranging from 24 to 90% over the catalytic domain. A relationship analysis covering the three most carboxyterminal catalytic subdomains based upon the neighborjoining method (Saitou and Nei, 1987) was carried out ( Fig. 1). The results revealed the same topology as an analysis using the 12 catalytic subdomain sequences (data not shown), thus enabling the inclusion of any truncated catalytic domain sequences. Our analysis showed that the characterized plant MAP3Ks can be classified into at least seven distinct groups, namely the a, b, c, f, d, g and h groups (Fig. 1). Members of the a, b, c and f groups share high similarities over their catalytic domain (about 50%), whereas reduced similarities (about 25%) were observed when the members of the former groups are compared to members of the d, g and h groups. These results are illustrated by the topology of the distance dendogram (Fig. 1). The cutoff between the different groups, supported by the catalytic domain sequence relationship analysis, was completed by the comparison of gene and protein structures (see Sections 3.2 and 3.3) or length of the detected transcripts (Jouannic et al., 1999).

3

S. Jouannic et al. / Gene 233 (1999) 1–11 Table 1 Characterized plant MAP3Ksa cDNA/gene name

A

B

C

AtMAP3Ka BnMAP3Ka1 AtARAKIN AtMEKK1 AtMAP3Kb3 BnMAP3Kb1 AtMAP3Kc AtANP1L AtANP1S AtANP2 AtANP3 NtNPK1 AtCTR1 LeCTR1 LeCTR2 AtMAP3Kd1 CsMAP3Kf1 AtMAP3Kd2 OsMAP3Kd1 RcMAP3Kh1 OsMAP3Kh1 AtMAP3Kg1 AtMAP3Kb4 AtMAP3Kd3 AtMAP3Kd4 AtMAP3Kg2 AtMAP3Kh1

Species

At Bn At At At Bn At At At At At Nt At Le Le At Cs At Os Rc Os At At At At At At

Accession No. cDNA

Gene

AJ010090 AJ010091 L43125 D50468 AJ010092 AJ010093 Y14316 AB000796 AB000797 AB000798 AB000799 D26601 – Y13273 AJ005077 Y14199 C21846 R29949 D47273 T14949 D41138 H36649 – – – – –

– – – AF076275 AF076275 – AB010700 AC000106 AC000106 – – – L08789 – – – – – – – – – AF076275 AC003981 AL031018 AC003981 AC004669

Systematic sequence annotations

– – – T15F16.5 T15F16.2 – Not annotated F7G19.13 F7G19.13 – – – – – – – – – – – – – T15F16.3 F22013.21 F7H19.240 F22O13.21 F7F1.22

Protein features

References

Length

Molecular weight

pI

582 591

66.5 64.5 – 65.9 62.4b 62.6 58.9b 72.8 41.4 70.8 71.7 76.2 90.3 92.0 107.3 – – – – – – – 84.9b 104.9b 82.3b – –

8.3 8.2

608 560b 575 533b 661 376 642 651 690 821 829 982

773b 945b 736b

7.8 7.8b 7.8 8.1b 7.4 7.6 7.4 7.9 7.9 7.7 7.8 7.7

7.6b 7.7b 7.1b

Jouannic et al. (1999) Jouannic et al. (1999) Covic and Lew (1996) Mizoguchi et al. (1996) Jouannic et al. (1999) Jouannic et al. (1999) Jouannic et al. (1999) Nishihama et al. (1997) Nishihama et al. (1997) Nishihama et al. (1997) Nishihama et al. (1997) Banno et al. (1993) Kieber et al. (1993) Wang and Li (1997) Unpublished Jouannic et al. (1999) Hisada et al. (1996) Newman et al. (1994) Unpublished VandeLoo et al. (1995) Unpublished Newman et al. (1994) Jouannic et al. (1999) Jouannic et al. (1999) This study This study Jouannic et al. (1999)

a Part A of the table shows the cDNAs isolated from various plants. Part B shows characterized EST clones from A. thaliana, R. communis, O. sativa and C. sinensis. Part C shows A. thaliana genes characterized using the nucleotide sequences of various BAC clones deposited by the Genome Sequencing Project for which no corresponding EST clones or cDNAs have been identified. Protein features are illustrated by numbers of amino acids (Length), size in kilodaltons (molecular weight) and predicted isoelectric point (pI ). Molecular weights and pI values were calculated using the Mac Molly package. The species are indicated as: Arabidopsis thaliana (At), Brassica napus (Bn), Nicotiana tabacum (Nt), Lycopersicum esculentum (Le), Oryza sativa (Os), Ricinus communis (Rc) and Citrus sinensis (Cs). b Deduced from the predicted structure of the corresponding ORF.

3.2. Relationship analysis of the MAP3K protein kinases A relationship tree constructed using animal, fungal and plant MAP3K catalytic domain sequences showed that these protein kinases can be divided into at least four distinct families, i.e. the MEKK/STE11, CDC7, RAF and MLK families. A fifth family can be defined by the c-MOS protein kinases, but, owing to their highly divergent feature, these proteins were not considered in this study. As shown in Fig. 2A, a relationship analysis revealed that the plant MAP3K proteins are related either to the MEKK/STE11 family or to the RAF family. Thus, they can be classified into the PMEKK (a, b, c and f groups) and PRAF (d, g and h groups) subfamilies. Within the MEKK/STE11 family, the protein kinases from fungi and animals can also be classified, according to their relationships, into the ASK1, MEKK2, STE11 and SSK2 subfamilies. Within the

STE11 subfamily, two groups were identified, the first including STE11, BYR2 and NRC-1 and the second BCK1, PMK1 and KlBCK1. The SSK2 subfamily contains proteins from vertebrates and yeast, whereas ASK1, MEKK2 and STE11 subfamilies incorporate, respectively, kinases from animals, vertebrates and yeasts. Three other sequences, MEKK1, TPL-2/COT1 and NIK1, from vertebrates are not closely related to any of the subfamilies defined in the MEKK1/STE11 family. The NIK1 sequence was included in this study on the basis of its similarity to MEKK proteins, although no evidence of its involvement in the activation of a MAP2K has been reported. In our analysis, the MLK family is resolved into the MLK2 and MUK subfamilies, composed of only animal sequences. As for NIK1, the MLK1 sequence was considered on the basis of its similarity to the established MAP3Ks MLK2 and SPRK. The vertebrate TAK1 protein is considered an

4

S. Jouannic et al. / Gene 233 (1999) 1–11

Fig. 1. Relationship analysis of the plant MAP3K deduced catalytic domain sequences. A relationship analysis of the plant MAP3Ks characterized was performed over the subdomains IX, X and XI (77 positions) using the neighbor-joining method (Saitou and Nei, 1987). Bootstrap values (1000 replicates) are represented as percentages. A scale bar of the deduced distance is shown at the bottom.

‘‘orphan’’ sequence, but could represent an emerging new family of MAP3Ks related to MLK. The yeast MAP3Ks are classified into three distinct subfamilies, two falling within the MEKK/STE11 family (STE11 and SSK2 subfamilies) and one defining the CDC7 family. An analysis of the S. cerevisiae genome sequence showed that it does not contain gene sequences coding for protein kinases related to the RAF and MLK families (Hunter and Plowman, 1997). In contrast, animals possess MAP3Ks belonging to the MEKK/STE11, MLK and RAF families. None of the plant MAP3K-related sequences characterized to date seems to be closely related to the MLK family of MAP3Ks. A multiple alignment of catalytic domain sequences of the 54 MAP3Ks from animals, yeast and plants allowed the identification of signature sequences. These signatures were evaluated according to Ku¨ltz (1998). The signature sequence in subdomain I, [RK ] [IV ]GXG[SF ][FY ]G[ TE ]VX[ KRH ][GA] X[ WF ][ HFN ]G, is common to all the RAF family members and discriminates them from the other MAP3Ks. Moreover,

it is possible to identify subfamily-specific signature sequences namely AI[ VI ]TQWCEGSSLYXHXH[ VI ] in subdomain IV, for the non-plant RAF subfamily (ARAF ), and [LIM ]X[SD]X[ST ]X[AK ]GTP[EQ ]W located upstream of the subdomain VIII for the plant RAF subfamily (PRAF ). Other family and subfamilyspecific signature sequences were identified, but are not reported here because of the limited number of sequences available or because group members are from the same species. Fig. 2B shows that the MEKK/STE11 protein kinases are not characterized by a conserved organization of the protein structure. The catalytic domain is located either in the amino- or carboxy-terminal extremity or in the central part of the protein. Moreover, the MEKK/STE11 non-catalytic regions are different in sequence and length between the different subfamilies. However, the structural organization and the non-catalytic sequences of a MAP3K belonging to the MEKK/STE11 family are only conserved between closely related proteins specially within groups of plant proteins. In contrast, the animal and plant proteins of

5

Fig. 2.

S. Jouannic et al. / Gene 233 (1999) 1–11

Fig. 2. Relationship analysis and primary structures of the MAP3K families of protein kinases. (A) Neighbor-joining relationship analysis within deduced amino acid catalytic domain region (from subdomain II to subdomain XI ) of 53 plant, animal and fungal MAP3Ks. Bootstrap values (1000 replicates) are indicated as percentages, when greater than 50%, on the left of nodes. MAP3K families are indicated by square brackets. Closely related sequences are grouped together by thin vertical lines at the right of assigned subfamilies, with the exception of the plant MAP3Ks, which are indicated by grey boxes. The sequences marked with an asterisk do not belong to any of the defined clades. Plant MAP3K groups are indicated by bold vertical lines within the grey boxes. A scale bar of the deduced distance is shown top left. The species origins of the non-plant sequences are indicated in brackets: Sc, Saccharomyces cerevisiae; Sp, Schizosaccharomyces pombe; Nc, Neurospora crassa; Kl, Kluyveromyces lactis; Rn, Rattus norvegicus; Mm, Mus musculus; Hs, Homo sapiens; Ce, Caenorhabditis elegans; Dd, Dictyostelium discoideum; Dm, Drosophila melanogaster. The Accession Nos of non plant sequences are: NIK1, Y10256; TPL2, M94454; MEKK1, L13103; Pk92b, JC4673; MAPKKK5, U67156; CEESN53F, U41994; PMK1, Q10407; BCK1, Q01389; KlBCK1, AJ005079; NRC-1, AF034090; BYR2, P28829; STE11, P23561; MKKalpha, AF093689; MEKK2, Q61083; MEKK3, Q61084; SSK2, P53599; WIS4, Z98763; MKK4a, U85607; CDC7, P41892; CDC15, P27636; TAK1, D76446; MLK1, P80192; MLK2, S68178; SPRK, A53800; MUK, D49785; LZK, AB001872; CEF33E2.2, Z84574; LIN-45, Q07292; D-RAF, P11346; C-RAF, P11345; B-RAF, P15056; A-RAF, P10398. See Table 1 for the Accession Nos of the plant sequences. (B) A primary structure comparison of MAP3Ks from different subfamilies and groups. Catalytic domain regions are indicated as hatched boxes.

6 S. Jouannic et al. / Gene 233 (1999) 1–11

Fig. 3. Primary structure of the characterized A. thaliana MAP3K genes. Exons are shown as boxes: grey for the non-catalytic domain encoding region, black for the catalytic domain encoding region and white for the 3∞ untranslated region. The first exon starts at the first ATG codon. The end of the grey or black exon, when it is the last exon of a gene, corresponds to the STOP codon. Slashes at the begining of the AtMAP3Kc, AtMAP3Kg2 and AtMAP3Kh1 genes indicate that the start positions have not been determined.

S. Jouannic et al. / Gene 233 (1999) 1–11 7

8

S. Jouannic et al. / Gene 233 (1999) 1–11

the RAF family share the same structural organization characterized by a carboxy-terminal position of the catalytic domain and a long amino-terminal extension. Nevertheless, the non-catalytic sequences have significantly diverged between animal and plant proteins. The proteins of the g and h groups are also characterized by a carboxy-terminal catalytic domain, but due to the truncated sequences, it was not possible to predict their overall structural organization with certainty (Jouannic et al., 1999). 3.3. Structure of A. thaliana MAP3K genes Only one MAP3K genomic sequence has been published to date, namely that of AtCTR1 ( Kieber et al., 1993). Fortunately, several other MAP3K genomic sequences are available in the databases of the A. thaliana sequencing projects. Indeed, data from Table 1 show that nine new gene sequences were identified and characterized, namely AtMEKK1, AtMAP3Kb3, AtMAP3Kc, AtANP1 and the newly characterized genes AtMAP3Kb4, AtMAP3Kd3, AtMAP3Kd4, AtMAP3Kg2 and AtMAP3Kh1. The position of the predicted translation initiation codons for AtMAP3Kc, AtMAP3Kg2 and AtMAP3Kh1 is not yet certain due to the predictions of the exon–intron structure in the corresponding genomic region and because of the absence of an upstream stop codon in the identified reading frame. As shown in Fig. 3, all the characterized A. thaliana MAP3K genes are punctuated by introns, notably in the regions encoding the catalytic domains. The exon–intron structure is not strictly conserved between genes belonging to the different groups. Surprisingly, the alternative splicing event that generates the two isoforms of the protein AtANP1 (Nishihama et al., 1997) occurs within an exon. AtCTR1, AtMAP3Kd3 and AtMAP3Kd4 are characterized by an overall conservation of the exon structure within the sequences encoding the catalytic domain excepted the exons encoding subdomains VIII, IX and X, however the length of the corresponding introns varies. In contrast, the sequence and the organization of the amino-terminal encoding regions have significantly diverged between the latter three genes. The positions of the introns in the region specifying the catalytic domain of the three proteins from the b group are strictly conserved. Moreover, the size and the sequences of the introns are globally conserved between the three b genes. An interesting feature of AtMEKK1 and AtMAP3Kb4 is the presence of an intron in the 3∞ untranslated region. According to the sequence of the AtARAKIN cDNA (Covic and Lew, 1996), this intron seems to be alternatively spliced. These observations strongly suggest a modulation of the size of the 3∞ untranslated region, which might play an important function in the stability of the mRNA and/or in post-

transcriptional regulation (Aubourg et al., 1997). The amino-terminal region of the AtMAP3Kb3 protein is encoded by three exons, whereas in AtMEKK1 and AtMAP3Kb4, the corresponding regions are encoded by a single large exon. The repeated amino acid sequences identified in the AtMAP3Kb3 protein (Jouannic et al., 1999) are encoded by two exons that are flanked by conserved introns in their 5∞ and 3∞ extremities. A similar observation was made for the corresponding intron of each MAP3Kb gene. The origin of the tandem repeat structure of the AtMAP3Kb3 gene can be explained by a partial duplication and a rearrangment of an ancestral exon encoding the amino-terminal non-catalytic region. The origin of the different genes encoding MAP3Kd must be ancient since both AtMAP3Kd3 and AtCTR1 appear to be more closely related to specific rice and/or tomato genes than to the other A. thaliana genes (Fig. 1). In contrast, the similarities observed in the structure and sequences of the members of the b group and their close linkage on chromosome IV (Jouannic et al., 1999) suggest a recent triplication.

4. Discussion 4.1. MAP3K classification and evolution MAP kinase kinase kinases (MAP3Ks) have been defined on the basis of their ability to phosphorylate their specific substrate, MAP kinase kinases (MAP2Ks). A relationship analysis of animal and fungal MAP3Ks illustrates the extensive heterogeneity and diversity among these proteins when compared to the MAP kinases ( Ku¨ltz, 1998) and MAP kinase kinases (Hamal et al., 1999). Nevertheless, functional complementation of yeast MAP3K mutants by unrelated animal and plant MAP3K proteins has been demonstrated (Banno et al., 1993; Blumer et al., 1994; Yamaguchi et al., 1995; Covic and Lew, 1996). On the basis of their sequence similarities, all biochemically established MAP3Ks have been found to belong to four significantly different families, namely MEKK/STE11, RAF, MLK and MOS. It is likely that the CDC7 and CDC15 proteins, from S. pombe and S. cerevisiae, respectively (Schweitzer and Philippsen, 1991; Fankhauser and Simanis, 1994), form a fifth family of MAP3K. This hypothesis is based both on their relationship with other MAP3Ks and on the fact that they interact with regulatory elements similar to those found upstream of yeast and animal MAP kinase modules (Shirayama et al., 1996; Schmidt et al., 1997). Moreover, the existence of a genomic sequence from A. thaliana (Accession No.: U26935) containing a CDC7-related coding sequence strongly suggests that higher plants also contain MAP3Ks belonging to the

S. Jouannic et al. / Gene 233 (1999) 1–11

CDC7 family, which might constitute yet another group of plant MAP3Ks. The plant MAP3Ks studied in this paper, namely the members of the PMEKK and PRAF subfamilies, belong either to the MEKK/STE11 or to the RAF families. These two plant subfamilies are separated from the nonplant members. This particular feature has already been observed for plant MAPKs and plant MAP2Ks belonging to single subfamilies of evolutionarily related proteins called PERK and MAP2K-1, respectively ( Ku¨ltz, 1998; Hamal et al., 1999). The diversity of plant MAPK, MAP2K and within the two families of plant MAP3Ks, reflects divergences of unique ancestral genes that occurred after the emergence of the plant phylum. To our knowledge, no sequences sharing high similarities to the MLK family have been reported in plants. Nevertheless, the GmPK6, ATN1 and AtMRK1 plant protein kinases share weak similarities with both RAF and MLK family members (Feng et al., 1993; Tregear et al., 1996; Ichimura et al., 1997). Hence, these proteins may define a putative new plant MAP3K subfamily, as discussed in Ichimura et al. (1997). Alignments of kinase catalytic domain sequences showed conserved signatures for the RAF family and for the distinct subfamilies ARAF and PRAF. Moreover, the organization of the animal and plant RAF proteins is conserved with reduced similarities over the non-catalytic regions. In contrast, no signature sequences were identified either for the MAP3Ks as a whole, for the MEKK/STE11 family or for shared subfamilies. This fact underlines the high heterogeneity of these proteins, even those of the MEKK/STE11 family. The considerable variability observed both in protein organization and in non-catalytic sequences (e.g. this study; Fanger et al., 1997) suggests that there is a wide variety of substrate specificities and regulatory mechanisms. A relationship analysis between the characterized MAP3K sequences reflects various processes of diversification and evolution inside each phylum. MAP3Ks from the MEKK/STE11 and CDC7 families are probably present in all phyla. In contrast, MAP3Ks of the MLK and RAF families were only observed in higher eukaryotes. Moreover, a highly distant protein kinase specific to animals, called c-MOS (Sagata, 1997), is characterized by a MAP kinase kinase kinase activity. The proteins of the RAF and MLK families, which possess a protein serine/threonine kinase specificity, revealed several structural features of the animal specific tyrosine kinases (Hanks and Hunter, 1995). Based on a relationship analysis (see Fig. 3A), it can be postulated that the common ancestor of the three main eukaryotic clades (i.e. animals, plants and fungi) possessed at least two basal MAP3K proteins, namely CDC7 and MEKK/STE11. Futhermore, a third ancestral MAP3K gene at the origin of the RAF and MLK types may also

9

have been present, and if so, was presumably lost in the fungal lineage after radiation of the different eukaryotic lineages. However, the characteristics of the RAF, MLK, and MOS proteins, and their weak similarities to the members of the other MAP3K families, suggest that they might not originate from a common ancestral MAP3K. We postulate, therefore, that they rather reflect a convergence from distinct protein kinases towards a biochemical function with the ability to phosphorylate a MAP kinase kinase protein. MAP3K-related sequences from protists and algae will help to clarify this point. 4.2. MAP3K classification and cellular function An emerging view in yeast and animals is the presence of two major types of MAP kinase modules involving either ERK-related or SAPK/JNK-related MAP kinases. The ERK pathways seem to be specifically involved in mitogen responses and cell-fate determination, whereas the SAPK/JNK pathways participate in apoptosis, stress responses or morphogenesis (Lewis et al., 1998). In animals, a functional specification of the MAP3K families can be observed. Indeed, the RAF and MOS proteins participate in ERK pathways, whereas the MEKKs, MLKs, TAK1 are involved in SAPK pathways (Lewis et al., 1998). In yeast, the MAP3K functional specificities are not so clear: proteins from the SSK2 subfamily are involved in stress responses, whereas proteins from the STE11 subfamily participate in different processes, namely mating, cellcycle regulation, pseudo-invasive growth and stress response (Gustin et al., 1998). The situation is complicated, however, by the fact that some MAP3Ks may participate in more than one MAP kinase pathway ( Fanger et al., 1997; Gustin et al., 1998), suggesting a complex network not conformable with a linear view of a MAP kinase module. This is particularly clear for animals MAP3Ks from the MEKK/STE11 and the MLK families, which are able to phosphorylate the same MAP2K (Fanger et al., 1997). The characterization of an A. thaliana ethylene response mutant allowed the isolation of the AtCTR1 gene that belongs to the RAF family and has been shown to act as a negative regulator of ethylene response ( Kieber et al., 1993). The NtNPK1 gene from N. tabacum, which displays a cell proliferation-dependent expression could participate in a process regulating cell division (Banno et al., 1993; Nakashima et al., 1998). Surprisingly, this protein seems to be involved in a MAPK pathway that negatively regulates the auxin response ( Kovtun et al., 1998). Moreover, the expression of the AtMEKK1 gene from A. thaliana is upregulated under stress conditions, suggesting that the corresponding protein participates in stress responses (Mizoguchi et al., 1996). It will clearly be of great interest to establish the cellular function of the other

10

S. Jouannic et al. / Gene 233 (1999) 1–11

plant MAP3K proteins in order to evidence a putative correlation between families and function as observed for animal MAP3Ks. By comparison, the plant MAPKs that belong to a single clade of closely related proteins seem to be involved in both mitogenesis and stress responses (Hirt, 1997; Ku¨ltz, 1998). The observation that plant MAP3Ks constitute distinct subfamilies, i.e. PMEKKs and PRAFs, within the wider RAF and MEKK/STE11 families is accompanied by the observation that plant MAP3K genes have undergone many gene duplications during evolution. The latter are likely to generate in the course of evolution related proteins that may be involved in processes different from that of the original. This is well illustrated in A. thaliana by the genes that encode proteins from the d and f groups and in particular the b group (e.g. this study; Jouannic et al., 1999). Alternative splicing is also observed to increase a variety of proteins as for the AtANP1 gene (Nishihama et al., 1997). Such processes may thus play an important role in the evolution of functional diversity, by altering substrate specificity, interactions with upstream effectors or activation processes. MAP3Ks are characterized by the diversity of their regulation, which in turn is reflected by a variety of upstream regulators with which they interact (Daum et al., 1994; Teramoto et al., 1996; Sells and Chernoff, 1997; Sugden and Clerk, 1997; Schonwasser et al., 1998). The yeast SSK2/SSK22 MAP3Ks are directly activated by the upstream response regulator protein SSK1p, which is part of a receptor protein of the two-component response regulator family (Posas and Saito, 1998). A similar mechanism is thought to occur for the AtCTR1 protein from the PRAF subfamily, which interacts physically with the ETR1 and ERS1 receptor histidine protein kinases (Clark et al., 1998), suggesting a direct activation of the plant MAP3K by a two-component response regulator. Other signalling elements similar to upstream regulators of yeast and animal MAP3Ks have been characterized in plants, including monomeric and heterotrimeric G-proteins (Ma, 1994) and MAP3K kinases (Leprince et al.,1999), thus underlining the complexity of MAP3K activation mechanisms and illustrating the complexity of the evolutionary processes that have given rise to higher plant MAP kinase signalling mechanisms.

Acknowledgements Part of this work was supported via Research training fellowship No. ERBBI02CT925124 awarded under the EU Biotechnology Programme to J.W.T. M.K. acknowledges ‘‘GREG’’ for their support (decision n°33/94). We also acknowledge the financial support of the MERS and CNRS to UMR 6818.

References Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403–410. Aubourg, S., Takvorian, A., Cheron, A., Kreis, M., Lecharny, A., 1997. Structure organisation and putative function of the genes identified within a 23.9 kb fragment from Arabidopsis thaliana chromosome IV. Gene 199, 241–253. Banno, H., Hirano, K., Nakamura, T., Irie, K., Nomoto, S., Matsumoto, K., Machida, Y., 1993. NPK1, a tobacco gene that encodes a protein with a domain homologous to yeast BCK1, STE11 and Byr2 protein kinases. Mol. Cell. Biol. 13, 4745–4752. Blumer, K.J., Johnson, G.L., Lange-Carter, C.A., 1994. Mammalian mitogen-activated protein kinase kinase kinase (MEKK ) can function in a yeast mitogen-activated protein kinase pathway downstream of protein kinase C. Proc. Natl. Acad. Sci. USA 91, 4925–4929. Bo¨gre, L., Ligterink, W., Meskiene, I., Barker, P.J., Heberle-Bors, E., Huskisson, N.S., Hirt, H., 1997. Wounding induces the rapid and transient activation of a specific MAP kinase pathway. Plant Cell 9, 75–83. Clark, K.L., Larsen, P.B., Wang, X., Chang, C., 1998. Association of the Arabidopsis CTR1 Raf-like kinase with the ETR1 and ERS ethylene receptors. Proc. Natl. Acad. Sci. USA 95, 5401–5406. Covic, L., Lew, R.R., 1996. Arabidopsis thaliana cDNA isolated by functional complementation shows homology to serine threonine protein kinases. Biochim. Biophys. Acta Gene. Struct. Express. 1305, 125–129. Daum, G., Eisenmann-Tappe, I., Fries, H.W., Troppmair, J., Rapp, U.R., 1994. The ins and outs of Raf kinases. Trends Biochem. Sci. 19, 474–479. Fanger, G.R., Gerwins, P., Widmann, C., Jarpe, M.B., Johnson, G.L., 1997. MEKKs GCKs MLKs PAKs TAKs and TPLs: upstream regulators of the c-Jun N-terminal kinases? Curr. Opin. Genet. Dev. 7, 67–74. Fankhauser, C., Simanis, V., 1994. The cdc7 protein kinase is a dosage dependent regulator of septum formation in fission yeast. EMBO J. 13, 3011–3019. Felsenstein, J., 1985. Confidence limits on phylogenies: an approach using bootstrap. Evolution 39, 783–791. Feng, X.H., Zhao, Y., Bottino, P.J., Kung, S.D., 1993. Cloning and characterisation of a novel member of protein kinase family from soybean. Biochim. Biophys. Acta 1172, 200–204. Gustin, M.C., Albertyn, J., Alexander, M., Davenport, K., 1998. MAP kinase pathways in the yeast Saccharomyces cerevisiae. Microbiol. Mol. Biol. Rev. 62, 1264–1300. Hamal, A., Jouannic, S., Leprince, A.S., Kreis, M., Henry, Y., 1999. Molecular characterisation and expression of an Arabidopsis thaliana L. MAP kinase kinase cDNA AtMAP2Ka. Plant Sci. 140, 41–52. Hanks, S.K., Hunter, T., 1995. The eukaryotic protein kinase superfamily: kinase (catalytic) domain structure and classification. FASEB J. 9, 576–595. Hebsgaard, S.M., Korning, P.G., Tolstrup, N., Engelbrecht, J., Rouze, P., Brunak, S., 1996. Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information. Nucleic Acids Res. 24, 3439–3452. Henikoff, S., Henikoff, J.G., 1992. Amino acid substitution matrices from protein blocks. Proc. Natl. Acad. Sci. USA 89, 10915–10919. Higgins, D.G., Thompson, J.D., Gibson, T.J., 1996. Using CLUSTAL for multiple sequence alignments. Meth. Enzymol. 266, 383–402. Hirt, H., 1997. Multiple roles of MAP kinases in plant signal transduction. Trends Plant Sci. 2, 11–15. Hisada, S., Moriguchi, T., Hidaka, T., Koltunow, A., Akihama, T., Omura, M., 1996. Random sequencing of sweet orange (Citrus

S. Jouannic et al. / Gene 233 (1999) 1–11 sinensis Osbeck) cDNA library derived from young seed. J. Jpn. Soc. Hort. Sci. 65, 487–495. Hunter, T., Plowman, G.D., 1997. The protein kinases of budding yeast: six scores and more. Trends Biochem. Sci. 22, 18–21. Huttly, A.K., Phillips, A.L., 1995. Gibberellin-regulated expression in oat aleurone cells of two kinases that show homology to MAP kinase and a ribosomal protein kinase. Plant Mol. Biol. 27, 1043–1052. Ichimura, K., Mizoguchi, T., Shinozaki, K., 1997. ATMRK1, an Arabidopsis protein kinase related to mammal mixed-lineage kinases and Raf protein kinase. Plant Sci. 130, 171–179. Jonak, C., Pay, A., Bo¨gre, L., Hirt, H., Heberle-Bors, E., 1993. The plant homologue of MAP kinase is expressed in a cell cycle-dependent and organ-specific manner. Plant J. 3, 611–617. Jonak, C., Kiegerl, S., Ligterink, W., Barker, P.J., Huskisson, N.S., Hirt, H., 1996. Stress signalling in plants: a mitogen-activated protein kinase pathway is activated by cold and drought. Proc. Natl. Acad. Sci. USA 93, 11274–11279. Jouannic, S., Hamal, A., Leprince, A.S., Tregear, J.W., Kreis, M., Henry, Y., 1999. Characterisation of novel plant genes encoding MEKK/STE11 and RAF-related protein kinases. Gene 229, 171–181. Kieber, J., Rothenberg, M., Roman, G., Feldmann, K.A., Ecker, J.R., 1993. CTR1 a negative regulator of the ethylene response pathway in Arabidopsis encodes a member of the Raf family of protein kinases.. Cell 72, 427–441. Knetsch, M.L.W., Wang, M., Snaar-Jagalska, B.E., Heimovaara-Dijkstra, S., 1996. Abscisic acid induces mitogen-actived protein kinase activation in barley aleurone protoplasts. Plant Cell 8, 1061–1067. Kovtun, Y., Chui, W.L., Zeng, W., Sheen, J., 1998. Suppression of auxin signal transduction by a MAPK cascade in higher plants. Nature 395, 716–720. Ku¨ltz, D., 1998. Phylogenetic and functional classification of mitogenand stress-activated protein kinases. J. Mol. Evol. 46, 571–588. Leprince, A.S., Jouannic, S., Hamal, A., Kreis, M., Henry, Y., 1999. Molecular characterisation of plant cDNAs, BnMAP4Ka1 and a2 belonging to the GCK/SPS1 subfamily of MAP kinase kinase kinase kinase. Biochim. Biophys. Acta 1444, 1–13. Lewis, T.S., Shapiro, P.S., Ahn, N.G., 1998. Signal transduction through MAP kinase cascades. Adv. Cancer Res. 74, 49–139. Man, H., 1994. GTP-binding proteins in plants: new members of an old family. Plant Mol. Biol. 26, 1611–1636. Meskiene, I., Bo¨gre, L., Glaser, W., Balog, J., Brandstotter, M., Zwerger, K., Ammerer, G., Hirt, H., 1998. MP2C, a plant protein phosphatase 2C functions as a negative regulator of mitogen-activated protein kinase pathways in yeast and plants. Proc. Natl. Acad. Sci. USA 95, 1938–1943. Mizoguchi, T., Irie, K., Hirayama, T., Hayashida, N., YamaguchiShinozaki, K., Matsumoto, K., Shinozaki, K., 1996. A gene encoding a mitogen-activated protein kinase kinase kinase is induced simultaneously with genes for a mitogen-activated protein kinase and an S6 ribosomal protein kinase by touch cold and water stress in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 93, 763–769. Morris, P.C., Guerrier, D., Leung, J., Giraudat, J., 1997. Cloning and characterisation of MEK1 an Arabidopsis gene encoding a homologue of MAP kinase kinase. Plant Mol. Biol. 35, 1057–1064. Nakashima, M., Hirano, K., Nakashima, S., Banno, H., Nishihama, R., Machida, Y., 1998. The expression pattern of the gene for NPK1 protein kinase related to mitogen-activated protein kinase kinase kinase (MAPKKK ) in a tobacco plant: correlation with cell proliferation. Plant Cell Physiol. 39, 690–700. Newman, T., de Bruijn, F.J., Green, P., Keegstra, K., Kende, H., McIntosh, L., Ohlrogge, J., Raikhel, N., Somerville, S., Thomashow, M., Retzel, E., Somerville, C., 1994. Genes galore: a summary of methods for accessing results from large-scale partial

11

sequencing of anonymous Arabidopsis cDNA clones. Plant Physiol. 106, 1241–1255. Nishihama, R., Banno, H., Kawahara, E., Irie, K., Machida, Y., 1997. Possible involvement of differential splicing in regulation of the activity of Arabidopsis ANP1 that is related to mitogen activated protein kinase kinase kinases (MAPKKKs). Plant J. 12, 39–48. Posas, F., Saito, H., 1998. Activation of the yeast SSK2 MAP kinase kinase kinase by the SSK1 two-component response regulator. EMBO J. 17, 1385–1394. Sagata, N., 1997. What does Mos do in oocystes and somatic cells? Bioessays 19, 13–21. Saitou, N., Nei, N., 1987. The neighbour-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425. Schmidt, S., Sohrmann, M., Hofmann, K., Woollard, A., Simanis, V., 1997. The Spg1p GTPase is an essentiel, dosage-dependant inducer of septum formation in Schizosaccharomyces pombe. Gene Dev. 11, 1519–1534. Schonwasser, D.C., Marais, R.M., Marshall, C.J., Parker, P.J., 1998. Activation of the mitogen-activated protein kinase/extracellular signal-regulated kinase pathway by conventional, novel and atypical protein kinase C isotypes. Mol. Cell. Biol. 18, 790–798. Schweitzer, B., Philippsen, P., 1991. CDC15, an essential cell cycle gene in Saccharomyces cerevisiae encodes a protein kinase domain. Yeast 7, 265–273. Sells, M.A., Chernoff, J., 1997. Emerging from the Pak: the p21-activated protein kinase family. Trends Cell Biol. 7, 162–167. Seo, S., Okamoto, M., Seto, H., Ishizuka, K., Sano, H., Ohashi, Y., 1995. Tobacco MAP kinase: a possible mediator in wound signal transduction pathways. Science 270, 1988–1992. Shirayama, M., Matsui, Y., Toh-e, A., 1996. Dominant mutant alleles of yeast protein kinase gene CDC15 suppress the lte1 defect in termination of M phase and genetically interact with CDC14. Mol. Gen. Genet. 251, 176–185. Sugden, P.H., Clerk, A., 1997. Regulation of the ERK subgroup of MAP kinase cascades through G protein-coupled receptors. Cell Signal. 9, 337–351. Suzuki, K., Shinshi, H., 1995. Transient activation and tyrosine phosphorylation of a protein kinase in tobacco cells treated with a fungal elicitor. Plant Cell 7, 639–647. Teramoto, H., Coso, O.A., Miyata, H., Igishi, T., Miki, T., Gutkind, J.S., 1996. Signaling from the small GTP-binding proteins Rac1 and Cdc42 to the c-Jun N-terminal kinase/stress-activated protein kinase pathway A role for the mixed lineage kinase 3/protein-tyrosine kinase 1 a novel member of the mixed lineage kinase family. J. Biol. Chem. 271, 27225–27228. Tregear, J.W., Jouannic, S., Schwebel-Dugue´, N., Kreis, M., 1996. An unusual protein kinase displaying characteristics of both the serine/ threonine and tyrosine families is encoded by the Arabidopsis thaliana gene ATN1. Plant Sci. 117, 107–119. VandeLoo, F.J., Turner, S., Somerville, C., 1995. Expressed sequence tags from developing castor seeds. Plant Physiol. 108, 1141–1150. Wang, Y., Lin, N., 1997. A cDNA sequence isolated from the ripening tomato fruit encodes a putative protein kinase. Plant Physiol. 114, 1135 Wilson, C., Voronin, V., Touraev, A., Vicente, O., Heberle-Bors, E., 1997. A developmentally regulated MAP kinase activated by hydration in tobacco pollen. Plant Cell 9, 2093–2100. Wilson, C., Pfosser, M., Jonak, C., Hirt, H., Heberle-bors, E., Vicente, O., 1998. Evidence for the activation of a MAP kinase upon phosphate-induced cell cycle re-entry in tobacco cells. Physiol. Plant. 102, 532–538. Yamaguchi, K., Shirakabe, K., Shibuya, H., Irie, K., Oishi, I., Ueno, N., Taniguchi, T., Nishida, E., Matsumoto, K., 1995. Identification of a member of the MAPKKK family as a potential mediator of the TGF-beta signal transduction. Science 270, 2008–2011. Zhang, S., Klessig, D.F., 1997. Salicylic acid activates a 48-kD MAP kinase in tobacco. Plant Cell 9, 809–824.