Accepted Manuscript Evolutionary history of Mo25 gene in plants, a component of RAM/MOR signaling network
Fernanda M. Bizotto, Renan S. Ceratti, Antonio S.K. Braz, Hana Paula Masuda PII: DOI: Reference:
S0925-4773(18)30070-4 doi:10.1016/j.mod.2018.09.001 MOD 3538
To appear in:
Mechanisms of Development
Received date: Revised date: Accepted date:
31 March 2018 5 July 2018 5 September 2018
Please cite this article as: Fernanda M. Bizotto, Renan S. Ceratti, Antonio S.K. Braz, Hana Paula Masuda , Evolutionary history of Mo25 gene in plants, a component of RAM/MOR signaling network. Mod (2018), doi:10.1016/j.mod.2018.09.001
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
ACCEPTED MANUSCRIPT Evolutionary history of Mo25 gene in plants, a component of RAM/MOR signaling network Fernanda M. Bizotto1,2*, Renan S. Ceratti2#, Antonio S.K. Braz1+, Hana Paula Masuda2× 1
Computational, Structural and Molecular Biology Laboratory, Centro de Ciências
Naturais e Humanas, Universidade Federal do ABC, 5001 Avenida dos Estados, Santo
2
PT
André, Brazil. 09210-580. Evolution and Diversity II Laboratory, Centro de Ciências Naturais e Humanas,
RI
Universidade Federal do ABC, Alameda da Universidade, s/nº, São Bernardo do Campo,
SC
Brazil. 09606-045.
[email protected]
MA
#
NU
*
[email protected],
PT E
×
[email protected]
D
+
[email protected]
Corresponding Author
Highlights:
CE
[email protected]
Mo25 was duplicated in land plants;
Arabidopsis thaliana Mo25 (AtMo25) homologs are structurally similar to each
AC
other and to the human Mo25 homolog;
AtMo25-1 gene was duplicated by retroposition and its expression profile is similar to the observed for retrogenes;
We showed an evolutionary framework of Mo25 genes in plants and used A. thaliana paralogs as example to understand some consequences of this gene duplication.
ACCEPTED MANUSCRIPT Abstract Change in cell morphogenesis is an important feature for proper development of eukaryotes. It is necessary for cell polarity and asymmetry and is essential for asymmetric cell division. RAM/MOR is a conserved signaling network that coordinates cell polarity determinants important for asymmetric cell division and cell polarity establishment. Mo25 is a scaffold protein that act as a master regulator of the germinal center kinase
PT
(GCK) which triggers the downstream signaling of this network. Little is known about RAM/MOR network or Mo25 protein homologs in plants. Here, we provide a glimpse of
RI
the evolutionary gene history of Mo25 in green plants. Our data showed that a duplication of Mo25 occurred at the basis of land plants (Embryophyta), forming the groups Mo25A
SC
and Mo25B. Further duplication events occurred in other plant lineages and one subgroup of sequences seemed to be rapidly diverging. This subgroup contained an A. thaliana
NU
paralog (AtMo25-1) which lacks intron and is expressed in a similar fashion of retrogenes (i.e. low expression levels and narrow expression breadth), suggesting that this paralog
MA
was duplicated by retroposition. We also showed that all AtMo25 proteins are structurally similar to each other and to the human homolog, although differences in residues in the interface between human Mo25 and MST3 are observed in the A. thaliana homologs.
D
Expression profile of AtMo25 homologs suggest that they are required at different
PT E
developmental contexts, possibly interacting with different partners. Finally, we discuss whether Mo25 duplication in Embryophyta could be an evolutionary novelty important for the terrestrial environment conquest and whether the duplicated paralogs are
AC
CE
undergoing neo- or subfunctionalization.
Keywords: RAM/MOR network, Mo25, plant gene duplication, retrogene, Arabidopsis thaliana
ACCEPTED MANUSCRIPT
1. Introduction Change in cell morphogenesis is a key feature of eukaryote development since it plays an important role in cell polarity, cell migration, asymmetric cell division and for determining the cell fate. Determination of cell morphology involves signaling pathways that orchestrate cytoskeleton organization, membrane trafficking and gene expression.
PT
RAM/MOR (Regulation of ACE2p activity and cellular Morphogenesis/ Morphogenesisrelated NDR kinase) network is one of the signaling pathways that coordinates cell
RI
polarity and morphogenesis and cell separation in Saccharomyces cerevisiae
SC
(Bidlingmaier et al., 2001; Jansen et al., 2006; Maerz and Seiler, 2010; Nelson et al., 2003; Racki et al., 2000; Saputo et al., 2012; Weiss et al., 2002). RAM/MOR signaling is essential for the correct distribution of cell polarity determinants between mother and
NU
daughter cells during asymmetric division and for appropriate mating projection during sexual reproduction in budding yeasts, thus providing cues for cell fate asymmetry
MA
(Bidlingmaier et al., 2001; Colman-Lerner et al., 2001; Nelson et al., 2003; Weiss et al., 2002).
D
RAM/MOR network comprises two protein kinases, Kic1 and Cbk1, and their
PT E
associated proteins Mob, Hym1/Mo25, Tao3, Sog2 (Bidlingmaier et al., 2001; ColmanLerner et al., 2001; Nelson et al., 2003; Racki et al., 2000; Weiss, 2012; Weiss et al., 2002). Cells lacking any of these genes have cell separation defects and loss of multiple
CE
components does not exacerbate the phenotype (Bidlingmaier et al., 2001; Nelson et al., 2003). Cbk1 is an NDR/LATS kinase and the terminal kinase of the pathway. It is phosphorylated by Kic1, a germinal center kinase (GCK), which binds to its activating
AC
protein Hym1/Mo25 (Mouse protein-25 or Mo25 in animals) and Sog2. For Cbk1 activation, the complex Cbk1-Mob should be associated to Tao3 which functions as a bridge connecting Kic1-Hym1/Mo25-Sog2 allowing Kic1 phosphorylation of Cbk1 (Saputo et al., 2012; Weiss, 2012). Once activated, Cbk1 phosphorylates downstream targets, such as the transcription factor Ace2, responsible for mother/daughter cell separation and the control of polarized morphogenesis (Colman-Lerner et al., 2001; Weiss et al., 2002). Several studies have shown that Hym1/Mo25 is essential for Kic1 phosphorylation of Cbk1 and consequently for mother/daughter cell separation, polarized
ACCEPTED MANUSCRIPT mRNA localization, secretion and polarized (thus asymmetric) morphogenesis in fungi (Bidlingmaier et al., 2001; Dettmann et al., 2012; Kanai et al., 2005; Mendoza et al., 2005; Nelson et al., 2003). Other studies in animals showed that Mo25 is essential for cell asymmetry determination in neuroblasts and intestinal epithelial cells (Chien et al., 2013; ten Klooster et al., 2009; Yamamoto et al., 2008). It was also shown that human Mo25 is part of a ternary complex composed of Mo25, the GCK family pseudokinase STRAD and the tumor suppressor LKB1, which act as the final kinase of the pathway that regulates
PT
several cellular processes such as cell metabolism, cell proliferation, cell polarity and
RI
migration (Kullmann and Krahn, 2018).
Structural studies have already dissected Mo25-mediated kinase activation and
SC
showed that MO25 binds to several GCKs such as MST3, MST4, STK25, YSK1 and SPAK/OSR1 in mammals and stimulate their kinase activity several-fold (Filippi et al.,
NU
2011; Hao et al., 2014; Shi et al., 2013). Thus, it was suggested that MO25 might be a master activator of Ste20 kinases which includes GCKs. Although it is hard to trace a
MA
direct parallel of RAM/MOR network involving all its components in other eukaryotes, it has been suggested that RAM/MOR shares similarities Hippo signaling (Hsu and Weiss, 2013; Weiss, 2012).
D
In plants, RAM/MOR signaling core components were identified in Arabidopsis
PT E
and four other species by sequence similarity searches (Zermiani et al., 2015). Homologs of all core components were identified, except for Sog2, and their expression/coexpression data suggested that RAM/MOR signaling might control polarized growth of
CE
pollen tube and in fine-tuning stem cell maintenance, differentiation and organ polarity. It was also shown that AtSIK1 and AtMob1A/B, the Arabidopsis thaliana homologs of
AC
Kic1 and Mob respectively, physically interact and these proteins regulate cell proliferation and expansion (Xiong et al., 2016). Little is known about Mo25 and the RAM/MOR or Hippo signaling in plants. To shed light into Mo25 homologs in plants and their possible role in RAM/MOR signaling or other protein complexes contexts, we analyzed Mo25 genes and proteins from several plant species and present a picture of the evolutionary history of this gene in plants. We also studied the Arabidopsis thaliana Mo25 (AtMo25) paralogs, as an example of gene duplication in land plants and focused in the paralog AtMo25-1, the gene with most different divergence rate.
ACCEPTED MANUSCRIPT 2. Results Phylogeny of Mo25 protein family Mo25 homologs were searched based on amino acid sequence similarity and only sequences from the eukaryotic groups (Rhizaria, Hexamitidae, Amoebozoa, Alveolata, Kinetoplastida, Stramenopiles, Viridiplantae, Metazoa, Fungi, Choanoflagellia and Ichthyosporea) were retrieved confirming its origin in Eukarya (Supplementary
PT
Information). Because we are interested in the evolutionary history of Mo25 in plants, we focused our attention on green plants (Viridiplantae). To better understand the
RI
phylogenetic relationships between groups within Viridiplantae and the duplication/loss events, a species tree was built so that we can compare to the Mo25 phylogeny obtained
maximum likelihood implemented in PhyML.
SC
in this work (Figure 1A). The phylogeny of Mo25 proteins was estimated using
NU
In plants, Mo25 family is present since red algae (Rhodophyta), the basal group among Archeaplastida. Interestingly, Mo25 is not a monophyletic group in green plants
MA
as most of Chlorophyta (green algae) sequences are a sister group of red algae and a few formed a separate group in Eukarya (Suppl. Fig. 1). Even though green algae sequences formed a group distant from red algae and land plants, no duplication event was observed
D
in algae as the organisms harboring these sequences present in the separate group are
PT E
different from those closer to red algae and land plants. Learning the domains present in a protein is a way to assess the function and possibly the evolution (e.g. whether it gained/lost a domain) of protein families. Therefore, protein domains were searched in
CE
all plant Mo25 sequences analyzed and the only domain identified in these sequences was described as Mo25-like (data not shown).
AC
Interestingly, a duplication event was observed in land plants (Embryophyta) and formed two groups, here named Mo25 group A (Mo25A) and Mo25 group B (Mo25B) (Figure 1B). This is supported by the presence of Mo25 sequences from basal Embryophyta (land plants), such as the mosses Physcomitrella patens and Sphagnum fallax, Marchantia polymorpha (liverwort), and Selaginella moellendorffii (lycophyte) in both groups. Sequences of naked seed plants (Gymnospermae), such as Cycas, Ginkgo and the conifers (Araucaria, Taxus, Picea, Pseudotsuga, Pinus and Wollemia), were found only in Mo25A, suggesting a possible loss of Mo25B homologs in this particular group. Other duplication/loss events might have occurred in other plant species or groups. Nevertheless, it is hard to assess that because the genomes of many plant species are not
ACCEPTED MANUSCRIPT completely sequenced or assembled. Curiously, almost all eudicot species have only one representative homolog in Mo25B, while duplication events were more common in Mo25A (Figure 2 and Suppl. Fig. 2). One example is the duplication within Mo25A in Brassicaceae family (Figure 2). When we look at the A. thaliana homologs as an example, one belongs to Mo25B group (AtMo25-3, Uniprot code Q8L9L9) while the other three belong to Mo25A (AtMo25-1, AtMo25-2 and AtMo25-4; Uniprot codes Q9ZQ77, Q9M0M4 and B3LFB8 respectively). Interestingly, the cluster harboring
PT
AtMo25-1 other Brassicaceae orthologs showed an accelerated evolution rate (Figure 2A, prolonged branch indicated in purple). Because A. thaliana is extensively studied and
RI
displays a plethora of tools and information publicly available, we used this species as a
SC
model to understand the possible causes and consequences of a duplicated gene with a
NU
distinct evolution rate when compared to its paralogs.
Genomic neighborhood and gene structure analysis of AtMo25 paralogs
MA
It has already been suggested that asymmetric sequence divergence might arise from gene relocation or retroposition as a consequence of gene duplication (Cusack and Wolfe, 2007). Thus, we analyzed the gene structure and genomic neighborhood of
D
AtMo25 paralogs. We observed that the genomic neighborhood of AtMo25-3 gene, the
PT E
only paralog in Mo25B, is different from the other paralogs, with a region of 12 kb upstream without any ORF (Figure 3A). In A. thaliana Mo25A paralogs, an auxin responsive family gene is adjacent and in the opposite direction of both AtMo25-2 and
CE
AtMo25-4, suggesting that the cassette AtMo25/auxin responsive protein, rather than AtMo25 alone, was duplicated. Although AtMo25-1 also belongs to Mo25A, the gene that
AC
codes for the auxin responsive protein was missing. Analyzing the gene structure of AtMo25 paralogs, we observed that only AtMo25-1 lacks introns and have a poly A sequence, suggesting that this gene was duplicated by retroposition (Figure 3B). This was confirmed by the amplification of genomic DNA using primers that include the start and stop codons of the CDS. gDNA PCR resulted in the amplification of a 1kb band that corresponds to the CDS, confirming the absence of introns in this copy (Suppl. Fig. 3A). All other Mo25-1 orthologs from the accelerated branch also lack introns, suggesting that the retroposition occurred before genera diversification in Brassicaceae family (Fig Suppl. 3B).
ACCEPTED MANUSCRIPT Model of AtMo25 homolog proteins structure To analyze how the sequence divergence affected the protein structure, we modelled the structure of the four A. thaliana Mo25 homologs (Figure 4). Because AtMo25-1 was the most divergent among the three sequences, we focused our study on this paralog and constructed 100 models to better elucidate AtMo25-1 3D structure. AtMo25-1 structural model showed 0.6Å as RMSD value and was obtained by comparison with crystallographic protein structure of human Mo25 protein (PDB: 1UPK). It is comprised
PT
of two curved layers of -helices arranged in a right-handed superhelix forming an Armadillo-type fold present in the human Mo25 homolog (Figure 4A, Suppl. Fig 4).
RI
Despite amino acid sequence differences, all AtMo25 homologs are structurally similar
SC
to each other and to the human homolog (Figure 4A; Suppl. Figure 4). An alignment between the four A. thaliana homologs and the human Mo25 showed that some residues
NU
on the interface between human Mo25 and its binding partner MST3 are conserved in the four A. thaliana homologs. They include Tyr223 and the pair Arg227+Met260 of the human Mo25 (Y231 and R235+M268 in the alignment), essential for MST3 kinase
MA
activation (Figure 4B) (Filippi et al., 2011; Mehellou et al., 2013). Other residues changed, in general, to other amino acids with the same biochemical characteristic. One
D
exception is the negatively charged Glu273 (E281 in the alignment) present in the human
PT E
Mo25 and other three AtMo25 homologs but changed to serine, a polar uncharged amino acid, in the AtMo25-3 homolog. Another difference is the presence of Phe230 (F232 in the alignment), an aromatic residue, in AtMo25-1 instead of a valine, a smaller nonpolar aliphatic residue, present in human Mo25. Thus, although structurally similar to each
CE
other and to the human Mo25, AtMo25 homologs have differences in residues that might
AC
be important for the interaction with other partners.
Gene expression profile of AtMo25 paralogs It has been suggested that relocated retrogenes is expressed in a more limited range of tissues than its static paralogs and at a lower level (Marques et al., 2005). Therefore, we analyzed the gene expression profile of AtMo25 paralogs using publicly available transcriptome data and RT-qPCR (Figure 5). To investigate possible developmental processes in which AtMo25 could participate, their expression pattern was searched in silico using Genevestigator database (Hruz et al., 2008). Our results showed that the gene expression profile of four AtMo25 paralogs is different. The gene expression of AtMo25-2, AtMo25-3 and AtMo25-4 were evenly distributed among several organs,
ACCEPTED MANUSCRIPT tissues and cell types (Figure 5A). AtMo25-1, on the other hand, showed a more restricted gene expression distribution being highly expressed in pollen grains. Accordingly, qPCR results showed that transcript levels of AtMo25-1 are higher in reproductive structures (e.g. flower buds and flowers) than in vegetative structures (leaves and roots) (Figure 5C). Also, temporal expression of AtMo25 paralogs during development was moderate to high, except for AtMo25-1 which showed low transcript levels (Fig 5B). Altogether, our results showed that AtMo25 paralogs have a diverse expression profile and that AtMo25-
PT
1 has a lower and restricted expression level and distribution, features common to
RI
retrogenes.
SC
3. Discussion
In fungi and animals, Mo25/Hym1 regulates the activity of GCK in pathways that
NU
include RAM/MOR network and LKB1-Mo25-STRAD and is an important regulator of a kinase cascade that regulate several aspects of eukaryote development, including cell
MA
polarized morphogenesis and asymmetric cell division (Kullmann and Krahn, 2018; Weiss, 2012). Although components of RAM/MOR network were already identified in plants, little is known about the function and evolutionary history of each of its
D
components (Zermiani et al., 2015). In this work, we showed that Mo25 is present only
PT E
in Eukaryotes, including members of Trypanosoma, Alveolata, Fungi, Metazoa and Viridiplantae (green plants). All plants have at least one copy of Mo25, including red and green algae (only one copy was found in these organisms). Interestingly, Mo25 was
CE
duplicated in land plants (Embryophyta) forming two groups: Mo25A and Mo25B. In each group, sequences from mosses, liverworts, lycophyte (sister group of ferns) and
AC
higher vascular plants were found. The only exception is gymnosperms (naked seed plants) that seem to have lost group B homolog. It is possible that, in this group, Mo25A protein is accumulating the function of Mo25B as well and that the different pathways in which it is involved are regulated by the expression/presence of its interacting partners in a determined developmental context, rather than the presence of Mo25 per se. Nevertheless, how the absence of Mo25B affects cell morphogenesis and polarity in these plants is still unknown. On the other hand, evolutionary history of Mo25 showed that several duplication events occurred in angiosperms. In Brassicaceae family, polyploidization events (α- and β- polyploidizations) took place
and could explain Mo25A paralogs duplication,
ACCEPTED MANUSCRIPT especially the subgroups formed by the Brassicaceae sequences clustered with AtMo252 and AtMo25-4 (Barker et al., 2009). The subgroup of Mo25A that harbors the AtMo251 paralog showed a divergent evolution rate which is common to distant duplicates including retrogenes (Cusack and Wolfe, 2007). All sequences from this subgroup lack introns suggesting a retrotransposition origin for this paralog in Brassicaceae family. Also, the expression profile of AtMo25-1 ortholog is coincident to the behavior expected for a retrogene (i.e. low overall expression levels and narrow expression breadth, usually
PT
primarily in male germ line as pollen grains) (Casola and Betrán, 2017).
RI
It is intriguing that few duplications in Mo25B is observed. It is possible that this is the ancient copy and is thus maintaining the ‘original’ function of this gene. It has been
SC
proposed that AtMo25-3 (from Mo25B group) belongs to a coexpression module composed of other plant RAM/MOR network components (i.e. AtSIK1, AtFRY and
NU
AtNDR2/3/4/5/6) and that this module controls the fine tuning of stem cell maintenance, differentiation and organ polarity within shoot apical meristem and inflorescence
MA
meristem (Zermiani et al., 2015). According to that study, AtMo25-1 and AtMo25-4 would also be part of this module, but they would participate on specific developmental contexts, such as male germline development. Although high throughput expression data
D
points to similarities in the expression profile of AtMo25-1 and AtMo25-4, our RT-qPCR
PT E
data suggest that AtMo25-4 transcript levels is more similar to the AtMo25-3 expression. Either way, both data supports to a co-expression module containing RAM/MOR network components and AtMo25-1, AtMo25-3 and AtMo25-4 genes. As observed in this study
CE
and elsewhere, AtMo25-2 transcript profile is the most divergent among all AtMo25 paralogs and would be part of a different co-expression module possibly representing
AC
SIN/MEN pathway (Zermiani et al., 2015). Mo25 phylogeny in plants supports a duplication in the basis of Embryophyta group. This is intriguing since green algae somehow transitioned to land, conquering terrestrial environment, and formed the group known as Embryophyta (Rensing, 2018). One of the evolutionary novelties in Embryophyta is the presence of the embryo, a structure with a 3D body plan characterized by polarity and symmetry breaking which is probably necessary for the formation of complex tissues and organs (Rensing, 2016). Thus, novel regulatory network configurations that control cell polarity, morphogenesis and asymmetry might have arisen to yield such complex structure as the embryo.
ACCEPTED MANUSCRIPT Interestingly, Mo25 was originally identified as a conserved protein expressed in mouse early cleavage stage during embryogenesis (Karos and Fischer, 1999; Miyamoto et al., 1993; Nozaki et al., 1996). Later, it was shown that downregulation of one of Mo25 homologs resulted in altered neuroblast asymmetric cell division in Caenorhabditis elegans (Chien et al., 2013). Asymmetric localization of cell fate determinants in neuroblasts was also observed in Drosophila mutants of Mo25 and GCK family kinase Fray genes (Yamamoto et al., 2008). In mammals, Mo25 was required for STRAD-LKB1
PT
induced brush border formation in intestinal epithelial cells, a highly asymmetric cell (ten Klooster et al., 2009). Mutants of Mo25/Hym1 in fungi resulted in cell polarity and cell
RI
separation defects (Bidlingmaier et al., 2001; Dettmann et al., 2012; Kanai et al., 2005;
SC
Mendoza et al., 2005; Nelson et al., 2003). In the plants, data on Mo25 homologs function are still missing. However, loss of the GCK kinase AtSIK1 in A. thaliana, a putative target
NU
for Mo25 regulation, resulted in plants with shorter root hairs, which are epidermal cells that experience extremely polarized cell growth, and smaller overall plant growth (Xiong et al., 2016). In all cases, Mo25 is directly involved at some degree of cell asymmetry
MA
establishment or maintenance which is crucial especially at early stages of embryogenesis.
D
Previous studies have elucidated kinase activation mediated by Mo25 and showed
PT E
its binding to several other GCK kinases, suggesting that Mo25 might be a master activator of GCKs (Filippi et al., 2011; Shi et al., 2013; Shi et al., 2013). Curiously, only one GCK kinase similar to the budding yeast Kic1 was found in plants (SIK1) (Zermiani
CE
et al., 2015). On the other hand, Mo25 duplication occurred not only once in Embryophyta, but also later duplication events occurred within other plant lineages. Why
AC
there are so many Mo25 homologs and apparently just one GCK in these organism is still not known. Genes preserved after duplication (either by WGD or retroposition) often seem to contribute to evolutionary innovations (Casola and Betrán, 2017; Schranz et al., 2012). During land plant evolution, the sporophyte became perennial and gradually increased in size and complexity at molecular and cellular levels leading to an increase in cell types and cell specialization. So, it is tempting to speculate whether Mo25 paralogs might participate in more than one protein context. Although structural data show that the four AtMo25 homologs are similar, changes were observed in residues that, at least in humans, are on the interface between Mo25 and MST3 interaction (Filippi et al., 2011; Mehellou et al., 2013). Thus, it is
ACCEPTED MANUSCRIPT possible that differences in the residues on AtMo25 surface could determine which protein each homolog will interact to. Because important residues for the activation of the kinase activity of Mo25 binding partner - such as tyrosine, arginine and methionine equivalent of Tyr223 and the pair Arg227+Met260 in human Mo25 - are conserved in all four AtMo25 homologs, it is tempting to speculate that all A. thaliana homologs are potential kinase activators and might have a ‘preferential partner’ and thus participate in different signaling pathways and at different developmental contexts (Filippi et al., 2011;
PT
Mehellou et al., 2013). It is possible that one or more Mo25 paralogs participate in a more canonical RAM/MOR network, while other(s), e.g. AtMo25-2, might be regulating
RI
kinase activity of other signaling pathways (Zermiani et al., 2015). To address this, more
SC
studies on the plant Mo25 genes function in plant development and cell signaling are necessary to understand the evolutionary significance of Mo25 duplication in land plants
NU
and the function of divergent Mo25 homologs in some plant lineages. In short, we showed an evolutionary framework of Mo25 genes in plants and used
MA
A. thaliana paralogs as example to understand some consequences of this gene duplication. Understanding the evolutionary history of a regulatory component of an ancient and important cell morphogenesis network such as RAM/MOR might help to
PT E
D
unravel how this cell signaling evolved in plants.
4. Experimental Procedures
CE
4.1 Amino acid sequences acquisition and phylogenetic analysis All 1671 available protein sequences (UniProt Q9ZQ77 was used as bait) were
AC
collected in UniProt databank by BLAST tool (Basic Local Alignment Search Tool, available at http://www.uniprot.org/blast/uniprot) in July 2016. The search criterion was based on e-threshold value (e-120), covering at least 80% of our bait and with 40% as minimum identity, using BLOSUM45 matrix. To avoid redundancy due to highly similar sequences, and thus ensure the use of only a set of representative data, a clustering methodology was performed using the CD-HIT software, which clustered proteins with identity 99% (Li and Godzik, 2006) and sequences that covered the query at least in 70%. A total of 767 sequences were used for the phylogenetic analysis. UniProt code sequences used to build the phylogenetic tree are available at Supplementary Information. Plant sequences from other organisms that were not available at UniProt were obtained
ACCEPTED MANUSCRIPT elsewhere. Sequences from Arabidopsis halleri, Brassica oleracea, Brassica rapa, Marchantia polymorpha, Sphagnum fallax, Boechera stricta, Capsella grandifolia, Capsella rubella and Eutrema salsugineum were retrieved from Phytozome portal (https://phytozome.jgi.doe.gov/ pz/portal.html) and sequences from gymnosperms (Pseudotsuga menziesii, Picea glauca, Picea abies, Pinus pinaster, Pinus taeda, Pinus sylvestris, Taxus baccata, Ginkgo biloba, Cycas micholitzii and Gnetum montanum) were obtained from PLAZA portal https://bioinformatics.psb.ugent.be/plaza/versions/gymno-
PT
plaza/) (Goodstein et al., 2012; Van Bel et al., 2018). The Mo25/Hym1 homolog from B. oleracea (Bol014887) was misannotated and the sequence used in this study is available
SC
removed from our analyses due to low quality sequence.
RI
in Supplementary Information. The accession code Brara.A01822.1 from B. rapa was
For phylogenetic analyses, a maximum likelihood (ML) tree was calculated using
NU
the JC+G substitution model (Abascal et al., 2005) as suggested by ProtTest3 (Darriba et al., 2011) with statistical support SH2-like aLTR (Anisimova and Gascuel, 2006) using
MA
PhyML (Gouy et al., 2010). The tree was visualized with FigTree v1.4 (Drummond and Rambaut, 2007). The tree of species was done using PhyloT (http://phylot.biobyte.de/) based
on
the
NCBI
taxonomy
(taxID,
available
at
PT E
D
http://www.ncbi.nlm.nih.gov/taxonomy) (Letunic and Bork, 2007).
4.2. Model of protein structure
The amino acid sequences were aligned by TCoffee 3D algorithm
CE
(http://tcoffee.crg.cat/) (Notredame et al., 2000) and structural information was added using the crystal structure of Mo25 in complex with a C-terminal peptide of STRAD from
AC
human (PDB 1UPK). PDB code 1UPK was selected as template for the initial theoretical structural model for AtMo25-1 protein. UniProt code Q9ZQ77 sequence was used to perform homology modelling. The initial theoretical structural models were generated using MODELLER 9 v.18 (Sali and Blundell, 1993) using the selected template and 100 models were generated for each protein. The structures qualities were evaluated according to the energy of each model. The 10 best energetic models were analyzed based on geometric criterion by MolProbity (Chen et al., 2010). Protein structure models were generated in PyMOL Molecular Graphics System v1.8.6.0 (DeLano, 2002). The RMSD values were performed by MUSTANG (Konagurthu et al., 2006) and obtained by
ACCEPTED MANUSCRIPT comparison with crystallographic protein structure of human MO25 (PDB: 1UPK).
4.2. Gene structure and genetic neighborhood analyses Sequence data for genetic neighborhood analysis in A. thaliana were obtained from Phytozome, species: Arabidopsis thaliana Col-0, TAIR10 (Goodstein et al., 2012; Lamesch et al., 2012). Sequences were retrieved using the Arabidopsis Genome Initiative (AGI) codes for each A. thaliana paralogs (At2g03410, At4g17270, At5g18940 and
PT
At5g47540 for AtMo25-1, AtMo25-2, AtMo25-3 and AtMo25-4 respectively) and the two up- and downstream-most ORFs of the AtMo25 paralog were identified. AtMo25
RI
gene structure were obtained from AGI sequence database (https://www.arabidopsis.org).
SC
To confirm the loss of introns in AtMo25-1, a PCR using a pair primers (AT2G03410_GW: GGGGACAAGTTTGTA CAAAAAAGCAGGCTTCATGAAAG
NU
GTCTCTTCAAGAACAA;AT2G03410_GW_R: GGGGACCACTTTGTACAAGAAA GCTGGGTTTTTATCGATCTGCGGTTTTGAT) that anneals in the start and end of the CDS (codons are underlined in the primer sequences) was performed using gDNA and
MA
cDNA from flower bud as templates. Amplicons were sequenced, aligned and compared to A. thaliana reference genome (TAIR10; www.arabidopsis.org). Gene structure of other
D
Brassicaceae AtMo25-1 homologs was built based on the genomic sequence retrieved from Phytozome, except for A. lyrata AtMo25-1 homolog (accession number
PT E
AL5G13160.t1). Phytozome prediction of this gene structure/CDS is misannotated and the genomic sequence used here is available under the gene symbol LOC9311272 at
CE
NCBI.
4.3. Plant material and gene expression analysis
AC
A. thaliana (Col-0) plants were grown on agar plates or soil under long-day conditions (16 h of light, 8 h of darkness) at 22 °C in a growth chamber. Flower bud, open flower and fertilized flower/silique, were collected from plants grown in soil:vermiculite (3:1) at stages 3, 13 and 15, according to developmental stages described elsewhere (Smyth et al., 1990). Rosette leaves and roots were collected from seedlings grown on ½ strength MS agar medium at 6 and 15 days after germination, respectively. Total RNA of wild type plants was extracted as described elsewhere (Logemann et al., 1987). DNasetreated RNAs (2.5 µg RNA; Turbo DNA-free, Ambion) were used for first strand cDNA synthesis using oligo dT from SuperScript III First Strand Synthesis kit (ThermoScientific). Expression analyses using qRT-PCR were performed using SYBR
ACCEPTED MANUSCRIPT Green PCR Master Mix (Applied Biosystems) in 7300 Applied Biosystems PCR System Thermocycler (Applied Biosystems). Primer sequences for qRT-PCR are available in Supplementary Information. Data was analyzed using 2−ΔΔCT method (Livak and Schmittgen, 2001). Expression of ubiquitin 14 (AtUBI14) and actin 2 were used as reference genes. In silico gene expression analyses were carried out using Genevestigator (Hruz et al., 2008) with microarray data publicly available of A. thaliana. Expression data was
PT
retrieved for AtMo25-1 (At2g03410), AtMo25-2 (At4g17270), AtMo25-3 (At5g18940)
SC
RI
and AtMo25-4 (At5g47540).
Acknowledgements
NU
We are grateful to Pedro Túlio Resende Lara and Pedro Henrique Camargo Penna for valuable discussion and comments in bioinformatics data and to Tatiana Correa for
MA
technical assistance.
D
Funding
The research was supported by Fundação de Amparo à Pesquisa do Estado de São
PT E
Paulo (FAPESP), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq).
CE
FMB received CAPES and UFABC fellowships and HPM received fellowship from
AC
CNPq.
References
Abascal, F., Zardoya, R. and Posada, D. (2005). ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21, 2104–2105. Anisimova, M. and Gascuel, O. (2006). Approximate likelihood-ratio test for branches: A fast, accurate, and powerful alternative. Syst. Biol. 55, 539–552. Barker, M. S., Vogel, H. and Schranz, M. E. (2009). Paleopolyploidy in the Brassicales: analyses of the Cleome transcriptome elucidate the history of genome
ACCEPTED MANUSCRIPT duplications in Arabidopsis and other Brassicales. Genome Biol Evol 1, 391– 399. Bidlingmaier, S., Weiss, E. L., Seidel, C., Drubin, D. G. and Snyder, M. (2001). The Cbk1p pathway is important for polarized cell growth and cell separation in Saccharomyces cerevisiae. Mol Cell Biol 21, 2449–2462.
PT
Casola, C. and Betrán, E. (2017). The Genomic Impact of Gene Retrocopies: What Have We Learned from Comparative Genomics, Population Genomics, and
RI
Transcriptomic Analyses? Genome Biology and Evolution 9, 1351–1373.
SC
Chen, V. B., Arendall, W. B., Headd, J. J., Keedy, D. A., Immormino, R. M., Kapral, G. J., Murray, L. W., Richardson, J. S. and Richardson, D. C. (2010). MolProbity: all-atom structure validation for macromolecular crystallography. Acta
NU
Crystallogr. D Biol. Crystallogr. 66, 12–21.
MA
Chien, S. C., Brinkmann, E. M., Teuliere, J. and Garriga, G. (2013). Caenorhabditis elegans PIG-1/MELK acts in a conserved PAR-4/LKB1 polarity pathway to promote asymmetric neuroblast divisions. Genetics 193, 897–909.
D
Colman-Lerner, A., Chin, T. E. and Brent, R. (2001). Yeast Cbk1 and Mob2 Activate
739–750.
PT E
Daughter-Specific Genetic Programs to Induce Asymmetric Cell Fates. Cell 107,
CE
Cusack, B. P. and Wolfe, K. H. (2007). Not Born Equal: Increased Rate Asymmetry in Relocated and Retrotransposed Rodent Gene Duplicates. Mol Biol Evol 24, 679–
AC
686.
Darriba, D., Taboada, G. L., Doallo, R. and Posada, D. (2011). ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27, 1164–1165. Dettmann, A., Illgen, J., Marz, S., Schurg, T., Fleissner, A. and Seiler, S. (2012). The NDR kinase scaffold HYM1/MO25 is essential for MAK2 map kinase signaling in Neurospora crassa. PLoS Genet 8, e1002950–e1002950. Drummond, A. J. and Rambaut, A. (2007). BEAST: Bayesian evolutionary analysis by sampling trees. BMC Evol. Biol. 7, 214.
ACCEPTED MANUSCRIPT Filippi, B. M., de los Heros, P., Mehellou, Y., Navratilova, I., Gourlay, R., Deak, M., Plater, L., Toth, R., Zeqiraj, E. and Alessi, D. R. (2011). MO25 is a master regulator of SPAK/OSR1 and MST3/MST4/YSK1 protein kinases. The EMBO journal 30, 1730–1741. Goodstein, D. M., Shu, S., Howson, R., Neupane, R., Hayes, R. D., Fazo, J., Mitros, T., Dirks, W., Hellsten, U., Putnam, N., et al. (2012). Phytozome: a comparative
PT
platform for green plant genomics. Nucleic Acids Res 40, D1178–D1186.
RI
Hao, Q., Feng, M., Shi, Z., Li, C., Chen, M., Wang, W., Zhang, M., Jiao, S. and Zhou, Z. (2014). Structural insights into regulatory mechanisms of MO25-mediated
SC
kinase activation. Journal of Structural Biology 186, 224–233. Hruz, T., Laule, O., Szabo, G., Wessendorp, F., Bleuler, S., Oertle, L., Widmayer, P.,
NU
Gruissem, W. and Zimmermann, P. (2008). Genevestigator v3: a reference expression database for the meta-analysis of transcriptomes. Adv Bioinformatics
MA
2008, 420747.
Hsu, J. and Weiss, E. L. (2013). Cell Cycle Regulated Interaction of a Yeast Hippo
D
Kinase and Its Activator MO25/Hym1. PLOS ONE 8, e78334.
PT E
Jansen, J. M., Barry, M. F., Yoo, C. K. and Weiss, E. L. (2006). Phosphoregulation of Cbk1 is critical for RAM network control of transcription and morphogenesis. J.
CE
Cell Biol. 175, 755–766.
Kanai, M., Kume, K., Miyahara, K., Sakai, K., Nakamura, K., Leonhard, K., Wiley, D.
AC
J., Verde, F., Toda, T. and Hirata, D. (2005). Fission yeast MO25 protein is localized at SPB and septum and is essential for cell morphogenesis. EMBO J 24, 3012–3025. Karos, M. and Fischer, R. (1999). Molecular characterization of HymA, an evolutionarily highly conserved and highly expressed protein of Aspergilus nidulans. Molecular and General Genetics 260, 510–521. Konagurthu, A. S., Whisstock, J. C., Stuckey, P. J. and Lesk, A. M. (2006). MUSTANG: a multiple structural alignment algorithm. Proteins 64, 559–574.
ACCEPTED MANUSCRIPT Kullmann, L. and Krahn, M. P. (2018). Controlling the master—upstream regulation of the tumor suppressor LKB1. Oncogene 1. Lamesch, P., Berardini, T. Z., Li, D., Swarbreck, D., Wilks, C., Sasidharan, R., Muller, R., Dreher, K., Alexander, D. L., Garcia-Hernandez, M., et al. (2012). The Arabidopsis Information Resource (TAIR): improved gene annotation and new
PT
tools. Nucleic Acids Res 40, D1202–D1210. Letunic, I. and Bork, P. (2007). Interactive Tree Of Life (iTOL): an online tool for
RI
phylogenetic tree display and annotation. Bioinformatics 23, 127–128.
SC
Li, W. and Godzik, A. (2006). Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics 22, 1658–1659.
NU
Livak, K. J. and Schmittgen, T. D. (2001). Analysis of Relative Gene Expression Data Using Real-Time Quantitative PCR and the 2−ΔΔCT Method. Methods 25, 402–
MA
408.
Logemann, J., Schell, J. and Willmitzer, L. (1987). Improved method for the isolation of
D
RNA from plant tissues. Anal Biochem 163, 16–20.
PT E
Maerz, S. and Seiler, S. (2010). Tales of RAM and MOR: NDR kinase signaling in fungal morphogenesis. Current Opinion in Microbiology 13, 663–671. Marques, A. C., Dupanloup, I., Vinckenbosch, N., Reymond, A. and Kaessmann, H.
CE
(2005). Emergence of Young Human Genes after a Burst of Retroposition in
AC
Primates. PLOS Biology 3, e357. Mehellou, Y., Alessi, D. R., Macartney, T. J., Szklarz, M., Knapp, S. and Elkins, J. M. (2013). Structural insights into the activation of MST3 by MO25. Biochem Biophys Res Commun 431, 604–609. Mendoza, M., Redemann, S. and Brunner, D. (2005). The fission yeast MO25 protein functions in polar growth and cell separation. Eur J Cell Biol 84, 915–926. Miyamoto, H., Matsushiro, A. and Nozaki, M. (1993). Molecular cloning of a novel mRNA sequence expressed in cleavage stage mouse embryos. Molecular Reproduction and Development 34, 1–7.
ACCEPTED MANUSCRIPT Nelson, B., Kurischko, C., Horecka, J., Mody, M., Nair, P., Pratt, L., Zougman, A., McBroom, L. D., Hughes, T. R., Boone, C., et al. (2003). RAM: a conserved signaling network that regulates Ace2p transcriptional activity and polarized morphogenesis. Mol Biol Cell 14, 3782–3803. Notredame, C., Higgins, D. G. and Heringa, J. (2000). T-Coffee: A novel method for
PT
fast and accurate multiple sequence alignment. J. Mol. Biol. 302, 205–217. Nozaki, M., Onishi, Y., Togashi, S. and Miyamoto, H. (1996). Molecular
RI
Characterization of the Drosophila Mo25 Gene, Which Is Conserved among
SC
Drosophila, Mouse, and Yeast. DNA and Cell Biology 15, 505–509. Racki, W. J., Bécam, A.-M., Nasr, F. and Herbert, C. J. (2000). Cbk1p, a protein similar to the human myotonic dystrophy kinase, is essential for normal morphogenesis
NU
in Saccharomyces cerevisiae. The EMBO Journal 19, 4524–4532.
MA
Rensing, S. A. (2016). (Why) Does Evolution Favour Embryogenesis? Trends in Plant Science 21, 562–573.
Rensing, S. A. (2018). Great moments in evolution: the conquest of land by plants.
PT E
D
Current Opinion in Plant Biology 42, 49–54. Sali, A. and Blundell, T. L. (1993). Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815.
CE
Saputo, S., Chabrier-Rosello, Y., Luca, F. C., Kumar, A. and Krysan, D. J. (2012). The
AC
RAM Network in Pathogenic Fungi. Eukaryot Cell 11, 708–717. Schranz, M. E., Mohammadin, S. and Edger, P. P. (2012). Ancient whole genome duplications, novelty and diversification: the WGD Radiation Lag-Time Model. Current Opinion in Plant Biology 15, 147–153. Shi, Z., Jiao, S., Zhang, Z., Ma, M., Zhang, Z., Chen, C., Wang, K., Wang, H., Wang, W., Zhang, L., et al. (2013). Structure of the MST4 in complex with MO25 provides insights into its activation mechanism. Structure 21, 449–461. Smyth, D. R., Bowman, J. L. and Meyerowitz, E. M. (1990). Early flower development in Arabidopsis. Plant Cell 2, 755–767.
ACCEPTED MANUSCRIPT ten Klooster, J. P., Jansen, M., Yuan, J., Oorschot, V., Begthel, H., Di Giacomo, V., Colland, F., de Koning, J., Maurice, M. M., Hornbeck, P., et al. (2009). Mst4 and Ezrin Induce Brush Borders Downstream of the Lkb1/Strad/Mo25 Polarization Complex. Developmental Cell 16, 551–562. Van Bel, M., Diels, T., Vancaester, E., Kreft, L., Botzki, A., Van de Peer, Y., Coppens, F. and Vandepoele, K. (2018). PLAZA 4.0: an integrative resource for
PT
functional, evolutionary and comparative plant genomics. Nucleic Acids Res 46,
RI
D1190–D1196.
Weiss, E. L. (2012). Mitotic Exit and Separation of Mother and Daughter Cells.
SC
Genetics 192, 1165–1202.
Weiss, E. L., Kurischko, C., Zhang, C., Shokat, K., Drubin, D. G. and Luca, F. C.
NU
(2002). The Saccharomyces cerevisiae Mob2p–Cbk1p kinase complex promotes polarized growth and acts with the mitotic exit network to facilitate daughter
MA
cell–specific localization of Ace2p transcription factor. The Journal of Cell Biology 158, 885–900.
D
Xiong, J., Cui, X., Yuan, X., Yu, X., Sun, J. and Gong, Q. (2016). The Hippo/STE20
PT E
homolog SIK1 interacts with MOB1 to regulate cell proliferation and cell expansion in Arabidopsis. Journal of Experimental Botany 67, 1461–1475. Yamamoto, Y., Izumi, Y. and Matsuzaki, F. (2008). The GC kinase Fray and Mo25
CE
regulate Drosophila asymmetric divisions. 366, 212–218.
AC
Zermiani, M., Begheldo, M., Nonis, A., Palme, K., Mizzi, L., Morandini, P., Nonis, A. and Ruperti, B. (2015). Identification of the arabidopsis RAM/MOR signalling network: Adding new regulatory players in plant stem cell maintenance and cell polarization. Annals of Botany 116, 69–89.
ACCEPTED MANUSCRIPT Legends to figures: Fig. 1. Mo25 phylogeny in Embryophyta (land plants). (A) Plant lineages/species tree based on NCBI taxonomy. Branches are colored according to their phylogenetic relations. Basal Embryophyta (mosses and liverworts) are represented in green; Lycophytes (sister group of ferns, represented by Selaginella) in orange; gymnosperms in purple and angiosperms (comprising basal angiosperm, monocots and eudicots) in pink. (B) Mo25
PT
phylogeny in land plants. Sequences from land plants form two groups: Mo25A and Mo25B present only in Embryophyta. For better visualization, branches representing
RI
monocot and eudicot sequences were collapsed. Branches are colored with the same color
SC
code of Figure 1A. Branches with support value above 75 are shown.
NU
Fig. 2. Mo25A phylogeny from eudicots species. Three A. thaliana paralogs (AtMo25) that belong to Mo25 group A are indicated by red asterisk. In Brassicaceae family, a group
MA
of sequences showed accelerated evolution rate (indicated as a purple branch).
D
Fig. 3. Genetic neighborhood and gene structure of AtMo25 paralogs. (A) Genetic neighborhood of the four AtMo25 paralogs. AtMo25-2 and AtMo25-4 (Mo25A group)
PT E
were duplicated as a cassette containing an auxin responsive gene. Chromosome number and genomic position coordinates of the right- and left-most genes depicted in the diagram are indicated for each AtMo25 paralog flanking region. (B) Gene structure of AtMo25
CE
paralogs. AtMo25-1 is the only intronless paralog. Genomic coordinates of the genes are
AC
indicated in the figure.
Fig. 4. Structural model and sequence alignment of the four AtMo25 homologs. (A) Structural models of AtMo25-1, AtMo25-2, AtMo25-3 and AtMo25-4 obtained by homology using PDB: 1UPK as template in MODELLER. (B) Sequence alignment of human Mo25 and the four AtMo25 homologs. Residues of human Mo25 that are on the interface between MST3 kinase are marked with black asterisk above the alignment. Red asterisks indicate residues important for MST3 activation in human Mo25 (Filippi et al., 2011; Mehellou et al., 2013).
ACCEPTED MANUSCRIPT Fig. 5. Expression profile of AtMo25 paralogs. (A) Expression profile AtMo25 paralogs in different plant tissues and organs shown as a heat map representation of the average values among the expression values published in many microarray experiments available in Genevestigator (https://genevestigator.com) (Hruz et al., 2008). (B) Temporal expression pattern of each AtMo25 paralog. Graphic representation of the average values among the expression values published in microarray experiments publicly available in Genevestigator. Each gene is represented in a different color as specified in the legend.
PT
(C) AtMo25 paralog gene expression in different A. thaliana tissues and organs using RT-qPCR. Values were normalized with AtUBI14 as reference gene. Data shown
RI
represent mean values obtained from independent amplification reactions (n = 3) and
AC
CE
PT E
D
MA
NU
collected from a pool of at least four plants.
SC
biological replicates (n = 2). Each biological replicate was performed with material
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9