The evolutionary analysis reveals domain fusion of proteins with Frizzled-like CRD domain

The evolutionary analysis reveals domain fusion of proteins with Frizzled-like CRD domain

Gene 533 (2014) 229–239 Contents lists available at ScienceDirect Gene journal homepage: www.elsevier.com/locate/gene The evolutionary analysis rev...

3MB Sizes 1 Downloads 36 Views

Gene 533 (2014) 229–239

Contents lists available at ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

The evolutionary analysis reveals domain fusion of proteins with Frizzled-like CRD domain Jun Yan a,b,c, Haibo Jia a, Zhaowu Ma b, Huashan Ye b, Mi Zhou b, Li Su a, Jianfeng Liu a, An-Yuan Guo a,b,⁎ a

Key Laboratory of Molecular Biophysics of the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China Hubei Bioinformatics & Molecular Imaging Key Laboratory, Department of Biomedical Engineering, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China c Department of Applied Physics, College of Information Science and Engineering, Shandong Agricultural University, Taian, Shandong, 271018, PR China b

a r t i c l e

i n f o

Article history: Accepted 23 September 2013 Available online 14 October 2013 Keywords: Frizzled-like CRD domain Evolution Wnt Domain fusion

a b s t r a c t Frizzleds (FZDs) are transmembrane receptors in the Wnt signaling pathway and they play pivotal roles in developments. The Frizzled-like extracellular Cysteine-rich domain (Fz-CRD) has been identified in FZDs and other proteins. The origin and evolution of these proteins with Fz-CRD is the main interest of this study. We found that the Fz-CRD exists in FZD, SFRP, RTK, MFRP, CPZ, CORIN, COL18A1 and other proteins. Our systematic analysis revealed that the Fz-CRD domain might have originated in protists and then fused with the Frizzled-like seven-transmembrane domain (7TM) to form the FZD receptors, which duplicated and diversified into about 11 members in Vertebrates. The SFRPs and RTKs with the Fz-CRD were found in sponge and expanded in Vertebrates. Other proteins with Fz-CRD may have emerged during Vertebrate evolution through domain fusion. Moreover, we found a glycosylation site and several conserved motifs in FZDs, which may be related to Wnt interaction. Based on these results, we proposed a model showing that the domain fusion and expansion of Fz-CRD genes occurred in Metazoa and Vertebrates. Our study may help to pave the way for further research on the conservation and diversification of Wnt signaling functions during evolution. Crown Copyright © 2013 Published by Elsevier B.V. All rights reserved.

1. Introduction The Wnt signaling pathways play essential roles in controlling cell proliferation, cell fate determination, and nervous system development (Freese et al., 2010; Ling et al., 2009). Members of the Frizzled (FZD) family on the plasma membrane are receptors of the Wnt signaling proteins. Wnts bind to the FZDs and initiate canonical or noncanonical signaling pathways involved in biological processes (Freese et al., 2010; Ling et al., 2009). A review about the evolution of Wnt signaling pathway in Cnidarians revealed that components of all Wnt signaling pathways are present in Cnidarians, indicating that these developmentally important proteins were already present in the Cnidarian–Bilaterian ancestor (Lee et al., 2006). Frizzled proteins are considered to be the subfamily of G-protein-coupled receptors (GPCRs) since they are 7-transmembrane domain (7TM) receptors

Abbreviations: FZD, Frizzled; CRD, Cysteine-rich domain; Fz-CRD, Frizzled-like Cysteine-rich domain; TM, transmembrane domain; 7TM, seven-transmembrane domain; GPCR, G-protein-coupled receptor; SFRP, secreted Frizzled-related protein; Smo, Smoothened; MuSK, muscle skeletal receptor tyrosine kinase; ROR, receptor tyrosine kinase-like orphan receptor; RTK, receptor tyrosine kinase; CPZ, carboxypeptidase Z; MFRP, membrane Frizzled-related protein; WGD, whole genome duplication. ⁎ Corresponding author at: Key Laboratory of Molecular Biophysics of the Ministry of Education, College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, Hubei 430074, PR China. Tel./fax: +86 27 87793177. E-mail address: [email protected] (A.-Y. Guo).

(Koval et al., 2011). Experimental and bioinformatics analysis suggests that Frizzled proteins are coupled to some members of the Gαi/o, Gαq, Gαs families of G proteins (Liu et al., 2001; Wang et al., 2006). GPCRs were classified into six major families, which are: Rhodopsin-like receptors (Class A, R), Adhesion receptors and Secretin receptors (Class B, A and S), Metabotropic glutamate/pheromone receptors (Class C, G), Fungal mating pheromone receptors (Class D), Cyclic AMP receptors (Class E) and Frizzled/Smoothened GPCRs (Class F, F) (Bockaert and Pin, 1999). However, Classes D and E do not exist in the human genome, in which the human GPCRs are classified into five clades (RSAGF) (Fredriksson et al., 2003). It has been reported that the five families (RSAGF) in humans arose before the split of the nematode from the Chordate lineage; several subgroups of the Rhodopsin family arose before the split of the lineage leading to Vertebrates (Fredriksson and Schioth, 2005). Most publications have studied the repertoire of GPCRs in Invertebrates and Vertebrates, including mosquito, sea squirt, fish, frog and human (Fredriksson et al., 2003; Hill et al., 2002; Ji et al., 2009; Kamesh et al., 2008; Metpally and Sowdhamini, 2005). A typical Frizzled protein always has three domains: (1) the Nterminal domain which participates in ligand binding; (2) the seven-transmembrane domain that spans the lipid bilayers; and (3) the C-terminal tail which mediates downstream signaling in the cellular cytoplasm (Wang et al., 2006). The N-terminal domain comprises of an extracellular Cysteine-rich domain (CRD) which

0378-1119/$ – see front matter. Crown Copyright © 2013 Published by Elsevier B.V. All rights reserved. http://dx.doi.org/10.1016/j.gene.2013.09.083

230

J. Yan et al. / Gene 533 (2014) 229–239

contains ten Cysteines to form disulfide bonds (Dann et al., 2001). Frizzled-like CRD (Fz-CRD) is not only present in FZDs but also in many other proteins. Previous studies suggest that the Fz-CRD exists in various soluble and transmembrane proteins (Rehn et al., 1998; Saldanha et al., 1998; Xu and Nusse, 1998), including FrzB (SFRP, secreted Frizzled-related protein), Smo (Smoothened), CPZ (carboxypeptidase Z), COL18A1 (collagen, type XVIII, alpha 1), MuSK (muscle, skeletal, receptor tyrosine kinase), and ROR (receptor tyrosine kinase-like orphan receptor). The Fz-CRD in SFRP, CPZ and ROR can bind with Wnt physically and modulate the Wnt signaling (Bafico et al., 1999; Moeller et al., 2003; Oishi et al., 2003; Wang et al., 2009). Till now, the origin and evolution of the proteins with Fz-CRD has rarely been studied. In this work, we systematically analyzed proteins with Fz-CRD across diverse genomes including Dictyostelium discoideum and Metazoan. Phylogenetic analysis and conserved motif search of FzCRD related proteins were performed to investigate the evolution. Moreover, we proposed a model for the origin, domain fusion and evolution of these protein families. We found that two rounds of expansion occurred in the proteins with Fz-CRD during the emergence of the Metazoa and Vertebrates. Our study provides a new insight into understanding the origins of Frizzled receptors and other correlate proteins with the Fz-CRD domain, which may help to further understand the evolutionary conservation and diversification of Wnt signaling functions.

Strongylocentrotus purpuratus proteomes were downloaded from NCBI (http://www.ncbi.nlm.nih.gov). The proteomes of other species were downloaded from Ensembl (http://www.ensembl.org). In order to obtain the complete Frizzled GPCR and other proteins with Fz-CRDs, we screened sequences according to the strategy in Fig. 1A. Initially, we performed the HMMER (http://hmmer.janelia. org/) search against all the proteomes in our local server using the Pfam profile PF01392 (Fz-CRD domain in Pfam) with an E-value cutoff 0.01. Then we uploaded the resulted sequences for further search in batch on Pfam website (http://pfam.sanger.ac.uk/) with the same cutoff. Secondly, CD-HIT (Huang et al., 2010) (90% identity cutoff for JGI and NCBI data) and local Perl script (for Ensembl data) were used to remove the redundancy. These processes identified the non-redundant (NR) proteins with Fz-CRD domain. Thirdly, we classified these sequences with Fz-CRD into eight kinds by their other characteristic domains (such as NTR domain in SFRP, etc). For Frizzled which are 7TM GPCRs, we performed the more filtration to retain the sequences with 6–8 transmembranes predicted by HMMTOP (Tusnady and Simon, 2001), TMHMM (Krogh et al., 2001) and SOSUI (Hirokawa et al., 1998) programs with default settings. We also searched the proteins with Fz-CRD by Blast in NCBI in lineages such as protist plants, microbes and fungi (http://www.ncbi. nlm.nih.gov/sutils/genom_table.cgi).

2. Materials and methods

For the phylogenetic analysis of FZDs, we used the truncated amino acid sequences of the TM-domain in all Frizzleds. The phylogenetic trees of the other kinds of proteins with Fz-CRD were based on terminally truncated Fz-CRD. We used the ClustalW program (Larkin et al., 2007) with default parameters to construct multiple alignments, followed by manual editing using BioEdit (Hall, 1999). The phylogenetic trees were produced by three different approaches: Bayesian analyses (MrBayes), neighbor-joining (NJ) method with the p-distance model,

2.1. Data collection Complete proteomes of D. discoideum and Dictyostelium purpureum were obtained from dictyBase (http://www.dictybase.org). Monosiga brevicollis, Amphimedon queenslandica and Nematostella vectensis proteomes were downloaded from JGI (http://genome.jgi-psf.org/).

2.2. Multiple alignment and phylogenetic analysis

Fig. 1. Analysis strategy and domain schema of Fz-CRD proteins. (A) Strategy to identify Fz-CRD proteins. The proteomes of species were searched for Fz-CRD domain and other domains. (B) The domain architectures of Fz-CRD proteins (use the human sequences as model). The domain descriptions were labeled on them.

J. Yan et al. / Gene 533 (2014) 229–239

and Maximum Likelihood (ML). Bayesian analysis was performed using MrBayes 3.1.2 (Ronquist and Huelsenbeck, 2003) with the mixed amino acid substitution model, MCMC chain with 10,000,000 generations was used. Markov chains were sampled every 100 generations, and the first 25% of the trees was discarded as burn-in. Convergence was assessed by checking the average standard deviation of split frequencies (below 0.01). The NJ trees were constructed using the MEGA5 (Tamura et al., 2007) software. Bootstrap with 1000 repetitions was performed to assess the confidence degree of nodes in the phylogenetic trees. The ML methods were performed using PhyML3.0 (Guindon and Gascuel, 2003) with 100 bootstrap replicates. The appropriate model of ML methods including model parameters was calculated using the Akaike Information Criterion (AIC) with ProtTest2.4 (Abascal et al., 2005), and resulted in JTT + G + F as the best model for the FZD-TM data set, in LG + I + G for the SFRP-CRD, in JTT + I + G for the RTK-CRD and in JTT + I + G for the other five classes (CRD) analyses. 2.3. Exon–intron structure and motif finding Exon–intron structures were based on the extracted information of NCBI (S. purpuratus), JGI (N. vectensis), dictyBase (D. discoideum) and UCSC (S. purpuratus, http://genome.ucsc.edu/) and Ensembl (others). The exon–intron diagrams were obtained using the Perl and R scripts based on the extracted information, following the manual edit. The domain structure diagrams were generated using Perl and R scripts based on the result of the Pfam search. The sequence WebLogos were produced by webserver (http://weblogo.berkeley.edu/logo.cgi) based on the alignment produced by ClustalW. The conserved motifs were searched in the MEME website (http://meme.ebi.edu.au/meme/intro. html). N-glycosylation sites were predicted by NetNGlyc 1.0 Server (http://www.cbs.dtu.dk/services/NetNGlyc/). 3. Results 3.1. Fz-CRD domain exists in several protein families from Protozoa to Vertebrates In order to study the origin and evolution of the Fz-CRD domain, we performed a comprehensive search in various genomes (including D. discoideum and Metazoan) by the strategy in Fig. 1A (see “Data collection” in the Materials and methods section). As a result, the FzCRD domain was found in about eight kinds of proteins, which are

231

FZD (Frizzled), SFRP (secreted Frizzled-related protein), RTK (receptor tyrosine kinase, including MuSK (muscle skeletal receptor tyrosine kinase), ROR1 (receptor tyrosine kinase-like orphan receptor 1) and ROR2), CORIN (corin, serine peptidase), CPZ (carboxypeptidase Z), COL18A1 (collagen type XVIII alpha 1), MFRP (membrane Frizzledrelated protein), and a few specific proteins in lower species with unknown function (Figs. 1B and 2). The sequence id is summarized in Table S1. In order to study the evolution, we summarized the detailed gene information of human proteins with Fz-CRD in Table S2. Based on the results, we described the origin and evolution of proteins with Fz-CRD. We first focused on FZD, which initially emerged among eight kinds. The earliest protein containing both Frizzled-like 7TM and Fz-CRD was found in slime mold, D. discoideum. Other Dictyostelium was also searched that many sequences with only Frizzled-like 7TM or Fz-CRD but not both were found in D. purpureum (data not shown). Local HMMER search results showed no homolog sequence with Fz-CRD and Frizzled-like 7TM was found in M. brevicollis, which is consistent with the report that no receptors or ligands were identified from the NHR (nuclear hormone receptor), Wnt and TGF-β signaling pathways in this species (King et al., 2008). One sequence with only the Fz-CRD domain was also found in the Protozoa (gi| 237845621| in Toxoplasma gondii ME49). No similar sequence of Fz-CRD was found in plants, fungi and microbes. Taken together, the Fz-CRD occurred in Protozoa and expanded into many kinds of proteins. Then we noticed that Fz-CRD expanded during the emergence of Metazoan. We will described it in detail by three representative species (D. discoideum, sponges and sea anemone), which are much older than Bilateria. In D. discoideum proteome, two Frizzled receptors (DDB0230120 and DDB0231833) both with complete Frizzled-like 7TM and Fz-CRD domains were found. Although some sequences with single Fz-CRD or Frizzled-like 7TM domain (incomplete FZD) were found (data not shown), we didn't consider them as FZD and excluded them in the following analysis. In contrast, Prabhu et al. have identified 25 Frizzled-like genes in D. discoideum genome (Prabhu and Eichinger, 2006), including some incomplete FZD sequences. 16 Fsl (the Frizzledand Smoothened-like receptors) harbor an N-terminal CRD while the 9 Fsc (the Frizzled/Smoothened-like Sans CRD) have no CRD domain. We re-examined these 25 sequences and found that only two sequences (DDB0231833 and DDB0230120) contained both the Frizzled-like 7TM and Fz-CRD domains, the same as our identification. However, the remaining 14 among the 16 Fsl proteins weren't predicted to contain the Frizzled-like 7TM by the Pfam website, thus we excluded them in

Fig. 2. The distribution of proteins with Fz-CRD domain in representative species. The phylogenetic relationship of all the species investigated was on the left. The number of Fz-CRD proteins is on the right. Two round domain fusions of Fz-CRD were represented by arrows and bars. The abbreviations were in the following. (p): (partial); FZD: Frizzled homolog (Drosophila); SFRP: secreted Frizzled-related protein; RTK: receptor tyrosine kinase, including MuSK (muscle skeletal receptor tyrosine kinase), ROR1 (receptor tyrosine kinase-like orphan receptor 1) and ROR2; CORIN: corin, serine peptidase; CPZ: carboxypeptidase Z; COL18A1: collagen type XVIII alpha 1; MFRP: membrane Frizzled-related protein. The 13 with an asterisk (13*) in sea urchin means 3 CORINs and 10 COMF, which were described in the text.

232

J. Yan et al. / Gene 533 (2014) 229–239

the further analysis. In Porifera (sponge), there are three kinds of FzCRD containing proteins, which are FZD, SFRP and RTK. We used the data of A. queenslandica from JGI and found 2 FZDs, 9 RTKs and 11 sequences with only Fz-CRD. We also searched other sponges from NCBI and found 1 FZD, 1 SFRP and 4 RTKs (data not shown), but didn't use them to construct the phylogenetic trees. In sea anemone, 5 FZDs (4 FZDs + 1 Smo), 2 SFRPs and 2 RTKs (ROR) were found. Some special proteins with only the Fz-CRD domain were found in D. discoideum, sponge, sea anemone, sea urchin, Ciona, and even in some Vertebrates such as zebrafish, frog, lizard and chicken. Taken together, there are three kinds of proteins containing Fz-CRD (FZD, SFRP and RTK) in early Metazoa. Finally we described them in Protostomes and Deuterostomes. In Protostomes two representative species were chosen for study; Caenorhabditis elegans and Drosophila melanogaster, which belong to Nematode and Arthropoda respectively. There are 4 FZDs, 1 SFRP and 1 RTK in C. elegans, while 5 FZDs, 1 RTK, 3 CORINs and none SFRP were found in D. melanogaster. The numbers and features of each type in C. elegans were different from sea anemone (or sea urchin). As for Deuterostomes, Echinodermata and Chordate are the main lineages. Five FZDs, 2 SFRPs, 2 RTKs and 3 CORINs were found in sea urchin, which is a representative species of Echinodermata. We noticed 10 sea urchin specific sequences (we named them COMF) which contained both the trypsin domain (CORIN characteristic domain) and the CUB domain (MFRP characteristic domain). These specific proteins with FzCRD in sea urchin may have emerged due to Fz-CRD specific domain fusion after the split between Echinodermata and Chordate. In Ciona, 5 FZDs, 2 SFRPs, 1 RTK (ROR), 1 CORIN, 4 others (only Fz-CRD) were found. Ciona encodes no COL18A1, CPZ, and MFRP, while these proteins were present in Vertebrates from zebrafish to human. There are 4 FZDs, 1 SFRP, 1 CPZ, 1 CORIN and 3 others (only Fz-CRD) in lamprey, which is a Cyclostomata of Vertebrate and emerged before the two rounds of whole genome duplication (2R WGD) occurred. Generally, 11 FZDs, 5 SFRPs, 3 RTKs (2 RORs + 1 MuSK), 1 CORIN, 1 COL18A1, 1 CPZ, and 1 MFRP can be found in Vertebrate genomes. We will describe their evolution as follows. 3.2. Phylogenetic analysis of the Frizzled family The numbers of FZDs in the investigated genomes were collected and summarized in Table S3. To further infer the evolution and classification the FZD proteins, we constructed a phylogenetic Bayesian tree based on their TM domains (Fig. 3). Most of Vertebrates have 11 FZDs which were classified into 5 groups in the phylogenetic tree (FZD1/2/7, FZD5/8, FZD4/9/10, FZD3/6 and Smo), while most of the Invertebrates from Cnidaria to Urochordata have 5 members (4 FZDs and Smo). Two FZDs from D. discoideum (DDB0231833 and DDB0230120) were used as outgroup to root the tree. The expanded phylogenetic subtrees of Vertebrate FZD and Smo were displayed in Figs. S1A–E. In order to validate the tree, NJ tree with model pdistance and ML tree with the JTT + G + F model were also constructed based on their TM sequences (Figs. S2A–B). The results showed that these trees had a similar phylogenetic topology. Based on the topologies of these trees, five groups were classified into five clades, Smo, FZD1/2/7, FZD5/8, FZD4/9/10 and FZD3/6. The first group is Smo, which appeared as early as the period the N. vectensis emerged. Except for C. elegans, each species encodes one copy of the Smoothened gene. The sequence of sea urchin (gi|115941795|) annotated as “similar to Smoothened, partial” only has one Fz-CRD and no TM domain, so we did not use it in the phylogenetic analysis. We can find Smo in lamprey (ENSPMAP00000003500). The fruit fly Smo (FBpp0077788) was located in the branch of Smo. However, no Smo homolog was found in C. elegans. The second group is FZD1/2/7, which is supported by Bayesian posterior probabilities (1) and only exists in Vertebrates. This group

was divided into three clades (FZD1, FZD2, FZD7) with high Bayesian posterior probabilities (1), indicating that they may diverge in Vertebrate. Two lamprey FZD1/2/7 sequences were in this clade, and the sequence of ENSPMAP00000010964 was at the base of the Vertebrate FZD1/2/7 clade, supporting that FZD1/2/7 diverged after 2R WGD. It is similar in the third group FZD3/6, which has two copies in Vertebrates and one in Ciona, suggesting that it may have originated from the common ancestor of Chordates. This was supported by the report that Fz3/6, which has no ortholog identified yet in Cnidarian lineage, might arise during the emergence of the Chordate line (Croce et al., 2006). Zamanian searched the repertoire of GPCR in the human parasite Schistosoma mansoni and the model organism Schmidtea mediterranea which belongs to Platyhelminthes, and found a single receptor (FSMP118970 and FSMD000018) of flatworm that was grouped in the cluster FZD3/6, sharing ~38% identity with human FZD6 (Zamanian et al., 2011). We checked them and found they didn't cluster with the Chordate FZD3/6 clade in NJ tree with p-distance (data not shown), similar to the C. elegans and D. melanogaster FZD. So the FZD3/6 may arise earlier than the emergence of Chordates; maybe during the emergence of the flatworm ancestor, and may be highly divergent after flatworms split from the lineages. The fourth group is FZD5/8, which exists from N. vectensis to human. FZD5/8 was divided into two members in Vertebrates. The fly sequence (FBpp0288861) was grouped with this cluster. Only one sequence (ENSPMAP00000010963) in this clade was found in lamprey. In the NJ tree with p-distance model, the bootstraps of V-FZD5 and V-FZD8 were both lower than 50, while the bootstrap value of their common branch was relatively high (72) (Fig. S2A). The fifth group is the FZD4/9/10, which is relatively more complex among the five clades. The clade of Vertebrate FZD9 and FZD10 grouped together with high Bayesian posterior probabilities (1), while the clade of FZD4 clustered with a sequence of S. purpuratus (gi|115969057|). The common Bayesian posterior probability of FZD4/9/10 is high (0.98). This clade included the Invertebrate FZD4/9/10 sequences of A. queenslandica, N. vectensis, D. melanogaster, S. purpuratus and Ciona intestinalis (Fig. 3). We noticed that two copies of FZD4/9/10 were present in the N. vectensis and S. purpuratus. The sea urchin sequence (gi|115969057|) was clustered with the Vertebrate FZD4 clade with high Bayesian posterior probabilities (1), while the positions of two N. vectensis (jgi|Nemve1|139208| and jgi|Nemve1|168924|) and one S. purpuratus (gi|115973023|) sequences varied among different phylogenetic trees, suggesting that the common ancestor of FZD4/9/10 may have appeared during the emergence of sea anemone and then diverged during evolution. 3.3. The exon and intron structure of the Frizzled family The exon–intron structures of Frizzled are obtained by the following process. We used Perl and R scripts basing on the extracted exon–intron structure information in Ensembl and domain information in Pfam website (Fig. 4). Our results showed that all the Smo and FZD3/6 have more introns. Except for Smo and FZD3/6, most of the other Vertebrate FZDs have only one exon. The exon phases of Vertebrate FZD3/6 were all “02022”, while the exon phase of Ciona FZD3/6 was not the same as the Vertebrates'. The similar phenomenon was also observed in Vertebrate Smo with exon phases “1002-0112-211”. Even the exon phase of sea anemone Smo was similar with “002-0112-2”. The intron numbers and exon phases in fly and nematode FZDs were not the same as N. vectensis and Vertebrate, maybe as a result of Protostome intron loss. It has been reported that extensive gene loss, intron loss, and genome rearrangement were experienced in the Protostomes, which is one of the three major modern Eumetazoan lineages: Cnidarians (i.e., N. vectensis), Protostomes (i.e., D. melanogaster and C. elegans) and Deuterostomes (sea urchin, Ciona and Vertebrate) (Putnam et al., 2007). The Eumetazoan ancestor more closely resembled

J. Yan et al. / Gene 533 (2014) 229–239

233

Fig. 3. Phylogenetic tree of FZDs. The Bayesian tree was built by the 7TM domain sequences using MrBayes 3.1.2 with the mixed amino acid substitution model. The Vertebrate Frizzled branches were compressed and their expanded subtrees were in Supplementary Fig. S1.

modern Vertebrates and sea anemones (Putnam et al., 2007). Compared to the exon phase of the sea anemone Smo, the fly Smo gene (FBpp0077788) lost introns to become “20222”. The coding region of fly FZD5/8 (FBpp0288861) was in one exon, similar to the other Invertebrates or Vertebrate FZD5/8, hinting that the ancestor Eumetazoan FZD5/8 had only one exon in coding region. The FZD in D. discoideum and A. queenslandica contained only one or two exons. It is reported that introns in D. discoideum are few and short, and intergenic regions are small (Eichinger et al., 2005). The Ciona FZD may experience its specific change after it split from lineages.

Based on these results, we inferred that in the ancestral Eumetazoan genome all FZD and Smo may have more introns. However, FZD1/2/7, FZD5/8 and FZD4/9/10 in the ancestor of N. vectensis lost introns and these characters were conserved in Vertebrates. Putnam et al. compared the N. vectensis genes to other animals and revealed that the ancestral Eumetazoan genome was intron-rich. In contrast to intron gains, some lineages appeared to have experienced extensive intron loss, notably the fly lost 90% introns, nematode 80%, sea squirt 50%, and human only 15% (Putnam et al., 2007). Our speculation was consistent with theirs. In summary, the process of intron loss was complex and species

234

J. Yan et al. / Gene 533 (2014) 229–239

Fig. 4. The exon–intron structures of representative FZD genes. Exon and intron structures of FZD were indicated. Red boxes: Fz-CRD domain; green boxes: Fz-7TM domain; white boxes: exons of protein coding areas except Fz-CRD and Fz-7TM domains; black boxes: untranslated regions (UTR) in exons, long UTRs were shortened “//”; lines: introns, long introns were shortened. Abbreviations: Hs, Homo sapiens; Dr, Danio rerio; Xt, Xenopus tropicalis; Pm, Petromyzon marinus; Ci, Ciona intestinalis.

specific. Two hypothesized mechanisms of intron loss which are Genomic Deletions and Reverse Transcriptase-Mediated Intron Loss (RTMIL) may explain the phenomenon of FZD intron loss (Roy and Gilbert, 2006). RTMIL might be responsible for the FZD intron loss leading one exon in FZD1/2/7, etc. 3.4. TMs and KTXXXW of Frizzled family In order to analyze the conserved sites in TM domains in FZDs, we used the WebLogo tool based on the multiple alignments of TMs. Some conserved motifs were found in the TMs, such as RPxxFLxxCY in the second TM domain, MAxxxWWVxL in the third TM domain and GxFxxLYxVP in the sixth TM domain (Fig. S3A). We also used MEME to search some conserved motifs and found three motifs (Fig. S3B). Motif 1 was mainly located in the second TM domain. Motif 2 was located in the third and fourth TM domains, and the loop between them. Motif 3 was located in the sixth TM domain and the loop before it. In 2004, Tateyama et al. studied the conformational change in 7TMs when the GPCR C class mGluR1α was activated by ligand, and found that the distance between TM1–IL1–TM2 of dimmers changed when the receptor was activated. Similarly, the distance between TM3–IL2– TM4 changed, too (Tateyama et al., 2004). Interestingly, we found that motif 1 of FZD was also located in TM2, and motif 2 was located in TM3–TM4. Although FZD does not belong to the GPCR C class, the mechanism of its activation may be similar to mGluR1α in its 7TM. The similar conserved regions in the transmembrane domain were also searched in other GPCRs, the odorant receptor of amphioxus (Churcher and Taylor, 2009). The conserved motif (KTXXXW) in C-terminal of FZD is conserved from sponge to human (Fig. S4). However, this KTXXXW motif can't be found in all Smo, which is involved in the Hedgehog, but not Wnt

signaling pathway (Ingham et al., 2011). Moreover, we didn't find this motif in FZDs of C. elegans (T23D8.1 and Y34D9B.1b) D. melanogaster (FBpp0070977 and FBpp0111841) and D. discoideum (DDB0231833 and DDB0231626). What is the function of this motif? It has been demonstrated that this motif locating in two amino acids after the 7TMs of FZD is required for the membrane relocalization and the phosphorylation of a scaffolding protein Disheveled (Dvl) in the Wnt/ β-catenin signaling (Umbhauer et al., 2000). Recent review further pointed out that this motif can bind to the PDZ domain of Dvl protein in both the canonical and PCP Wnt signaling pathway (Koval et al., 2011). Dvl protein is a major transducer in Wnt signaling (Koval et al., 2011).

3.5. SFRP emerged in sponge and expanded in Vertebrate The SFRPs are the largest family of Wnt inhibitors (Bovolenta et al., 2008). We found one SFRP (gi|133917271|) in sponge Lubomirskia baicalensis at NCBI. Since we chose A. queenslandica as a representative sponge, this sequence was excluded in the construction of phylogenetic trees. Our phylogenetic analysis showed that SFRPs were divided into 2 groups, SFRP1/2/5 and SFRP3/4 (Figs. 5A and S5A–B). Two N. vectensis SFRPs located at the base of the SFRP1/2/5 and SFRP3/4 clades respectively. Two S. purpuratus SFRPs had the similar positions in phylogenetic trees. Two Ciona SFRPs all located in SFRP1/2/5 clade. All Vertebrates except lamprey contain five SFRPs and most Invertebrates have two SFRPs, implying that SFRPs duplicated and diverged during the emergence of Vertebrate. SFRPs of two investigated Protostomes are special that C. elegans has only 1 SFRP and D. melanogaster has no SFRP. The domain structures of SFRP are simple (Fz-CRD and NTR domains) and are conserved after its formation in Metazoa.

J. Yan et al. / Gene 533 (2014) 229–239

235

Fig. 5. Phylogenetic tree of RTK with Fz-CRD and SFRP. The Bayesian trees were built by the Fz-CRD sequences using MrBayes 3.1.2 with the mixed amino acid substitution model. (A) SFRP. (B) RTK, including three types, ROR1, ROR2, and MuSK. (C) MFRP, CPZ, CORIN, COL18A1 and sea urchin COMF.

3.6. The RTKs with Fz-CRD appeared in sponge RTKs are single-pass transmembrane receptors which are essential components of signal transduction pathways that affect cell proliferation, differentiation, migration and metabolism. RTKs have the TK (tyrosine

kinase) domain which catalyze the transfer of ATP to the side-chain hydroxyl group of tyrosine residues in protein substrates (Hubbard, 1999). ROR RTKs are a family of orphan receptors which are related to MuSK and Trk neurotrophin receptors, among which ROR and MuSK have Fz-CRD (Forrester, 2002).

236

J. Yan et al. / Gene 533 (2014) 229–239

In order to study the evolution of RTKs with Fz-CRD (ROR1/2 and MuSK), we constructed different phylogenetic trees based on their sequences of truncated Fz-CRD domain (Figs. 5B and S5C–D). Among the nine sponge A. queenslandica RTK, we chose two sequences containing complete ten cysteines in Fz-CRD, and excluded the others without ten cysteines in the following analysis. Our results showed that there are two large clades, ROR clade and MuSK clade. Two N. vectensis RTKs were located at the base of common ROR and MuSK clades, suggesting that they resembled the ancestors and didn't diverge into ROR or MuSK. The positions of two S. purpuratus varied among different phylogenetic trees. The Vertebrate ROR1and ROR2 were grouped with high Bayesian posterior probability (1), indicating that they duplicated and diverged when Vertebrates emerged. We noticed that the earliest MuSK appeared in fly (FBpp0086841). Comparing the domain structures, we found that this fly MuSK contained a Kringle domain while the Vertebrate MuSK lacked the Kringle domain. Since all RORs contained the Kringle domain and emerged earlier than MuSK, we inferred that MuSK may evolve from ROR by discarding its Kringle domain. The N. vectensis RTK (jgi|Nemve1|21450|, Nv-ROR) without I-set domain may be the ancestor-like sequence of fly MuSK. The sea urchin MuSK (gi|115735424|, labeled MuSK) did not contain Kringle domain, too. In sponge, nine RTKs with Fz-CRD were found in one kind of sponge called A. queenslandica. Among these, two (Aqu1.225241 and Aqu1.214357) contained only two domains (Pkinase_Tyr + Fz-CRD), while one (Aqu1.214048) contained three domains with another Kringle domain (Pkinase_Tyr +Fz-CRD +Kringle). We also found four RTKs with Fz-CRD in another sponge Ephydatia fluviatilis (Fig. S6A). It suggests that the domain fusion of Pkinase_Tyr and Fz-CRD occurred during the emergence of sponge or even earlier. The two N. vectensis RORs both contained four domains (Pkinase_Tyr + Fz-CRD + Kringle + I-set) as compared to the sponge RTKs. Most RORs contained one I-set and one Kringle domain from N. vectensis to human, while MuSK contained three I-set domains and no Kringle domain. The domain structures of ROR and MuSK in Vertebrates are conserved. The domain structures of human ROR1/2 resemble N. vectensis RORs (Fig. S6A). 3.7. CORIN emerged in fruit fly, while MFRP, CPZ and COL18A1 emerged in Vertebrate Besides FZD, RTK with Fz-CRD and SFRP, other proteins with Fz-CRD are CORIN, MFRP, CPZ, and COL18A1. Since there are only 1–2 members of these proteins, we analyzed them together. The phylogenetic trees (Figs. 5C and S5E–F) were constructed based on their Fz-CRD sequence alignment. We noticed that the earliest CORIN was found in fruit fly, while the other three types can be only found in Vertebrates. Sea urchin had 3 CORINs and 10 specific COMF, and these proteins (10 + 3) were clustered in one branch supported by a Bayesian posterior probability 0.91. V-MFRPs were clustered with FlyCORIN FBpp0079096 by a Bayesian posterior probability 0.89. The CORIN in fruit fly and sea urchin contained 1 Fz-CRD, while the CORIN in Vertebrates contained 2 FzCRDs (Fig. S6B). In phylogenetic trees, the 2 Fz-CRDs in V-CORIN were divided into two clades which were named as V-CORIN and VCORIN2. The COL18A1 and CPZ grouped together with Bayesian posterior probability 0.7 implying that the Fz-CRDs of the COL18A1 and CPZ were homologous. The Bayesian posterior probabilities of these four Vertebrate groups (V-COL18A1, V-MFRP, V-CPZ and VCORIN) were respectively high (1 or 0.99), implying that their FzCRDs were highly conserved during evolution. The lamprey CORIN and CPZ suggested that they emerged before the 2R WGD occurred, while COL18A1 can be found in zebrafish after 2R WGD. As shown, fruit fly CORINs, sea urchin CORIN and Ciona CORIN contained only one Fz-CRD, while Vertebrate CORINs contained two Fz-CRDs and two Ldl_recept_a domains. The CPZ were in existence in Vertebrates, and their domain structures were simple (Fz-CRD and Peptidase_M14) and conserved. The COL18A1 were only present in

Vertebrates, while the M-COL18A1 (Mammalian) obtained a DUF959 domain as compared to the Frog COL18A1 and Zebrafish COL18A1 (Figs. S6B–C). 4. Discussion 4.1. Glycosylation sites in Fz-CRD domain may be important for Wnt-binding Protein glycosylation plays crucial biological and physiological roles which involve protein foldings, quality control and biological recognition events (Moremen et al., 2012). The N-X-T/S sequon is a unique acceptor sequence conserved in N-glycosylation (Schwarz and Aebi, 2011). Watty et al. identified two N-linked glycosylation sites in MuSK around the Fz-CRD domain (Watty and Burden, 2002). We predicted all the N-linked glycosylation sites in proteins with Fz-CRD and found one N-linked glycosylation site near the second Cysteine in HsFZD1–10, HsSFRP3–4, HsROR1–2, HsCPZ, and HsMFRP, etc (Fig. 6). All FZDs containing this N-glycosylation site are able to bind to Wnt, while all Smo proteins don't have this site and can't bind to Wnt. Interestingly, two proteins (ROR2 and CPZ) with this N-glycosylation site were demonstrated to interact with Wnt ligand. ROR2 acts as Wnt receptor (Oishi et al., 2003), and immunoprecipitation experiments suggest that the Fz-CRD of CPZ acts as binding domain for Wnt (Moeller et al., 2003). So we speculated that this N-linked glycosylation site may be very important for the Wnt-binding. The Vertebrate ROR1/2 has this N-glycosylation site near the second Cysteine in the Fz-CRD, but ROR of N. vectensis, S. purpuratus and C. intestinalis lacks it (Fig. 6). The zebrafish N-glycosylation site is NRS, while it is NRT in human. The Nv-ROR (jgi|Nemve1|84420|) has KKS in the same position in the alignment, and Ci-ROR (ENSCINP00000006994) GKS. Because they were all in the same branch in the phylogenetic tree, we deduced that the ancestor ROR sequence didn't have this Nglycosylation site, the first amino acid mutated into N in Vertebrates during the evolution. Thus the Vertebrate RORs gained this Nglycosylation site. The SFRP3/4 also contained this N-glycosylation site near the second Cysteine in Fz-CRD from N. vectensis to human (Fig. 6). Nv-SFRP3/4 (jgi| Nemve1|88982|) N-glycosylation site (NMT) and Sp-SFRP3/4 (gi| 115975690|) N-glycosylation site (NLT) resemble the FZDs (NXT). So we inferred that the SFRP3/4 may interact with Wnt by Fz-CRD, while SFRP1/2/5 may interact with Wnt in another way, such as the NTR domain of SFRP1/2/5 interacts with Wnt, or their affinity of binding Wnt in Fz-CRD is lower than SFRP3/4 (Wnt interacts with NTR in Fig. 3D in Bovolenta et al. (2008)). The Bayesian posterior probability of the SFRP1/2/5 branch is 1 and their ancestor may resemble NvSFRP1/2/5 (jgi|Nemve1|200285|) (Figs. 5A and S5A–B). Dann et al. studied the crystal structures of the Fz-CRD from mouse Frizzled8 and SFRP3. They used the residue mutations to identify some important sites for direct binding between Wnt and Fz-CRD molecules (Dann et al., 2001). We marked up these three important Wnt-binding motifs in the alignment according to the Fz-CRD 3D structure (Dann et al., 2001). The first motif located in the β1-sheet and β2-sheet may be important for Wnt-binding, while the second motif located in the fifth Cysteine and the third motif in C-terminal of Fz-CRD domain may be responsible for forming homo- or heter-CRDdimer (Carron et al., 2003) (Fig. 6). Moreover, the N-glycosylation site we found was located in the first important Wnt-binding region from a previous residue mutation test (Dann et al., 2001). We found that Smo and FZD3/6 lacked some amino acids at the end of Fz-CRDs as compared to other proteins (Fig. 6). Dann et al. reported that Fz-CRDs exhibited a conserved dimer interface which may be important for Wnt binding (Dann et al., 2001). We noticed that Smo and FZD3/6 lacked a block located between the ninth Cysteine and the tenth Cysteine, which might be the dimer interface. We also found three hydrophobic amino acids (arrows indicated) which may be important for the Wnt-binding in our investigated Wnt-binding

J. Yan et al. / Gene 533 (2014) 229–239

237

Fig. 6. Important sites for Wnt-binding and N-glycosylation sites in Fz-CRD domains. This graph is the sequence alignment of Fz-CRD domain from N. vectensis to H. sapiens. Five disulfide bridges were labeled with arrow and lines. Three important Wnt-binding regions were circled in dark blue. N-glycosylation sites of sequences predicted by NetNGlyc Server were circled in red. The third arrow at the bottom indicated the N-glycosylation sites which might be important for Wnt-binding. The other three arrows indicated hydrophobic amino acid which might be important for Wnt-binding. Two crimson squares indicated the missing amino acids at the end of Fz-CRDs of Smoothened and Fzd3/6. Abbreviations: Hs, Homo sapiens; Dr, Danio rerio; Pm, Petromyzon marinus; Ci, Ciona intestinalis; Sp, Strongylocentrotus purpuratus; Dm, Drosophila melanogaster; Ce, Caenorhabditis elegans; Nv, Nematostella vectensis.

proteins (Fig. 6). The evidence is that they are not hydrophobic amino acids in the corresponding position of proteins without Wnt-binding, such as Smo (Fig. 6). These three hydrophobic amino acids were all located in Wnt-binding motif and were demonstrated to influence the Wnt-binding by mutant experiments (Dann et al., 2001). As a summary, we concluded that the N-linked glycosylation near the second Cysteine in Fz-CRD might be very important for Wnt-binding, the FZD3/6 and Smo block may be involved in the dimer formation of Fz-CRD, and these three hydrophobic amino acids may also contribute to Wntbinding (Fig. 6). 4.2. The domain fusion and expansions model of Fz-CRD proteins Based on the above analyses, we proposed a model (Fig. 7) to infer the evolution and domain fusion of the correlate proteins with FzCRD. We inferred that two rounds expansions of the Fz-CRD domain occurred in the Metazoa and Vertebrates. The first round of expansion of Fz-CRD occurred during the emergence of Metazoa, which is shown that FZD, ROR, SFRP, and CORIN emerged by domain fusion. We noticed that many sequences only containing Fz-CRD were in existence in D. discoideum, sponges and N. vectensis. As shown above, the fusion of

Frizzled-like 7TM and Fz-CRD domains formed FZD in the ancestor of D. discoideum. The same fusion of ROR may have occurred in the ancestor of sponges. Many sequences only containing Pkinase_Tyr domain present in sponges and M. brevicollis (Fig. S7A). We noticed that a RTK only containing two domains (Fz-CRD and Pkinase_Tyr) (Aqu1.225241) was in one kind of sponge called A. queenslandica. RTKs with Fz-CRD were first formed through fusion of Fz-CRD and Pkinase_Tyr in sponge ancestor, then obtained the Kringle and I-set domains in the evolution. The earliest SFRP was found in sponges (gi| 133917271|). However, we can't find any sequence which contains only one NTR domain in sponges (Fig. S7A). The earliest CORINs were found in fruit fly (FBpp0087983 and FBpp0071118), but their domain structures were incomplete in comparison to Vertebrate CORINs. Ciona CORIN obtained the second cluster of Ldl receptor domain, and Vertebrate CORINs obtained the second Fz-CRD in evolution (Fig. S7B). The second expansion of Fz-CRD occurred during the emergence of Vertebrates. Three kinds of new proteins with Fz-CRD, MFRP, CPZ, and COL18A1 emerged in Vertebrates. In comparison to the MFRP in frogs (ENSXETP00000046684), one sequence (ENSDARP00000071206) in zebrafish only lacks the Fz-CRD. This implies that the MFRP fusion occurred during the emergence of amphibious or even earlier

238

J. Yan et al. / Gene 533 (2014) 229–239

Fig. 7. Evolution and domain fusion model of proteins with Fz-CRD. Two black squares indicate two rounds of Fz-CRD domain fusions which may have occurred during the emergences of Metazoan and Vertebrates.

(Fig. S7B). The fused CPZ containing two domains (Peptidase_M14 and Fz-CRD) was first extant in Vertebrates, while a sequence only containing the Peptidase_M14 domain (ENSCINP00000021115) was found in Ciona. The fusion of COL18A1 (Fz-CRD, Collagen and Endostatin domains) occurred in Vertebrates. We can find sequences containing the Endostatin domain and clusters of the Collagen domain in Ciona, even in D. melanogaster and C. elegans. Mammal COL18A1 obtained DUF959 domain in N-terminal (Fig. S7C). Is the second expansion of Fz-CRD involved in the novel functions in Vertebrates? CORIN functions to maintain normal blood pressure in the heart (Chan et al., 2005). MFRP and COL18A1 are involved in vision and eye development (Marneros and Olsen, 2005; Sundin et al., 2008; Won et al., 2008). It has been reported that the Fz-CRD of CPZ can bind to Wnt and modulate Wnt signaling (Moeller et al., 2003; Wang et al., 2009). The Ciona heart can be used to assess the origin of Vertebrate heart (Satou and Satoh, 2003). And Vertebrate eyes are different to insects and other Protostomes. So it is suggested that the CORIN, MFRP and COL18A1 may be involved in the origin of the Vertebrate heart and eyes. We speculated that these new proteins (CORIN, MFRP, CPZ, and COL18A1) with Fz-CRD domain may influence the Wnt signaling pathway as Wnt antagonists after they obtained Fz-CRD, therefore regulate the embryonic development. It will be interesting to validate it with more experimental data. 4.3. The origin and evolution of FZD We traced FZD evolution during animal emergence. There were 4 FZDs and 1 Smo in the Cnidaria (N. vectensis), which resembled the last common ancestor of Bilaterians and lived perhaps 700 million years ago (Putnam et al., 2007). Sponges diverged from other metazoans over 600 million years ago, and were more primitive than the Cnidaria (N. vectensis) (Srivastava et al., 2010). From our study, 2 FZDs were found in one kind of sponge called A. queenslandica and 1 Frizzled in another kind of sponge called Suberites domuncula (gi| 38524368|), although the proteomes investigated from NCBI are not complete. Many sequences containing only Fz-CRDs and one sequence containing only Fz-7TM domain (Aqu1.227973) were found in A. queenslandica. This suggests that the Fz-CRD and TM domains may

come from two independent genes and were fused before the sponge appeared. The earliest FZD was present in D. discoideum, suggesting that the fusion of Fz-CRD and Fz-7TM occurred in the ancestor of D. discoideum and Metazoa. Krishnan studied the origin of GPCR and estimated that Frizzled families evolved before the split of Unikonts (including Amoebozoa, Fungi and Metazoa) from the cAMP receptor family of GPCR (Krishnan et al., 2012). Our result is consistent with this. We can find cAMP receptors with Dicty_CAR domain (PF05462, Slime mold cAMP receptor) in D. discoideum, but no sequence with both Dicty_CAR and Fz-CRD (PF01392) domains. This suggests that the cAMP receptor with only Dicty_CAR domain may have evolved into a sequence with the Fz-7TM (PF01534) domain, and then fused with the Fz-CRD domain to form FZD. Urochordata (Ciona) and Echinodermata (sea urchin) have 5 FZDs, while most Vertebrates have 11 FZDs, suggesting that gene duplication events have happened in the lineage of Vertebrate. The result of 4 lamprey FZDs also supports this view. This is consistent with the reported results that the 2R-WGD (two rounds of whole genome duplication) affected the Wnts and GPCRs (Huminiecki and Heldin, 2010). Nakatani et al. reconstructed the Vertebrate ancestral genome before the 2R-WGD (chromosome number 10–13, before the split of agnatha and gnathostomata), and did the chromosome segment mapping between the Vertebrate ancestor and human (Nakatani et al., 2007). Based on their results, we mapped the chromosome location of each human FZD in Fig. 4 of that paper (Nakatani et al., 2007). We found that the FZD1, FZD2, and FZD7 were from the same ancestral chromosome E of the reconstructed Vertebrate ancestor before 1RWGD. Interestingly, the FZD1 (in ancestral chromosome E1 of gnathostome ancestor after 2R-WGD), FZD2 (E2), and FZD7 (E0) location in gnathostome suggests that the FZD1/2/7 ancestor expanded in Vertebrates as part of 2R genome duplications. Similar results were obtained in all other FZD groups (FZD3/6 (B), FZD5/8 (E) and FZD9/10 (E)), SFRP and ROR (SFRP1/2/5 (C), SFRP3/4 (C) and ROR1/2 (A)). 5. Conclusion In this study, we systematically identified proteins with the Fz-CRD domain and analyzed their evolution and domain fusion. Proteins with

J. Yan et al. / Gene 533 (2014) 229–239

Fz-CRD including FZD, SFRP, RTK (ROR and MuSK), MFRP, CPZ, COL18A1, CORIN and others in representative genomes were characterized. We proposed a model about the Fz-CRD fusion events which may have occurred during the emergences of Metazoa and Vertebrate. The earliest SFRP was found in sponge and expanded into Vertebrates from 2 (N. vectensis) to 5 (Vertebrate); earliest RTKs with Fz-CRD (ROR) were found in sponge while the earliest MuSK appeared in fruit fly and was conserved in Vertebrates; the earliest CORIN was found in flies and its domain structure changed in Vertebrates. MFRP, CPZ, COL18A1 can only be found in Vertebrates. We also analyzed some functional sites such as the motif KTXXXW and N-glycosylation sites NXT. Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.gene.2013.09.083. Conflict of interest None. Acknowledgments We would like to thank Dr Siluo Huang (Huazhong University of Science and Technology) and Dr Jianhua Cao (Huazhong Agricultural University) for the advice on this study. This work was supported by the following funds to A.Y.G: National Natural Science Foundation of China (NSFC) (31171271 and 31270885), Young Teachers' Fund for Doctor Stations, Ministry of Education of China (20110142120042), and fund from the State Key Laboratory of Freshwater Ecology and Biotechnology (2012FB02). This work was also supported by the funds of H.B.J: NSFC 31171387, 31000640 and 81090113. References Abascal, F., Zardoya, R., Posada, D., 2005. ProtTest: selection of best-fit models of protein evolution. Bioinformatics 21, 2104–2105. Bafico, A., Gazit, A., Pramila, T., Finch, P.W., Yaniv, A., Aaronson, S.A., 1999. Interaction of frizzled related protein (FRP) with Wnt ligands and the frizzled receptor suggests alternative mechanisms for FRP inhibition of Wnt signaling. J. Biol. Chem. 274, 16180–16187. Bockaert, J., Pin, J.P., 1999. Molecular tinkering of G protein-coupled receptors: an evolutionary success. EMBO J. 18, 1723–1729. Bovolenta, P., Esteve, P., Ruiz, J.M., Cisneros, E., Lopez-Rios, J., 2008. Beyond Wnt inhibition: new functions of secreted Frizzled-related proteins in development and disease. J. Cell Sci. 121, 737–746. Carron, C., Pascal, A., Djiane, A., Boucaut, J.C., Shi, D.L., Umbhauer, M., 2003. Frizzled receptor dimerization is sufficient to activate the Wnt/beta-catenin pathway. J. Cell Sci. 116, 2541–2550. Chan, J.C., Knudson, O., Wu, F., Morser, J., Dole, W.P., Wu, Q., 2005. Hypertension in mice lacking the proatrial natriuretic peptide convertase corin. Proc. Natl. Acad. Sci. U. S. A. 102, 785–790. Churcher, A.M., Taylor, J.S., 2009. Amphioxus (Branchiostoma floridae) has orthologs of vertebrate odorant receptors. BMC Evol. Biol. 9, 242. Croce, J.C., et al., 2006. A genome-wide survey of the evolutionarily conserved Wnt pathways in the sea urchin Strongylocentrotus purpuratus. Dev. Biol. 300, 121–131. Dann, C.E., Hsieh, J.C., Rattner, A., Sharma, D., Nathans, J., Leahy, D.J., 2001. Insights into Wnt binding and signalling from the structures of two Frizzled cysteine-rich domains. Nature 412, 86–90. Eichinger, L., et al., 2005. The genome of the social amoeba Dictyostelium discoideum. Nature 435, 43–57. Forrester, W.C., 2002. The ROR receptor tyrosine kinase family. Cell. Mol. Life Sci. 59, 83–96. Fredriksson, R., Schioth, H.B., 2005. The repertoire of G-protein-coupled receptors in fully sequenced genomes. Mol. Pharmacol. 67, 1414–1425. Fredriksson, R., Lagerstrom, M.C., Lundin, L.G., Schioth, H.B., 2003. The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol. Pharmacol. 63, 1256–1272. Freese, J.L., Pino, D., Pleasure, S.J., 2010. Wnt signaling in development and disease. Neurobiol. Dis. 38, 148–153. Guindon, S., Gascuel, O., 2003. A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52, 696–704. Hall, T.A., 1999. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp. Ser. 95–98. Hill, C.A., et al., 2002. G protein-coupled receptors in Anopheles gambiae. Science 298, 176–178. Hirokawa, T., Boon-Chieng, S., Mitaku, S., 1998. SOSUI: classification and secondary structure prediction system for membrane proteins. Bioinformatics 14, 378–379. Huang, Y., Niu, B., Gao, Y., Fu, L., Li, W., 2010. CD-HIT Suite: a web server for clustering and comparing biological sequences. Bioinformatics 26, 680–682.

239

Hubbard, S.R., 1999. Structural analysis of receptor tyrosine kinases. Prog. Biophys. Mol. Biol. 71, 343–358. Huminiecki, L., Heldin, C.H., 2010. 2R and remodeling of vertebrate signal transduction engine. BMC Biol. 8, 146. Ingham, P.W., Nakano, Y., Seger, C., 2011. Mechanisms and functions of Hedgehog signalling across the metazoa. Nat. Rev. Genet. 12, 393–406. Ji, Y., Zhang, Z., Hu, Y., 2009. The repertoire of G-protein-coupled receptors in Xenopus tropicalis. BMC Genomics 10, 263. Kamesh, N., Aradhyam, G.K., Manoj, N., 2008. The repertoire of G protein-coupled receptors in the sea squirt Ciona intestinalis. BMC Evol. Biol. 8, 129. King, N., et al., 2008. The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans. Nature 451, 783–788. Koval, A., Purvanov, V., Egger-Adam, D., Katanaev, V.L., 2011. Yellow submarine of the Wnt/Frizzled signaling: submerging from the G protein harbor to the targets. Biochem. Pharmacol. 82, 1311–1319. Krishnan, A., Almen, M.S., Fredriksson, R., Schioth, H.B., 2012. The origin of GPCRs: identification of mammalian like Rhodopsin, Adhesion, Glutamate and Frizzled GPCRs in fungi. PLoS One 7, e29817. Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L., 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J. Mol. Biol. 305, 567–580. Larkin, M.A., et al., 2007. Clustal W and Clustal X version 2.0. Bioinformatics 2947–2948. Lee, P.N., Pang, K., Matus, D.Q., Martindale, M.Q., 2006. A WNT of things to come: evolution of Wnt signaling and polarity in cnidarians. Semin. Cell Dev. Biol. 17, 157–167. Ling, L., Nurcombe, V., Cool, S.M., 2009. Wnt signaling controls the fate of mesenchymal stem cells. Gene 433, 1–7. Liu, T., DeCostanzo, A.J., Liu, X., Wang, H.Y., Hallagan, S., Moon, R.T., Malbon, C.C., 2001. G protein signaling from activated rat frizzled-1 to the beta-catenin-Lef-Tcf pathway. Science 292 (5522), 1718–1722. Marneros, A.G., Olsen, B.R., 2005. Physiological role of collagen XVIII and endostatin. FASEB J. 19, 716–728. Metpally, R.P., Sowdhamini, R., 2005. Genome wide survey of G protein-coupled receptors in Tetraodon nigroviridis. BMC Evol. Biol. 5, 41. Moeller, C., Swindell, E.C., Kispert, A., Eichele, G., 2003. Carboxypeptidase Z (CPZ) modulates Wnt signaling and regulates the development of skeletal elements in the chicken. Development 130, 5103–5111. Moremen, K.W., Tiemeyer, M., Nairn, A.V., 2012. Vertebrate protein glycosylation: diversity, synthesis and function. Nat. Rev. Mol. Cell Biol. 13, 448–462. Nakatani, Y., Takeda, H., Kohara, Y., Morishita, S., 2007. Reconstruction of the vertebrate ancestral genome reveals dynamic genome reorganization in early vertebrates. Genome Res. 17, 1254–1265. Oishi, I., et al., 2003. The receptor tyrosine kinase Ror2 is involved in non-canonical Wnt5a/JNK signalling pathway. Genes Cells 8, 645–654. Prabhu, Y., Eichinger, L., 2006. The Dictyostelium repertoire of seven transmembrane domain receptors. Eur. J. Cell Biol. 85, 937–946. Putnam, N.H., et al., 2007. Sea anemone genome reveals ancestral eumetazoan gene repertoire and genomic organization. Science 317, 86–94. Rehn, M., Pihlajaniemi, T., Hofmann, K., Bucher, P., 1998. The frizzled motif: in how many different protein families does it occur? Trends Biochem. Sci. 23, 415–417. Ronquist, F., Huelsenbeck, J.P., 2003. MrBayes 3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574. Roy, S.W., Gilbert, W., 2006. The evolution of spliceosomal introns: patterns, puzzles and progress. Nat. Rev. Genet. 7, 211–221. Saldanha, J., Singh, J., Mahadevan, D., 1998. Identification of a Frizzled-like cysteine rich domain in the extracellular region of developmental receptor tyrosine kinases. Protein Sci. 7, 1632–1635. Satou, Y., Satoh, N., 2003. Draft genome sequence of Ciona intestinalis and its meaning. Tanpakushitsu Kakusan Koso 48, 1282–1286. Schwarz, F., Aebi, M., 2011. Mechanisms and principles of N-linked protein glycosylation. Curr. Opin. Struct. Biol. 21, 576–582. Srivastava, M., et al., 2010. The Amphimedon queenslandica genome and the evolution of animal complexity. Nature 466, 720–726. Sundin, O.H., et al., 2008. Developmental basis of nanophthalmos: MFRP Is required for both prenatal ocular growth and postnatal emmetropization. Ophthalmic Genet. 29, 1–9. Tamura, K., Dudley, J., Nei, M., Kumar, S., 2007. MEGA4: Molecular Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol. Evol. 1596–1599. Tateyama, M., Abe, H., Nakata, H., Saito, O., Kubo, Y., 2004. Ligand-induced rearrangement of the dimeric metabotropic glutamate receptor 1alpha. Nat. Struct. Mol. Biol. 11, 637–642. Tusnady, G.E., Simon, I., 2001. The HMMTOP transmembrane topology prediction server. Bioinformatics 17, 849–850. Umbhauer, M., et al., 2000. The C-terminal cytoplasmic Lys-thr-X-X-X-Trp motif in frizzled receptors mediates Wnt/beta-catenin signalling. EMBO J. 19, 4944–4954. Wang, H.Y., Liu, T., Malbon, C.C., 2006. Structure-function analysis of Frizzleds. Cell. Signal. 18, 934–941. Wang, L., Shao, Y.Y., Ballock, R.T., 2009. Carboxypeptidase Z (CPZ) links thyroid hormone and Wnt signaling pathways in growth plate chondrocytes. J. Bone Miner. Res. 24, 265–273. Watty, A., Burden, S.J., 2002. MuSK glycosylation restrains MuSK activation and acetylcholine receptor clustering. J. Biol. Chem. 277, 50457–50462. Won, J., et al., 2008. Membrane frizzled-related protein is necessary for the normal development and maintenance of photoreceptor outer segments. Vis. Neurosci. 25, 563–574. Xu, Y.K., Nusse, R., 1998. The Frizzled CRD domain is conserved in diverse proteins including several receptor tyrosine kinases. Curr. Biol. 8, R405–R406. Zamanian, M., Kimber, M.J., McVeigh, P., Carlson, S.A., Maule, A.G., Day, T.A., 2011. The repertoire of G protein-coupled receptors in the human parasite Schistosoma mansoni and the model organism Schmidtea mediterranea. BMC Genomics 12, 596.