Gene 495 (2012) 128–133
Contents lists available at SciVerse ScienceDirect
Gene journal homepage: www.elsevier.com/locate/gene
The soybean aldehyde dehydrogenase (ALDH) protein superfamily Simeon O. Kotchoni a, b,⁎, Jose C. Jimenez-Lopez c, Adéchola P.P. Kayodé d, Emma W. Gachomo e, Lamine Baba-Moussa f a
Department of Biology, Rutgers University, 315 Penn St., Camden, NJ 08102, USA Center for Computational and Integrative Biology (CCIB), Rutgers University, 315 Penn St., Camden, NJ 08102, USA Department of Biochemistry, Cell and Molecular Biology of Plants, Estacion Experimental del Zaidin, Consejo Superior de Investigaciones Cientificas, Profesor Albareda 1, 18008, Granada, Spain d Département de Nutrition et Sciences Alimentaire, Faculté des Sciences Agronomiques Université d'Abomey-Calavi, 01 BP 526 Cotonou, Benin e Department of Crop Science, University of Illinois U-C, 1201 W. Gregory, Urbana, IL 61801, USA f Laboratoire de Biologie et de Typage Moléculaire en Microbiologie, Faculté des Sciences et Techniques, Université d'Abomey-Calavi, 05 BP 1604 Cotonou, Benin b c
a r t i c l e
i n f o
Article history: Accepted 20 December 2011 Available online 29 December 2011 Keywords: Genome Soybean ALDH Gene nomenclature Plants
a b s t r a c t Aldehyde dehydrogenases (ALDHs) are members of NAD(P)+-dependent protein superfamily that catalyze the oxidation of a wide range of endogenous and exogenous highly reactive aliphatic and aromatic aldehyde molecules to their corresponding non toxic carboxylic acids. Research evidence has shown that ALDHs represent a promising class of genes to improve growth development, seed storage and environmental stress adaptation in higher plants. The recently completed genome sequences of several plant species have resulted in the identification of a large number of ALDH genes, most of which still need to be functionally characterized. In this paper, we identify members of the ALDH gene superfamily in soybean genome, and provide a unified nomenclature for the entire soybean ALDH gene families. The soybean genome contains 18 unique ALDH sequences encoding members of five ALDH families involved in a wide range of metabolic and molecular detoxification pathways. In addition, we describe the biochemical requirements and cellular metabolic pathways of selected members of ALDHs in soybean responses to environmental stress conditions. Published by Elsevier B.V.
1. Introduction Up to date, few plant genomes have been fully sequenced; examples include Arabidopsis thaliana (Arabidopsis Genome Initiative 2000), Oryza sativa (Goff et al., 2002; International Rice Genome Sequencing Project 2005; Yu et al., 2002), Populus trichocarpa (Tuskan et al., 2006), the unicellular green algae Chlamydomonas reinhardtii (Merchant et al., 2007), Ostreococcus lucimarinus (Palenik et al., 2007), Ostreococcus tauri (Derelle et al., 2006), Zea mays (Schnable et al., 2009), Medicago truncatula (http://www.medicagohapmap.org/index.php), Sorghum bicolor (http:// mips.helmholtz-muenchen.de/plant/sorghum/genomeView/index. jsp), Vitis vinifera (Moroldo et al., 2008), and recently Glycine max (Schmutz et al., 2010). The soybean genome is so far the largest plant genome with 70% more protein-coding genes than A. thaliana (Schmutz et al., 2010). We are interested in understanding the molecular and genetic Abbreviations: ALDH, Aldehyde dehydrogenase; BADH, betaine aldehyde dehydrogenase; GAPN, non-phosphorylating glyceraldehyde-3-phosphate dehydrogenase; GABA, γ-aminobutyric acid; MMALDH, methylmalonyl semialdehyde dehydrogenase; P5CDH, Δ1-pyrroline-5-carboxylate dehydrogenase; P5CS, Δ1-pyrroline-5-carboxylate synthase; SSALDH, succinic semialdehyde dehydrogenase. ⁎ Corresponding author at: Department of Biology, Rutgers University, 315 Penn St., Camden, NJ 08102, USA. Tel.: + 1 856 225 6354; fax: + 1 856 225 6165. E-mail addresses:
[email protected],
[email protected] (S.O. Kotchoni). 0378-1119/$ – see front matter. Published by Elsevier B.V. doi:10.1016/j.gene.2011.12.035
bases of drought stress and disease resistance in soybean with a particular emphasis on aldehyde dehydrogenase (ALDH) protein superfamily (Jimenez-Lopez et al., 2010; Kirch et al., 2005; Kotchoni, 2004; Kotchoni et al., 2010a). ALDHs have been studied in various organisms from bacteria to mammals (Jimenez-Lopez et al., 2010; Kotchoni et al., 2010a; Wood and Duff, 2009; Yoshida et al., 1998). ALDHs comprise a protein superfamily of NAD(P)+-dependent enzymes capable of oxidizing a variety of aromatic and aliphatic aldehydes into their corresponding carboxylic acids (Kirch et al., 2004). Aldehydes are highly reactive molecules ubiquitously produced during different physiological processes in prokaryotes and eukaryotes (Kotchoni et al., 2006; Kotchoni et al., 2010a). They often react with cellular nucleophiles through their carbonyl group causing deleterious effects on organism metabolism under excessive in vivo concentration. Therefore, the selective elimination of excessive aldehyde byproducts through ALDHs, is essential for cellular function. Thus, ALDHs have been considered as efficient detoxifying enzymes eliminating biogenic and xenobiotic reactive aldehyde molecules (Yoshida et al., 1998). In addition, ALDHs are involved in many other fundamental biochemical pathways including reduced lipid-peroxidation (Kotchoni et al., 2006; Missihoun et al., 2011; Stiti et al., 2011), detoxification of aldehydes generated by ethanol fermentation (Vasiliou and Nebert, 2005; Yoshida et al., 1998) and environmental stresses such as salinity, drought and desiccation (Kirch et al., 2001; Kotchoni and Bartels, 2003; Kotchoni et al., 2006; Missihoun et al., 2011; Stiti et al., 2011).
S.O. Kotchoni et al. / Gene 495 (2012) 128–133
In our previous report, we provided for the first time evidence of a crucial role of the maize RF2A/ALDH2B2 tunnel-like cavity in preventing maize infection by Cochliobolus heterostrophus (Jimenez-Lopez et al., 2010). In addition, we have recently annotated and characterized the entire ALDH gene superfamily of rice and maize (Jimenez-Lopez et al., 2010; Kotchoni et al., 2010a). In this study, we provide a unified annotation and phylogenetic analysis of the soybean ALDH gene superfamily and describe the functional impact of selected members of ALDH gene families in specific metabolic pathways mediating soybean growth development. We particularly aim at highlighting potential roles of stress inducible ALDH genes in adaptation to abiotic stress conditions and improvement of soybean to cope with stress. 2. Materials and methods 2.1. Identification, annotation and characterization of soybean ALDH protein superfamily In order to identify the soybean ALDH protein superfamily, previously identified Arabidopsis-, rice-, and maize-ALDH sequences retrieved from NCBI (http://www.ncbi.nlm.nih.gov/), rice genomic database (TIGR Rice Annotation Release 4, http://blast.jcvi.org/eukblast/index.cgi?project=osa1), and maize genome (release 4a.53 http://www.maizesequence.org; Schnable et al., 2009) were used to search for soybean ALDH and ALDH like DNA sequences using BLASTX, BLASTN and BLAST 2.2.24 release (low complexity filter; and based on Blosum62 substitution matrix) (Altschul et al., 1997). Protein motifs of the identified soybean-ALDHs were queried using the PROSITE release 20.66 (Sigrist et al., 2010), Pfam 23.0 (Finn et al., 2010), CDD v2.25 (Conserved Domain Database) or CDART (Conserved Domain Architecture Retrieval Tool) tools (Marchler-Bauer et al., 2009). The retrieved sequences were then double checked using Pfam 00171 (ALDH family), PS00070 (ALDH cysteine active site), PS00687 (ALDH glutamic acid active site), KOG2450 (aldehyde dehydrogenase), KOG2451 (aldehyde dehydrogenase), KOG 2453 (aldehyde dehydrogenase) and KOG2456 (aldehyde dehydrogenase) in order to identify domains of soybean ALDH protein superfamily. Putative functions were thereafter assigned to retrieved proteins based upon significant similarity to functionally characterized proteins as previously described (Kotchoni et al., 2010a). The soybean ALDH deduced polypeptides were then annotated using the established annotation criteria by the ALDH Gene Nomenclature Committee (AGNC) (Kotchoni et al., 2010a). Based on the AGNC-annotation criteria, deduced amino acid sequences that were more than 40% identical to other previously identified ALDH sequences formed a family, and sequences that were more than 60% identical composed a protein subfamily. Deduced amino acid sequences less than 40% identical would describe a new ALDH protein family as previously reported (Jimenez-Lopez et al., 2010; Kotchoni et al., 2010a). 2.2. Soybean ALDHs: phylogenetic analysis For the phylogenetic analysis of soybean ALDH proteins, the genome (release 4a.53 http://www.maizesequence.org; Schnable et al., 2009), the A. thaliana (The Arabidopsis Information Resource, TAIR; http://www.arabidopsis.org/), Physcomitrella patens ssp. Patens, and C. reinhardtii (Genome Resources of the US Department of Energy Joint Genome Institute; http://genome.jgi-psf.org/) ALDH superfamilies were retrieved and used together with the soybean ALDH superfamily to generate the phylogenetic trees using ClustalW as previously described (Kotchoni et al., 2010a). The alignments were created using the Gonnet protein weight matrix, multiple alignment gap opening/ extension penalties of 10/0.5 and pairwise gap opening/extension penalties of 10/0.1. We adjusted the alignments using Bioedit V 7.0.5.3. and eliminated portions of sequences that could not be reliably aligned. Phylogenetic trees were generated by the neighbor-joining method (NJ), and the tree branches were tested with 1000 bootstrap replicates.
129
The soybean and maize tree was visualized with Treeview v.0.5.0, and the more expanded tree composed of G. max, Z. mays, O. sativa, A. thaliana, P. patens and C. Reinhardtii ALDHs was visualized with Treedyn 198.3 as described previously (Jimenez-Lopez et al., 2010). 3. Results 3.1. Soybean ALDH protein families: unified nomenclature Database searches resulted in the identification of 18 soybean ALDH gene sequences encoding members of five ALDH protein families (ALDH2, ALDH3, ALDH5, ALDH7, ALDH11) that have been previously identified in other plant species (Table 1, Fig. 1). To classify each protein family according to AGNC, the root symbol (ALDH) of each protein is followed by the family designation number (1, 2, 3, 4 etc.), followed by a subfamily designator (A, B, C, D etc.) and, finally, the individual gene number as illustrated in Table 1. Four (families 2, 7, 10, and 11) out of the five families of soybean ALDHs are represented by more than one gene [family 2 (five members), family 7 (four members), family 10 (six members), and family 11 (two members)], whereas family 3 is represented by a single gene member. Interestingly, these ALDH gene families with the exception of GmALDH3H2 are located in duplicated regions of the soybean genome, which might result from direct gene duplications. Several other ALDH gene families identified in rice, Arabidopsis and maize genome are missing in soybean, suggesting that these genes might have been lost through gene recombination events (Figs. 1 and 2). 3.2. Soybean ALDHs: phylogenetic analysis A phylogenetic analysis of soybean ALDH sequences with other well-characterized plant ALDHs has never been performed. To gain insight into the functional relevance of abundantly represented members of soybean ALDH protein families, we generated the phylogenetic relationships between soybean ALDHs and other well characterized plant ALDHs. The phylogenetic analysis shows that the plant ALDHs are split into three clades (plant ALDH origins), and soybean ALDHs share the common core of the plant ALDH families (ALDH2, ALDH3, ALDH5, ALDH6, ALDH7, ALDH10, ALDH11 and ALDH12) with the exception of ALDH5, ALDH6, and ALDH12 that are missing in soybean genome (Figs. 1 and 2, Table 2). It clearly evident that families 2, 5, 6, 7, 10, 22 and 23 representing the plant ALDH core families cluster together (clade 1). However, families 3, 12 and 18 belonging also to the plant ALDH core families cluster in clade 2 of the tree, while families 11 and 21 belong to clade 3 of the tree (Fig. 2). In addition, this study reveals some interesting findings. Family 3 ALDHs are represented by a single gene in soybean (Figs. 1 and 2), yet this family is represented by five gene members in other plant species such as maize, moss and rice (Jimenez-Lopez et al., 2010; Kotchoni et al., 2010a). However, the green algae, one of the plant species that evolved earlier in plant kingdom lacks family 3 ALDH proteins (Table 2, Fig. 2). Family 7 ALDHs are represented by four gene members in soybean and a single gene in other plants with the exception of the green algae that lack ALDH7 proteins (Fig. 2, Table 2). The multiple members of family 7 ALDHs in soybean can be explained by the fact that GmALDH7 genes are located in duplicated regions of soybean genome resulting into direct gene duplications. On the other hand, soybean genome lacks family 5, 6, 12 and 18 ALDHs that represent members of the core plant ALDH proteins (Table 2). These families encode for substrate specific ALDH proteins and are involved in environmental stress response metabolic pathways (Kirch et al., 2004; Kotchoni and Bartels, 2003; Wood and Duff, 2009). Understanding the biological implication of lack of ALDH families 5, 6, 12, and 18 and the multiple member duplications of other ALDH gene families in soybean crop improvement is of great value. Indeed family 5 ALDHs encode for succinic semialdehyde dehydrogenases, which are crucial for UV-B and heat stress response in plants, while family 6
130
S.O. Kotchoni et al. / Gene 495 (2012) 128–133
Table 1 Glycine max aldehyde dehydrogenase (ALDH) protein superfamily: unified nomenclature and subcellular localization. ALDH family
Revised annotation
Accession number
Putative molecular function
Subcellular localization
Coding sequence (nd)
Number of amino acids
Predicted MW (Da)
Family 2
GmALDH2B1 GmALDH2B2 GmALDH2B3 GmALDH2C1 GmALDH2D1 GmALDH3H2 GmALDH7A1
AK244993 AK244996 AK285821 AK244669 AK286345 BT098493 BT093713
Mitochondria Mitochondria Mitochondria Cytosol Cytosol Cytosol Chloroplast
1617 1620 1623 1506 1245 990 1056
538 539 540 501 414 329 351
58,453.73 58,552.87 58,906.48 54,688.17 44,764.57 35,948.57 38,185.02
GmALDH7B6
AY250704
Chloroplast
1533
510
54,668.57
GmALDH7B7
AK246010
Chloroplast
1590
529
56,718.71
GmALDH7B8
AK245949
528
56,451.41
AK245037 AK243907 HM063944 HM063941 AK244648 HM063940 AK286865 AK286867
Secretory pathway/endomembrane compartment (ER) Peroxisome Peroxisome Peroxisome Peroxisome Peroxisome Peroxisome Cytosol Cytosol
1587
GmALDH10A4 GmALDH10A5 GmALDH10A6 GmALDH10A7 GmALDH10A9 GmALDH10B1 GmALDH11A3 GmALDH11A4
Aldehyde dehydrogenase Aldehyde dehydrogenase Aldehyde dehydrogenase Aldehyde dehydrogenase Aldehyde dehydrogenase Aldehyde dehydrogenase NAD+-dependent α-aminoadipic semialdehyde dehydrogenase NAD+-dependent α-aminoadipic semialdehyde dehydrogenase NAD+-dependent α-aminoadipic semialdehyde dehydrogenase NAD+-dependent α-aminoadipic semialdehyde dehydrogenase Betaine-aldehyde dehydrogenase Betaine-aldehyde dehydrogenase Betaine-aldehyde dehydrogenase Betaine-aldehyde dehydrogenase Betaine-aldehyde dehydrogenase Betaine-aldehyde dehydrogenase NADH-dependent non-phosphorylating glyceraldehyde-3-phosphate dehydrogenase
1548 1512 1512 1512 1542 936 1494 1572
515 503 503 503 513 311 497 523
56,121.16 54,610.97 54,669.00 54,739.84 55,762.07 33,119.41 53,171.60 56,164.46
Family 3 Family 7
Family 10
Family 11
nd = nucleotides; Da = daltons.
ALDHs encode for methylmalonyl aldehyde dehydrogenases, responsible for valine degradation into propionyl CoA (Steele et al., 1992). Moreover, family 12 and 18 ALDH proteins are both involved in proline metabolic pathways that have been shown to be crucial for osmotic stress adjustment in higher plants (Deuschle et al., 2001; Hare and Cress, 1997; Kirch et al., 2004; Kotchoni et al., 2010a). These classes of ALDHs are highly induced by drought stress and have proved to be responsible for
rice stress adaptation through proline metabolism (Gao and Han, 2009; Kotchoni et al., 2010a). 3.3. The core plant ALDH protein families Genomic data analysis revealed that green plants retained a core ALDH family group composed of ALDH5, ALDH10, ALDH11, ALDH12
Fig. 1. Phylogenetic relationship of soybean and maize ALDHs. Neighbor-joining (NJ) method was used to perform the phylogenetic analysis. The Glycine max (Gm) ALDHs are represented under gray shaded background. Respective families (orange shade) and putative functions of members of represented ALDH families are indicated. Abbreviation: BADH, betaine aldehyde dehydrogenase; GAPN, non-phosphorylating glyceraldehyde-3-phosphate dehydrogenase; MMALDH, methylmalonyl semialdehyde dehydrogenase; P5CDH, Δ1-pyrroline-5-carboxylate dehydrogenase; P5CS, Δ1-pyrroline-5-carboxylate synthase; SSALDH, succinic semialdehyde dehydrogenase.
S.O. Kotchoni et al. / Gene 495 (2012) 128–133
131
Fig. 2. Phylogenetic analysis of soybean ALDHs with other well characterized plant ALDHs. Neighbor-joining (NJ) method was used to perform the phylogenetic analysis of G. max (purple), Z. mays (black), O. sativa (red), A. thaliana (blue), P. patens (green), and C. reinhardtii (yellow) deduced ALDH protein sequences. Members of respective ALDH families are depicted in a specific background color. Respective ALDH families are indicated.
and ALDH22 (Wood and Duff, 2009). Members of ALDH5 gene family are known to encode for the GABA shunt enzyme, succinic semialdehyde dehydrogenase (SSADH), crucial for molecular signaling and maintenance of carbon–nitrogen balance in plants (Bouché and Fromm, 2004; Wood and Duff, 2009). While all previously sequenced green plant (Arabidopsis, maize, rice, moss, algae) genomes contained class 5 ALDH proteins (Jimenez-Lopez et al., 2010; Kirch et al., 2004; Kotchoni et al., 2010a; Wood and Duff, 2009), soybean genome lacks class 5 ALDHs encoding SSADHs (Table 1), suggesting that GABA metabolism in plants is under flexible genetic control and in soybean this metabolic pathway does not involve ALDH5s, the SSADHs, but rather other divergent genes that have functionally replaced family 5 ALDHs over several million years of evolution. On the other hand, G. max and P. patens genomes contain multiple ALDH11 genes encoding the non-phosphorylating glyceraldehyde-3phosphate dehydrogenase (GAPN; EC 1.2.1.9) known to catalyze the irreversible oxidation of glyceraldehyde-3-phosphate into 3phosphoglycerate (Kirch et al., 2004). ALDH11 is one of the core
plant ALDH genes involved in GAPN “glycolytic shunt”. Unlike Z. mays, O. sativa, A. thaliana and algal genomes that contain each a single ALDH11 gene, the G. max and P. patens genomes contain multiple ALDH11 genes suggesting that the GAPN “glycolytic shunt” is a robust and active biochemical pathway in soybean and mosses (Table 1, Figs. 1 and 2). ALDH10 genes encoding betaine aldehyde dehydrogenase (BADH; EC 1.2.1.8) which catalyzes the oxidation of betaine aldehyde to the compatible solute glycine betaine (Weretilnyk and Hanson, 1990) are members of the most important ALDH gene family in soybean. Compared to other plant species, the proliferation of ALDH10 genes (five copies) in soybean suggests that betaine aldehyde metabolic pathway is very crucial for soybean development and environmental adaptability. However, the soybean genome lacks family 12 and 18 ALDH genes. ALDH12 and ALDH18 genes encode mitochondrial Δ1-pyrroline-5-carboxylate dehydrogenase (P5CDH; EC 1.5.1.12) and Δ1-pyrroline-5-carboxylate synthetase respectively. These enzymes are crucial in proline metabolism (Deuschle et al., 2001; Igarashi et al., 1997) mediating salt stress tolerance in higher plants. In addition,
Table 2 Comparative identification of the ALDH gene families in fully sequenced plant genomes. Organism
G. max O. sativa P. patens A. thaliana C. reinhardtii Z. mays
ALDH family 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
− − − − − −
+ + + + + +
+ + + + − +
− − − − − −
− + + + + +
− + + + + +
+ + + + − +
− − − − − −
− − − − − −
+ + + + + +
+ + + + + +
− + + + + +
− − − − − −
− − − − − −
− − − − − −
− − − − − −
− − − − − −
− + − − − +
− − − − − −
− − − − − −
− − + − − −
− + − + − +
− − + − − −
− − − − + −
Presence (+) or absence (−) of ALDH gene family is depicted in each indicated organism.
132
S.O. Kotchoni et al. / Gene 495 (2012) 128–133
ALDH22 gene family encoding a novel aldehyde dehydrogenase (Jimenez-Lopez et al., 2010; Kirch et al., 2004; Kotchoni et al., 2010a; Stiti et al., 2011; Wood and Duff, 2009) is missing in soybean genome. However, the striking proliferation of ALDH2 (five copies) and ALDH10 (six copies) gene copies in soybean (Table 1) compared to other plant species (Figs. 1 and 2) suggests that these classes of ALDHs might be crucial for physiological maintenance and environmental adaptability of soybean. 4. Discussion Despite the important role of ALDHs in environmental stress responses, the functional mechanism of stress inducible ALDH gene family is still elusive. Several stress inducible ALDH genes including members of ALDH3, ALDH7, ALDH10, ALDH11, ALDH12 and ALDH18 gene families (Kirch et al., 2004; Kirch et al., 2005; Kotchoni and Bartels, 2003; Kotchoni et al., 2006; Missihoun et al., 2011; Stiti et al., 2011) have been characterized in a wide range of plant species. Soybean genome has a single ALDH3 gene, ALDH3H2, localized to the chloroplast (Table 1), suggesting that reverse genetics will be an efficient approach to reveal the role of ALDH3 protein in soybean under environmental stress conditions. The first plant inducible ALDH3 gene family, CpALDH3, was identified and characterized from a resurrection plant, Craterostigma plantagineum (Kirch et al., 2001) and overexpression of CpALDH3 in transgenic Arabidopsis plants improves tolerance to drought and salt stress (Kotchoni, 2004). CpALDH3 is an ABA and dehydration inducible gene involved in reactive oxygen mediated detoxification of lipid peroxidation derived aldehydes (Kirch et al., 2001; Kotchoni et al., 2006). Orthologs of CpALDH3 have been subsequently characterized in maize (JimenezLopez et al., 2010), rice (Kotchoni et al., 2010a), Arabidopsis (Kirch et al., 2004), moss and algae (Wood and Duff, 2009). The expression of Arabidopsis class 3 ALDHs (AthALDH3) is induced after exogenous ABA application, high salinity, dehydration, exposure to heavy metals, H2O2 and paraquat, suggesting a possible role of AthALDH3 in response to oxidative stress (Kirch et al., 2004; Kotchoni et al., 2006; Missihoun et al., 2011; Stiti et al., 2011). The stress inducible family 7, 10, and 11 ALDHs are represented by multiple gene members in soybean. ALDH7 genes are members of ‘turgor-responsive’ antiquitin proteins involved in stress adaptive metabolic pathway(s) (Kirch et al., 2004; Kotchoni et al., 2006; Stiti et al., 2011). They are induced by environmental stresses including dehydration, low temperature, heat shock and high concentrations of ABA (Kotchoni et al., 2006). In addition, soybean genome encodes six members of dehydration- and salt stress-inducible ALDH10 (betaine aldehyde dehydrogenases: BADHs) protein family (Table 1) that catalyze the oxidation of betaine aldehyde into the compatible solute glycine betaine (Weretilnyk and Hanson, 1990). The high proliferation of members of ALDH10 gene family suggests that betaine aldehyde metabolic pathway is crucial in soybean physiology and environmental adaptability. It has been demonstrated that the ability to synthesize and/or accumulate glycine betaine is a ubiquitous adaptation to osmotic stress (Rhodes and Hanson, 1993). This metabolic pathway therefore represents a promising way to improve abiotic stress tolerance in soybean. Indeed, it has been recently demonstrated that recombinant ALDH10A9 protein is able to metabolize betaine aldehyde as well as two aminoaldehydes, 4-aminobutanal and 3aminopropanal, implying that ALDH10 protein family might be involved in the polyamine metabolism (Missihoun et al., 2011). Evidence of the important role of ALDH10 genes in abiotic stress response has been demonstrated through a simultaneous increase of diamine oxidase activity and the production of γ-aminobutyric acid (GABA) from 4-aminobutanal in soybean (Xing et al., 2007). This resulted in an increase of the GABA contents in roots of soybean grown under salt stress condition. In plant, GABA acts as a compatible solute to a signal molecule (Bouché and Fromm, 2004). The GABA
content derived from the activity of AMADH proteins was estimated to be about 39% of the total GABA pool in soybean suggesting that the production of GABA from the polyamine catabolism-derived aminoaldehydes under adverse conditions is a general feature shared by both monocots and dicots (Xing et al., 2007). However, whether the multiple members of soybean ALDH10 gene family function in vivo in this pathway remains to be elucidated. Like ALDH10 gene family, soybean genome contains multiple members of ALDH11 gene family (two: GmALDH11A3, GmALDH11A4) encoding a cytosolic glyceraldehyde-3phosphate dehydrogenase involved in one of the classic glycolytic ‘bypass’ reactions unique to plants (Plaxton, 1996) generating NADPH required for the biosynthesis of photosynthetic glyceraldehyde-3phosphate exported from chloroplast by the phosphate translocator (Kirch et al., 2004). Glyceraldehyde-3-phosphate dehydrogenase has been established as the main source of NADPH for mannitol biosynthesis in celery (Gao and Loescher, 2000). The main role of soybean ALDH11 gene family is yet to be fully elucidated. The soybean genome contains five genes (GmALDH2B1, GmALDH2B2, GmALDH2B3, GmALDH2C1 and GmALDH2D1) encoding cytosolic and mitochondrial homotetrameric predicted ALDH proteins (Table 1). Orthologs of family 2 ALDHs have been extensively studied in humans and yeast (Navarro-Aviño et al., 1999; Yoshida et al., 1998). Recently, we identified members of ALDH2 gene family in rice and maize and used protein modeling structural characterization to demonstrate for the first time the role of maize ALDH2B2 in the molecular process of male fertility restoration (Jimenez-Lopez et al., 2010; Kotchoni et al., 2010b). Orthologs of ALDH2 gene family have been characterized in Arabidopsis, rice, mosses, and algae, with the exception of O. tauri (Kotchoni et al., 2010a; Skibbe et al., 2002; Wood and Duff, 2009). In rice, OsALDH2B2 has been found to be responsible for an efficient detoxification of acetaldehydes after submergence, suggesting that class 2 ALDHs play a major role in plant ethanol fermentation (Altschul et al., 1990). However, submerged growth condition is not required for soybean growth development. The multiple members of family 2 ALDH proteins in soybean might be justified by a crucial need to metabolize/ ferment several other aldehydes in a wide range of different cellular compartments in soybean. This idea is supported by recent findings showing that members of family 2 ALDHs require non-identical substrates and do not accumulate in the same tissue at the same time (Hare and Cress, 1997). Soybean and legumes in general are the major part of world agriculture as they fix atmospheric nitrogen. The complete sequence of soybean genome has tremendously assisted researchers to improve soybean as a food source and provide molecular assisted knowledge to manipulate soybean to become a potentially viable source of biodiesel fuel. In this paper, we have retrieved for the first time the entire ALDH gene families in soybean. The soybean genome contains 18 genes belonging to the ALDH gene superfamily, encoding members of five distinctive protein families, many of which (families 3, 7 and 10) are associated with stress conditions. The phylogenetic analysis suggested the existence of common regulatory mechanisms of ALDH gene superfamily within several plant species. Therefore, it is now possible to use comparative functional genomics approach between soybean, maize, rice, Arabidopsis, moss and algae, as well as site directed mutagenesis approach to identify and characterize the biochemical pathways associated with each ALDH protein and/or protein family in soybean. Elucidation of the ALDH functions will represent an important step towards understanding basic aspects of osmotic stress responses in soybean. References Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J., 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25, 3389–3402. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J., 1990. Basic local alignment search tool. J. Mol. Biol. 215, 403–410.
S.O. Kotchoni et al. / Gene 495 (2012) 128–133 Bouché, N., Fromm, H., 2004. GABA in plants: just a metabolite? Trends Plant Sci. 9, 110–115. Derelle, E., Ferraz, C., Rombauts, S., Rouzé, P., Worden, A.Z., Robbens, S., Partensky, F., Degroeve, S., Echeynié, S., Cooke, R., et al., 2006. Genome analysis of the smallest free living eukaryote Ostreococcus tauri unveils many unique features. Proc. Natl. Acad. Sci. U.S.A. 103, 11647–11652. Deuschle, K., Funck, D., Hellmann, H., Däschner, K., Binder, S., Fromme, W.B., 2001. A nuclear gene encoding mitochondrial D1-pyrroline-5-carboxylate dehydrogenase and its potential role in protection from proline toxicity. Plant J. 27, 345–355. Finn, R.D., Mistry, J., Tate, J., Coggill, P., Heger, A., Pollington, J.E., Gavin, O.L., Gunasekaran, P., Ceric, G., Forslund, K., et al., 2010. The Pfam protein families database. Nucleic Acids Res. 38, D211–D222. Gao, C., Han, B., 2009. Evolutionary and expression study of the aldehyde dehydrogenase (ALDH) gene superfamily in rice (Oryza sativa). Gene 431, 86–94. Gao, Z., Loescher, W.H., 2000. NADPH supply and mannitol biosynthesis. Characterization, cloning, and regulation of the nonreversible glyceraldehyde-3-phosphate dehydrogenase in celery leaves. Plant Physiol. 124, 321–330. Goff, S.A., Ricke, D., Lan, T.-H., Presting, G., Wang, R., Dunn, M., Glazebrook, J., Sessions, A., Oeller, P., Varma, H., et al., 2002. A draft sequence of the rice genome (Oryza sativa L. ssp. japonica). Science 296, 92–100. Hare, P.D., Cress, W.A., 1997. Metabolic implications of stress induced proline accumulation in plants. Plant Growth Regul. 21, 79–102. Igarashi, Y., Yoshiba, Y., Sanada, Y., Wada, K., Yamaguchi-Shinosaki, K., Shinosaki, K., 1997. Characterization of the gene for D1-pyrroline-5-carboxylate synthetase and correlation between the expression of the gene and salt tolerance in Oryza sativa L. Plant Mol. Biol. 33, 857–865. Jimenez-Lopez, J.C., Gachomo, E.W., Seufferheld, M.J., Kotchoni, S.O., 2010. The maize ALDH protein superfamily: linking structural features to functional specificities. BMC Struct. Biol. 10, 43. doi:10.1186/1472-6807-10-43. Kirch, H.-H., Nair, A., Bartels, D., 2001. Novel ABA- and dehydration-inducible aldehyde dehydrogenase genes isolated from the resurrection plant Craterostigma plantagineum and Arabidopsis thaliana. Plant J. 28, 555–567. Kirch, H.-H., Bartels, D., Wei, Y., Schnable, P.S., Wood, A.J., 2004. The ALDH gene superfamily of Arabidopsis. Trends Plant Sci. 9, 371–377. Kirch, H.-H., Schlingensiepen, S., Kotchoni, S., Ramanjulu, S., Bartels, D., 2005. Detailed expression analysis of selected genes of the aldehyde dehydrogenase (ALDH) gene superfamily in Arabidopsis thaliana. Plant Mol. Biol. 57, 315–332. Kotchoni, S.O. 2004. Molecular and physiological characterization of transgenic Arabidopsis plants expressing different aldehyde dehydrogenase (ALDH) genes. Ph.D. thesis dissertation. University of Bonn. Kotchoni, S.O., Jimenez-Lopez, J.C., Gao, D., Edwards, V., Gachomo, E.W., Margam, V.M., Seufferheld, M.J., 2010a. Modeling-dependent protein characterization of the rice aldehyde dehydrogenase (ALDH) superfamily reveals distinct functional and structural features. PLoS One 5, e11516. doi:10.1371/journal.pone.0011516. Kotchoni, S.O., Jimenez-Lopez, J.C., Gachomo, E.W., Seufferheld, M.J., 2010b. A new and unified nomenclature for male fertility restorer (RF) proteins in higher plants. PLoS One 5, e15906. doi:10.1371/journal.pone.0015906. Kotchoni, S.O., Bartels, D., 2003. Water stress induces the up-regulation of a specific set of genes in plants: aldehyde dehydrogenase as an example. Bulg. J. Plant Physiol. Special Issue 2003, 37–51. Kotchoni, S.O., Kuhns, C., Kirch, H.-H., Bartels, D., 2006. Overexpression of different aldehyde dehydrogenase genes in Arabidopsis thaliana confers tolerance to abiotic stress and protects plants against lipid peroxidation and oxidative stress. Plant Cell Environ. 29, 1033–1048. Marchler-Bauer, A., Anderson, J.B., Chitsaz, F., Derbyshire, M.K., DeWeese-Scott, C., Fong, J.H., Geer, L.Y., Geer, R.C., Gonzales, N.R., Gwadz, M., et al., 2009. CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res. 37, D205–D210. Merchant, S.S., Prochnik, S.E., Vallon, O., Harris, E.H., Karpowicz, S.J., Witman, G.B., Terry, A., Salamov, A., Fritz-Laylin, L.K., Maréchal-Drouard, L., Chlamydomonas
133
Annotation Team 2007, JGI Annotation Team 2007, et al., 2007. The Chlamydomonas genome reveals the evolution of key animal and plant functions. Science 318, 245–251. Missihoun, T.D., Schmitz, J., Klug, R., Kirch, H.-H., Bartels, D., 2011. Betaine aldehyde dehydrogenase genes from Arabidopsis with different sub-cellular localization affect stress responses. Planta 233, 369–382. Moroldo, M., Paillard, S., Marconi, R., Fabrice, L., Canaguier, A., Cruaud, C., De Berardinis, V., Guichard, C., Brunaud, V., Le Clainche, I., et al., 2008. A physical map of the heterozygous grapevine ‘Cabernet Sauvignon’ allows mapping candidate genes for disease resistance. BMC Plant Biol. 8, 66. doi:10.1186/1471-2229-8-66. Navarro-Aviño, J.P., Prasad, R., Miralles, V.J., Benito, R.M., Serrano, R., 1999. A proposal for nomenclature of aldehyde dehydrogenases in Saccharomyces cerevisiae and characterization of the stress-inducible ALD2 and ALD3 genes. Yeast 15, 829–842. Palenik, B., Grimwood, J., Aerts, A., Rouzé, P., Salamov, A., Putnam, N., Dupont, C., Jorgensen, R., Derelle, E., Rombauts, S., et al., 2007. The tiny eukaryote Ostreococcus provides genomic insights into the paradox of plankton speciation. Proc. Natl. Acad. Sci. U.S.A. 104, 7705–7710. Plaxton, W., 1996. The organization and regulation of plant glycolysis. Annu. Rev. Plant Physiol. Plant Mol. Biol. 47, 185–214. Rhodes, D., Hanson, A.D., 1993. Quaternary ammonium and tertiary sulphonium compounds in high plants. Rev. Plant Physiol. Plant Mol. Biol. 44, 357–384. Schmutz, J., Cannon, S.B., Schlueter, J., Ma, J., Mitros, T., Nelson, W., Hyten, D.L., Song, Q., Thelen, J.J., Cheng, J., et al., 2010. Genome sequence of the palaeopolyploid soybean. Nature 463, 178–183. Schnable, P.S., Ware, D., Fulton, R.S., Stein, J.C., Wei, F., Pasternak, S., Liang, C., Zhang, J., Fulton, L., Graves, T.A., et al., 2009. The B73 maize genome: complexity, diversity, and dynamics. Science 326, 1112–1115. Sigrist, C.J.A., Cerutti, L., de Castro, E., Langendijk-Genevaux, P.S., Bulliard, V., Bairoch, A., Hulo, N., 2010. PROSITE, a protein domain database for functional characterization and annotation. Nucleic Acids Res. 38, D161–D166. Skibbe, D.S., Liu, F., Wen, T.J., Yandeau, M.D., Cui, X., Cao, J., Simmons, C.R., Schnable, P.S., 2002. Characterization of the aldehyde dehydrogenase gene families of Zea mays and Arabidopsis. Plant Mol. Biol. 48, 751–764. Steele, M.I., Lorenz, D., Hatter, K., Park, A., Sokatch, J.R., 1992. Characterization of the mmsAB operon of Pseudomonas aeruginosa PAO encoding methylmalonatesemialdehyde dehydrogenase and 3-hydroxyisobutyrate dehydrogenase. J. Biol. Chem. 267, 13585–13592. Stiti, N., Missihoun, T.D., Kotchoni, S.O., Kirch, H.-H., Bartels, D., 2011. Aldehyde dehydrogenases in Arabidopsis thaliana: biochemical requirements, metabolic pathways, and functional analysis. Frontiers Plant Sci. 2, 65. doi:10.3389/fpls.2011.00065. Tuskan, G.A., DiFazio, S., Jansson, S., Bohlmann, J., Grigoriev, I., Hellsten, U., Putnam, N., Ralph, S., Rombauts, S., Salamov, A., et al., 2006. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science 15, 1596–1604. Vasiliou, V., Nebert, D.W., 2005. Analysis and update of the human aldehyde dehydrogenase (ALDH) gene family. Hum. Genomics 2, 138–143. Weretilnyk, E.A., Hanson, A.D., 1990. Molecular cloning of a plant betaine aldehyde dehydrogenase, an enzyme implicated in adaptation to salinity and drought. Proc. Natl. Acad. Sci. U.S.A. 87, 2745–2749. Wood, A., Duff, R.J., 2009. The aldehyde dehydrogenase (ALDH) gene superfamily of the moss Physcomitrella patens and the algae Chlamydomonas reinhardtii and Ostreococcus tauri. The Bryologist 112, 1–11. Xing, S.G., Jun, Y.B., Hau, Z.W., Liang, L.Y., 2007. Higher accumulation of γ-aminobutyric acid induced by salt stress through stimulating the activity of diamine oxidases in Glycine max (L.) Merr. roots. Plant Physiol. Biochem. 45, 560–566. Yoshida, A., Rzhetsky, A., Hsu, L.C., Chang, C., 1998. Human aldehyde dehydrogenase gene family. Eur. J. Biochem. 251, 549–557. Yu, J., Hu, S., Wang, J., Wong, G.K.-S., Li, S., Liu, B., Deng, Y., Dai, L., Zhou, Y., Zhang, X., et al., 2002. A draft sequence of the rice genome (Oryza sativa L. ssp. indica). Science 296, 79–92.