Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.)

Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.)

GENE-40644; No. of pages: 9; 4C: 7 Gene xxx (2015) xxx–xxx Contents lists available at ScienceDirect Gene journal homepage: www.elsevier.com/locate/...

3MB Sizes 0 Downloads 40 Views

GENE-40644; No. of pages: 9; 4C: 7 Gene xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Gene journal homepage: www.elsevier.com/locate/gene

Research paper

Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.) Changpin Zhou a,b,c, Yanbo Chen a,c, Zhenying Wu a,c, Wenjia Lu a,c, Jinli Han a,c, Pingzhi Wu a, Yaping Chen a, Meiru Li a, Huawu Jiang a, Guojiang Wu a,⁎ a b c

Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou 510650, PR China Research Institute of Tropical Forestry, Chinese Academy of Forestry, Guangzhou 510520, PR China University of Chinese Academy of Sciences, Beijing 100049, PR China

a r t i c l e

i n f o

Article history: Received 12 January 2015 Received in revised form 2 April 2015 Accepted 29 June 2015 Available online xxxx Keywords: MYB gene family Gene evolution Gene expression Abiotic stress Physic nut (Jatropha curcas L.)

a b s t r a c t The MYB proteins comprise one of the largest transcription factor families in plants, and play key roles in regulatory networks controlling development, metabolism, and stress responses. A total of 125 MYB genes (JcMYB) have been identified in the physic nut (Jatropha curcas L.) genome, including 120 2R-type MYB, 4 3R-MYB, and 1 4R-MYB genes. Based on exon–intron arrangement of MYBs from both lower (Physcomitrella patens) and higher (physic nut, Arabidopsis, and rice) plants, we can classify plant MYB genes into ten groups (MI–X), except for MIX genes which are nonexistent in higher plants. We also observed that MVIII genes may be one of the most ancient MYB types which consist of both R2R3- and 3R-MYB genes. Most MYB genes (76.8% in physic nut) belong to the MI group which can be divided into 34 subgroups. The JcMYB genes were nonrandomly distributed on its 11 linkage groups (LGs). The expansion of MYB genes across several subgroups was observed and resulted from genome triplication of ancient dicotyledons and from both ancient and recent tandem duplication events in the physic nut genome. The expression patterns of several MYB duplicates in the physic nut showed differences in four tissues (root, stem, leaf, and seed), and 34 MYB genes responded to at least one abiotic stressor (drought, salinity, phosphate starvation, and nitrogen starvation) in leaves and/or roots based on the data analysis of digital gene expression tags. Overexpression of the JcMYB001 gene in Arabidopsis increased its sensitivity to drought and salinity stresses. © 2015 Elsevier B.V. All rights reserved.

1. Introduction The MYB family of transcription factors is widespread in vertebrates, plants, and fungi, with a variety of functions (Dubos et al., 2010). It contains a conserved DNA-binding domain (DBD), the MYB domain, at the N-terminus. This domain generally consists of 1–4 imperfect amino acid sequence repeats (R) of about 52 amino acids, each forming three αhelices (Lipsick, 1996; Dubos et al., 2010). The second and third helices form a helix-turn-helix (HTH) structure when bound to specific promoter sequences (Lipsick, 1996; Stracke et al., 2001; Dubos et al., 2010). The first identified MYB gene was the v-MYB (P01104.2) gene derived from avian myeloblastosis virus (AMV; Rushlow et al., 1982). In plants, the C1 (NP_001106010.1) gene that controls the biosynthesis

Abbreviations: DBD, DNA-binding domain; EST, expression sequence tag; LGs, linkage groups; MS, Murashige and Skoog medium; NJ, neighbor-joining; PCR, polymerase chain reaction; RT-PCR, reverse transcriptase polymerase chain reaction; TPM, transcripts per million. ⁎ Corresponding author at: Xingke Road 723, Tianhe District, Guangzhou 510650, PR China. E-mail address: [email protected] (G. Wu).

of anthocyanin in maize was the first confirmed MYB gene (Paz-Ares et al., 1987). Now, the MYB gene family is considered one of the largest transcription factor families in plants. MYB genes have been identified in a number of monocots and dicots, most of which contain more than 100 genes. Plant MYB genes are grouped according to the number of adjacent repeats in the MYB domain of the resulting proteins: 1R-MYB, 2Rtype MYB (R2R3-, CCA1-like, and R-R-type MYB), R1R2R3-type MYB (3R-MYB), and 4R-MYB, containing one, two, three, and four imperfect repeats of 51 to 53 residues, respectively (Lipsick, 1996; Chen et al., 2006; Dubos et al., 2010). According to previous studies, MYB superfamily genes can be divided into MYB family (R2R3-, 3R-, and 4R-MYB) and MYB-related (R-R-type, CCA1-like, and 1R-MYB) families (Martin and Paz-Ares, 1997; Jiang et al., 2004; Chen et al., 2006; Dubos et al., 2010; Katiyar et al., 2012). The 4R-MYB group is the smallest and contains four R1/R2-like repeats; little is known of their functions in plants (Dubos et al., 2010). The 3R-MYB group is also small, and its constituents function in cell cycle control (Ito, 2005; Haga et al., 2007). The 2R-type MYB genes constitute the largest plant subfamily and are thought to have evolved from the same ancestor as 3R-MYB genes. More than 100 genes from plants such

http://dx.doi.org/10.1016/j.gene.2015.06.072 0378-1119/© 2015 Elsevier B.V. All rights reserved.

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072

2

C. Zhou et al. / Gene xxx (2015) xxx–xxx

as Arabidopsis (126; Dubos et al., 2010), maize (157; Du et al., 2012a), rice (109; Chen et al., 2006), soybean (244; Du et al., 2012b), eucalyptus (141; Soler et al., 2014) and poplar (192; Wilkins et al., 2009) are allocated to this group. The 2R-type MYB genes play multiple roles in plant-growth, including (1) primary and secondary metabolites, (2) cell fate and identity, (3) developmental processes, and (4) responses to biotic and abiotic stresses (Dubos et al., 2010). The 1R-MYB group is heterogeneous, with each protein containing a single and/or a partial MYB repeat. Collectively, 1R-MYB proteins are dubbed “MYB-related” (Dubos et al., 2010). MYB-related proteins from potato, for example, only have one repeat unit, and the REB1 (CAA84992.1) gene from yeast produces proteins with one and a half repeats (Ju et al., 1990; Baranowskij et al., 1994). Physic nut (Jatropha curcas L.) is a small perennial shrub belonging to the Euphorbiaceae family that is native to the tropical Americas and is now grown commercially in tropical and subtropical areas of Africa and Asia. Due to its abilities to endure drought and adapt easily to barren soil, and to the high oil content from its tree-borne seeds, the physic nut has emerged as a source of biofuel, which makes physic nut emerges as the tree-borne biofuel plant (Dhillon et al., 2009; Parthiban et al., 2009, 2011). Following the recent sequencing of its genome and the development of expressed sequence tag (EST) libraries by our group and others (Parani and Natarajan, 2011; Sato et al., 2011; Wu et al., 2015), the physic nut is now a useful model for studying the members of different families of transcription factor genes and their evolution. In this study, we searched the genome sequences of the physic nut in order to identify MYB genes (JcMYB). Subsequently, we characterized the DBD and exon–intron structure of these genes. The distribution of these genes in the linkage groups (LGs) and a phylogenetic tree combining Arabidopsis and rice MYB proteins were also analyzed to examine evolutionary relationships and the putative functions of physic nut MYB proteins. We analyzed the expression of the JcMYB genes under normal growth conditions and various abiotic stressors. We also analyzed the functions of several abiotic stress response genes by overexpressing them in Arabidopsis, and found that JcMYB001 plays a negative role in drought endurance and protects against salinity stressors. 2. Materials and methods 2.1. Sequence database searches Sequences of MYB domain proteins from Arabidopsis and rice were downloaded from the Arabidopsis genome, TAIR 9.0 release (http:// www.Arabidopsis.org/) and the plant transcription factor database website (http://plntfdb.bio.uni-potsdam.de/v3.0/), respectively. We searched for MYB genes in the physic nut genome database of the Kazusa DNA Research Institute (http://www.kazusa.or.jp/jatropha/) (Sato et al., 2011) and our own genome database (DDBJ/EMBL/GenBank under the accession number AFEW00000000; Wu et al., 2015). We used Arabidopsis MYB proteins as query sequences for Blastp and tBlastn searches against the physic nut genome sequences and against predicted protein sequences. Sequences for which the E value was less than − 10 were selected for further analysis. Next, we corrected errors in the annotation of MYB coding domain sequences on the basis of: the physic nut EST database available from GenBank (http://www.ncbi. nlm.nih.gov/), and our own physic nut (The database was deposited in the NCBI Sequence Read Archive under accession number SRX750579SRX750581.) and Jatropha integerrima EST datasets (Accession number SRX750578, SRX757230, SRX757232 and SRX757234). The proteins were analyzed by the PROSITE program (http://prosite.expasy.org/) to confirm the putative genes containing the MYB domain (De Castro et al., 2006). The exon–intron structures of JcMYB genes were determined by comparing the coding sequences and their corresponding genomic sequences using the Gene Structure Display Server (GSDS, http://

gsds.cbi.pku.edu.cn/). Chromosomal positions of the MYB genes were mapped according to the physic nut linkage map (Wu et al., 2015). 2.2. Phylogenetic tree construction The conserved MYB domains of JcMYB proteins were obtained using the PROSITE program (http://prosite.expasy.org/), and R2R3MYB domain clusters were aligned using the multiple sequence alignment software ClustalX (1.83; Thompson et al., 1997). The sequences of all 134 Arabidopsis MYB proteins (1 4R-MYB, 5 3RMYBs, 1 R2R3-like MYB, 1 3R-like MYB, and 126 R2R3-MYBs) were obtained from Stracke et al. (2001); the corresponding protein sequences were downloaded from The Arabidopsis Information Resource (TAIR) (http://www.Arabidopsis.org/), and the 113 rice sequences (Oryza sativa subsp. Japonica) were obtained from the plant transcription factor database website (http://plntfdb.bio.unipotsdam.de/v3.0/). The castor bean (Supplementary Table S1), Physcomitrella patens, Eucalyptus grandis, Vitis vinifera, and green algae MYB gene sequences were downloaded from Phytozome (http://www.phytozome.net; Chan et al., 2010; Soler et al., 2014). The tree was constructed using the neighbor-joining (NJ) method and 1000 bootstraps to arrange putative full-length MYB amino acid sequences with software ClustalX (1.83), and the results were displayed with Mega software version 4 (Tamura et al., 2007). 2.3. JcMYB001 transformation and gene expression in transgenic plants Total RNA was extracted from physic nut leaves and the firststrand cDNA was synthesized according to Xiong et al. (2013). The full length coding sequence of JcMYB001 was amplified with the primer pair JcMYB001aF (5′-GGGGTACCTAAGGACCGTCTCTCTATCT AA-3′) and JcMYB001aR (5′-GCGTCGACTCACCGACTCATTCTCACTT3′; incorporated KpnI and SalI restriction sites are underlined). The PCR product was cloned into the pMD 18-T vector (TaKaRa, Otsu, Japan) and subjected to DNA sequencing. The resulting fragment was digested with KpnI/SalI and inserted into the corresponding restriction sites of the pCAMBIA1301 vector under the control of the CaMV35S promoter. The resulting constructs were transformed into Arabidopsis plants (Col-0 ecotype) by the floral-dipping method (Clough and Bent, 1998). The single insertion homozygous T-DNA lines were chosen for the next analysis. Expression levels of the transgenic JcMYB001 were examined by semi-quantitative RT-PCR, using the gene-specific primer pairs JcMYB001bF (5′-TCGGTGAGTG CGGGTTCTGG-3′) and JcMYB001bR (5′-CGAGGTGGGTGGGTCGTT GT-3′), and a cDNA fragment of β-tubulin as a control (β-tubulin-F: GAGCCTTACAACGCTACTCTGTCTGTC, β-tubulin-R: ACACCAGACATA GTAGC AGAAATCAAG). 2.4. Drought and salinity treatment in transgenic Arabidopsis Transgenic and wild-type seeds were surface-sterilized and incubated in the dark at 4 °C for 2 days. For drought tolerance experiments, the plants were grown in pots containing a 1:1 mixture of vermiculite and peat moss of similar density under a long-day photoperiod (16-h light/8-h dark) at 22 ± 2 °C for 3 weeks; water was withheld for 13 days. The plants were then watered for 7 days and survival was observed. For salinity stress experiments, surfacesterilized seeds were planted on Murashige and Skoog (MS) agar plates containing 1% (w/v) sucrose and 1% agar (w/v). The plates were incubated in the dark at 4 °C for 2 days then placed under a long-day photoperiod (16-h light/8-h dark) at 22 ± 2 °C. Four days after germination, seedlings were transferred to new vertical MS agar plates supplemented with different concentrations of NaCl. Root length was measured after 10 days of growth. The experiment was repeated 6 times.

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072

C. Zhou et al. / Gene xxx (2015) xxx–xxx

3. Results 3.1. Identification of MYB family genes in physic nut To identify MYB family genes in physic nut, BLAST searches were carried out against physic nut genomes, including the public database (Sato et al., 2011) and our own protein and genome sequences (AFEW00000000; Wu et al., 2015), using the MYB domains of Arabidopsis and rice as query sequences. The existence of the complete MYB motif in each protein was confirmed using the PROSITE program (http://prosite.expasy.org/; De Castro et al., 2006). A total of 125 MYB genes (JcMYB) were identified from the searches, including 120 2Rtype-, 4 3R-, and 1 4R-MYB genes. These genes were named JcMYB001 to JcMYB125, and were deposited in the GenBank database (Supplementary Table S2). Predicted JcMYB proteins were 179 (JcMYB008) to 1,125 (JcMYB013) amino acid residues in length (Supplementary Table S2.). We observed that 114 of the 120 R2R3-MYBs genes have introns in the coding domain sequences: 18 have one intron, 88 have two, 6 have three, 1 has four, and 1 has eleven. Three of the 3R-MYB genes have 6 introns, while one has 10. The 4R-MYB gene has 9 introns (Supplementary Fig. S1). The exon–intron structures of R2R3 repeats were conserved in rice, grape, and Arabidopsis R2R3 MYB genes, and the intron-containing genes were classified into 4–6 models (Jiang et al., 2004; Matus et al., 2008; Katiyar et al., 2012). After analysis of the overall exon–intron arrangements, we observed that MYB genes from physic nut could be grouped into nine models (include an intronless group) based on the coding sequences for the structures of the two types of MYB repeats (R2 and R3; Fig. 1). All intron-containing 2R- and 3R-type MYB genes from rice and Arabidopsis can also be classified into the same eight groups as physic nut after analysis of their gene structures (Fig. 1). Most intron-containing 2R-type MYB genes in physic nut (96 of 114), as well as in rice and Arabidopsis, possess the model I structure. In model I, exon 1 codes for H1 and H2 of the R2 MYB repeat, exon 2 codes for H3 of the R2 and H1 of the R3 MYB repeat, and exon 3 codes for H2 and H3 the of R3 MYB repeat (model Ia). Model I holds even though the second or the first intron (in models Ib and Ic, respectively) is lost in several genes. The first and second MYB domains are encoded

3

by two exons in model III, IV, V, and VII genes. The R2 and R3 MYB repeats of JcMYB015 and the 3R genes belong to model VIII which contains the most introns; the 3R genes lose the second and forth introns (Fig. 1). The 2R-type MYB genes, except Pp1s206_27V6.1 (MIX) in P. patens, a moss species, could fall into any of the nine structure models (Fig. 1). MYB domain structures for more than half of 2Rtype MYB genes in green alga (such as Chlamydomonas reinhardtii) were similar to model VIII at one to three intron positions (Supplementary Fig. S2). The 3R-MYB gene in C. reinhardtii, found in higher plant types, is nonexistent, while it is found in other alga such as Coccomyxa subellipsoidea C-169 (http://www.phytozome.net; Transcript name = 7682), XP_005647035.1 (NCBI) and Micromonas pusilla CCMP1545-3R (http://www.phytozome.net/; Transcript name = 192942). The gene structure in these algae is similar to that of 3R-MYB genes (MVIII) in higher plants. These results suggest that the present MYB genes in both higher and lower plants evolved from several, common, ancient MYB genes, and the MVIII group is likely one of the most ancient MYB types. The basic MYB domain (three α-helices, H1, H2, and H3) is about 50 amino acid residues long with three regularly spaced tryptophan (Trp) amino acid residues (Fig. 1 and Supplementary Fig. S3; Du et al., 2012b). The three regularly spaced Trp amino acids in the first repeat are conserved in all 2R-type MYB proteins, while the first Trp in the second repeat is only conserved in MIII, IV, VIII, and IX proteins (Fig. 1; Supplementary Fig. S3A). The third Trp in the second repeat of JcMYB010 (MVI), JcMYB014 (MIV, CDC5 gene family), and JcMYB015 (MVIII) was replaced by other amino acids (Supplementary Fig. S3A). The third regularly spaced Trp in the first repeat of the 4R-MYB protein was replaced by phenylalanine (Phe; Supplementary Fig. S3B). 3.2. Phylogenetic analysis of physic nut MYB proteins To examine the evolutionary relationships of physic nut MYB genes, a phylogenetic tree was constructed with JcMYB proteins and the publicly available MYB proteins from Arabidopsis and rice by using neighbor-joining within the Mega4 program (Supplementary Fig. S4). Another phylogenetic tree was also constructed with physic nut R2R3MYB proteins and the R2R3-MYB proteins from Arabidopsis, grape and

Fig. 1. Alignment of the amino acids for R2 and R3 repeats of representative 2R-, 3R- and 4R-type MYB genes, indicating the exon–intron structure models. Intron positions, relative to the amino acid residues, are indicated by arrows. Arrows between the coding sequences of two amino acids indicate that splicing occurs just before the second amino acid. Arrows pointed to amino acids indicate that the splicing occurred within the amino acid coding sequence. JcMYB013a and JcMYB013b represent the R1 and R2, and R2 and R3 repeats of the JcMYB013 protein (3R-MYB) respectively. The putative, regularly spaced tryptophan (Trp, W) residues are indicated by black triangles (▲) below the amino acids, while the conserved helices are shown above the sequences. The gene numbers for 4 plant species from each model are listed to the right of the structure models. Model X represents the intronless (IL) genes. Jc = Jatropha curcas. At = Arabidopsis thaliana. Os = Oryza sativa. Pp = Physcomitrella patens.

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072

4

C. Zhou et al. / Gene xxx (2015) xxx–xxx

Fig. 2. A phylogenetic tree of the MYB proteins. A total of 134 proteins from Arabidopsis (At), 125 from physic nut (Jc), and 113 from rice (Os) were used. The full-length amino acid sequences of MYB proteins were aligned using CLUSTAL_X and the phylogenetic tree was constructed using the neighbor-joining method. The bracketed subgroups were previously classified by Soler et al. (2014). Subgroup names are included next to each clade together with a short name to simplify nomenclature. The number of genes of each species for each subgroup is also included. The exon–intron model types of the grouped genes were indicated on lines of branches. The uncompressed tree is available in Supplementary Fig. S4.

eucalyptus (Supplementary Fig. S5). The topologies on the phylogenetic trees are similar to previously reported phylogenetic analyses, given the comparisons with genes from rice and Arabidopsis on each clade. According to our results, and based on previous analyses of the grape,

eucalyptus, Arabidopsis and rice MYB gene families (Stracke et al., 2001; Dubos et al., 2010; Soler et al., 2014), these MYB proteins could be classified into 45 clades, C1 to C45 (Supplementary Fig. S4, Fig. 2). Proteins of Arabidopsis and rice in 39 of the 42 R2R3 MYB clades

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072

C. Zhou et al. / Gene xxx (2015) xxx–xxx

(shown in brackets) were coincident with the previous report (Soler et al., 2014). Eucalyptus lacks clades of C12, C18, C22 and C24 (Supplementary Fig. S5; Table S3). AtMYB104 (AT2G26950), which belongs to subgroup 18 (Dubos et al., 2010) and shares the same intron-exon structure (MII) with other subgroup 18 genes, was subgrouped into C35. The 4R-MYB proteins fell into C44 (4R-MYB). The 3R-MYB proteins were grouped in C42 (3R-MYB). JcMYB014, which has the MIV structure, was classified with AtMYBCDC5 (AT1G09770) and LOC_Os04g28090 into C45 (R2R3like). The intronless genes fell into two subgroups: C36 (JcMYB016), C39 (JcMYB001-005). Genes from models III (JcMYB017-020), V (JcMYB125), VI (JcMYB008-012), VII (JcMYB006 and 007), and VIII (JcMYB015) fell into clades C41, C37, C40, C38, and C43, respectively. The MI genes were divided into 34 clades, C1 to C34. Of these 34 clades, 23 include genes from all three species, 6 are specific to the physic nut, and 1 (C12) do not contain physic nut genes (Fig. 2). The two MYB domains of MIV proteins (CDC5) are of the R1/R2 type (Jiang et al., 2004). MVIII genes from AtMYB088 (AT2G02820) and AtMYB124/FLP (AT1G14350) were previously classified as an atypical R2R3 MYB gene, based on their evolutionary origin (Dias et al., 2003). The first regularly spaced Trp residue in the second repeat (R3) was only presented in MIII, MIV, and MVIII proteins. The phylogenetic analysis indicated that the different exon–intron structure model genes were classified into distinct groups, and the intronless genes from physic nut and Arabidopsis were placed into two groups. Next, to test if the second MYB domain of these proteins are of the R3 type, and to explore the evolutionary relationship of MYB genes, we selected one gene in each group from four species (physic nut, rice, Arabidopsis, and P. patens), and along with MYB genes from green alga, constructed a phylogenetic tree using the individual MYB repeats. The R3 (except for the MIV proteins) and R1/R2 domains of these proteins were divided into two groups on the phylogenetic tree (Supplementary Fig. S6). The

5

R3 domains of MIV proteins were close to the R1 domains of 4R proteins, as previously reported (Jiang et al., 2004). The second MYB domain of MIII and MVIII group proteins was classified into the R3 group. Cre03g197350 and Cre02g103450 were classified into the MIV and MVIII groups, respectively. The MIX and Group II intronless (ILII) proteins were classified into two distinct subgroups, and only existed in P. patens and higher plants, respectively. 3.3. Chromosomal distribution, tandem repeats and duplication A total of 121 JcMYB genes could be mapped onto 11 linkage groups (LGs) of physic nut (Wu et al., 2015). These JcMYB genes were nonrandomly distributed on the LGs. Most JcMYB genes were found on LG 5 (N = 22, 18.2%), while the least number of genes was found on LG 10 (N = 4, 3.3%; Fig. 3). The existence of segmental and tandem duplication events for MYB genes in Arabidopsis and rice has been reported previously (Cannon et al., 2004; Katiyar et al., 2012). In physic nut, several gene pairs were produced from the genome triplication of ancient dicotyledons (Fig. 3). They are JcMYB043/084 and JcMYB044/085 (A1) on LGs 1 and 8; JcMYB031/096 and JcMYB032/095 (A2) on LGs 2 and 6; JcMYB035/045 and JcMYB033/034/119 (A3) on LGs 2 and 5; JcMYB061/ 063 and JcMYB062/064/067/068 (A4) on LGs 2 and 11; JcMYB017/018/ 071/118 and JcMYB019/072/113 (A5) on LGs 3 and 5; and JcMYB002/ 004/093/106 and JcMYB001/092/105 (A6) on LGs 4 and 7 (Fig. 3). Tandem duplicates, defined as tandem repeats which are located within 50 kb from each other or are separated by b4 non-homologous spacer genes (Cannon et al., 2004), were also observed for the MYB genes in the physic nut genome. About 21% (N = 25) of MYB genes in the physic nut genome are present as tandem repeats at 8 loci on 5 LGs. The most tandem repeats (N = 8) were observed on LG 5, which belongs to clade C7 (subgroup 6). Each tandem duplicate pair exhibited relatively high

Fig. 3. The distribution of physic nut MYB genes on linkage groups (LGs). The linkage group number is indicated above each LG. The distribution is represented on a centimorgan (cM) scale. The phylogenetic category of each gene is indicated by the subgroup number. Black bars on the LGs indicate the 6 predicted ancient duplication regions (A1–A6). The vertical black lines indicate groups of gene clusters with paralogous and syntenic genes on the LGs (T1–T8).

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072

6

C. Zhou et al. / Gene xxx (2015) xxx–xxx

sequence similarities and was grouped into a clade on the phylogenetic trees. In order to test whether these tandem duplicates arose from recent duplication events in physic nut, we constructed another unrooted tree using MYB proteins from physic nut and its closelyrelated species, the castor bean (Supplementary Fig. S7). Based on this phylogenetic tree, paralogs of JcMYB097/098 (T1), JcMYB088/ 089 (T3), and JcMYB064/067/068 (T8) were also observed as tandem repeats in the castor bean genome. Paralogs of JcMYB074/075/076/ 077/078/121/123/124 (T4) were observed as tandem repeats in genomes of E. grandis and V. vinifera, while tandem repeat of JcMYB065/066 (T7) paralogs was only presented in the E. grandis genome (Supplementary Fig. S5). On the contrary, tandem repeats of JcMYB017/018 (T2), JcMYB116/117 (T5), and JcMYB057/058/059/060 (T6), were only observed in the physic nut genome among the tested species. These results indicate that the MYB tandem repeats in physic nut come from both ancient and recent gene duplication events.

3.4. Expression profiles analysis of the JcMYB genes We assessed the expression profiles of physic nut MYB genes in the roots, stems, leaves and seeds using Digital Gene Expression (DGE) tag profiling, a next-generation sequencing-based method that allows for spatial transcript analysis. The plant seeds we collected allowed for analysis at the early development (S1) and the filling and maturation (S2) stages (Jiang et al., 2012). Expressed sequence tags (ESTs) for 86 JcMYB genes were detected in the EST database of physic nut. The other 25 JcMYB genes were detected in the EST database of J. integerrima. The expression of another 6 JcMYB genes (JcMYB020/082/088/089/091/098/) without ESTs was observed in the DGE database at low levels (Supplementary Table S3.). These results imply that 93.5% of the 125 MYB genes were expressed, according to the present database. The expression patterns of many JcMYB genes existed with varying abundance in the tested tissues (Fig. 4). Four genes (JcMYB001/002/014/101) were highly expressed in all tissues tested. Forty-one genes were highly expressed in roots, 19 in stems, 19 in leaves, 24 in S1, and 12 in S2, respectively. The expression levels of 13 genes (JcMYB002/004/012/015/023/037/052/056/ 069/075/083/108/109) in S1 were over 5 times higher than S2, whereas, the expression level of only JcMYB011 was largely higher in S2 than in S1 (Fig. 4; Supplementary Table S3). Several duplicators from genome triplication events in ancient dicotyledons were differently expressed among the tissues tested. For genes on the A2 locus, the EST and high expression level of JcMYB031 were observed in roots and leaves, whereas JcMYB032 was not detected in the tissues tested. JcMYB095 was expressed in roots and stems, while JcMYB096 was highly expressed in roots and seeds. The JcMYB033 gene on locus A3 was highly expressed in roots, while JcMYB035 was highly expressed in seeds. JcMYB063 genes on locus A4 were highly expressed in leaves, while JcMYB064/067 genes showed the highest expression levels in roots. Genes on locus A5 for JcMYB113, but not JcMYB118, were expressed in seeds. JcMYB001 gene on locus A6 was more highly expressed than JcMYB002/004 in S2. In order to detect the potential role of JcMYBs in abiotic stress, we evaluated the expression of JcMYB genes in the roots and leaves under drought, nitrogen-deficiency, phosphorus-deficiency, and salinity stresses, respectively, using our next-generation sequencing-based DGE tag database and reported the results as fold changes with respect to the controls (Supplementary Table S3). The expressions of 34 JcMYB genes were altered, showing either a 2-fold increase or decrease. Of these, 18 genes were expressed in response to a single treatment, while the others responded to more than one treatment (Fig. 4; Supplementary Table S3). The number of genes that were expressed in response to drought, salinity, nitrogen starvation and phosphate starvation was 20, 16, 14, and 9, respectively.

Fig. 4. Expression levels of the JcMYB genes in physic nut plants under normal (TPM) ** and abiotic stress conditions. Changes in expression levels that were greater than twofold under drought (D), salinity (S), phosphate starvation (−P) and nitrogen starvation (−N) are indicated. Only genes with expression levels greater than 5 transcripts per million are included. **TPM (number of transcripts per million tags).

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072

C. Zhou et al. / Gene xxx (2015) xxx–xxx

7

3.5. Over-expression of JcMYB001 in Arabidopsis decreases its tolerance to drought and salinity stresses To investigate the function of physic nut MYB genes, several of the genes that demonstrated changes in expression as a result of drought and salinity stress were over-expressed in Arabidopsis under the control of a CaMV 35S promoter. We observed that the over-expression of JcMYB001 (OeJcMYB001) in Arabidopsis resulted in an increased sensitivity to drought/salinity stress. Three OeJcMYB001 Arabidopsis lines (O1, O2, and O3) were selected for abiotic stress treatments. Semiquantitative RT-PCR was used to measure the expression of JcMYB001 in transgenic Arabidopsis lines (Fig. 5). Under normal growth conditions, no differences were detected between wild-type and transgenic plants in plant size, morphology, and development stages. The transgenic plants wilted after 13 days without watering. No transgenic plants survived after post-drought watering, whereas all wild-type plants survived (Fig. 5). The root lengths of the transgenic seedlings were significantly shorter than those of the wild-type seedlings when cultivated with 100–150 mM NaCl. The transgenic seedlings showed severer leaf chlorosis than wild-type seedlings when cultivated with 150 mM NaCl (Fig. 6). 4. Discussion In the present study, we identified a putative full set of MYB genes in the physic nut genome, comprising a total of 120 R2R3- and R2R3-like-, 4 3R-, and one 4R-MYB encoding genes. Introns are generally neutral to selection and undergo rapid change during evolution, therefore high sequence similarity between orthologous introns indicates a functional constraint during evolution (Rogozin et al., 2005). The exon–intron structures of R2R3 repeats are conserved in the R2R3 MYB genes of higher plants, and the intron-containing genes have been classified into 4–6 groups (Jiang et al., 2004; Matus et al., 2008; Katiyar et al., 2012). In our study, we observed that intron-containing 2R-type MYB

Fig. 5. Drought stress tolerance tests of transgenic Arabidopsis lines. (A) The relative transcript levels of JcMYB001 in different transgenic lines (O1, O2 and O3) after semi-quantitative RT-PCR. (B) Drought stress tolerance tests. Three-week-old wild-type and transgenic lines were withheld from water for 13 days (upper), followed by recovery for 7 days (lower).

Fig. 6. Salinity stress tolerance tests of transgenic Arabidopsis lines. (A) Salinity stress tolerance tests. Four-day-old seedlings of wild-type and transgenic lines (O1, O2 and O3) were transferred to Murashige and Skoog (MS) medium supplemented with different concentrations of NaCl for 10 days. (B) Root length of the seedling. The data shown are means ± SD from six biological experiments (n N 30). Statistically significant differences were assessed using Student's t-tests (**P b 0.01).

genes in land plants could be classified into nine groups (MI to MIX) based on the models of intron–amino acid positions of MYB domains (Fig. 1). The higher plants that were analyzed lost MIX genes as they evolved. Genes in each group were classed into a single large group on the phylogenetic tree. Intronless genes from physic nut and Arabidopsis were classified into two subgroups, one of which was absent from the P. patens genome during evolution (Fig. 2; Supplementary Figs. S4, S6). These results indicate that most MYB subfamilies were conserved across both lower and higher plants. The existence of a larger number of MYB genes in higher plants was mainly due to the expansion of MI group genes in higher plants during their evolution. There are several reasons for classifying MVIII genes as one of the most ancient R2R3 gene types, although previously, they were classified as atypical R2R3 genes (Dias et al., 2003). First, based on phylogenetic analysis, the two MYB repeats in MVIII genes belong to the R2 and R3 types, respectively (Supplementary Fig. S4). Second, the R2R3 domain of 3R-MYB proteins in green alga and land plants belongs to this exon–intron structure model (Fig. 1). And third, more than half of R2R3 genes in green alga share one or more intron positions with this model (Supplementary Fig. S2). If the supposition that 3R MYB genes derived from R2R3 MYB genes by gaining the R1 repeat through an ancient intragenic duplication (Jiang et al., 2004) is true, the present 3R MYB genes in plants should derive from an ancient MVIII gene. We observed that the Cre03g197350 and Cre02g103450 genes from green algae were close to the MIV and groups MVIII, respectively (Supplementary Fig. S6). But their exon–intron structures differed from those of MIV and MVIII in plants (Supplementary Fig. S2). This difference may result from variations in deletion and/or insertion of introns in these genes

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072

8

C. Zhou et al. / Gene xxx (2015) xxx–xxx

during their evolution in different groups of organisms. Six R2R3-MYB clades are specific to the physic nut according to the phylogenetic analysis of MYB proteins from physic nut, Arabidopsis and rice (Supplementary Fig. S4, Fig. 2). They are classed into the five woody-preferential R2R3-MYB subgroups (Soler et al., 2014) based on the phylogenetic analysis of the R2R3-MYB proteins from physic nut, Arabidopsis, grape, and eucalyptus (Supplementary Fig. S5). These subgroups are totally absent in the Bryophytes, Lycophytes, Monocot and Brassicaceae lineages, but some subgroups present in Gymnosperms and Eudicots. These results suggest that they are derived from a common ancestor, and were lost in some lineages during evolution (Soler et al., 2014). Most plants have experienced one or more rounds of ancient polyploidy. The whole genome triplication event may have been experienced by all core eudicots (Jaillon et al., 2007). Gene duplication has occurred throughout plant evolution, thereby contributing to the establishment of new gene functions, and underlying the origins of evolutionary novelty (Cannon et al., 2004; Schmutz et al., 2010). Arabidopsis has undergone two, recent, whole-genome duplications (WGD; α and β) within the Brassicaceae lineage (Franzke et al., 2011). Bowers et al. (2003) classified Arabidopsis chromosomal duplication into three types, α, β, and γ, based on the relative time of duplication. In 40 duplicated R2R3-MYB pairs, 32 resulted from α (nine pairs) and β (23 pairs) duplication events, and only two resulted from γ duplication events (Chen et al., 2006). According to the whole genome analysis of the MYB genes in rice, Populus, maize, and soybean, multiple segmental and tandem duplication events played an important role in the elaboration of the MYB gene family (Toledo-Ortiz et al., 2003; Li et al., 2006; Wilkins et al., 2009; Du et al., 2012a, 2012b). No recent WGDs were observed in the physic nut or castor bean genomes (Chan et al., 2010; Sato et al., 2011). Six potential MYB genes containing chromosomal/segmental duplications were detected in the physic nut genome, indicating an ancient duplication event (Fig. 3). Eight tandem arrays were present, five of which were also detected in genomes of castor bean and/or other plants among the tested species (T1, T3, T4, T7 and T8). The results suggest that the expansion of the MYB gene family in physic nut included both ancient and recent tandem duplication (Fig. 3; Supplementary Fig. S7). Additionally, several duplicates from the ancient duplication events (A2, A3, A4, A5, and A6) show divergent expression patterns (Fig. 4; Supplementary Table S3), suggesting the occurrence of subfunctionalization during the evolutionary process. In conclusion, the expansion of MYB genes in the physic nut genome resulted from ancient chromosomal/segmental duplication events as well as ancient and recent tandem duplication events, as observed in other plants. The subgroups C1 and C7 have more genes in physic nut than in Arabidopsis (Fig. 2), probably due to the tandem duplication of these genes (T7 and T8 for C1 genes, T4 for C7 genes; Fig. 3) in the physic nut genome. Proteins with high sequence similarities generally have similar functions across different species. After analyzing its expression patterns, we observed that many MYB orthologs of physic nut (Fig. 4; Supplementary Table S3) and Arabidopsis existed with similar expression patterns in both tissue-specific and abiotic stress responses. For example, JcMYB017, JcMYB019, and JcMYB020, were specifically expressed in seed. These genes were assigned to the MIII subgroup (C41) along with AtMYB115 (AT5G40360) and AtMYB118/PGA37 (AT3G27785), AtMYB115 was thought to play roles in embryogenesis (Wang et al., 2009), and AtMYB118 repressed endosperm maturation in Arabidopsis (Barthole et al., 2014). Several members of C34 (subgroup 20) in Arabidopsis responded to the abiotic stressors: salt, dehydration, and phosphate-starving (Feller et al., 2011). In physic nut, JcMYB043, 044, 046 in C34 also responded to these abiotic stresses (Fig. 4; Supplementary Table S3). Thus, the data we present in this study showing the expression of MYB genes in physic nut provide a solid foundation for future functional studies of the MYB genes in physic nut. Our comprehensive analyses will help with the design of experiments investigating the functional conservation or validation of the precise role of MYB genes in plant development and stress responses.

The R2R3-MYB genes JcMYB001 and JcMYB002, and AtMYB044/ AtMYBR1 (AT5G67300), AtMYB070 (AT2G23290), AtMYB073 (AT4G37260), and AtMYB077/AtMYBR2 (AT3G50060) were intronless (MX, IL1; Fig. 1) and were grouped into C39 (subgroup 22; Supplementary Fig. S4). The four Arabidopsis genes are associated with stress responses (Jung, 2008). Transgenic Arabidopsis over-expressing AtMYB044 is more sensitive to abscisic acid (ABA) and has a more rapid ABA-induced stomatal closure response than wild-type and atmyb044 knockout plants. Transgenic plants exhibited a reduced rate of water loss, as measured by the fresh-weight loss of detached shoots, and remarkably enhanced tolerance to drought and salt stress compared to wild-type plants (Jung, 2008). Transgenic soybeans containing AtMYB044 exhibited significantly enhanced drought/salt stress tolerance, as observed in Arabidopsis (Seo et al., 2012). The over-expression of JcMYB002 (named as JcMYB1 in their study) in tobacco enhanced the drought and salt stress tolerance in transgenic tobacco lines (Li et al., 2014). Furthermore, under salt stress, atmyb073 knockout plants exhibited higher survival rates compare to wild-type (Col-0) plants (Kim et al., 2013). In our study, we observed that Arabidopsis plants over-expressing JcMYB001 were more sensitive to drought and salinity stresses than wild types (Figs. 5, 6). These results suggest that these MYB orthologs of physic nut and Arabidopsis play similar roles in plants. How the two proteins diverge in amino acid sequences to produce the opposite effects on plant tolerance for abiotic stressors, especially salt stress, remains to be studied. 5. Conclusions In conclusion, a total of 125 JcMYB genes were identified in the physic nut genome. The MYB genes were classified into nine groups based on exon–intron structure models of the MYB domain coding sequences, and the MVIII group was one of the most ancient MYB types. There was recent gene tandem duplication of MYB genes in the physic nut genome. Analysis of transcript abundance of JcMYB gene products was tested in different tissues under normal growth conditions. Thirty-four JcMYB genes responded to abiotic stressors. Over-expression of the JcMYB001 gene in Arabidopsis increased its sensitivity to drought and salinity stresses. Our study provides a useful reference data set that may serve as the basis for cloning and functional analysis of physic nut MYB genes. Supplementary data to this article can be found online at http://dx. doi.org/10.1016/j.gene.2015.06.072. Conflicts of interest The authors declare that they have no conflicts of interest. Acknowledgments This work was supported by grants from the National Basic Research Program of China (2010CB126600), the National Natural Science Foundation of China (31270705 and 31200513), and the Knowledge Innovation Program of the Chinese Academy of Sciences (KSCX2-EW-J-28). References Baranowskij, N., Frohberg, C., Prat, S., Willmitzer, L., 1994. A novel DNA binding protein with homology to Myb oncoproteins containing only one repeat can function as a transcriptional activator. EMBO J. 13, 5383–5392. Barthole, G., et al., 2014. MYB118 represses endosperm maturation in seeds of Arabidopsis. Plant Cell 26, 3519–3537. Bowers, J.E., Chapman, B.A., Rong, J.K., Paterson, A.H., 2003. Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature 422, 433–438. Cannon, S.B., Mitra, A., Baumgarten, A., Young, N.D., May, G., 2004. The roles of segmental and tandem gene duplication in the evolution of large gene families in Arabidopsis thaliana. BMC Plant Biol. 4, 10. Chan, A.P., et al., 2010. Draft genome sequence of the oilseed species Ricinus communis. Nat. Biotechnol. 28, 951–956.

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072

C. Zhou et al. / Gene xxx (2015) xxx–xxx Chen, Y.H., et al., 2006. The MYB transcription factor superfamily of Arabidopsis: expression analysis and phylogenetic comparison with the rice MYB family. Plant Mol. Biol. 60, 107–124. Clough, S.J., Bent, A.F., 1998. Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant J. 16, 735–743. De Castro, E., et al., 2006. ScanProsite: detection of PROSITE signature matches and ProRule-associated functional and structural residues in proteins. Nucleic Acids Res. 34, W362–W365. Dhillon, R.S., Hooda, M.S., Jattan, M., Chawla, V., Bhardwaj, M., Goyal, S.C., 2009. Development and molecular characterization of interspecific hybrids of Jatropha curcas × J. integerrima. Indian J. Biotechnol. 8, 384–390. Dias, A.P., Braun, E.L., McMullen, M.D., Grotewold, E., 2003. Recently duplicated maize R2R3 Myb genes provide evidence for distinct mechanisms of evolutionary divergence after duplication. Plant Physiol. 131, 610–620. Du, H., Feng, B.R., Yang, S.S., Huang, Y.B., Tang, Y.X., 2012a. The R2R3-MYB transcription factor gene family in maize. PLoS One 7, e37463. Du, H., et al., 2012b. Genome-wide analysis of the MYB transcription factor superfamily in soybean. BMC Plant Biol. 12, 106. Dubos, C., Stracke, R., Grotewold, E., Weisshaar, B., Martin, C., Lepiniec, L., 2010. MYB transcription factors in Arabidopsis. Trends Plant Sci. 15, 573–581. Feller, A., Machemer, K., Braun, E.L., Grotewold, E., 2011. Evolutionary and comparative analysis of MYB and bHLH plant transcription factors. Plant J. 66, 94–116. Franzke, A., Lysak, M.A., Al-Shehbaz, I.A., Koch, M.A., Mummenhoff, K., 2011. Cabbage family affairs: the evolutionary history of Brassicaceae. Trends Plant Sci. 16, 108–116. Haga, N., et al., 2007. R1R2R3-Myb proteins positively regulate cytokinesis through activation of KNOLLE transcription in Arabidopsis thaliana. Development 134, 1101–1110. Ito, M., 2005. Conservation and diversification of three-repeat Myb transcription factors in plants. J. Plant Res. 118, 61–69. Jaillon, O., et al., 2007. The grapevine genome sequence suggests ancestral hexaploidization in major angiosperm phyla. Nature 449, 463–467. Jiang, C.Z., Gu, X., Peterson, T., 2004. Identification of conserved gene structures and carboxy-terminal motifs in the Myb gene family of Arabidopsis and Oryza sativa L. ssp indica. Genome Biol. 5, R46. Jiang, H.W., et al., 2012. Global analysis of gene expression profiles in developing physic nut (Jatropha curcas L.) seeds. PLoS One 7, e36522. Ju, Q.D., Morrow, B.E., Warner, J.R., 1990. Reb1, a yeast DNA-binding protein with many targets, is essential for cell-growth and bears some resemblance to the oncogene myb. Mol. Cell. Biol. 10, 5226–5234. Jung, C., 2008. Overexpression of AtMYB44 enhances stomatal closure to confer abiotic stress tolerance in transgenic Arabidopsis. Plant Physiol. 146, 623–635. Katiyar, A., et al., 2012. Genome-wide classification and expression analysis of MYB transcription factor families in rice and Arabidopsis. BMC Genomics 13, 544. Kim, J.H., Nguyen, N.H., Jeong, C.Y., Nguyen, N.T., Hong, S.W., Lee, H., 2013. Loss of the R2R3 MYB, AtMyb73, causes hyper-induction of the SOS1 and SOS3 genes in response to high salinity in Arabidopsis. J. Plant Physiol. 170, 1461–1465. Li, X., et al., 2006. Genome-wide analysis of basic/helix–loop–helix transcription factor family in rice and Arabidopsis. Plant Physiol. 141, 1167–1184. Li, H.L., Guo, D., Peng, S.Q., 2014. Molecular and functional characterization of the JcMYB1, encoding a putative R2R3-MYB transcription factor in Jatropha curcas. Plant Growth Regul. 1–9.

9

Lipsick, J.S., 1996. One billion years of Myb. Oncogene 13, 223–235. Martin, C., Paz-Ares, J., 1997. MYB transcription factors in plants. Trends Genet. 13, 67–73. Matus, J.T., Aquea, F., Arce-Johnson, P., 2008. Analysis of the grape MYB R2R3 subfamily reveals expanded wine quality-related clades and conserved gene structure organization across Vitis and Arabidopsis genomes. BMC Plant Biol. 8, 83. Parani, M., Natarajan, P., 2011. De novo assembly and transcriptome analysis of five major tissues of Jatropha curcas L. using GS FLX titanium platform of 454 pyrosequencing. BMC Genomics 12, 191. Parthiban, K.T., Kumar, R.S., Thiyagarajan, P., Subbulakshmi, V., Vennila, S., Rao, M.G., 2009. Hybrid progenies in Jatropha — a new development. Curr. Sci. India 96, 815–823. Parthiban, K.T., et al., 2011. Genetic association studies among growth attributes of Jatropha hybrid genetic. Int. J. Plant Breed. Genet. 5, 159–167. Paz-Ares, J., Ghosal, D., Wienand, U., Peterson, P.A., Saedler, H., 1987. The regulatory c1 locus of Zea mays encodes a protein with homology to myb proto-oncogene products and with structural similarities to transcriptional activators. EMBO J. 6, 3553–3558. Rogozin, I.B., Sverdlov, A.V., Babenko, V.N., Koonin, E.V., 2005. Analysis of evolution of exon–intron structure of eukaryotic genes. Brief. Bioinform. 6, 118–134. Rushlow, K.E., et al., 1982. Nucleotide sequence of the transforming gene of avian myeloblastosis virus. Science 216, 1421–1423. Sato, S., et al., 2011. Sequence analysis of the genome of an oil-bearing tree, Jatropha curcas L. DNA Res. 18, 65–76. Schmutz, J., et al., 2010. Genome sequence of the palaeopolyploid soybean. Nature 463, 178–183. Seo, J.S., et al., 2012. Expression of the Arabidopsis AtMYB44 gene confers drought/saltstress tolerance in transgenic soybean. Mol. Breed. 29, 601–608. Soler, M., et al., 2014. The Eucalyptus grandis R2R3-MYB transcription factor family: evidence for woody growth-related evolution and function. New Phytol. http://dx.doi. org/10.1111/nph.13039. Stracke, R., Werber, M., Weisshaar, B., 2001. The R2R3-MYB gene family in Arabidopsis thaliana. Curr. Opin. Plant Biol. 4, 447–456. Tamura, K., Dudley, J., Nei, M., Kumar, S., 2007. MEGA4: molecular evolutionary genetics analysis (MEGA) software version 4.0. Mol. Biol. Evol. 24, 1596–1599. Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F., Higgins, D.G., 1997. The CLUSTAL_X windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 25, 4876–4882. Toledo-Ortiz, G., Huq, E., Quail, P.H., 2003. The Arabidopsis basic/helix–loop–helix transcription factor family. Plant Cell 15, 1749–1770. Wang, X.C., et al., 2009. Overexpression of PGA37/MYB118 and MYB115 promotes vegetative-to-embryonic transition in Arabidopsis. Cell Res. 19, 224–235. Wilkins, O., Nahal, H., Foong, J., Provart, N.J., Campbell, M.M., 2009. Expansion and diversification of the Populus R2R3-MYB family of transcription factors. Plant Physiol. 149, 981–993. Wu, P., et al., 2015. Integrated genome sequence and linkage map of physic nut (Jatropha curcas L.), a biodiesel plant. Plant J. 81, 810–821. Xiong, W., et al., 2013. Genome-wide analysis of the WRKY gene family in physic nut (Jatropha curcas L.). Gene 524, 124–132.

Please cite this article as: Zhou, C., et al., Genome-wide analysis of the MYB gene family in physic nut (Jatropha curcas L.), Gene (2015), http:// dx.doi.org/10.1016/j.gene.2015.06.072