Basic helix-loop-helix gene family: Genome wide identification, phylogeny, and expression in Moso bamboo

Basic helix-loop-helix gene family: Genome wide identification, phylogeny, and expression in Moso bamboo

Accepted Manuscript Basic helix-loop-helix gene family: Genome wide identification, phylogeny, and expression in Moso bamboo Xinran Cheng, Rui Xiong, ...

6MB Sizes 0 Downloads 60 Views

Accepted Manuscript Basic helix-loop-helix gene family: Genome wide identification, phylogeny, and expression in Moso bamboo Xinran Cheng, Rui Xiong, Huanlong Liu, Min Wu, Feng Chen, Hanwei Yan, Yan Xiang PII:

S0981-9428(18)30393-0

DOI:

10.1016/j.plaphy.2018.08.036

Reference:

PLAPHY 5397

To appear in:

Plant Physiology and Biochemistry

Received Date: 9 July 2018 Revised Date:

28 August 2018

Accepted Date: 28 August 2018

Please cite this article as: X. Cheng, R. Xiong, H. Liu, M. Wu, F. Chen, Hanwei Yan, Y. Xiang, Basic helix-loop-helix gene family: Genome wide identification, phylogeny, and expression in Moso bamboo, Plant Physiology et Biochemistry (2018), doi: 10.1016/j.plaphy.2018.08.036. This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT 1 2

Basic Helix-Loop-Helix Gene Family: Genome Wide Identification, Phylogeny, and Expression in

3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34

Xinran Cheng1, Rui Xiong1, Huanlong Liu2, Min Wu2, Feng Chen1,Hanwei Yan1,2, Yan Xiang1,2*

Moso Bamboo

1

Laboratory of Modern Biotechnology, School of Forestry and Landscape Architecture, Anhui

Agricultural University, Hefei, 230036, China. 2

National Engineering Laboratory of Crop Stress Resistance Breeding, Anhui Agricultural University,

RI PT

Hefei 230036, China. * Correspondence: : Yan Xiang [email protected] Fax number: +86-0551-65786021 Funding

SC

This study was supported by the National Science and Technology Support Program (Grant No. 2015BAD04B0302) and National Natural Science Foundation of China (Grant No. 31670672). Competing interests

M AN U

The authors declare that they have no competing interests. Acknowledgements

We thank for the Laboratory of Modern Biotechnology,National Engineering Laboratory of Crop Stress Resistance Breeding and Key Laboratory of Crop Biology of Anhui Province members for their assistance in this study. We thank Margaret Biswas, PhD, from Liwen Bianji, Edanz Group China (www.liwenbianji.cn/ac), for editing the English text of a draft of this manuscript. Author contributions

TE D

XinRan Cheng conceived the study, put into effect the main bioinformatics analyses, and drafted the manuscript. Rui Xiong carried out the software analyses and helped to construct the figures and tables. Min Wu took part in the experiments and drafting of the manuscript. HuanLong Liu processed the experimental data and helped to draft the manuscript. Feng Cheng helped with the software and drafting of the manuscript. HanWei Yan reviewed the project and helped in revamping the manuscript.

EP

Yan Xiang conceived and guided the experiments, and helped in coordinating the project and drafting the manuscript. All authors read and accepted the final manuscript. Availability of data and materials

AC C

The genome sequences of Moso bamaoo, Arabidopsis, and rice were downloaded from the Bamboo Genome

Database

(http://www.bamboogdb.org/),

(http://www.arabidopsis.org)and

Rice

Arabidopsis

Genome

(http://rice.plantbiology.msu.edu/analyses_search_locus.shtml)

1

Annotation

Information Project

Resource database

ACCEPTED MANUSCRIPT

Studies have shown that basic helix-loop-helix (bHLH) transcription factors play important roles in plant growth and survival, and response to various biotic/abiotic stresses. We identified a total of 448 bHLH genes. These genes were classified into 21 bHLH subfamilies, and most genes in a given subfamily had similar gene structures and conserved motifs. We identified 176 homologous pairs in the three species. We calculated Ka, Ks, and Ka/Ks to analyze the replication relationships among the three

RI PT

species. Multiple sequence analysis revealed that the PebHLH genes had the distinct bHLH structure. The gene ontology annotation analysis showed that the PebHLH genes had many molecular functions. Promoter cis-element analysis revealed that most of the PebHLH genes contained cis-elements that can respond to various biotic/abiotic stress-related events. The tissue expression patterns of the PebHLH genes indicated that most members were expressed in leaves, roots, and stems. Quantitative real-time PCR analysis showed that 21 selected PebHLH genes were differentially regulated after abscisic acid, drought, and methyl jasmonate treatments. This study has laid the basis for studying the functions of

SC

36 37 38 39 40 41 42 43 44 45 46 47 48 49

ABSTRACT

AtbHLH, OsbHLH, and PebHLH genes, and will contribute to future studies of the functions of bHLH genes in other plant species.

M AN U

35

50 51

Keyword: Phylogenetic analysis, Gene ontology annotation, Promoter analysis, Expression

52 53 54 55 56

Abbreviations: bHLH, basic helix-loop-helix; PEG, polyethylene glycol; ABA, abscisic acid;

profiles analysis

MeJA, methyl jasmonate; qRT-PCR, quantitative real-time PCR; Ks, number of synonymous substitutions per synonymous site; Ka, number of non-synonymous substitutions per non-synonymous site; NJ, neighbor-joining; CDS, coding sequence; bp, base pair; aa, amino acids; MW, molecular

AC C

EP

TE D

weight; pI, isoelectric point; Da, Dalton.

2

ACCEPTED MANUSCRIPT

82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97

upstream specificity combined with specific sequences to ensure the strength of purpose gene in a specific expression of protein molecules in certain time and space. TFs usually contain four functional regions, namely a DNA binding domain, transcription regulation domain, nuclear localization signal, and oligomerization site (Yanagisawa 1998; Riechmann et al. 2000; Liu et al. 2006; Amoutzias et al.

RI PT

2009; Yamasaki et al. 2013; Guo and Wang 2017). In the current study, we found there are more than 60 TF families in plants. According to the numbers of arginine and lysine residues in the DNA binding domain, the TFs can be divided into four categories: Zinc-finger, helix-turn-helix, basic leucine zipper (bZIP), and basic helix-loop-helix (bHLH). To date, the most common TF families found in higher plants are the WD40, MYD, WRKY, bZIP, and bHLH families (Kosugi and Ohashi 2002).

SC

The bHLH TFs are widespread in eukaryotes, and are the second largest family after the MYB family in plants (Riechmann et al. 2000; Ledent and Vervoort 2001). The DNA binding domain of bHLH TFs contains about 60 amino acids comprising a basic region of 10–15 amino acids and a

M AN U

helix-loop-helix region of about 40 amino acids at the N-terminus of TF sequences, and is primarily responsible for the binding of TFs to specific DNA sequences

(Atchley et al. 1999; Ledent and

Vervoort 2001; Heim et al. 2003; Jones 2004; Li et al. 2006). Studies have shown that interactions between the two α-helices of the same bHLH TF or α-helices of different bHLH TFs form homologous or heterologous dimers that can target different parts of gene promoters to regulate a target gene (Murre et al. 1989; Nair and Burley 2000; Toledoortiz et al. 2003; Baudry et al. 2006). Unlike the highly conservative bHLH domain, the rest of the TF sequences are usually very different. It had been shown that bHLH TFs mainly combine with E-box sequences (5′-CANNTG-3′) in the promotors of target genes, the most common form being the palindromic G-box (5′-CACGTG-3′). Several conserved

TE D

68 69 70 71 72 73 74 75 76 77 78 79 80 81

Transcription factors (TFs), also known as trans-acting factors, a group of can with the gene 5′

amino acids in the basic region of the DNA binding domain determine the specificity of the TF for the core consensus sites of different E-boxes (Massari and Murre 2000; Robinson et al. 2000). Atchley et al. (1997) classified animal bHLH TFs into six groups (A-F) according to their evolutionary relationship and sequence similarity (Atchley and Fitch 1997; David et al. 2007). Heim et

EP

58 59 60 61 62 63 64 65 66 67

INTRODUCTION

al. (2003) showed that the majority of plant bHLH TFs belong to group B (Heim et al. 2003). With the rapid development of molecular biology, the diversity of the bHLH TF family has been revealed and plant bHLH TF genes have been identified in species such as Arabidopsis, rice, tomato, Chinese

AC C

57

cabbage, Brachypodium distachyon, peanut and miltiorrhiza

(Toledoortiz et al. 2003; Li et al. 2006;

Wang et al. 2015; Zhang et al. 2015; Wu et al. 2016; Chao et al. 2017; Niu et al. 2017). bHLH TFs not only play important roles in plant growth and secondary metabolism, but also

participate in a variety of plant stress response. The growth and morphogenesis of higher plants is a very complicated process, which is accomplished through the interaction of DNA and proteins. It has been found that the process of female vine with flowering plants was coordinated by SPT and HEC-encoding genes (Gremski and Ditta 2007). In Arabidopsis, the regulation of light morphogenesis was found to be realized mainly by phytochrome-interacting bHLH TFs (Ni et al. 1998). The regulation of higher plant stress responses at the transcriptional level is usually accomplished by a combination of TFs and related genes. Abe et al. (2003) found that, in Arabidopsis, RD22 was induced mainly by drought stress and abscisic acid (ABA). The promoter region of RD22 contains 3

ACCEPTED MANUSCRIPT 98 99

MYC (bHLH) and MYB TF recognition sites, and the corresponding TFs specifically bind to these

100 101 102 103 104 105 106 107 108 109

TFs also participate in plant secondary metabolism. The Lc protein encoded by the R gene in genes in the maize anthocyanin metabolic pathway (Ludwig et al. 1989). Goodrich et al. (1992) cloned the Delila gene from snapdragon, which encodes a protein with a similar helix-loop-helix domain to

RI PT

the protein encoded by the maize R gene family and subsequently confirmed that the expression of Ddila was related closely to the accumulation of anthocyanins (Goodrich et al. 1992). The bHLH TFs involved in the regulation of anthocyanin synthesis in Arabidopsis belong to subgroup III, which includes the TT8, EGL3, GL3 and MYCl TF families (Bailey and Weisshaar 2003; Heim et al. 2003). Zimmermann et al. (2004) found that they both interact with subgroup VI R2R3-MYB TFs, which

SC

including PAP1 (AtMYB75), and PAP2 (AtMYB90) (Zimmermann et al. 2004).

Moso bamboo (Phyllostachys edulis; family Gramineae) is a rare species of evergreen tree-like bamboo that grows rapidly (Peng et al. 2013) and has gradually become widely used, and is now an

M AN U

economically important species in China (Wu et al. 2016a). Moso bamboo trees are affected by various environmental stresses, including the rapid spread of pests and drought, which can lead to significant economic losses. Increasing the resistance of Moso bamboo to these stresses will help to improve both the quality and quantity of bamboo produced, and this is one of the major aims of researchers and breeders working in this area (Liu et al. 2017; Wang et al. 2017).

In this study, we identified 137 PebHLH genes and conducted bioinformatics analysis, including phylogenetic, gene structure, conservative motif, gene ontology annotations and promoter cis regulating elements analysis, etc. In addition, the expression levels of 21 PebHLH genes were

TE D

117 118 119 120 121 122 123

maize was the first reported bHLH TF in plants, and it was found to regulate at least two structural

measured by qRT-PCR to study their responses to different stresses, methyl jasmonate (MeJA), absisic acid (ABA), and drought stress. We found that all 21 genes were stress responsive. The purpose of this study was to determine the responses of members of the PebHLH gene family to various stresses and plant hormone therapy.

EP

110 111 112 113 114 115 116

sites to regulate gene expression (Abe et al. 1997; Abe et al. 2003).

MATERIALS AND METHODS

125

Identification of bHLH family genes in Moso bamboo

126 127 128 129 130 131 132 133 134 135 136 137

AC C

124

We downloaded PebHLH sequences from the Bamboo Genome Database (BambooGDB;

http://www.bamboogdb.org/) (Gao et al. 2017; Liu et al. 2017; Wang et al. 2017). Local BLAST (E value-5) searches were performed using the Hidden Markov Model (HMM) profile in the Pfam database (http://pfam.janelia.org/search/sequence) to screen all candidate PebHLH gene sequences. Candidate genes were retained that contained known conserved domains and passed checks against the Pfam,

SMART

(http://smart.embl-heidelberg.de/),

and

NCBI

Conserved

Domains

(http://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi) (Chen et al. 2015) databases for the presence of the bHLH domain (PF00010). The Arabidopsis and rice bHLH gene sequences were downloaded from the Arabidopsis Information Resource (TAIR) and the Rice Genome Annotation Project websites, respectively. Bioinformatics analyses were performed on the PebHLH, AtbHLH, and OsbHLH sequences, and physical and chemical parameters (e.g., ORF, MW, pI) were calculated using ExPASy (http://www.expasy.ch/tools/pi_tool.html) (He et al. 2012). 4

ACCEPTED MANUSCRIPT 138 139 140 141 142

Phylogenetic and multiple alignment analyses Phylogenetic trees were built using the neighboring method (NJ) in MEGA 6.0 with 1,000 repetitions for the bootstrap test. An individual phylogenetic tree for the PebHLH genes and a comprehensive phylogenetic tree for the bHLH gene from the three species (AtbHLH, OsbHLH, and PebHLH) were constructed using MEGA 6.0 (Tamura et al. 2013; Chu et al. 2016). We also performed multiple sequence alignments of the full-length translated protein sequences of the PebHLH genes using ClustalX 2.11 with the default parameters (Gao et al. 2017; Liu et al. 2017;

146

Gene structure, conserved motif and promoter cis-acting regulatory element analysis

159 160 161 162 163 164 165 166 167 168 169 170 171 172 173

SC

Gao et al. 2017; Liu et al. 2017; Wang et al. 2017) to analyze the coding region and genomic DNA sequences of the candidate AtbHLH, OsbHLH, and PebHLH genes to determine their exon/intron

M AN U

structures.

We used the MEME system (http://meme.sdsc.edu/meme/itro.html) to identify conserved motifs, with parameters set as: number of repetitions, arbitrary; maximum number of patterns, 20; optimal width of the motif, between 6 and 200 residues (Wu et al. 2016c; Liu et al. 2017; Wang et al. 2017). We downloaded the 2-kb sequences upstream and downstream of the PebHLH genes from BambooGDB

and

analyzed

their

promoter

(http://bioinformatics.psb.ugent.be/webtools/plantcare/html/)

regions

to

detect

using cis-acting

PlantCARE regulatory

elementsusing PlantCARE (http://bioinformatics.psb.ugent.be/webtools/plantcare/html/) (Liu et al. 2009; Wang et al. 2017).

TE D

154 155 156 157 158

We used the Gene Structure Display Server (GSDS) (http://gsds.cbi.pku.edu.ch) (Wu et al. 2016c;

Calculation of synonymous (Ks) and non-synonymous (Ka) substitutions To identify homologous pairs of genes, the transcript sequences of AtbHLH, OsbHLH, and

EP

151 152 153

Wang et al. 2017).

PebHLH were investigated by BLASTN searches (Altschul et al. 1997). Paralogous pairs within the genome of the same species were defined as follows: the aligned sequences were longer than 300 bp and shared identities ≥40%. Orthologous pairs within the genomes of two or more species were defined

AC C

147 148 149 150

RI PT

143 144 145

as follows: the aligned sequences were longer than 300 bp (Blanc and Wolfe 2004). Ks and Ka substitutions were calculated for each site in the homologous pair alignments (Wang et

al. 2015b). As reported previously, Ks and Ka were calculated as follows. First, align the orthologous or paralogous protein sequences employing the ClustalX 2.11 software. Second, compare the translated protein and original cDNA sequences using MEGA 6.0 (Wang et al. 2017). Third, calculate Ka/Ks using the DnaSP5 software. Fourth, perform a sliding window analysis on the Ka/Ks ratio with the window set as 150 bp and the step length set as 9 bp. Gene ontology (GO) annotation and subcellular localization prediction The translated PebHLH protein sequences were annotated using the Blast2GO program to assign GO terms (http://amigo.geneontology.org/amigo/term/) (Conesa 2008). GO analysis e-value is 1.0E-6. 5

ACCEPTED MANUSCRIPT 174 175

GO terms are provided under three main categories, biological process, cellular component, and

176 177

The subcellular localization of the PebHLH proteins was predicted using WOLF PSORT (http://www.genscript.com/psort.html) (Horton et al. 2007).

178

Plant materials, growth conditions, and stress treatments

199

RI PT

2017; Wang et al. 2017).

SC

Drought, ABA, and MeJA treatments were simulated with 20% PEG-6000 solution (Wu et al. 2015a), 100 µM ABA solution (Wu et al. 2015a), and 100 µM MeJA solution (Zhu et al. 2017), respectively. These solutions were poured into cultured pots of cultivated Moso bamboo, and the leaves were harvested at 1, 3, 6, 12, and 24 hours. The harvested leaves were placed in RNA protection solution, immediately frozen in liquid nitrogen, and stored at −80°C for RNA isolation. Untreated plants were used as the control groups.

M AN U

190 191 192 193 194 195 196 197 198

three months, the plant material was used in all the following experiments (Gao et al. 2017; Liu et al.

RNA isolation and quantitative real-time PCR (qRT-PCR) analysis

Total RNA was extracted as described previously. cDNA was synthesized using PrimerScript RT MasterMix (Takara, Tokyo, Japan) according to the manufacturer's instructions, and 21 specific primers for the selected PebHLH genes were designed using Primer Premier 5.0. Tonoplast intrinsic protein 41 (TIP41) was used as the reference gene (Fan et al. 2013; Wu et al. 2015b). The qRT-PCR

TE D

189

Zhejiang Province of China and cultivated in a greenhouse (25±2°C, 16-h/8-h light/dark cycle). After

reaction volume was 20 µl, including 10 µl 2* TransStart Tip Green qPCR SuperMix, 0.4 µl Passive Reference Dye, 0.4 µl forward primer, 0.4 µl reverse primer, and 8.8 µl ddH2O. The qRT-PCR reaction procedures were as follow: first step, 94°C for 30 s; second step, 40–45 cycles of 94°C for 5 s, 50– 60°C for 15 s, and 72°C for 10 s. Three biological replicates were used for each sample. software was used to process the data. Statistical analysis

Data analyses were performed using Excel and SPSS v10.0, and Student's t-test was used to

200 201

determine the difference levels.

202

RESULTS

203

bHLH family genes in Moso bamboo, Arabidopsis and rice

204 205 206 207 208 209

GraphPad 5

EP

183 184 185 186 187 188

Moso bamboo seeds were gathered from the Tian Mu Mountain National Nature Reserve in the

AC C

179 180 181 182

molecular function.

The conserved domain containing the bHLH motif (PF00010) has been used previously to identify bHLH TFs in plants such as Arabidopsis (144) and rice (167) (Toledoortiz et al. 2003; Li et al. 2006). Therefore, we used this domain to identify bHLH genes in BambooGDB and then BLASTP searches to detect bHLH genes in the Moso bamboo genome. Through validation, we obtained 137 candidate genes encoding the bHLH domain, on the strength of their physical location on the scaffolds (top to bottom) (Table S1). These genes, named PebHLH1 to PebHLH137, encode proteins of 85 (PebHLH137) to 6

ACCEPTED MANUSCRIPT 210

1,401 (PebHLH78) amino acids, with an average length of 359 amino acids. Information about the bHLH genes in the three angiosperm species, including their genetic

211 212

identifier, location, gene length, and coding region length, is shown in Table S1.

213

Phylogenetic analysis and calculation of Ka and Ks of the bHLH genes

235 236 237 238 239 240 241 242 243 244 245 246

RI PT

et al. 2003), the phylogenetic tree was divided into 21 subfamilies (1–21) (Figure 1). As can be seen from Figure 2A, the bHLH genes of Moso bamboo were distributed in 16 of the subfamilies (except 1,

SC

7, 13, 14, and 20); subfamily 18 was the largest and subfamily 4 was small. In Figure 2B, most subfamilies include at least two of the species (except subfamilies 1 and 20); subfamily 18 was clearly the largest subfamily among all three species and subfamily 20 was the smallest.

Homologous bHLH genes were identified among the three species and are listed in Table 1. We

M AN U

228 229 230 231 232 233 234

Based on previous studies of the classification bHLH genes of rice and Arabidopsis (Toledoortiz

identified 51, 19, and 45 paralogous pairs in Moso bamboo, rice, and Arabidopsis, respectively, and 46 and 15 orthologous pairs between rice and Moso bamboo, and between rice and Arabidopsis, respectively. No orthologous bHLH genes were detected between Moso bamboo and Arabidopsis. Therefore, the bHLH genes between Moso bamboo and rice were considered to be more closely related than the bHLH genes between Moso bamboo and Arabidopsis.

To better understand the Darwinian evolutionary selection of the bHLH gene family, we calculated Ka/Ks (Table 2) and performed a sliding window analyzes of the bHLH genes in rice,

TE D

222 223 224 225 226 227

respectively, was built to better understand the evolutionary relationships among these three species.

Arabidopsis, and Moso bamboo (Figures S8–S12). Only 17 of 176 pairs of homologous bHLH genes had Ka/Ks values >1, which indicated that only a few genes had undergone positive selection and adaptive evolution producing new genes. All other genes had undergone negative selection, suggesting that most of the bHLH genes had evolved slowly. These results suggest that the three species had gone through phase-out processes during evolution.

EP

216 217 218 219 220 221

A phylogenetic tree of the 144, 167, and 137 bHLH genes of Arabidopsis, rice, and Moso bamboo,

Gene structure and conserved motifs analysis of bHLH family

AC C

214 215

As can be seen from Figure S6, the same subfamilies of these three species had similar gene

structures. For example, most subfamily 17 members had seven exons, and PebHLH78 had the most exons (14). Most of the bHLH genes contained 1–12 introns and 11.4% of them had no introns (Figure S6). We also analyzed 178 homologous pairs and found that 52.2% of them contained the same exon and most of them had a similar genetic structure. In addition,according to Long Li's article, we performed an alternative splicing (AS) analysis related to the growth of bamboo shoots. A total of 75 genes in the PebHLH gene family had undergone variable splicing (Table S7). Statistical analysis showed that AS occured in most subfamilies members. However, this event has not occurred in only two members of the subfamily The MEME analysis of the translated bHLH protein sequences detected 20 conserved motifs of rice, 20 of Arabidopsis, and 20 of Moso bamboo (Table S2). As can be seen from Figure S7, PebHLH 7

ACCEPTED MANUSCRIPT 247 248 249 250 251 252

proteins contained motifs 1 and 2 (except PebHLH4 and PebHLH137), and motifs 1 and 2 were

253 254

Multiple sequence alignment, gene ontology (GO) annotation, and subcellular localization of the

268 269 270 271

most closely related members had a common motif composition, which indicates functional similarity between the bHLH proteins in paralogous pairs or in the same subgroup; for example, subfamily 3 in

RI PT

PebHLH, subfamily 8 in AtbHLH, and subfamily 10 in OsbHLH.

PebHLH proteins

To further study the structural characteristics of the bHLH motifs, we carried out a multiple sequence analysis. We found that the 137 PebHLH proteins contained one basic region, one loop region, and two helix regions (Figure S10), which is consistent with the results of previous studies

SC

(including Arabidopsis, rice, Chinese cabbage, and Brachypodium distachyon) (Toledoortiz et al. 2003; Li et al. 2006; Wu et al. 2016; Niu et al. 2017).

M AN U

The 137 PebHLH genes were assigned a total of 43 GO terms (Figure 3and Table 3). Among them, 182, 128, and 60 proteins were assigned terms under molecular function, cellular component, and biological process, respectively. Under biological process, only four genes (PebHLH2, -49, -55, and -105) were predicted to be involved in the stress response; all the others were assigned terms associated with regulating plant growth. Under cellular component, most of the genes were assigned to cell part, and only six genes (PebHLH30, -53, -71, -92, -94, and -122) were related to organelles. Under molecular function, most terms were related to binding, and only two terms (GO:0046983 and GO:0001228) were transformative (Table 3).

TE D

260 261 262 263 264 265 266 267

only one subfamily; for example, motif 10 was found only in subfamily 3 (Figure S9). As expected,

In addition, we analyzed the GO annotations for each subfamily. The same annotations exist in different genes of different subfamilies (subfamily 2), and there are also different annotations (subfamily 18). These results suggested that different genes of the same subfamily may have different roles in the evolution process (Table 3).

EP

255 256 257 258 259

common in all the AtbHLH proteins (Figure S8). In the OsbHLH proteins, some motifs were found in

The subcellular localization analysis of the 137 PebHLH proteins predicted they were mostly

272 273 274

located in the nucleus (129, 94.2%); only one was located in mitochondria (0.7%), two were located in

275

Promoter and expression profiles analysis of bHLH genes in Moso bamboo

AC C

276 277 278 279 280 281 282 283 284 285

cytoplasm (1.5%), and five were located in chloroplasts (3.6%) (Figure S11 and Table S3).

The promoter region of the PebHLH genes contained cis-elements that were divided into three

main categories (Table S4). Category one contained a ubiquitous class of plant light-responsive elements among which G-box, G-Box, and Sp1 were common in the PebHLH promoters. Category two contained plant growth and development response elements; Skn-1_motif, circadian, and ARE were the main elements in the PebHLH promoters. Category three, which was the most important, contained elements that affect the growth, development, and survival of plants, including response elements to biotic stresses (CGTCA-motif, ABRE, and TGACG-motif) and abiotic stresses (MBS and LTR). In the PebHLH promoters, the most common response elements were MeJA (CGTCA-motif and TGACG-motif), ABA (CE1, CE3, motif IIb, and ABRE), and drought (MBS), which numbered 446 (28.63%), 269 (17.27%), and 237 (15.21%), respectively (Figure 4). 8

ACCEPTED MANUSCRIPT

312 313 314 315 316 317 318 319 320 321 322 323 324 325

tissues, with their highest expression levels in leaf (Figure 5). Genes in paralogous pairs (e.g., PebHLH1/-101), to a certain extent, had similar expression profiles. However, the expression profiles of two paralogous gene pairs (PebHLH67/-134 and PebHLH91/-110) were fundamentally different in

RI PT

different tissues, suggesting that these genes may have differentiated into different roles. Examination of PebHLH genes expression after stress treatment by qRT-PCR

The expression patterns of genes in different conditions are often related to their functions. Therefore, we used 21 representative PebHLH genes, combined with tissue expression analysis, using the specific primers listed in Table S5 to use seedling leaves for gene expression detection. We used

SC

qRT-PCR to analyze the PebHLH genes expression of RNA extracted from leaves under different biotic/abiotic treatments, and the same primers were used (Table S5). The RNA used in the analysis was extracted from leaves under different biotic/abiotic treatments. The initial qRT-PCR results are shown in (Figure 6). The relative expression levels of PebHLH were calculated.

M AN U

301 302 303 304 305 306 307 308 309 310 311

in Table S5. The results showed that the expression levels of these genes varied greatly in the six

The expression level of 13 PebHLH genes was up-regulated under drought (PEG) treatment, and no significant changes in expression were detected for PebHLH67 and PebHLH125 compared with the untreated seedlings. The expression levels of PebHLH45 and PebHLH57 were up-regulated at 0–3 hours, reached a maximum at 3 hours after the PEG treatment, then decreases gradually. The expression levels of 10 genes were up-regulated by >15-fold under prolonged stress, and the expression of the other three genes peaked to >10-fold under short-term stress (Figure 7). Three pairs of paralogous genes (PebHLH28/-29, PebHLH57/-98, and PebHLH1/-101) showed the same expression

TE D

294 295 296 297 298 299 300

leaf, shoot, young leaf, and rhizome (Table S6), and the specific primers used in this analysis are listed

pattern in 6 pairs of paralogous. The other three pairs of expression pattern were different. But in the same expression pattern, its expression is still different. For the PebHLH57/-98 pair, PebHLH57 was up-regulated at 0–3 hours then the expression decreased at later time points, whereas PebHLH98 was slightly up-regulated at 0–12 hours and peaked at 24 hours (Figure 7).

EP

293

We carefully selected 21 PebHLH genes for qRT-PCR assays in six tissues, namely root, stem,

The expression levels of 16 of the 21 PebHLH genes were significantly up-regulated at different time points after ABA treatment, and the expression levels of the remaining five genes were significantly down-regulated, as can be seen from Figure 7. For example, the expression levels of

AC C

286 287 288 289 290 291 292

PebHLH134 were less than half the levels in the untreated controls at all time points after treatment. The expression levels of PebHLH1, PebHLH29, PebHLH71, PebHLH90, PebHLH93, and PebHLH98 were similar to the controls at some time points after ABA treatment, and were significantly p-regulated (>10-fold) at other time point. The expression levels of 12 of the 21 PebHLH genes were up-regulated at some time points after MeJA treatment (Figure 7), and the expression levels of the remaining nine genes were similar to those of the untreated controls. For seven of the up-regulated genes, the expression levels were highest at 0–6 hours after treatment (PebHLH11, PebHLH13, PebHLH45, PebHLH67, PebHLH93, PebHLH98 and PebHLH130), whereas for the other five up-regulated genes the expression levels were highest at 12– 24 hours after treatment (PebHLH29, PebHLH26, PebHLH28, PebHLH91 and PebHLH110). One of these genes (PebHLH11) was strongly up-regulated (>60-fold). 9

ACCEPTED MANUSCRIPT

347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366

Arabidopsis (Toledoortiz et al. 2003; Li et al. 2006; Niu et al. 2017), and the relationship between plant growth and biotic and abiotic stresses has also been reported. However, until now, this TF family has not been reported in Moso bamboo. Moreover, although bioinformatics analyses of these TFs have been reported in rice and Arabidopsis, there is still more to be discovered. Therefore, in this study, we Arabidopsis and rice bHLH genes in the bioinformatics analyses.

RI PT

conducted evolutionary analysis and qRT-PCR assays of bHLH genes in Moso bamboo, and included

The evolutionary analysis identified 137 bHLH genes in Moso bamboo (Table S1), which were divided into 16 subfamilies (Figure 2A) based on the reported evolutionary relationships between rice and Arabidopsis, and 51 pairs of paralogous genes were also identified (Table 1). Only three of these

SC

pairs had a Ka/Ks value >1, implying that most of the PebHLH genes may have undergone negative selection (Table 2). We also constructed a phylogenetic tree using the 448 bHLH genes of rice, Arabidopsis and Moso bamboo (Table S1) and found that the genes were grouped into 21 subfamilies

M AN U

(Figure 1). We identified 19 and 45 paralogous gene pairs in Arabidopsis and rice, respectively (Table 1), and 46 and 15 orthologous pairs between Moso bamboo and rice, and between rice and Arabidopsis, respectively (Table 1). Our results showed that Moso bamboo was more closely related to rice than to Arabidopsis. Only 14 of the 176 pairs of homologues had Ka/Ks values >1 (Table 2), implying that most of the bHLH genes had undergone a purification selection among these three species. Therefore, theoretically, most PebHLH genes had eliminated deleterious mutations in the population through purification selection.

Gene structure variation plays an important part in gene evolution. Our results showed that

TE D

334 335 336 337 338 339 340 341 342 343 344 345 346

The bHLH family of TFs has been identified in plants such as rice, Brachypodium distachyon, and

subfamily 13 was special because the bHLH genes in this subfamily had no introns, suggesting these genes may have undergone a number of genetic evolution events that lead to the reduction in the number of introns. Therefore, the genes in this subfamily may function differently from bHLH genes. In subfamilies 8 and 19, most of the genes had also lost introns, suggesting that members of these subfamilies may be functionally differentiated. These structural variations coexisted in the bHLH genes

EP

327 328 329 330 331 332 333

DISCUSSION

of all three species, indicating that these three subfamilies contained genes that tended to lose introns. However, the structure of the majority of the genes was similar and the overall evolution was relatively conservative (Figure S6). The conserved motif analysis showed that most of the conserved motifs in

AC C

326

the same subfamily were similar, indicating the function of the encoded proteins in each subfamily was stable. However, some proteins in the same subfamily had structures that were different from those of the majority of proteins, such as PebHLH37/-53/-59/-41 in which motif 18 was missing. The different structure of these proteins suggests they may have different functions from the other proteins in the same subfamily (Figures 3–5). How cis-elements in the promoters of the bHLH genes respond to the environment will affect their roles in stimulating and regulating gene expression. In this study, we identified many promoter cis-elements that may respond to abiotic and biotic stresses, such as CE1, CE3, CGTCA-motif, TGACG-motif, MBS, and GARE-motif (Figure 4). Cis-elements for two phytohormones (ABA and MeJA) were identified in the PebHLH genes, namely CGTCA-motif and TGACG-motif for MeJA (43.39%), and CE1, CE3, ABRE, and motif-IIb far ABA (26.17%) (Figure 4). These results indicate 10

ACCEPTED MANUSCRIPT

399 400 401 402 403 404 405 406 407 408

respond to drought (e.g., MBS, 44.72%) (Figure 4), which is consistent with the qRT-PCR results (Figures 10–12). In addition, most PebHLH genes were assigned GO terms under the biological process and molecular function categories (Figure 4). In conclusion, the expression levels of most of the PebHLH genes were induced by the drought, MeJA, and ABA treatments, implying all of these genes may play roles in Moso bamboo growth, disease resistance, and response to environmental

RI PT

conditions.

The analysis of tissue expression data related to the development of Moso bamboo, provided preliminary organ-specific expression data for PebHLH gene family, which will lay the foundation for further clarifying the function of these genes in the evolutionary development of Moso bamboo. Five genes were specifically expressed, two (PebHLH125/-134) in shoot and three (PebHLH28/-91/-93) in

SC

leaf and, as shown in the phylogenetic tree, these three genes were in subfamily 8, indicating they may have the same function (Figure 5). Therefore, we speculated that PebHLH28/-91/-93 had not yet differentiated in the evolutionary process, but this needs further study to confirm. According to the

M AN U

results of Craig.D.Fairchild et al., Min NI et al., Enamul Hug and Peter H.Quail et al., the bHLH TFs HFR1, PIF3, and PIF4 play regulatory roles in phytochrome signaling (Ni et al. 1998, 1999; Huq and Quail 2002). The promoter analysis and tissue expression patterns of the PebHLH genes that were highly expressed in leaf suggest that the PebHLH genes may have similar roles. The expression profiles of the PebHLH genes in different tissues (Figure 5) showed that the highest expression levels were in leaf; therefore, the leaves were used for the qRT-PCR gene expression analysis. Five genes (PebHLH11, PebHLH45, PebHLH71, PebHLH82, and PebHLH98) showed low expression (Figure 5), but were up-regulated after undergoing various stress treatments

TE D

386 387 388 389 390 391 392 393 394 395 396 397 398

development. We also identified a large number of cis-acting elements in the PebHLH genes that may

(MeJA, ABA, and drought). For example, 1 hour after MeJA treatment, the expression of PebHLH11 increased rapidly (>120-fold) (Figures 10 and 11), and PebHLH98 was up-regulation (at least 3-fold) under all three treatments. Although most of the genes were up-regulated in the leaves under the three stress treatments (Figure 5), the expression levels of some of the PebHLH genes were down-regulated. For example, PebHLH1, PebHLH101, and PebHLH122 expression levels were down-regulated by

EP

375 376 377 378 379 380 381 382 383 384 385

that these phytohormones may play important roles in the regulation of Moso bamboo growth and

drought treatment; PebHLH1 and PebHLH90 expression levels were down-regulated by MeJA treatment; and PebHLH57 and PebHLH67 expression levels were down-regulated by ABA treatment (Figures 10 and 11). These results indicate the PebHLH genes play underlying roles in various stress

AC C

367 368 369 370 371 372 373 374

management responses in Moso bamboo. The properties and functions of proteins are influenced to some extent by the gene structure, and

the diversity of the bHLH gene structure in Moso bamboo indicates the functional diversity of the PebHLH TFs. Many studies have shown that bHLH genes are involved in responses to various biotic/abiotic stresses. Kiribuchi and ju-seok Seo et al. showed that RERJ1 (OsbHLH006) responded to drought stress. We found that PebHLH28, a homologue of RERJ1, was up-regulated more than 10-fold (6 hours) under drought stress, which is in agreement with the previous study. PebHLH28 was also up-regulated by the ABA and MeJA treatments. Ju-Seok Seo and Joungsu Joo et al. showed that OsbHLH148 responded to MeJA stress, and we found that PebHLH130 (a homologue of OsbHLH148) was also responsive to MeJA stress. Interestingly, PebHLH130 was significantly up-regulated (up to 55-fold) under drought stress. Our qRT-PCR analysis showed that most of the PebHLH genes were 11

ACCEPTED MANUSCRIPT 409 410 411 412

induced under drought, ABA, and MeJA treatments, which is coherent with previous studies in rice and

413 414 415 416 417 418 419 420 421

In summary, we studied the phylogeny, gene structure, homology, and Ka/Ks values of the

other plant species. It has been reported that other bHLH genes such as AtAIB and OsbHLH148 were also induced by ABA and drought stress (Li et al. 2007; Seo et al. 2011), and VaICE1 and MdCiBhlh1 were also induced by freezing tolerance (Feng et al. 2012; Xu et al. 2014).

AtbHLH, OsbHLH, and PebHLH TF genes. We also analyzed the multiple sequence alignments, tissue

RI PT

expression patterns, and promoters, and performed GO annotation analysis, subcellular localization prediction, and qRT-PCR analysis of PebHLH genes and the encoded proteins. According to the promoter analysis and GO annotations, we selected 21 genes for analysis of tissue expression patterns by qRT-PCR, and then extracted RNA from leaves to study the influences of ABA, MeJA, and PEG on the expression of the same 21 genes. The results indicated that the expression profiles of most of the PebHLH genes were affected by these three treatments. The results of this study will provide the basis

AC C

EP

TE D

M AN U

SC

for further research on the function of the PebHLH gene family in plants.

12

ACCEPTED MANUSCRIPT 422

REFERENCES

423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465

Abe H, Urao T, Ito T, Seki M, Shinozaki K, Yamaguchi-Shinozaki K (2003) Arabidopsis AtMYC2 (bHLH) and AtMYB2 (MYB) Function as Transcriptional Activators in Abscisic Acid Signaling. PLANT CELL 15, 63. Abe H, Yamaguchi-Shinozaki K, Urao T, Iwasaki T, Hosokawa D, Shinozaki K (1997) Role of Arabidopsis MYC and MYB Homologs in Drought- and Abscisic Acid-Regulated Gene

RI PT

Expression. PLANT CELL 9, 1859. Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. NUCLEIC ACIDS RES 25, 3389.

Amoutzias GD, Veron AS, Robinson-Rechavi M, Bornberg-Bauer E, Oliver SG, Robertson DL (2009) One billion years of bZIP transcription factor evolution: conservation and change in

SC

dimerization and DNA-binding site specificity. MOL BIOL EVOL 24, 827-835.

Atchley WR, Fitch WM (1997) A natural classification of the basic helix–loop–helix class of transcription factors. P NATL ACAD SCI USA 94, 5172.

M AN U

Bailey PC, Weisshaar B (2003) Update on the basic helix-loop-helix transcription factor gene family in Arabidopsis thaliana. PLANT CELL 15, 2497-2501.

Baudry A, Caboche M, Lepiniec L (2006) TT8 controls its own expression in a feedback regulation involving TTG1 and homologous MYB and bHLH factors, allowing a strong and cell-specific accumulation of flavonoids in Arabidopsis thaliana. PLANT J 46, 768-779. Blanc G, Wolfe KH (2004) Widespread paleopolyploidy in model plant species inferred from age distributions of duplicate genes. PLANT CELL 16, 1667-1678.

TE D

Chao G, Sun J, Wang C, Dong Y, Xiao S, Wang X, Jiao Z (2017) Genome-wide analysis of basic/helix-loop-helix gene family in peanut and assessment of its roles in pod development. PLOS ONE 12, e0181843.

Chen Z, Chen X, Yan H, Li W, Li Y, Cai R, Xiang Y (2015) The Lipoxygenase Gene Family in Poplar: Identification, Classification, and Expression in Response to MeJA Treatment. PLOS ONE 10,

EP

e0125526.

Chu W, Liu B, Wang Y, Pan F, Chen Z, Yan H, Xiang Y (2016) Genome-wide analysis of poplar VQ gene family and expression profiling under PEG, NaCl, and SA treatments. TREE GENET

AC C

GENOMES 12, 124.

Conesa A (2008) Blast2GO: A Comprehensive Suite for Functional Analysis in Plant Genomics. INT J PLANT PROD 2008, 619832.

David C, Pierre K, Morgane TC, Gemma R, Valérie L, Elena S, Degnan BM, Michel V (2007) Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics. BMC EVOL BIOL 7, 33.

Fan C, Ma J, Guo Q, Li X, Hui W, Lu M (2013) Selection of Reference Genes for Quantitative Real-Time PCR in Bamboo (Phyllostachys edulis). PLOS ONE 8, e56573. Feng XM, Qiang Z, Zhao LL, Yu Q, Xie XB, Li HF, Yao YX, You CX, Hao YJ (2012) The cold-induced basic helix-loop-helix transcription factor gene MdCIbHLH1 encodes an ICE-like protein in apple. BMC PLANT BIOL 12, 22. Gao Y, Liu H, Wang Y, Li F, Xiang Y (2017) Genome-wide identification of PHD-finger genes and expression pattern analysis under various treatments in moso bamboo (Phyllostachys edulis). PLANT PHYSIOL BIOCH 123, 378. 13

ACCEPTED MANUSCRIPT Goodrich J, Carpenter R, Coen ES (1992) A common gene regulates pigmentation pattern in diverse plant species. CELL 68, 955. Gremski K, Ditta G, Mf (2007) The HECATE genes regulate female reproductive tract development in Arabidopsis thaliana. Development 134, 3593-3601. Guo XJ, Wang JR (2017) Global identification, structural analysis and expression characterization of bHLH transcription factors in wheat. BMC PLANT BIOL 17, 90. He H, Dong Q, Shao Y, Jiang H, Zhu S, Cheng B, Xiang Y (2012) Genome-wide survey and

RI PT

characterization of the WRKY gene family in Populus trichocarpa. PLANT CELL REP 31, 1199-1217.

Heim MA, Jakoby M, Werber M, Martin C, Weisshaar B, Bailey PC (2003) The basic helix-loop-helix transcription factor family in plants: a genome-wide study of protein structure and functional diversity. MOL BIOL EVOL 20, 735-747.

SC

Horton P, Park KJ, Obayashi T, Fujita N, Harada H, Adamscollier CJ, Nakai K (2007) WoLF PSORT: protein localization predictor. NUCLEIC ACIDS RES 35, 585-587.

Huq E, Quail PH (2002) PIF4, a phytochrome-interacting bHLH factor, functions as a negative regulator of phytochrome B signaling in Arabidopsis. EMBO J 21, 2441-2450.

M AN U

Jones S (2004) An overview of the basic helix-loop-helix proteins. GENOME BIOL 5, 226. Kosugi S, Ohashi Y (2002) DNA binding and dimerization specificity and potential targets for the TCP protein family. PLANT J 30, 337–348.

Ledent V, Vervoort M (2001) The basic helix-loop-helix protein family: comparative genomics and phylogenetic analysis. GENOME RES 11, 754-770.

Li H, Sun J, Xu Y, Jiang H, Wu X, Li C (2007) The bHLH-type transcription factor AtAIB positively regulates ABA response in Arabidopsis. PLANT MOL BIOL 65, 655-665.

TE D

Li X, Duan X, Jiang H, Sun Y, Tang Y, Yuan Z, Guo J, Liang W, Chen L, Yin J (2006) Genome-Wide Analysis of Basic/Helix-Loop-Helix Transcription Factor Family in Rice and Arabidopsis. PLANT PHYSIOL 141, 1167.

Liu H, Min W, Zhu D, Feng P, Wang Y, Yue W, Yan X (2017) Genome-Wide analysis of the AAAP gene family in moso bamboo ( Phyllostachys edulis ). BMC PLANT BIOL 17, 29.

EP

Liu Q, Wang H, Zhang Z, Wu J, Ying F, Zhu Z (2009) Divergence in function and expression of the NOD26-like intrinsic proteins in plants. BMC GENOMICS 10, 313. Liu W, Wang Y, Guo H, Zhao T, Zhou H, Guo A (2006) A crucial amino acid of regulation of binding activity of DREBP to DRE {\sl cis}--elementin {\sl Nicotiana benthamiana}. CHINESE JBMP

AC C

466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509

4.

Ludwig SR, Habera LF, Dellaporta SL, Wessler SR (1989) Lc, a member of the maize R gene family responsible for tissue-specific anthocyanin production, encodes a protein similar to transcriptional activators and contains the myc-homology region. P NATL ACAD SCI USA 86, 7092-7096.

Massari ME, Murre C (2000) Helix-loop-helix proteins: regulators of transcription in eucaryotic organisms. MOL CELL BIOL 20, 429. Murre C, Mccaw PS, Baltimore D (1989) A new DNA binding and dimerization motif in immunoglobulin enhancer binding, daughterless, MyoD , and myc proteins. Cell 56, 777-783. Nair SK, Burley SK (2000) Recognizing DNA in the library. NATURE 404, 717-718. Ni M, Tepperman JM, Quail PH (1998) PIF3, a phytochrome-interacting factor necessary for normal photoinduced signal transduction, is a novel basic helix-loop-helix protein. CELL 95, 657-667. 14

ACCEPTED MANUSCRIPT Ni M, Tepperman JM, Quail PH (1999) Binding of phytochrome B to its nuclear signalling partner PIF3 is reversibly induced by light. NATURE 400, 781-784. Niu X, Guan Y, Chen S, Li H (2017) Genome-wide analysis of basic helix-loop-helix (bHLH) transcription factors in Brachypodium distachyon. BMC GENOMICS 18, 619. Peng Z, Lu Y, Li L, Zhao Q, Feng Q, Gao Z, Lu H, Hu T, Yao N, Liu K (2013) The draft genome of the fast-growing non-timber forest species moso bamboo (Phyllostachys heterocycla). NAT GENET 45, 456-461.

RI PT

Riechmann JL, Heard J, Martin G, Reuber L, Jiang CZ, Keddie J, Adam L, Pineda O, Ratcliffe OJ, Samaha RR (2000) Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes. SCIENCE 290, 2105.

Robinson KA, Koepke JI, Kharodawala M, Lopes JM (2000) A network of yeast basic helix–loop– helix interactions. NUCLEIC ACIDS RES 28, 4460-4466.

SC

Seo JS, Joo J, Kim MJ, Kim YK, Nahm BH, Song SI, Cheong JJ, Lee JS, Kim JK, Choi YD (2011) OsbHLH148, a basic helix-loop-helix protein, interacts with OsJAZ proteins in a jasmonate signaling pathway leading to drought tolerance in rice. PLANT J CELL & MOL BIOL 65, 907. Tamura K, Stecher G, Peterson D, Filipski A, Kumar S (2013) MEGA6: Molecular Evolutionary

M AN U

Genetics Analysis version 6.0. MOL BIOL EVOL 30, 2725.

Toledoortiz G, Huq E, Quail PH (2003) The Arabidopsis Basic/Helix-Loop-Helix Transcription Factor Family. PLANT CELL 15, 1749-1770.

Wang J, Hu Z, Zhao T, Yang Y, Chen T, Yang M, Yu W, Zhang B (2015a) Genome-wide analysis of bHLH transcription factor and involvement in the infection by yellow leaf curl virus in tomato ( Solanum lycopersicum ). BMC GENOMICS,16,1(2015-02-05) 16, 39. Wang Y, Feng L, Zhu Y, Li Y, Yan H, Xiang Y (2015b) Comparative genomic analysis of the WRKY III

TE D

gene family in populus , grape, arabidopsis and rice. BIOL DIRECT,10,1(2015-09-08) 10, 48. Wang Y, Liu H, Zhu D, Gao Y, Yan H, Yan X (2017) Genome-wide analysis of VQ motif-containing proteins in Moso bamboo (Phyllostachys edulis). PLANTA 246, 165. Wu H, Lv H, Li L, Liu J, Mu S, Li X, Gao J (2015a) Genome-Wide Analysis of the AP2/ERF Transcription Factors Family and the Expression Patterns of DREB Genes in Moso Bamboo

EP

(Phyllostachys edulis). PLOS ONE 10, e0126657. Wu M, Li Y, Chen D, Liu H, Zhu D, Xiang Y (2016a) Genome-wide identification and expression analysis of the IQD gene family in moso bamboo (Phyllostachys edulis). SCI REP 6, 24520. Wu M, Wu S, Chen Z, Dong Q, Yan H, Xiang Y (2015b) Genome-wide survey and expression analysis

AC C

510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553

of the amino acid transporter gene family in poplar. TREE GENET GENOMES 11, 83.

Wu P, Song XM, Wang Z, Duan WK, Hu R, Wang WL, Li Y, Hou X (2016b) Genome-wide analysis of the BES1 transcription factor family in Chinese cabbage ( Brassica rapa ssp. pekinensis ). PLANT GROWTH REGUL 103, 1-11.

Wu S, Wu M, Dong Q, Jiang H, Cai R, Xiang Y (2016c) Genome-wide identification, classification and expression analysis of the PHD-finger protein family in Populus trichocarpa. GENE 575, 75. Xu W, Jiao Y, Li R, Zhang N, Xiao D, Ding X, Wang Z (2014) Chinese wild-growing Vitis amurensis ICE1 and ICE2 encode MYC-type bHLH transcription activators that regulate cold tolerance in Arabidopsis. PLOS ONE 9, e102303. Yamasaki K, Kigawa T, Seki M, Shinozaki K, Yokoyama S (2013) DNA-binding domains of plant-specific transcription factors: structure, function, and evolution. TRENDS PLANT SCI 18, 267-276. 15

ACCEPTED MANUSCRIPT Yanagisawa S (1998) Transcription factors in plants: Physiological functions and regulation of expression. J PLANT RES 111, 363-371. Zhang X, Luo H, Xu Z, Zhu Y, Ji A, Song J, Chen S (2015) Genome-wide characterisation and analysis of bHLH transcription factors related to tanshinone biosynthesis in Salvia miltiorrhiza. SCI REP 5, 11244. Zhu D, Chu W, Wang Y, Yan H, Chen Z, Xiang Y (2017) Genome‐wide identification, classification and expression analysis of the serine carboxypeptidase ‐ like protein family in poplar.

RI PT

PHYSIOL PLANTARUM.

Zimmermann IM, Heim MA, Weisshaar B, Uhrig JF (2004) Comprehensive identification of Arabidopsis thaliana MYB transcription factors interacting with R/B-like BHLH proteins.

EP

TE D

M AN U

SC

PLANT J 40, 22-34.

AC C

554 555 556 557 558 559 560 561 562 563 564 565 566

16

ACCEPTED MANUSCRIPT 567

FIGURE LEGENDS:

568 569 570 571

Figure 1 Phylogeny of bHLHs from Moso bamboo, rice and Arabidopsis.

572 573 574 575 576

Figure 2 Phylogeny of bHLHs from Moso bamboo, rice and Arabidopsis.

577 578

Figure 3Gene Ontology (GO) results in PebHLH proteins

579 580 581 582 583

Figure 4 Cis-acting element analysis of the promoter regions of PebHLH genes.

584 585 586 587

Figure 5 Expression profiles of poplar bHLH genes across different tissues and developmental

588 589 590 591 592

Figure 6 Expression patterns of all selected genes in Moso bamboo under ABA, Drought and

593 594 595

Figure 7 Expression patterns of all selected genes in Moso bamboo under ABA, Drought and

596

SUPPORTING INFORMATION

597

Figure S1 Sliding window plots of the bHLH genes in Moso bamboo.

598

Figure S2 Sliding window plots of the bHLH genes in Arabidopsis.

599

Figure S3 Sliding window plots of the bHLH genes in Rice.

600

Figure S4 Sliding window plots of the bHLH genes in Moso bamboo and Rice.

The 137 PebHLH genes, 167 OsbHLH genes and 144 AtbHLH genes are clustered into 21 subfamilies. Details of the bHLH genes from Moso bamboo, Arabidopsis and rice are listed in Table S1. The tree

RI PT

was generated with the Clustal X 2.0 software using the neighbor-joining (N-J) method.

A: The 137 PebHLH genes are clustered into 16 subfamilies.

B: Comparison of bHLH family members from Moso bamboo, rice and Arabidopsis. Different colors represent the different species, and the percentage of each subfamily in each species is shown. Black:

SC

AtbHLH genes; Red: OsbHLH genes; Blue: PebHLH genes.

Different colored represented different function.

M AN U

Based on the functional annotation, the cis-acting elements were classified into two major classes: phytohormone responsive and abiotic and biotic stress-related cis-acting elements. A: Percentage of total cis-acting elements in the promoter region of the PebHLH gene. B and C: The number of each cis-acting element as a percentage of its classification.

stages as revealed by qRT-PCR.

TE D

Heatmap shows the hierarchical clustering of the 21 PebHLH genes among the different tissues. Three biological replicates per organization.

MeJA treatment, as revealed by qRT-PCR.

Heatmap shows the hierarchical clustering of the 21 PebHLH genes among the different tissues. Three

EP

biological replicates per organization.

AC C

The abbreviation represents specific stresses: ABA (A); Drought (D); MeJA (M).

MeJA treatment, as revealed by qRT-PCR. Y-axis: relative expression levels; X-axis: the time course of stress treatments; Error bars, 6±SE.

17

ACCEPTED MANUSCRIPT 601

Figure S5 Sliding window plots of the bHLH genes in Arabidopsis and Rice.

602 603 604 605 606

Figure S6 Phylogenetic relationships and gene structures of bHLH genes in three species.

607 608 609 610

Figure S7 Schematic representation of 20 conserved motifs in the PebHLH genes.

611 612 613 614

Figure S8 Schematic representation of 20 conserved motifs in the AtbHLH genes.

615 616 617 618

Figure S9 Schematic representation of 20 conserved motifs in the OsbHLH genes.

619 620 621

Figure S10 Multiple sequence alignment of bHLH proteins in Moso bamboo.

622 623

Figure S11 Subcellular localization of the bHLH genes in Moso bamboo.

624

Table S1 Detailed information about the bHLHs in the Moso bamboo, Arabidopsis and rice genome

625 626

Table S2 Conserved motifs detected by MEME in the bHLH proteins of Moso bamboo, Arabidopsis,

627

Table S3 Subcellular localization of bHLH proteins in Moso bamboo predicted by WOLF PSORT

628

Table S4 Promoter analysis of bHLH proteins in Moso bamboo

629

Table S5 Primers used for the qRT-PCR analysis of PebHLH gene expression

630

Table S6 Expression data of 21 selected PebHLH genes in six tissues of Moso bamboo

Left: Based on the results of sequence alignment, using the NJ method to construct the Phylogenetic tree of bHLHs. Each subfamily was framed together. Right: Exons, introns and untranslated regions (UTRs) are indicated by yellow rectangles, gray lines

RI PT

and blue rectangles, respectively.

Using the online MEME program,the conserved motifs in the bHLH genes were identified. Different colored boxes represented different specific motifs. The length of each box in the figure does not

SC

represent the actual motif size.

Using the online MEME program,the conserved motifs in the bHLH genes were identified. Different colored boxes represented different specific motifs. The length of each box in the figure does not

M AN U

represent the actual motif size.

Using the online MEME program,the conserved motifs in the bHLH genes were identified. Different colored boxes represented different specific motifs. The length of each box in the figure does not represent the actual motif size.

highly conserved.

TE D

Sequences were aligned using DNAMAN software. The Basic-Helix-Loop-Helix motif is clearly

AC C

and rice

EP

Different colors represent different positions.

18

ACCEPTED MANUSCRIPT Table 1 Paralogous (Pe-Pe,Os-Os and At-At) and orthologous (Pe-Os and At-Os) gene pairs

Os-Os

At-At

Pe-Os

At-Os

PebHLH1/PebHLH101

OsbHLH12/OsbHLH165

AtbHLH1/AtbHLH2

PebHLH6/OsbHLH50

AtbHLH14/OsbHLH51

PebHLH2/PebHLH49

OsbHLH13/OsbHLH14

AtbHLH3/AtbHLH4

PebHLH8/OsbHLH109

AtbHLH20/OsbHLH55

PebHLH3/PebHLH74

OsbHLH15/OsbHLH16

AtbHLH5/AtbHLH6

PebHLH9/OsbHLH44

AtbHLH22/OsbHLH148

PebHLH4/PebHLH60

OsbHLH19/OsbHLH23

AtbHLH8/AtbHLH9

PebHLH11/OsbHLH28

AtbHLH32/OsbHLH17

RI PT

Pe-Pe

OsbHLH25/OsbHLH26

AtbHLH11/AtbHLH13

PebHLH16/OsbHLH110

AtbHLH43/OsbHLH156

OsbHLH62/OsbHLH63

AtbHLH15/AtbHLH16

PebHLH17/OsbHLH35

AtbHL59/OsbHLH71

PebHLH7/PebHLH75

OsbHLH67/OsbHLH69

AtbHLH23/AtbHLH24

PebHLH20/OsbHLH52

AtbHLH60/OsbHLH65

PebHLH9/PebHLH31

OsbHLH68/OsbHLH70

AtbHLH26/AtbHLH27

PebHLH22/OsbHLH80

AtbHLH61/OsbHLH66

PebHLH10/PebHLH81

OsbHLH73/OsbHLH74

AtbHLH28/AtbHLH29

PebHLH24/OsbHLH112

AtbHLH67/OsbHLH154

PebHLH12/PebHLH135

OsbHLH81/OsbHLH82

AtbHLH30/AtbHLH31

PebHLH25/OsbHLH119

AtbHLH89/OsbHLH84

PebHLH13/PebHLH26

OsbHLH117/OsbHLH136

AtbHLH35/AtbHLH39

PebHLH28/OsbHLH6

AtbHLH105/OsbHLH113

PebHLH14/PebHLH121

OsbHLH130/OsbHLH131

AtbHLH36/AtbHLH37

PebHLH29/OsbHLH7

AtbHLH116/OsbHLH124

PebHLH15/PebHLH50

OsbHLH132/OsbHLH133

AtbHLH38/AtbHLH40

PebHLH30/OsbHLH88

AtbHLH119/OsbHLH123

PebHLH18/PebHLH38

OsbHLH139/OsbHLH140

AtbHLH41/AtbHLH42

PebHLH32/OsbHLH141

AtbHLH144/OsbHLH33 AtbHLH135/OsbHLH64

M AN U

SC

PebHLH5/PebHLH58 PebHLH6/PebHLH123

OsbHLH144/OsbHLH145

AtbHLH44/AtbHLH45

PebHLH37/OsbHLH101

OsbHLH146/OsbHLH147

AtbHLH46/AtbHLH47

PebHLH40/OsbHLH104

PebHLH27/PebHLH118

OsbHLH150/OsbHLH151

AtbHLH48/AtbHLH49

PebHLH42/OsbHLH58

PebHLH28/PebHLH29

OsbHLH160/OsbHLH161

AtbHLH53/AtbHLH54

PebHLH45/OsbHLH3

PebHLH33/PebHLH34

OsbHLH163/OsbHLH166

AtbHLH55/AtbHLH56

PebHLH51/OsbHLH55

AtbHLH62/AtbHLH65

PebHLH53/OsbHLH106

AtbHLH63/AtbHLH64

PebHLH59/OsbHLH108

AtbHLH71/AtbHLH72

PebHLH60/OsbHLH59

AtbHLH73/AtbHLH74

PebHLH61/OsbHLH10

AtbHLH75/AtbHLH76

PebHLH62/OsbHLH40

PebHLH35/PebHLH131 PebHLH36/PebHLH95 PebHLH37/PebHLH41 PebHLH39/PebHLH133 PebHLH40/PebHLH78

AtbHLH78/AtbHLH79

PebHLH64/OsbHLH32

AtbHLH80/AtbHLH87

PebHLH66/OsbHLH153

AtbHLH81/AtbHLH82

PebHLH68/OsbHLH90

PebHLH47/PebHLH52

AtbHLH84/AtbHLH85

PebHLH70/OsbHLH99

PebHLH48/PebHLH69

AtbHLH90/AtbHLH96

PebHLH72/OsbHLH42

PebHLH51/PebHLH129

AtbHLH91/AtbHLH92

PebHLH82/OsbHLH24

PebHLH54/PebHLH99

AtbHLH93/AtbHLH94

PebHLH83/OsbHLH37

PebHLH55/PebHLH105

AtbHLH97/AtbHLH98

PebHLH85/OsbHLH167

PebHLH56/PebHLH103

AtbHLH101/AtbHLH102

PebHLH89/OsbHLH34

PebHLH57/PebHLH98

AtbHLH103/AtbHLH104

PebHLH93/OsbHLH5

PebHLH62/PebHLH124

AtbHLH107/AtbHLH108

PebHLH104/OsbHLH86

PebHLH63/PebHLH100

AtbHLH109/AtbHLH110

PebHLH116/OsbHLH152

PebHLH67/PebHLH134

AtbHLH111/AtbHLH113

PebHLH117/OsbHLH128

PebHLH70/PebHLH109

AtbHLH114/AtbHLH115

PebHLH118/OsbHLH95

PebHLH71/PebHLH122

AtbHLH120/AtbHLH121

PebHLH123/OsbHLH49

PebHLH76/PebHLH87

AtbHLH122/AtbHLH124

PebHLH124/OsbHLH38

PebHLH44/PebHLH102

AC C

PebHLH46/PebHLH73

EP

PebHLH43/PebHLH65

TE D

PebHLH19/PebHLH21 PebHLH24/PebHLH132

ACCEPTED MANUSCRIPT PebHLH77/PebHLH88

AtbHLH125/AtbHLH126

PebHLH126/OsbHLH20

PebHLH79/PebHLH136

AtbHLH130/AtbHLH131

PebHLH127/OsbHLH121

PebHLH84/PebHLH112

AtbHLH132/AtbHLH133

PebHLH128/OsbHLH47

PebHLH91/PebHLH110

AtbHLH137/AtbHLH138

PebHLH129/OsbHLH54

PebHLH92/PebHLH94

AtbHLH139/AtbHLH140

PebHLH130/OsbHLH148

PebHLH96/PebHLH113

AtbHLH157/AtbHLH158

PebHLH137/OsbHLH125

PebHLH97/PebHLH120

RI PT

PebHLH106/PebHLH108 PebHLH107/PebHLH115 PebHLH114/PebHLH119

AC C

EP

TE D

M AN U

SC

PebHLH117/PebHLH137

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT Table 2 Ka, Ks, Ka/Ks values for the bHLH genes in Moso bamboo, rice, and Arabidosis

PebHLH genes Ks

Ka/Ks

0.06016 0.03985 0.04014 0.1852 0.07223 0.15392 0.03946 0.21748 0.07345 0.08046 0.07315 0.03227 0.06202 0.24122 0.10937 0.23445 0.13402 0.28868 0.27872 0.09963 0.06175 0.12569 0.17404 0.26431 0.04335 0.15803 0.16193 0.07857 0.11784 0.53184 0.11222 0.02725 0.03137 0.07495 0.40796 0.07225 0.04533

0.10529 0.13191 0.09013 0.26946 0.13403 0.28708 0.13322 0.4461 0.15291 0.13983 0.14865 0.03221 0.23989 0.26037 0.17358 0.25478 0.1918 0.325 0.44378 0.19217 0.15188 0.18848 0.22781 0.45104 0.10526 0.19947 0.16719 0.12749 0.14989 0.42762 0.22141 0.11321 0.14948 0.12879 0.4062 0.13172 0.18398

RI PT

Ka

AC C

EP

TE D

0.5714 0.3021 0.4454 0.6873 0.5389 0.5362 0.2962 0.4875 0.4803 0.5754 0.4921 1.0019 0.2585 0.9265 0.6301 0.9202 0.6987 0.8882 0.6281 0.5184 0.4066 0.6669 0.7640 0.5860 0.4118 0.7922 0.9685 0.6163 0.7862 1.2437 0.5068 0.2407 0.2099 0.5820 1.0043 0.5485 0.2464

SC

M AN U

Duplicated TTF gene pairs PebHLH1/PebHLH101 PebHLH2/PebHLH49 PebHLH3/PebHLH74 PebHLH4/PebHLH60 PebHLH5/PebHLH58 PebHLH6/PebHLH123 PebHLH7/PebHLH75 PebHLH9/PebHLH31 PebHLH10/PebHLH81 PebHLH12/PebHLH135 PebHLH13/PebHLH26 PebHLH14/PebHLH121 PebHLH15/PebHLH50 PebHLH18/PebHLH38 PebHLH19/PebHLH21 PebHLH24/PebHLH132 PebHLH27/PebHLH118 PebHLH28/PebHLH29 PebHLH33/PebHLH34 PebHLH35/PebHLH131 PebHLH36/PebHLH95 PebHLH37/PebHLH41 PebHLH39/PebHLH133 PebHLH40/PebHLH78 PebHLH43/PebHLH65 PebHLH44/PebHLH102 PebHLH46/PebHLH73 PebHLH47/PebHLH52 PebHLH48/PebHLH69 PebHLH51/PebHLH129 PebHLH54/PebHLH99 PebHLH55/PebHLH105 PebHLH56/PebHLH103 PebHLH57/PebHLH98 PebHLH62/PebHLH124 PebHLH63/PebHLH100 PebHLH67/PebHLH134

ACCEPTED MANUSCRIPT

M AN U

OsbHLH genes

0.30489 0.16454 0.13172 0.11976 0.14776 0.12653 0.16584 0.15652 0.10945 0.11771 0.20717 0.11996 0.11043 0.58601

Ka

Ks

Ka/Ks

0.04886 0.08448 0.20219 0.4633 0.07322 0.28326 0.46766 0.35299 0.41868 0 0.6192 0.29801 0.38735 0.22923 0.55096 0.52595 0.1901 0.03326 0.13658

0.05658 0.11706 0.28605 0.57772 0.11887 0.6391 0.47731 0.40538 0.51437 0 0.48922 0.53154 0.63528 0.34646 0.42667 0.4178 0.50104 0.09771 0.55329

0.8636 0.7217 0.7068 0.8019 0.6160 0.4432 0.9798 0.8708 0.8140

Ka

Ks

Ka/Ks

0.30898

0.55103

0.5607

AC C

EP

TE D

Duplicated TTF gene pairs OsbHLH12/OsbHLH165 OsbHLH13/OsbHLH14 OsbHLH15/OsbHLH16 OsbHLH19/OsbHLH23 OsbHLH25/OsbHLH26 OsbHLH62/OsbHLH63 OsbHLH67/OsbHLH69 OsbHLH68/OsbHLH70 OsbHLH73/OsbHLH74 OsbHLH81/OsbHLH82 OsbHLH117/OsbHLH136 OsbHLH130/OsbHLH131 OsbHLH132/OsbHLH133 OsbHLH139/OsbHLH140 OsbHLH144/OsbHLH145 OsbHLH146/OsbHLH147 OsbHLH150/OsbHLH151 OsbHLH160/OsbHLH161 OsbHLH163/OsbHLH166

0.5345 0.3846 0.5221 0.2374 0.7877 0.7062 0.2157 0.4413 0.3751 0.4396 0.5698 0.5194 0.2824 0.7014

RI PT

0.16295 0.06328 0.06877 0.02843 0.11639 0.08936 0.03577 0.06908 0.04105 0.05175 0.11805 0.06231 0.03119 0.411

SC

PebHLH70/PebHLH109 PebHLH71/PebHLH122 PebHLH76/PebHLH87 PebHLH77/PebHLH88 PebHLH79/PebHLH136 PebHLH84/PebHLH112 PebHLH91/PebHLH110 PebHLH92/PebHLH94 PebHLH96/PebHLH113 PebHLH97/PebHLH120 PebHLH106/PebHLH108 PebHLH107/PebHLH115 PebHLH114/PebHLH119 PebHLH117/PebHLH137

1.2657 0.5607 0.6097 0.6616 1.2913 1.2589 0.3794 0.3404 0.2469

AtbHLH genes Duplicated TTF gene pairs AtbHLH1/AtbHLH2

ACCEPTED MANUSCRIPT

TE D

EP

AC C

0.4137 0.4482 0.3326 0.5388 1.0979 0.4097 0.5438 0.4696 0.2609 0.5792 0.3896 0.5720 0.5005 0.5920 0.3950 0.7598 0.6597 0.3218 0.7133 0.8079 0.3183 0.7820 0.6562 0.5000 0.6712 0.5952 0.5413 0.3053 0.3259 0.5721 0.7642 0.5448 0.7157 0.7954 0.3320 0.4501 0.4413 0.5428 0.8913 0.5357 0.4365 0.5626 1.0390 0.5908

RI PT

0.49367 0.46415 0.32641 0.4883 0.54127 0.55389 0.72034 0.54139 0.50988 0.63294 0.545 0.57655 0.63092 0.51981 0.52029 0.72371 0.58524 0.58434 0.52563 0.66207 0.46219 0.63408 0.39531 0.43134 0.67157 0.46167 0.4535 0.44116 0.44875 0.4359 0.58068 0.53887 0.34977 0.52131 0.453 0.44723 0.61656 0.42484 0.61214 0.4927 0.59319 0.49727 0.59608 0.56819

SC

0.20422 0.20801 0.10857 0.26312 0.59428 0.22691 0.39172 0.25423 0.13302 0.36663 0.21235 0.3298 0.3158 0.30772 0.20552 0.54987 0.38607 0.18802 0.37494 0.53486 0.1471 0.49585 0.25939 0.21568 0.45075 0.27477 0.24549 0.13468 0.14623 0.24937 0.44375 0.29359 0.25034 0.41463 0.15039 0.2013 0.2721 0.23062 0.54559 0.26394 0.25893 0.27976 0.61935 0.3357

M AN U

AtbHLH3/AtbHLH4 AtbHLH5/AtbHLH6 AtbHLH8/AtbHLH9 AtbHLH11/AtbHLH13 AtbHLH15/AtbHLH16 AtbHLH23/AtbHLH24 AtbHLH26/AtbHLH27 AtbHLH28/AtbHLH29 AtbHLH30/AtbHLH31 AtbHLH35/AtbHLH39 AtbHLH36/AtbHLH37 AtbHLH38/AtbHLH40 AtbHLH41/AtbHLH42 AtbHLH44/AtbHLH45 AtbHLH46/AtbHLH47 AtbHLH48/AtbHLH49 AtbHLH53/AtbHLH54 AtbHLH55/AtbHLH56 AtbHLH62/AtbHLH65 AtbHLH63/AtbHLH64 AtbHLH71/AtbHLH72 AtbHLH73/AtbHLH74 AtbHLH75/AtbHLH76 AtbHLH78/AtbHLH79 AtbHLH80/AtbHLH87 AtbHLH81/AtbHLH82 AtbHLH84/AtbHLH85 AtbHLH90/AtbHLH96 AtbHLH91/AtbHLH92 AtbHLH93/AtbHLH94 AtbHLH97/AtbHLH98 AtbHLH101/AtbHLH102 AtbHLH103/AtbHLH104 AtbHLH107/AtbHLH108 AtbHLH109/AtbHLH110 AtbHLH111/AtbHLH113 AtbHLH114/AtbHLH115 AtbHLH120/AtbHLH121 AtbHLH122/AtbHLH124 AtbHLH125/AtbHLH126 AtbHLH130/AtbHLH131 AtbHLH132/AtbHLH133 AtbHLH137/AtbHLH138 AtbHLH139/AtbHLH140

ACCEPTED MANUSCRIPT Pe/OsbHLH genes Ks

Ka/Ks

0.28739 0.18376 0.10861 0.33682 0.19026 0.066 0.11211 0.08838 0.15004 0.18415 0.13839 0.37374 0.19698 0.71182 0.06233 0.71368 0.08043 0.16052 0.19729 0.28575 0.13128 0.07383 0.08265 0.05535 0.08968 0.71512 0.07616 0.0316 0.13224 0.26062 0.20667 0.10749 0.18753 0.40894 0.14634 0.13787 0.16109 0.07009 0.13179 0.20716

0.27447 0.32119 0.27312 0.38497 0.30975 0.35437 0.25697 0.22134 0.29286 0.27507 0.20255 0.4176 0.43061 0.77604 0.30122 0.6555 0.31652 0.31761 0.27127 0.431 0.29455 0.30124 0.19991 0.20225 0.38223 0.51857 0.29041 0.42584 0.26472 0.35933 0.30479 0.32702 0.36072 0.42064 0.32755 0.3294 0.38292 0.23626 0.21953 0.25123

1.0471 0.5721 0.3977 0.8749 0.6142 0.1862 0.4363 0.3993 0.5123 0.6695 0.6832 0.8950 0.4574 0.9172 0.2069 1.0888 0.2541 0.5054 0.7273 0.6630 0.4457 0.2451 0.4134 0.2737 0.2346 1.3790 0.2622 0.0742 0.4995 0.7253 0.6781 0.3287 0.5199 0.9722 0.4468 0.4185 0.4207 0.2967 0.6003 0.8246

SC

RI PT

Ka

AC C

EP

TE D

M AN U

Duplicated TTF gene pairs PebHLH6/OsbHLH50 PebHLH8/OsbHLH109 PebHLH9/OsbHLH44 PebHLH11/OsbHLH28 PebHLH16/OsbHLH110 PebHLH17/OsbHLH35 PebHLH20/OsbHLH52 PebHLH22/OsbHLH80 PebHLH24/OsbHLH112 PebHLH25/OsbHLH119 PebHLH28/OsbHLH6 PebHLH29/OsbHLH7 PebHLH30/OsbHLH88 PebHLH32/OsbHLH141 PebHLH37/OsbHLH101 PebHLH40/OsbHLH104 PebHLH42/OsbHLH58 PebHLH45/OsbHLH3 PebHLH51/OsbHLH55 PebHLH53/OsbHLH106 PebHLH59/OsbHLH108 PebHLH60/OsbHLH59 PebHLH61/OsbHLH10 PebHLH62/OsbHLH40 PebHLH64/OsbHLH32 PebHLH66/OsbHLH153 PebHLH68/OsbHLH90 PebHLH70/OsbHLH99 PebHLH72/OsbHLH42 PebHLH82/OsbHLH24 PebHLH83/OsbHLH37 PebHLH85/OsbHLH167 PebHLH89/OsbHLH34 PebHLH93/OsbHLH5 PebHLH104/OsbHLH86 PebHLH116/OsbHLH152 PebHLH117/OsbHLH128 PebHLH118/OsbHLH95 PebHLH123/OsbHLH49 PebHLH124/OsbHLH38

ACCEPTED MANUSCRIPT 0.33912 0.15363 0.17836 0.66019 0.13382 0.16615

0.36397 0.23149 0.27491 0.55619 0.36275 0.32819

Ka

Ks

0.42287 0.40614 0.64161 0.51577 0.60663 0.53755 0.67098 0.57008 0.31585 0.44903 0.48824 0.41375 0.51374 0.45658 0.70144

0.67168 0.78551 0.74847 0.68653 0.71654 0.71205 0.491 0.68692 0.86282 0.65048 0.71403 0.6963 0.67717 0.73768 0.67309

AC C

EP

TE D

M AN U

Duplicated TTF gene pairs AtbHLH14/OsbHLH51 AtbHLH20/OsbHLH55 AtbHLH22/OsbHLH148 AtbHLH32/OsbHLH17 AtbHLH43/OsbHLH156 AtbHL59/OsbHLH71 AtbHLH60/OsbHLH65 AtbHLH61/OsbHLH66 AtbHLH67/OsbHLH154 AtbHLH89/OsbHLH84 AtbHLH105/OsbHLH113 AtbHLH116/OsbHLH124 AtbHLH119/OsbHLH123 AtbHLH144/OsbHLH33 AtbHLH135/OsbHLH64

Ka/Ks

0.6296 0.5170 0.8572 0.7513 0.8466 0.7549 1.3666 0.8299 0.3661 0.6903 0.6838 0.5942 0.7587 0.6189 1.0421

SC

At/OsbHLH genes

0.9317 0.6637 0.6488 1.1870 0.3689 0.5063

RI PT

PebHLH126/OsbHLH20 PebHLH127/OsbHLH121 PebHLH128/OsbHLH47 PebHLH129/OsbHLH54 PebHLH130/OsbHLH148 PebHLH137/OsbHLH125

ACCEPTED MANUSCRIPT

Biological process

M AN U

SC

protein dimerization activity core promoter sequence-specific DNA binding transcriptional activator activity, RNA polymerase II transcription regulatory region sequence-specific DNA binding binding DNA binding DNA binding transcription factor activity RNA polymerase II regulatory region sequence-specific DNA binding sequence-specific DNA binding amino acid binding nucleus plastid membrane integral component of membrane mitochondrion RNA polymerase II transcription factor complex plasmodesma intracellular membrane-bounded organelle transcription from RNA polymerase II promoter positive regulation of transcription from RNA polymerase II promoter transcription, DNA-templated

TE D

Cellular component (128)

GO:0046983 GO:0001046 GO:0001228 GO:0005488 GO:0003677 GO:0003700 GO:0000977 GO:0043565 GO:0016597 GO:0005634 GO:0009536 GO:0016020 GO:0016021 GO:0005739 GO:0090575 GO:0009506 GO:0043231 GO:0006366 GO:0045944 GO:0006351

Nu m 108 12 12 13 25 6 2 2 1 92 22 2 2 5 2 2 1 12 5 4

EP

Molecular function (181)

Gos

Analysis of each subfamily of Moso bamboo bHLH protein Annotation

AC C

Classificati on

RI PT

Table 3 Gene ontology (GO) annotation

ACCEPTED MANUSCRIPT

M AN U

SC

RI PT

regulation of transcription, DNA-templated brassinosteroid mediated signaling pathway regulation of seed growth regulation of transcription from RNA polymerase II promoter double fertilization forming a zygote and endosperm signal transduction metabolic process regulation of defense response response to abscisic acid anther wall tapetum formation anther wall tapetum cell differentiation regulation of growth guard cell differentiation regulation of macromolecule metabolic process positive regulation of cellular process negative regulation of biological process regulation of primary metabolic process regulation of cellular metabolic process positive regulation of cell differentiation guard cell fate commitment negative regulation of cell division positive regulation of transcription, DNA-templated positive regulation of flower development

TE D

8 2 2 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 1 1 1 1

EP

Biological process (60)

GO:0006355 GO:0009742 GO:0080113 GO:0006357 GO:0009567 GO:0007165 GO:0008152 GO:0031347 GO:0009737 GO:0048656 GO:0048657 GO:0040008 GO:0010052 GO:0060255 GO:0048522 GO:0048519 GO:0080090 GO:0031323 GO:0045597 GO:0010377 GO:0051782 GO:0045893 GO:0009911

AC C

(60)

ACCEPTED MANUSCRIPT

Analysis of each subfamily of Moso bamboo bHLH protein

5

6

8

TE D

EP

4

AC C

3

M AN U

SC

2

Name molecular_function cellular_component biological_process PebHLH33 √ √ √ PebHLH34 √ √ √ PebHLH43 √ PebHLH65 √ PebHLH31 √ √ PebHLH128 √ PebHLH123 √ √ PebHLH6 √ √ PebHLH9 √ PebHLH2 √ √ √ PebHLH49 √ PebHLH20 √ √ PebHLH51 √ PebHLH129 √ √ PebHLH22 √ PebHLH114 √ √ PebHLH119 √ √ PebHLH126 √ PebHLH46 √ √ √ PebHLH73 √ √ √ PebHLH136 √ √ PebHLH113 √ √ √ PebHLH105 √ √ √

RI PT

subfamily

ACCEPTED MANUSCRIPT

AC C

10

RI PT

√ √ √ √

√ √ √ √ √ √ √ √ √



M AN U

SC

√ √ √

TE D

√ √ √ √ √ √ √ √ √ √ √ √ √ √

EP

9

PebHLH96 PebHLH85 PebHLH80 PebHLH79 PebHLH61 PebHLH55 PebHLH67 PebHLH134 PebHLH101 PebHLH1 PebHLH45 PebHLH57 PebHLH98 PebHLH125 PebHLH13 PebHLH28 PebHLH29 PebHLH26 PebHLH91 PebHLH93 PebHLH110 PebHLH62 PebHLH48 PebHLH17 PebHLH69





√ √ √

√ √ √ √ √ √ √

√ √ √

ACCEPTED MANUSCRIPT

AC C

15

RI PT

√ √ √ √ √ √

M AN U

√ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √

SC

√ √

EP

12



TE D

11

PebHLH124 PebHLH83 PebHLH72 PebHLH133 PebHLH89 PebHLH64 PebHLH39 PebHLH42 PebHLH56 PebHLH60 PebHLH35 PebHLH4 PebHLH103 PebHLH131 PebHLH15 PebHLH50 PebHLH60 PebHLH10 PebHLH19 PebHLH21 PebHLH37 PebHLH40 PebHLH41 PebHLH53 PebHLH59



√ √ √ √ √ √ √

√ √ √ √ √ √ √

√ √

ACCEPTED MANUSCRIPT

AC C

EP

17

18

RI PT



√ √





√ √ √ √ √ √ √

√ √

SC

√ √ √ √ √

M AN U

√ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √

TE D

16

PebHLH78 PebHLH81 PebHLH92 PebHLH94 PebHLH116 PebHLH24 PebHLH132 PebHLH135 PebHLH8 PebHLH12 PebHLH16 PebHLH95 PebHLH109 PebHLH87 PebHLH70 PebHLH76 PebHLH74 PebHLH58 PebHLH52 PebHLH5 PebHLH47 PebHLH36 PebHLH3 PebHLH7 PebHLH14

√ √ √

√ √ √

√ √

ACCEPTED MANUSCRIPT

√ √

RI PT



√ √ √ √ √ √ √ √ √ √

AC C

√ √ √ √ √

M AN U

TE D

√ √ √ √ √ √ √ √

√ √ √ √

SC

√ √

EP

PebHLH18 PebHLH23 PebHLH22 PebHLH27 PebHLH30 PebHLH38 PebHLH44 PebHLH54 PebHLH68 PebHLH75 PebHLH84 PebHLH86 PebHLH97 PebHLH99 PebHLH102 PebHLH104 PebHLH106 PebHLH107 PebHLH108 PebHLH111 PebHLH112 PebHLH115 PebHLH118 PebHLH120 PebHLH121

√ √ √ √ √ √ √ √



ACCEPTED MANUSCRIPT

RI PT



SC

TE D



√ √ √ √ √ √

√ √ √ √ √ √

M AN U

√ √

EP

21

√ √ √ √ √ √ √ √ √

AC C

19

PebHLH135 PebHLH127 PebHLH117 PebHLH100 PebHLH88 PebHLH77 PebHLH63 PebHLH25 PebHLH11 PebHLH71 PebHLH82 PebHLH90 PebHLH122 PebHLH130



AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

AC C

EP

TE D

M AN U

SC

RI PT

ACCEPTED MANUSCRIPT

ACCEPTED MANUSCRIPT Highlights 1. We identified 137 candidate genes in PebHLH family (PebHLH1-PebHLH137) and the members in one subfamily shared similar gene structure and motif compositions, implying similar functions. 2. The promoter analysis showed 21 resistance-related genes contained many cis-acting elements,

RI PT

suggesting they may play an important role in growth stage. 3. The qRT-PCR analysis revealed tissue-specific expression profiles of 21PebHLHs in six tissues.

4. The stressful induction showed expression levels of 16 21PebHLHs under ABA, drought, and MeJA

AC C

EP

TE D

M AN U

SC

treatments.

ACCEPTED MANUSCRIPT Author contributions XinRan Cheng conceived the study, put into effect the main bioinformatics analyses, and drafted the manuscript. Rui Xiong carried out the software analyses and helped to construct the figures and tables. Min Wu took part in the experiments and drafting of the manuscript. HuanLong Liu processed the experimental data and helped to draft the manuscript. Feng Cheng helped with the software and

RI PT

drafting of the manuscript. HanWei Yan reviewed the project and helped in revamping the manuscript. Yan Xiang conceived and guided the experiments, and helped in coordinating the project and drafting

AC C

EP

TE D

M AN U

SC

the manuscript. All authors read and accepted the final manuscript.