Identification of key genes involved in catechin metabolism in tea seedlings based on transcriptomic and HPLC analysis

Identification of key genes involved in catechin metabolism in tea seedlings based on transcriptomic and HPLC analysis

Plant Physiology and Biochemistry 133 (2018) 107–115 Contents lists available at ScienceDirect Plant Physiology and Biochemistry journal homepage: w...

2MB Sizes 1 Downloads 10 Views

Plant Physiology and Biochemistry 133 (2018) 107–115

Contents lists available at ScienceDirect

Plant Physiology and Biochemistry journal homepage: www.elsevier.com/locate/plaphy

Research article

Identification of key genes involved in catechin metabolism in tea seedlings based on transcriptomic and HPLC analysis

T

Yazhen Zhang, Kang Wei∗, Hailin Li, Liyuan Wang, Li Ruan, Dandan Pang, Hao Cheng∗∗ Key Laboratory of Tea Plant Biology and Resources Utilization, Ministry of Agriculture, National Center for Tea Improvement, Tea Research Institute Chinese Academy of Agricultural Sciences (TRICAAS), 9 Meiling South Road, Hangzhou, Zhejiang, 310008, China

A R T I C LE I N FO

A B S T R A C T

Keywords: Camellia sinensis Correlation analysis Transcriptome analysis Catechin biosynthesis Transcription factor

Tea is a non-alcoholic beverage with many benefits to human health and thereby widely consumed in the world. It contains plenty of secondary metabolites and tea catechins are the characteristic compounds. To further elucidate the biosynthetic and regulatory mechanisms of catechins in tea, high performance liquid chromatography (HPLC) and transcriptome analysis were performed in tea seedlings of different growth stages. A combined method of differential expression and correlation analysis was then conducted. The results showed that the order of total catechin (TC) contents was leaves > stems > roots, irrespective of growth stages. For transcriptome analysis, a total of 355.81 million clean reads were generated and mapped to the referencing tea genome. Further real time PCR analysis of 18 selected genes confirmed RNA-Seq results. A total of 7 structural genes and 35 transcription factors (TFs) were identified to be significantly correlated with TC changes. Among them, three TFs homologous to ANL2, WRKY44 and AtMYB113 might play key roles in catechin regulation. The de novo transcriptome data of different organs in tea seedlings provided new insights into the biosynthetic and metabolic pathways of catechins.

1. Introduction Catechins, namely flavan 3-ols, are the major components of polyphenols in tea plants (Wei et al., 2011). They occupy about 10% of dry tea leaves and mainly include (−)-epigallocatechin gallate (EGCG), (−)-epicatechin gallate (ECG), (−)-epigallocatechin (EGC), (−)-epicatechin (EC), (−)-gallocatechin (GC) and (+)-catechin (C). Tea catechins are not only responsible for the final sensory quality, but also closely associated with many health benefits, such as anti-cancer (Thomasset et al., 2007), anti-inflammation (Cavet et al., 2011) and anti-bacteria (Cho et al., 2011). Therefore, understanding the high accumulation mechanism of catechins in tea is of great interest for many scientists. It is widely considered that catechin biosynthesis is through the phenylpropanoid and flavonoid biosynthetic pathways (Winkel-Shirley, 2001). The formation of chalcone is the entry point of catechin biosynthesis, and three enzymes, phenylalanine ammonia-lyase (PAL), cinnamate 4-hydroxylase (C4H), chalcone synthase (CHS) are engaged in the process. The other key enzymes involved in the subsequent steps are as follows: chalcone isomerase (CHI), flavanone 3′-hydroxylase (F3′H), flavonoid 3′5′-hydroxylase (F3′5′H), flavanone 3-hydroxylase



(F3H), flavonol synthase (FLS), dihydroflavonol 4-reductase (DFR), leucoanthocyanidin reductase (LAR), anthocyanidin reductase (ANR) and anthocyanidin synthase (ANS) (Lepiniec et al., 2006; Zhou et al., 2016). Numerous studies had been carried out to identify key genes involved in catechin accumulation under different conditions or in special tea resources (Rani et al., 2012; Liu et al., 2015; Wei et al., 2015; Chen et al., 2017). Moreover, as catechins are mainly accumulated in young tea leaves, correlation between catechin contents and expression patterns of biosynthetic genes during different stages of leaf development was also a hot topic (Rani et al., 2012; Zhang et al., 2016; Guo et al., 2017). However, either a lack of global view or catechin difference not big enough in the experimental samples still largely limited relative advance. For example, Guo et al. (2017) found the total catechin contents ranged from 116.08 to 193.88 mg/g in leaves of different stages, which was less than 2-fold difference and might miss identification of some key genes playing bottleneck effects in catechin biosynthesis, particularly transcription factors (TFs). Thus, with the development of RNA-sequencing technology, identification of tea samples with large catechin difference limited relative studies. Ashihara et al. (2010) studied the distribution of phenolic compounds in 8-week-old tea seedlings and it was interesting to find that

Corresponding author. Corresponding author. E-mail addresses: [email protected] (K. Wei), [email protected] (H. Cheng).

∗∗

https://doi.org/10.1016/j.plaphy.2018.10.029 Received 14 August 2018; Received in revised form 25 October 2018; Accepted 26 October 2018 Available online 29 October 2018 0981-9428/ © 2018 Elsevier Masson SAS. All rights reserved.

Plant Physiology and Biochemistry 133 (2018) 107–115

Y. Zhang et al.

flavonol synthase fragments per kb per million gallocatechin High Performance Liquid Chromatography Gene Ontology Kyoto Encyclopedia of Genes and Genomes Clusters of orthologous groups for eukaryotic complete genomes LAR leucoanthocyanidin reductase MBW MYB-bHLH-WD40 NR NCBI non-redundant protein database PAL phenylalanine ammonia-lyase PCA Principal component analysis qRT-PCR quantitative real-time PCR SPSS Statistical Package for the Social Sciences TC total catechin TFs transcription factors

Abbreviations

FLS FPKM GC HPLC GO KEGG KOG

ANR anthocyanidin reductase ANS anthocyanidin synthase C catechins C4H cinnamate 4-hydroxylase CHI chalcone isomerase CHS chalcone synthase DEG Differentially expressed gene DFRdihydroflavonol 4-reductaseEC dihydroflavonol 4reductaseECepicatechin ECG epicatechin gallate EGC epigallocatechin EGCG epigallocatechin gallate F3H flavanone 3-hydroxylase F3′H flavanone 3′-hydroxylase F3′5′H flavonoid 3′5′-hydroxylase FDR false discovery rate

the total catechins ranged from 1.1 μmol/g in roots to 120 μmol/g in young leaves. There were about 100-fold differences between different tissues, which would be useful to explore the key genes involved in the activation of catechin biosynthesis. Moreover, our previous study showed that correlation analysis combined with expression level screening of RNA-Seq data and target products was feasible for the identification of key genes (Wei et al., 2015). Furthermore, the recent published genome sequences of Camellia sinensis var. assamica and sinensis would be helpful for RNA-Seq analysis in relative fields (Xia et al., 2017; Wei et al., 2018). Therefore, a new strategy with the combination of differential expression analysis, expression level screening and correlation analysis would help us identify many key genes and deepen our understanding of their functions. In this study, tea samples from different tissues and developmental stages of cv Longjing43 seedlings were used for the determination of catechin distribution and RNA-Seq analysis. Expression levels of relative genes involved in catechin biosynthesis and their relationship with the accumulation of catechins were also investigated. The purpose of this study was to explore the key genes involved in catechin biosynthesis and provided more insights into the network of catechin regulation.

(2015) previously. Materials (0.2 g, dry weight) were extracted with 5 ml 70% (v/v) methanol in water bath at 70 °C for 10min and shaken with an interval of 5min. The extracted solution was centrifuged at 3500×g for 10 min at 4 °C and then transformed into a 10 ml volumetric flask. These steps were repeated again with a final volume of 10 ml. They were then filtered through 0.45 μm Millipore filters. The measurement of catechins was performed in a Waters HPLC system with a reversed phase column (Phenomenex C12, 4.6 mm × 250 mm, 5 μm) at 40 °C. The mobile phase was composed of water with 1% (v/v) formic acid (A) and acetonitrile (B) with a linear gradient elution, 0–42min: 4%–18.7% B; 42–43min: 18.7%–4% B. The samples were eluted at 1 ml min−1 flow rate and monitored at 280 nm. Each sample was performed for three independent extractions. The error bars are standard deviation from triplicates of total catechins contents. Significant difference was determined by one-way ANOVA with Duncan's multiple range test (P<0.05). 2.3. RNA sequencing analysis Total RNAs of tea samples were extracted by an RNAprep pure Plant Kit (Tiangen, Beijing, China). RNA quality was confirmed by 1% agarose gel and the concentration was measured through NanoDrop 2000 (Thermo Scientific, DE, USA). RNAs from 8 samples were used for RNA sequencing through Illumina HiSeq 2500 (Illumina, San Diego, CA, USA) according to the method described by Wei et al. (2015).

2. Materials and methods 2.1. Plant materials Fresh seeds of tea cultivar Longjing 43 were harvested in November 2016 from the tea garden of the Tea Research Institute Chinese Academy of Agricultural Sciences in Hangzhou, China. Seeds were stored in the refrigerator (4 °C) for one month and then cultivated in quartz sands in the green house. When tea seedlings were grown to the new germination (stage 1), bud and leaf stage (stage 2) and resting bud stage (stage 3), leaf, stem and root samples were collected and stored at −80 °C (Fig. 1). The stems and roots of the new germination seedling were abbreviated to 1-S and 1-R. The leaves, stems and roots of the second and third stage seedlings were abbreviated to 2-L, 2-S, 2-R, 3-L, 3-S and 3-R respectively. Part of materials were dried by a freeze dryer (Labconco Stoppering Tray Dryer, Labconco Inc., Kansas City, MO, USA) at −5 °C for 72 h, and then subject to catechin analysis by HPLC. The remaining samples were subjected to RNA-Seq analysis. 2.2. HPLC analysis

Fig. 1. The phenotypes of tea seedlings of 3 different growth stages. Sampling sites were boxed: R, root; S, stem; L, leaf.

The determination of catechins by HPLC was described by Wei et al. 108

Plant Physiology and Biochemistry 133 (2018) 107–115

Y. Zhang et al.

annotated by blastx.

Sequence data of 8 samples had been deposited in the NCBI SRA database (SRS3194069, SRS3194070, SRS3194071, SRS3194072, SRS3194073, SRS3194074, SRS3194075, SRS3194076). The sequence reads were then aligned to the tea reference genome using Hierarchical Indexing for Spliced Alignment of Transcripts (HISAT2, v. 2.0.4) (Kim et al., 2015; Wei et al., 2018).

2.5. Real-time PCR validation To verify the results of RNA-Seq, eighteen genes were selected for qRT-PCR analysis. Annotation information and primers of those genes were displayed in additional file 1. Total RNAs of eight samples were extracted using an RNAprep pure Plant Kit (Tiangen, Beijing, China). Then a PrimeScript® 1st Strand cDNA Synthesis Kit (Takara, Dalian, China) was applied to synthesize cDNA. The PrimeScript® RT reagent qPCR Kit (Takara, Dalian, China) was used to performed qRT-PCR on the ABI 7500 Real-Time PCR System. Values of target genes were normalized by GAPDH, a common reference gene in tea plant (Wei et al., 2016). The qRT-PCR program was as follows: 95 °C for 30s, 40 cycles at 95 °C for 5s, 62 °C for 34s. The relative expression levels were calculated using the 2−ΔΔCt method (Livaka and Schmittgen, 2001). The data were expressed as the means ± SD from four independent biological replicates.

2.4. Identification of differential gene expressions Expression levels of genes in tea samples were normalized and measured according to FPKM (fragments per kilobase of transcript per million reads) values. Based on the principal component analysis (PCA), three groups of samples (Group R vs S, Group R vs L, Group S vs L) were compared respectively. To identify differentially expressed genes (DEGs), Cuffdiff was used with blind dispersion methods and adjusted by false discovery rate (FDR≤0.01) (Trapnell et al., 2013). Candidate DEGs were obtained with p-value≤0.05 and at least two-fold change in expression across groups. In addition, Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of DEGs were also

Fig. 2. Chromatograms of typical tea samples of different tissues. a. root sample; b. stem sample; c. leave sample; 1. (−)-gallocatechin (GC); 2. (−)-epigallocatechin (EGC); 3. (+)-catechin (C); 4. Caffeine; 5. (−)-epicatechin (EC); 6. (−)-epigallocatechin gallate (EGCG); 7. (−)-epicatechin gallate (ECG). 109

Plant Physiology and Biochemistry 133 (2018) 107–115

Y. Zhang et al.

sequencing. In this study, 74.3%–75.9% of the sequences had a similarity larger than 90% of the referencing genes for each library (Additional file 3), indicating the gene coverage over the eight libraries were highly reproducible and quite uniform.

2.6. Correlation analysis of gene expression and catechins content The FPKM values of relevant genes and concentration of catechins were subjected to bivariate correlation analysis using Statistical Package for the Social Sciences (SPSS) Statistics 17.0. The correlation was displayed as Pearson correlation. P-values < 0.05 were considered significant, and p-values < 0.01 were considered highly significant.

3.3. Quantitative real-time PCR verification of transcriptome data The expression levels of tea genes were measured by FPKM. To confirm the RNA-Seq results, 18 genes including some potentially important genes involved in flavonoid biosynthesis or regulation were selected for qRT-PCR validation. These genes included potentially important genes involved in flavonoid biosynthesis, transportation and regulation. The qRT-PCR and RNA-Seq results of these genes are shown in Fig. 4. It was clear that the expression patterns of tea genes determined by qRT-PCR were highly consistent with the transcriptome data. Further correlation analysis of RNA-Seq data and qRT-PCR results showed that they were highly and significantly correlated (r = 0.639**), indicating the RNA-seq method provided reliable data for the identification of key genes involved in catechin metabolism in C. Sinensis.

3. Results 3.1. Accumulation of catechins in tea seedlings The phenotypes of tea seedlings of different growth stages are shown in Fig. 1 and the catechin compositions determined by HPLC are present in Figs. 2 and 3. Totally six typical catechins were detected, namely GC, EGC, C, EC, EGCG, and ECG. It could be found that the total catechin (TC) contents varied greatly among tissues rather than growth stages, with the order of leaves > stems > roots. The TC contents in roots were significantly lower than the other samples, with the highest value of 2.6 mg/g in 3-R. While the TC contents in leaf samples (2-L and 3-L) were about 100 mg/g, which were the highest in all tissues. These results are consistent with previous reports (Ashihara et al., 2010; Xiong et al., 2013). In terms of individual catechins, EC accounted for about 94% of the TC contents in roots irrespective of growth stages, followed by ECG. Both catechins belong to dehydroxylated catechins, indicating the biosynthesis of dehydroxylated catechins plays a central role in roots. While EGCG and EGC (trihydroxylated catechins) were the dominant catechin components in stems (totally about 60%) and leaves (81%). These results indicate a clear change from the biosynthesis of dehydroxylated catechins in roots to trihydroxylated catechins in stems and leaves.

3.4. Differentially expressed genes (DEGs) in C. sinensis of different tissues To further investigate the gene expression patterns in eight samples, PCA analysis was performed to explore their relationships (Fig. 5A). It was interesting that samples of the same tissues were clustered in the same group, which was similar to our findings in catechin distribution (Fig. 3) and suggests tissue difference in the samples plays a dominant role. Moreover, roots samples were far away from the stem and leaf groups, revealing big difference between them. Thus, the eight samples were divided into three groups, namely group R (1-R, 2-R, 3-R), S (1-S, 2-S, 3-S) and L (2-L, 3-L). The different expression patterns of genes were then analyzed between group R vs S, group R vs L and group S vs L. A total of 4142 DEGs were identified between group R and S, with 1841 genes up-regulated and 2301 down-regulated. Similarly, 7661 and 3845 DEGs were identified in the comparison of group R vs L and S vs L (Fig. 5B). The numbers of DEGs identified in each group were highly corresponding to their distances in PCA analysis (Fig. 5A and B). To determine the putative functions of the DEGs, the KEGG pathway enrichment analysis was conducted and a total of 1058, 1842 and 993 DEGs were mapped to reference canonical pathways in the comparison of group R vs S, R vs L and S vs L, respectively (Additional file 4). Both phenylpropanoid biosynthesis (ko00940) and flavonoid biosynthesis (ko00941) were among these representative pathways in all three pairs

3.2. Illumina sequencing and aligning to the reference genome To uncover the molecular mechanism and find candidate genes of catechins biosynthesis, eight cDNA libraries were separately constructed from the tea samples. Totally 355.81 million clean reads with a Q30 over 92% were generated after removing the low-quality reads (Additional file 2). The average genome mapping rate of these clean reads was 89.92%, with 84.16% matched to unique and 5.76% matched to multiple genomic locations. Gene coverage, a value equal to the ratio of the base number in a gene covered by unique mapping reads to the total number of bases for that gene, is an important indicator reflecting the quality of RNA

Fig. 3. Catechin contents in tea seedlings of different developmental stages by HPLC (Mean ± standard deviation, n = 3). Means showing significant difference (P < 0.05) are labeled with different letters based on one-way ANOVA with Duncan's multiple range test. 110

Plant Physiology and Biochemistry 133 (2018) 107–115

Y. Zhang et al.

Fig. 4. Comparison of the expression profiles of 18 selected genes as determined by RNA-Seq and Real Time-PCR. Data of real time PCR analysis are the means ± SD (n = 3).

important functional genes were obtained. They included PAL (1 gene), 4CL (2), CHS (4), CHI (1), F3H (1), F3′H (1), F3′5′H (1), FLS (1), LAR (2), ANS (1) and ANR (1).

of comparisons, which confirmed the tissue effects on the expressions of catechin biosynthetic genes. Totally 19 functional genes associated with catechin biosynthesis and 41 TFs were identified in the DEGs with the expression patterns of leaves or stems > roots. Moreover, as the key functional genes generally had high expression levels when their bioproduct contents were high, those candidate genes with the maximum FPKM ˂ 50 in the eight samples were removed and 16 potentially

3.5. Correlation analysis of the candidate genes and catechins Further correlation analysis was performed to check the importance 111

Plant Physiology and Biochemistry 133 (2018) 107–115

Y. Zhang et al.

zipper protein (TEA022050.1) was found to be highly similar to ANTHOCYANINLESS 2 (ANL2), a key regulator of anthocyanin distribution (Kubo et al., 1999). Moreover, a WRKY (TEA000149.1) highly similar to TTG2/WRKY44 and a MYB gene (CsAN1) similar with AtMYB113 were also identified in these differentially expressed TFs. Further investigation of these TFs would deepen our understanding of the regulation system of catechin biosynthesis. 4. Discussion Catechins are the main secondary metabolites in C. sinensis and have significant impacts on tea quality and taste. In this study, the concentrations of six catechin components, namely GC, EGC, C, EC, EGCG and ECG were determined in tea seedlings of different growth stages. A clear distribution pattern with high TC contents in leaves but extremely low TC contents in roots was observed, which was consistent with previous finding (Ashihara et al., 2010). This distribution pattern would largely benefit our understanding on their high accumulation mechanism. Previous studies had showed that many catechin biosynthetic genes were closely associated with the final catechin contents and compositions (Ashihara et al., 2010; Rani et al., 2012; Wei et al., 2015; Guo et al., 2017). However, which of them play key roles in catechin accumulation and should be more emphasized in future breeding were not identified yet. Recently, Wang et al. (2018) reported 36 catechin biosynthetic genes based on transcriptome sequence data and 26 of them could be mapped to the tea genome, which offered a useful map for subsequent studies. In this study, 7 functional genes (PAL, 4CL, CHS, CHI, F3′H, F3′5′H) and LAR) were positively and significantly correlated with the TC contents and showed high abundance in leaves. BLAST analysis showed that these genes were highly homologous to CsPALa (KY615669), CsCHSc (KY615682), CsCHIc (KY615685), CsF3′5′Ha (KY615696) and CsLARa (KY615698) described by Wang et al. (2018), indicating their importance in catechin biosynthesis. The F3′H and 4CL genes were found to be 100% and 86% identity to the CsF3′H1 (KP335091) and Cs4CLb (KY615678), whose functions still need more research. The expression patterns of CsPALa, CsCHSc, CsCHIc, CsF3′5′Ha and CsLARa in different tissues were consistent in both studies. Most of these genes are early biosynthetic genes (PAL, 4CL, CHS, CHI), suggesting high accumulation of catechins depends on the activation of the whole phenylpropanoid and flavonoid biosynthetic pathways. This is different from the high accumulation pattern of anthocyanins in purple tea, which was mainly affected by the late biosynthetic genes (Wei et al., 2016). PAL and 4CL are key enzymes in the phenylpropanoid pathway and also regarded as a starting point for the following flavonoid pathway (Dixon et al., 2010). Synergistic changes between the catechin contents and expressions of PAL and 4CL were also identified in tea plants under drought stress or gibberellic acid, abscisic acid and wounding treatments (Eungwanichayapant and Popluechai, 2009; Rani et al., 2012; Singh et al., 2009). Guo et al. (2017) reported 10 PAL genes might play important roles by correlation analysis of the transcriptome and metabolite of catechin biosynthesis during tea leaf development. While, Wu et al. (2017) also reported six PAL genes in tea plants, which displayed subtle differences in kinetics and enzymatic properties. However, ignoring their expression levels in key tissues might magnify the importance of some less important genes. Our results showed that the PALa gene (TEA023243.1) was not only closely correlated with catechin contents, but also showed high expression levels in leaves and stems, revealing its importance in catechin biosynthesis. Moreover, two functional genes (CsCHIc and CsF3′5′Ha) were highly expressed in tea leaves (FPKM > 300), but extremely lowly expressed in roots (FPKM < 5), which might act as a bottleneck for catechin accumulation. These results also suggest that CsCHIc and CsF3′5′Ha are tissue specific genes. Similar results were also found by Xia et al. (2017), with CsCHIc (3.79-fold) and CsF3′5′Ha (16.86-fold)

Fig. 5. Principal component analysis of RNA-Seq data of eight tea samples (A) and differentially expressed genes among different tissues (B).

of these genes. Significant correlations were observed in the TC contents and 8 functional genes [PAL (1), 4CL (1), CHS (1), CHI (1), F3′H (1), F3′5′H (1) and LAR (1) genes], indicating their importance in the activation of catechin biosynthesis (Fig. 6, Additional file 5). Most of these genes were involved in the early stage of the catechin biosynthetic pathway, such as PAL, 4CLCHS and CHI. Moreover, two genes (CHI and F3′5′H) had extremely low expression levels (FPKM < 5) in roots and high levels (FPKM > 300) in leaves. Their expression patterns were highly consistent with the changes of catechins in tissues, and thus might act as a bottleneck for catechin accumulation. Furthermore, the F3′5′H was 99% identity with the CsF3′5′Ha (genebank: KY615696), which plays a key role in the trihydroxylated catechin biosynthesis (Wang et al., 2014; Wei et al., 2015; Wang et al., 2018). It is interesting that none trihydroxylated catechins was biosynthesized in roots when the expressions of F3′5′H were low (FPKM < 3) and high ratios of the trihydroxylated catechins to the TC were identified in stems (about 60%) and leaves (81%) when the average FPKMs of F3′5′H reached 549 and 945 respectively. These results further confirmed the importance of CsF3′5′Ha in the formation of trihydroxylated catechins. For TFs, totally 35 DEGs showed significant correlations with TC contents, which included MYB (8), zinc finger (9), leucine zipper (4), RING-H2 finger (3), bHLH (1), WRKY (1), WD repeat-containing protein (1), Ethylene-responsive TF (1) and F-box (1) genes (Fig. 7, Additional file 5). TFs play key roles in regulation of gene expressions. Interestingly, three candidate TFs were proposed to play important roles in regulating catechin biosynthesis. Among them, a homeobox-leucine 112

Plant Physiology and Biochemistry 133 (2018) 107–115

Y. Zhang et al.

Fig. 6. Putative catechin biosynthetic pathway in Camellia sinensis. (a) Proposed pathway of catechin biosynthesis. Enzyme abbreviations: PAL, phenylalanine ammonia lyase; C4H, cinnamate 4-hydroxylase; 4CL, 4-coumarate CoA ligase; CHS, chalcone synthase; CHI, chalcone isomerase; F3′H, flavonoid 3′-hydroxylase; F3′5′H, flavonoid 3′,5′-hydroxylase; F3H, flavanone 3-hydroxylase; FLS, flavonol synthase; DFR, dihydroflavanol 4-reductase; LAR, leucocyanidin reductase; ANS, anthocyanidin synthase; ANR, anthocyanidin reductase. (b) Expression levels of unigenes involved in catechins biosynthesis. Green and red colors are used to represent low-to-high expression levels, and color scales correspond to the mean centered log2-transformed FPKM values. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.)

Fig. 7. Expression levels of potential transcription factors involved in catechin regulation. 113

Plant Physiology and Biochemistry 133 (2018) 107–115

Y. Zhang et al.

PAL, 4CL, CHS, CHI, F3′H, F3′5′H and LAR) were significantly and positively correlated with TC concentrations, suggesting their essential roles in catechin biosynthesis. Furthermore, 35 TFs, especially three genes homologous to ANL2, WRKY44 and AtMYB113 were also identified and might be involved in the regulation of catechin biosynthesis. This study provides valuable message for studying the biosynthetic and metabolic pathways of secondary metabolites, which will facilitate the processes of gene functional studies in tea plants.

more highly expressed in section Thea species than non-Thea sections. CHI (EC 5.5.1.6) is an essential enzyme catalyzing the stereospecific isomerization of chalcones into the corresponding (2S)-flavanones. CHIsuppression by RNA interference (RNAi) in tobacco exhibited reduced pigmentation and changes in flavonoid components (Nishihara et al., 2005). Therefore, it could be deduced that high expression of CsCHIc in tea leaves is essential for the high catechin accumulation. In terms of CsF3′5′Ha, its expression levels were closely correlated with the trihydroxylated catechin biosynthesis in different tissues of tea seedlings, which was consistent with previous findings (Wei et al., 2015; Jin et al., 2016). CsF3′5′Ha is a member of CYP75A subfamily. Over-expression of CsF3′5′Ha was reported to be able to increase the accumulation of both cyanidin and delphinin, although more cyanidin derivatives were observed (Wang et al., 2014). Thus, the expression of CsF3′5′Ha may not only activate the trihydroxylated catechin biosynthesis, but also induce the TC contents. TFs were considered as key regulators for functional gene expressions. Guo et al. (2017) identified 30 TFs potentially involved in catechin regulation in tea plants. Wei et al. (2018) also found numbers of TFs strongly associated with the expressions of catechin biosynthetic genes, including MADS box, R2R3-MYB and bHLH TFs. Here, 35 TFs were identified to be closely correlated with TC accumulation. Among them, three TFs were proposed to play important roles in catechin regulation. A homeobox-leucine zipper protein (TEA022050.1) was found to have high identity to ANL2, a key regulator of anthocyanin accumulation (Kubo et al., 1999). As anthocyanins and catechins share most of the biosynthetic pathway, this homeobox protein is also possible involved in the regulation of catechins. Furthermore, a MYB gene (CsAN1) was found to be a key regulator of anthocyanin accumulation in purple tea (Sun et al., 2016; Wei et al., 2016; He et al., 2018). AN1 was also highly similar the Arabidopsis PAP1 or AtMYB113, which are key members of a ternary MBW complex involved in the regulation of flavonoid biosynthesis, such as flavonols (Czemmel et al., 2012), anthocyanidin (Gonzalez et al., 2008; Jaakola, 2013) and proanthocyanidins (Nesi et al., 2001). Our previous study showed that CsAN1 might mainly activate the expressions of late flavonoid biosynthetic genes (Wei et al., 2016). A significant and positive correlation between the expressions of CsAN1 and total anthocyanin contents was also found in tea seedlings (Additional file 6), suggesting that the MBW complex might be a key regulator of anthocyanin accumulation in tea. In this study, although a significant correlation was also identified between CsAN1 and TC content, lots of early flavonoid biosynthetic genes were activated in tissues with abundant catechins, which was different from the regulation mechanism of MBW. Therefore, the regulation role of the MYB gene (CsAN1) in the accumulation of catechins still needs further research. On the other hand, a WRKY (TEA000149.1) highly similar to TTG2/ WRKY44 was found to be closely correlated with catechin distribution. Gonzalez et al. (2016) found that TTG2 mutants in Arabidopsis lack the proanthocyanidins (PAs or condensed tannins) found in wild-type seeds, but produce other flavonoid compounds, such as anthocyanins in the shoot, suggesting that TTG2 regulates genes in the PA biosynthetic branch of the flavonoid pathway. A similar close correlation ship between the expression pattern of WRKY gene and catechin biosynthetic genes was also identified by Wei et al. (2018). As catechins belonged to PA, the WRKY gene (TEA000149.1) seems to be more important for catechin regulation.

Author contribution Kang Wei, Liyuan Wang, Hao Cheng conceived and designed the experiments. Yazhen Zhang, Kang Wei, Hailin Li, Dandan Pang performed the experiments. Yazhen Zhang and Kang Wei analyzed the data. Kang Wei, Liyuan Wang, Li Ruan and Hao Cheng contributed reagents, materials analysis tools. Yazhen Zhang and Kang Wei wrote the paper. Acknowledgments This work was supported by the National Natural Science Foundation of China (31470396), The Major Project of Agricultural Science and Technology in Breeding of Tea Plant Variety in Zhejiang Province (2016C02053-3), Central Public-interest Scientific Institution Basal Research Fund (1610212017002 and 1610212018004) and Earmarked Fund for China Agriculture Research System (CARS-19). Appendix A. Supplementary data Supplementary data to this article can be found online at https:// doi.org/10.1016/j.plaphy.2018.10.029. References Ashihara, H., Deng, W.W., Mullen, W., Crozier, A., 2010. Distribution and biosynthesis of flavan-3-ols in Camellia sinensis seedlings and expression of genes encoding biosynthetic enzymes. Phytochemistry 71, 559–566. Cavet, M.E., Harrington, K.L., Vollmer, T.R., Ward, K.W., Zhang, J.Z., 2011. Anti-inflammatory and anti-oxidative effects of the green tea polyphenol epigallocatechin gallate in human corneal epithelial cells. Mol. Vis. 17, 533–542. Chen, C.S., Wei, K., Wang, L.Y., Ruan, L., Li, H.L., Zhou, X.G., Lin, Z.H., Shan, R.Y., Cheng, H., 2017. Expression of key structural genes of the phenylpropanoid pathway associated with catechin epimerization in tea cultivars. Front. Plant Sci. 8, 702. Cho, Y.S., Oh, J.J., Oh, K.H., 2011. Synergistic anti-bacterial and proteomic effects of epigallocatechin gallate on clinical isolates of imipenem-resistant Klebsiella pneumoniae. Phytomedicine 18, 941–946. Czemmel, S., Heppel, S.C., Bogs, J., 2012. R2R3 MYB transcription factors: key regulators of the flavonoid biosynthetic pathway in grapevine. Protoplasma 249 (Suppl. 2) S109–118. Dixon, R.A., Achnine, L., Kota, P., Liu, C.J., Reddy, M.S.S., Wang, L., 2010. The phenylpropanoid pathway and plant defence-a genomics perspective. Mol. Plant Pathol. 3, 371–390. Eungwanichayapant, P.D., Popluechai, S., 2009. Accumulation of catechins in tea in relation to accumulation of mRNA from genes involved in catechin biosynthesis. Plant Physiol. Biochem. 47, 94–97. Gonzalez, A., Zhao, M., Leavitt, J.M., Lloyd, A.M., 2008. Regulation of the anthocyanin biosynthetic pathway by the TTG1/bHLH/Myb transcriptional complex in Arabidopsis seedlings. Plant J. 53, 814–827. Gonzalez, A., Brown, M., Hatlestad, G., Akhavan, N., Smith, T., Hembd, A., Moore, J., Montes, D., Mosley, T., Resendez, J., Nguyen, H., Wilson, L., Campbell, A., Sudarshan, D., Lloyd, A., 2016. TTG2 controls the developmental regulation of seed coat tannins in Arabidopsis by regulating vacuolar transport steps in the proanthocyanidin pathway. Dev. Biol. 419, 54–63. Guo, F., Guo, Y.F., Wang, P., Wang, Y., Ni, D.J., 2017. Transcriptional profiling of catechins biosynthesis genes during tea plant leaf development. Planta 246, 1139–1152. He, X.J., Zhao, X.C., Gao, L.P., Shi, X.X., Dai, X.L., Liu, Y.J., Xia, T., Wang, Y.S., 2018. Isolation and characterization of key genes that promote flavonoid accumulation in purple-leaf tea (Camellia sinensis L.). Sci. Rep. 8, 130. Jaakola, L., 2013. New insights into the regulation of anthocyanin biosynthesis in fruits. Trends Plant Sci. 18, 477. Jin, J.Q., Ma, J.Q., Yao, M.Z., Ma, C.L., Chen, L., 2016. Functional natural allelic variants of flavonoid 3′,5′-hydroxylase gene governing catechin traits in tea plant and its relatives. Planta 245, 1–16. Kim, D., Langmead, B., Salzberg, S.L., 2015. HISAT: a fast spliced aligner with low

5. Conclusions In summary, the de novo transcriptome data of different organs in tea seedlings of three developmental stages was analyzed. A total of 355.81 million clean reads were generated and mapped to the tea genome. Studies on expression patterns of genes, catechin contents and correlation analysis identified 42 candidate key genes participating in catechin biosynthesis. Among them, seven structural genes (namely 114

Plant Physiology and Biochemistry 133 (2018) 107–115

Y. Zhang et al.

sinensis): critical role in the accumulation of catechins. BMC Plant Biol. 14, 347. Wang, W.Z., Zhou, Y.H., Wu, Y.L., Dai, X.L., Liu, Y.J., Qian, Y.M., Li, M.Z., Jiang, X.L., Wang, Y.S., Gao, L.P., Xia, T., 2018. Insight into catechins metabolic pathways of Camellia sinenss based on genome and transcriptome analysis. J. Agric. Food Chem. 66, 4281–4293. Wei, C.L., Yang, H., Wang, S.B., Zhao, J., Liu, C., Gao, L.P., Xia, E.H., Lu, Y., Tai, Y.L., She, G.B., et al., 2018. Draft genome sequence of Camellia sinensis var. sinensis provides insights into the evolution of the tea genome and tea quality. Proc. Natl. Acad. Sci. U.S.A. 115, E4151–E4158. Wei, K., Wang, L.Y., Zhou, J., He, W., Zeng, J.M., Jiang, Y.W., Cheng, H., 2011. Catechin contents in tea (Camellia sinensis) as affected by cultivar and environment and their relation to chlorophyll contents. Food Chem. 125, 44–48. Wei, K., Wang, L.Y., Zhang, C.C., Wu, L.Y., Li, H.L., Zhang, F., Cheng, H., 2015. Transcriptome analysis reveals key flavonoid 3′-Hydroxylase and flavonoid 3′,5′hydroxylase genes in affecting the ratio of dihydroxylated to trihydroxylated catechins in Camellia sinensis. PloS One 10, e0137925. Wei, K., Zhang, Y.Z., Wu, L.Y., Li, H.L., Ruan, L., Bai, P.X., Zhang, C.C., Zhang, F., Xu, L.Y., Wang, L.Y., Cheng, H., 2016. Gene expression analysis of bud and leaf color in tea. Plant Physiol. Biochem. 107, 310–318. Winkel-Shirley, B., 2001. Flavonoid biosynthesis: a colorful model for genetics, biochemistry, cell biology, and biotechnology. Plant Physiol. (Sofia) 126, 485–493. Wu, Y.L., Wang, W.Z., Li, Y.Z., Dai, X.L., Ma, G.L., Xing, D.W., Zhu, M., Gao, L., Xia, T., 2017. Six phenylalanine ammonia-lyases from Camellia sinensis: evolution, expression, and kinetics. Plant Physiol. Biochem. 118, 413–421. Xia, E.H., Zhang, H.B., Sheng, J., Li, K., Zhang, Q.J., Kim, C., Zhang, Y., Liu, Y., Zhu, T., Li, W., et al., 2017. The tea tree genome provides insights into tea flavor and independent evolution of caffeine biosynthesis. Mol. Plant 10, 866–877. Xiong, L.G., Li, J., Li, Y.H., Yuan, L., Liu, S.Q., Huang, J.A., Liu, Z.H., 2013. Dynamic changes in catechin levels and catechin biosynthesis-related gene expression in albino tea plants (Camellia sinensis L.). Plant Physiol. Biochem. 71, 132–143. Zhang, L.Q., Wei, K., Cheng, H., Wang, L.Y., Zhang, C.C., 2016. Accumulation of catechins and expression of catechin synthetic genes in Camellia sinensis at different developmental stages. Botanical Stud 57, 31. Zhou, T.S., Zhou, R., Yu, Y.B., Xiao, Y., Li, D.H., Xiao, B., Yu, O., Yang, Y.J., 2016. Cloning and characterization of a flavonoid 3’-hydroxylase gene from tea plant (Camellia sinensis). Int. J. Mol. Sci. 17, 261.

memory requirements. Nat. Methods 12, 357–360. Kubo, H., Peeters, A.J., Aarts, M.G., Pereira, A., Koornneef, M., 1999. ANTHOCYANINLESS2, a homeobox gene affecting anthocyanin distribution and root development in Arabidopsis. Plant Cell 11, 1217–1226. Lepiniec, L., Debeaujon, I., Routaboul, J.M., Baudry, A., Pourcel, L., Nesi, N., Caboche, M., 2006. Genetics and biochemistry of seed flavonoids. Annu. Rev. Plant Biol. 57, 405–430. Liu, M., Tian, H., Wu, J.H., Cang, R.R., Wang, R.X., Qi, X.H., Xu, Q., Chen, X.H., 2015. Relationship between gene expression and the accumulation of catechin during spring and autumn in tea plants (Camellia sinensis L.). Hortic. Res. 2, 15011. Livaka, K.J., Schmittgen, T.D., 2001. Analysis of relative gene expression data using realtime quantitative PCR and the 2-ΔΔCT method. Methods 25, 402–408. Nesi, N., Jond, C., Debeaujon, I., Caboche, M., Lepiniec, L., 2001. The arabidopsis TT2 gene encodes an R2R3 MYB domain protein that acts as a key determinant for proanthocyanidin accumulation in developing seed. Plant Cell 13, 2099–2114. Nishihara, M., Nakatsuka, T., Yamamura, S., 2005. Flavonoid components and flower color change in transgenic tobacco plants by suppression of chalcone isomerase gene. FEBS Lett. 579, 6074–6078. Rani, A., Singh, K., Ahuja, P.S., Kumar, S., 2012. Molecular regulation of catechins biosynthesis in tea [Camellia sinensis (L.) O. Kuntze]. Gene 495, 205–210. Singh, K., Kumar, S., Rani, A., Gulati, A., Ahuja, P.S., 2009. Phenylalanine ammonialyase (PAL) and cinnamate 4-hydroxylase (C4H) and catechins (flavan-3-ols) accumulation in tea. Funct. Integr. Genom. 9, 125–134. Sun, B.M., Zhu, Z.S., Gao, P.R., Chen, H., Chen, C.M., Zhou, X., Mao, Y.H., Lei, J.J., Jiang, Y.P., Meng, W., Wang, Y.X., Liu, S.Q., 2016. Purple foliage coloration in tea (Camellia sinensis L.) arises from activation of the R2R3-MYB transcription factor CsAN1. Sci Rep 6:32534Thomasset SC, Berry DP, Garcea G, Marczylo T, Steward WP, Gescher AJ (2007) Dietary polyphenolic phytochemicals promising cancer chemopreventive agents in humans? A review of their clinical properties. Int. J. Canc. 120, 451–458. Thomasset, S.C., Berry, D.P., Garcea, G., Marczylo, T., Steward, W.P., Gescher, A.J., 2007. Dietary polyphenolic phytochemicals-promising cancer chemopreventive agents in humans? A review of their clinical properties. Int. J. Cancer 120 (3), 451–458. Trapnell, C., Hendrickson, D.G., Sauvageau, M., Goff, L., Rinn, J.L., Pachter, L., 2013. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nat. Biotechnol. 31, 46e53. Wang, Y.S., Xu, Y.J., Gao, L.P., Yu, O., Wang, X.Z., He, X.J., Jiang, X.L., Liu, Y.J., Xia, T., 2014. Functional analysis of flavonoid 3',5'-hydroxylase from tea plant (Camellia

115