Journal Pre-proofs Research paper Identification of drought response genes by digital gene expression (DGE) analysis in Caragana korshinskii Kom Yan Long, Fengping Liang, Jingwen Zhang, Mande Xue, Tianbao Zhang, Xinwu Pei PII: DOI: Reference:
S0378-1119(19)30829-7 https://doi.org/10.1016/j.gene.2019.144170 GENE 144170
To appear in:
Gene Gene
Received Date: Revised Date: Accepted Date:
18 December 2018 12 October 2019 15 October 2019
Please cite this article as: Y. Long, F. Liang, J. Zhang, M. Xue, T. Zhang, X. Pei, Identification of drought response genes by digital gene expression (DGE) analysis in Caragana korshinskii Kom, Gene Gene (2019), doi: https:// doi.org/10.1016/j.gene.2019.144170
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2019 Published by Elsevier B.V.
Identification of drought response genes by digital gene expression (DGE) analysis in Caragana korshinskii Kom.
Yan Long1, Fengping Liang1,2, Jingwen Zhang1,2, Mande Xue1, Tianbao Zhang1, Xinwu Pei1* 1:
Biotechnology Research Institute, Chinese Academy of Agricultural Sciences,
Beijing 100081, China 2:
Ministry of Education Key Laboratory for Ecology of Tropical Islands, College of
Life Sciences, Hainan Normal University, Haikou 571158, China
*: Corresponding author: Prof. Dr. XinWu Pei, E-mail:
[email protected], Tel: +86-10-82106119 Fax: +86-10-82106118 Abstract Caragana korshinskii Kom. is a legume shrub that is widely distributed across desert habitats with gravely, sandy, and saline soils in Asia and Africa. C. korshinskii has highly developed roots and a strong tolerance to abiotic stress. At present, there are few genetic studies of C. korshinskii because of the limited availability of genomic resources. To understand the comprehensive mechanisms that are associated with drought tolerance, we used RNA-seq to survey the differentially expressed genes (DEGs) in comparisons of drought-treated and control plants. After analysing the sequencing results, we found 440 differentially expressed genes existing in drought-treated and control plants. Among the DEGs, 39 unigenes showed up-regulated expression after drought treatment, while 401 unigenes were down-regulated. We used the KEGG database to annotate these drought-induced genes; 126 unigenes were identified by KEGG pathway annotation, and approximately 28% of the unigenes with known function fell into
categories related to fatty acid metabolism, starch, sucrose metabolism, and nitrogen metabolism, suggesting that these pathways or processes may be involved in the drought response. Finally, we confirmed that one gene has a potential function in drought tolerance. Our study is the first to provide transcriptomic resources for Caragana korshinskii and to determine its digital gene expression profile under conditions of drought stress using the assembled transcriptomic data for reference. These data provide a valuable resource for genetic and genomic studies of desert plants under abiotic stress conditions.
Identification of drought response genes by digital gene expression (DGE) analysis in Caragana korshinskii Kom.
Yan Long1, Fengping Liang1,2, Jingwen Zhang1,2, Mande Xue1, Tianbao Zhang1, Xinwu Pei1* 1:
Biotechnology Research Institute, Chinese Academy of Agricultural Sciences,
Beijing 100081, China 2:
Ministry of Education Key Laboratory for Ecology of Tropical Islands, College of
Life Sciences, Hainan Normal University, Haikou 571158, China
*: Corresponding author: Prof. Dr. XinWu Pei, E-mail:
[email protected], Tel: +86-10-82106119 Fax: +86-10-82106118
Abstract Caragana korshinskii Kom. is a legume shrub that is widely distributed across desert habitats with gravely, sandy, and saline soils in Asia and Africa. C. korshinskii has highly developed roots and a strong tolerance to abiotic stress. At present, there are few genetic studies of C. korshinskii because of the limited availability of genomic resources. To understand the comprehensive mechanisms that are associated with drought tolerance, we used RNA-seq to survey the differentially expressed genes (DEGs) in comparisons of drought-treated and control plants. After analysing the sequencing results, we found 440 differentially expressed genes existing in drought-treated and control plants. Among the DEGs, 39 unigenes showed up-regulated expression after drought treatment, while 401 unigenes were down-regulated. We used the KEGG database to annotate these drought-induced genes; 126 unigenes were identified by KEGG pathway annotation, and approximately 28% of the unigenes with known function fell into categories related to fatty acid metabolism, starch, sucrose metabolism, and nitrogen metabolism, suggesting that these pathways or processes may be involved in the drought response. Finally, we confirmed that one gene has a potential function in drought tolerance. Our study is the first to provide transcriptomic resources for Caragana korshinskii
and to determine its digital gene expression profile under conditions of drought stress using the assembled transcriptomic data for reference. These data provide a valuable resource for genetic and genomic studies of desert plants under abiotic stress conditions. Keywords: Caragana korshinskii Kom., drought stress, digital gene expression
1. Introduction Drought is one of the most common and frequent abiotic environmental stresses, and it can have a profound impact on plant development and productivity, resulting in serious yield losses to crops around the world (Golldack et al., 2011). Drought also enhances the damage caused by other stresses, such as salinity and high and low temperatures (Farooq et al., 2009). When plants encounter drought conditions, they undergo changes in the expression of endogenous genes, and the contents of certain metabolites and specific proteins to protect them from stress. A number of drought-responsive gene families, such as NAC and WRKY have been previously identified (Jeong et al., 2010; Ren et al., 2013). Based on these important drought response genes, regulatory networks have been constructed (Golldack et al., 2014; Shanker et al., 2014). Two groups of drought stress-inducible genes are included in the known drought regulatory network; the first group directly protects plants against environmental stresses, while and the second group of genes respond to
environmental changes by regulating gene expression networks and signaling pathways. Recently, progress has been made in analyzing the functions of stress-inducible genes, not only to understand the mechanisms of drought stress but also to improve drought tolerance in plants by gene transfer. Caragana korshinskii, commonly known as peashrub, is a small shrub in the botanical family Fabaceae that is widely distributed in sandy grasslands of Northwestern China and Mongolia (Wang and Gao, 2008), and is extremely drought and salt tolerant. C. korshinskii can tolerate severe drought stress with soil water content as low as 7.38%(Yan-Jin et al., 2008) and high concentrations of NaCl(Hui et al., 2012). Presently, peashrub is grown both for pasture and also as an ornamental plant[10]. However, considering the high economic and ecological value, genetic and genomic studies are relatively scarce for C. korshinskii. Much of the research conducted on this species to date has focused on the physiological mechanisms responsible for its resistance to abiotic stress factors (Wang et al., 2011; Yang et al., 2012). A relatively few drought-related genes have been identified in C. korshinskii; for example, Yang et al. (2013) cloned CkLEA1, and found that the expression of this gene is induced by drought, ABA, cold, heat, and salt stress treatments(Yang, 2013) . In addition, CkWRKY1(Yang and Yin, 2013) and CKNCED1 (Ren et al., 2013) have been cloned from C. korshinskii using PCR-based methods. Although the above genes have been identified, isolating and characterizing individual genes is time consuming and labor intensive. Genome-wide sequencing provides opportunities for large-scale gene discovery. With the recent development of next-generation high-throughput DNA
sequencing technologies, large-scale transcriptome data can be readily generated for both model and non-model species. Hegedus et al. (2009) were the first to use the Solexa/Illumina Digtal Gene Expression (DGE) system to discover differentially expressed genes in the zebrafish transcriptome(Hegedus et al., 2009). Since then, RNA-Seq and DGE technology have also been widely used to identify plant genes, including those expressed in response to stress conditions related to important agronomic traits (Xu et al., 2012; Li et al., 2013; Tian et al., 2013). For example, Su et al. (2013) characterized a total of 4,153 differentially-expressed genes in unopened flower buds between drought-treated and control samples in the model plant Arabidopsis (Su et al., 2013). In wild barley, Bedada et al. (2014) selected one desert type (B1K2) and one Mediterranean type (B1K30) and performed de novo transcriptome sequencing and GDE analysis to discover genes involved in the drought response (Bedada et al., 2014). In our previous study, we used RNA-Seq and de novo assembly to produce a transcriptome library in Caragana korshinskii Kom. In total, the sequencing reads were assembled into 86,265 unigenes (mean length=709 bp), and similarity searches indicated that 33,955 and 21,978 unigenes showed significant similarities to known proteins from the NCBI non-redundant and Swissprot protein databases, respectively(Long et al., 2015). The genetic mechanisms by which this desert shrub responds to drought stress are unknown at present. Therefore, in the current study, we constructed two DGE libraries from different tissues of drought-treated and control Caragana korshinskii plants. After sequencing, the raw Illumina reads were mapped
to the reference transcriptome, then DEG profiling was performed, and the potential drought
tolerance
genes
were
discovered
by
gene
transformation.
The
differentially-expressed genes identified in this study provide genetic resources for understanding the basis of drought tolerance in C. korshinskii and other drought tolerant crops.
2. Materials and Methods 2.1. Plant material and sample collection The Caragana korshinskii seeds used in this study were provided by the Gansu Desert Control Institute. The seeds were first sown on damp filter paper and incubated at 4oC for four days before being placed at 23oC under long-day (16 h light/8 h dark) conditions. After the seedlings had reached the two-leaf stage, they were transferred to pots containing soil. The seedlings were then grown in four pots (20 seedlings/pot) representing of the two treatments (drought and control). After the seedlings had grown for one month, one set of seedlings were subjected to drought stress, and the second set of seedlings was used as the control and were grown under normal water irrigation and light conditions. The treatment pots were not watered until the symptoms of wilting appeared. Wilting appeared after the plants had been without water for one week, so we choose this one week time point in which to find genes that respond to drought stress. Several tissues, including leaves, roots, and stems, were harvested from the drought-treated and control plants for RNA isolation. 2.2. Relative water content (RWC) measurement
Relative water content (RWC) is one of the most appropriate measurements of plant water status in terms of the physiological consequence of cellular water deficit. In the current study, we used a method of RWC detection based on the method described by Barr et al. (1962) (Barrs and Weatherley, 1962). Leaf samples (100 mg) were collected from drought-treated plants and the fresh weight was determined. The leaves were floated on water for 4 h to achieve full turgidity and were weighed again to obtain the turgid weight. The leaves were then oven-dried at 105°C for 15min and 65°C for 48 h and the dry weight was determined. RWC was calculated and expressed as a percentage according to following equation: RWC (%) = (Fresh weight-Dry weight)/(Saturated
fresh
weight-Dry
weight)×100%.
Three
replicates
of
drought-treated leaves were collected for RWC determination. 2.3. RNA isolation, DGE library preparation, and RNA sequencing The drought-treated and control plants were used for RNA isolation. Total RNA from the seedlings including different tissues, including leaves, stems, and roots, was extracted for DGE library construction with TRIzol Reagent (Invitrogen, 15596-026) according to the manufacturer’s instructions. Each DGE library was with two biological replicates.. We used 3 μg of total RNA per sample as input material for RNA library preparation. The DGE sequencing libraries were generated using NEBNext Ultra RNA Library Prep Kits for Illumina (NEB, USA). Following the instructions provided by Illumina, mRNA was purified from the pooled, total RNA using oligo dT magnetic beads (Novogene, China). Fragmentation buffer was added to reduce the mRNA molecules to short fragments. Reverse transcriptase and random
primers were used to synthesize the first-strand cDNA from the mRNA fragments. Second-strand cDNA was synthesised using buffer, dNTPs, RNase H, and DNA polymerase I. The double-stranded cDNA was purified using QIAquick PCR extraction kits (QIAGEN, Hilden, Germany) and washed with EB buffer for end repair and addition of a terminal A (adenine) nucleotide. Illumina sequencing adapters were then ligated onto the cDNA fragments. The cDNA fragments containing terminal adapters were purified by AMPure XP beads and enriched by PCR to construct the library for RNA sequencing. 2.4. DGE library sequencing and mapping After the raw data was generated and the data-processing steps were completed, the clean reads were mapped to the assembly transcriptome reference sequences (Long et al., 2015) using RSEM software (Li and Dewey, 2011). Mismatches of no more than 2 bases were allowed in the alignments. The read count for each gene was obtained from the mapping results. All of the raw data including four libraries was submitted to the NCBI database with the GEO number GSE132673 .
.
2.5. Differentially-expressed unigene identification Gene expression levels were calculated based on the numbers of reads mapped to the reference sequence, using the FPKM (Fragments Per Kilobase of transcript per Million mapped reads) method (Trapnell et al., 2010). After calculating gene expression levels, the differentially expressed genes (DEGs) were identified by comparing gene expression levels. Implementing the method described by Anders(Anders and Huber, 2010) , differential expression analysis of the two
conditions was performed using the DESeq R package (1.10.1). DESeq provides statistical routines for determining differential expression in digital gene expression data using a model based on the negative binomial distribution. The resulting P values were adjusted using Benjamini and Hochberg’s approach for controlling the false discovery rate. In this study, unigenes with an adjusted P<0.05 identified by DESeq were considered to be differentially expressed. 2.6. Functional annotation of DEGs For functional annotation, the assembled unigenes that could possibly encode proteins were used as search queries against the nr (http://www.ncbi.nlm.nih.gov/), SWISS-PROT (http://www.expasy.ch/sprot/), KEGG (http://www.genome.jp/kegg/), and COG (http://www.ncbi.nlm.nih.gov/cog/) databases using the BLASTX algorithm. A typical cut-off value of E-value<1e-5 was used. With Nr annotations, the Blast2GO program (Conesa et al., 2005) was then used to assign GO annotations to the unigenes according to the three main ontologies “molecular function”, “biological process” and “cellular component”. After obtaining GO annotations for all unigenes, WEGO software (Ye et al., 2006) was used to assign GO functional classifications to all the unigenes and to understand the distribution of gene functions for the species on the macro level. 2.7. Quantitative real-time PCR validation for DEGs To confirm the DGE results, quantitative real-time reverse transcription PCR (qRT-PCR) analysis was performed on 14 drought-induced unigenes that were randomly chosen from the two libraries. The primers employed in the qRT-PCR
experiments are listed in Supplementary File 1. qRT-PCR was performed using the SYBR premix Ex Taq kit (Abm, China) on an ABI 7500 Real-Time System (Applied Biosystems), with first-strand cDNA as the template. The Actin unigene (comp59129_c2) from the reference assembly was used as the internal control. The relative quantitative (2-△△CT) method was used to calculate the fold-change in the expression levels of target genes (Quail et al., 2008). All reactions were performed in three technical replicates using three biological samples. 2.8. Vector construction and gene transformation In order to further identify the functional drought tolerance related gene in Caragana korshinskii, the gene structure characters of the 39 up-regulated unigenes were screened. It was found that 16 unigenes having full-length coding regions. All of the sequence information and the potential biological functions were listed in Supplementary file 3. So, each of the full-length CDSs of these 16 unigenes were then cloned into the pBinGlyRed3 vector. The plasmids were double digested with the restriction endonucleases EcoRI and XmaI and then ligated with the specific transcript fragments so that the expression of each unigene was under the control of the CaMV 35S promoter. The constructs were transformed into Agrobacterium strain EHA105 using the freeze–thaw method. Arabidopsis Col-0 plants were then transformed using the floral dip method (Clough and Bent, 1998), with untransformed Arabidopsis plants used as wild-type (WT) controls. Transgenic plants were selected on MS medium supplemented with kanamycin. 2.9. Phenotypic screening of transgenic plants
For phenotypic screening, approximately 200 seeds from WT and two T2-generation homozygous transgenic lines were sown on plates with soil with normal management, and then water was withheld when the plants had grown to the four-leaf stage. As malondialdehyde and soluble sugar contents were two important index of evaluating the drought response for plants. So, after two weeks of drought treatment, the malondialdehyde and soluble sugar contents were evaluated. The malondialdehyde contents were measured using the MDA Kit (Solarbio China).The soluble sugar content was measured by Micro Plant Soluble Sugar Content Assay Kit (Solarbio China). Meanwhile, the surviving rate was calculated and three replicates were performed for each line. All plants were maintained in a greenhouse under standard conditions (24oC day/18°C night; 16 h light/8 h dark). The leaves of the plants when growing in 21 days (before drought treatment), and then during and after re-watering were collected for relative gene expression analysis. Relative gene expression values were determined by using the 2-ΔΔCt method(Quail et al., 2008) with Actin1 (AT2G37620) as an internal control.All reactions were performed in three technical replicates using three biological samples.
3. Results 3.1. Drought stress phenotypes in C. korshinskii After one week of drought treatment, the drought-treated plants began to display wilting, and the leaves at the bottom of the plants began to turn yellow. There was no wilting observed in the control plants. In addition to the effect of drought on the
leaves, there were many more roots in the drought treated plants than in the control plants (Fig. 1). We then measured relative water content (three replicates). For the three leaf RWC replicate measurements, the values were 78.4%, 74.6%, and 80.4%, which showed that C. korshinskii has a strong ability to retain water when experiencing drought stress. 3.2. Statistics of DGE sequence abundance in drought-treated and control plants To discover the molecular events that occur during exposure to drought, two digital gene expression (DGE) libraries were constructed by sequencing RNA extracted from two pools of control and drought-treated plants using Illumina technology. Following the removal of adapters and low-quality reads, 12,053,819 and 14,601,238 reads were obtained for the two control replicates, and 13,625,328 and 14,738,241 reads were obtained for the two replicates from the drought-treated plants. We then mapped the clean reads to the transcriptome reference data; a total of 59,491 and 63,307 unigene sequences were identified for the control replicates, and 63,050 and 66,925 unigene sequences were identified for the drought treatment replicates (Table 1). 3.3. Identification and functional annotation of differentially-expressed genes (DEGs) Gene expression levels were obtained using the FPKM method. Based on the normalized gene expression levels, genes that were significantly differentially expressed in comparisons of drought-treated and control plants were identified. Unigenes in which expression was either up- or down-regulated expression were identified among the differentially expressed unigenes. A total of 440 DEGs
(Supplementary file 2) were detected between the drought-treated and control RNA-Seq libraries. Among all the differentially-expressed unigenes, 49 were induced by drought treatment, and 391 showed down-regulated expression after one week of drought treatment. In order to classify the functions of the potential DEGs in C. korshinskii, we performed gene ontology (GO) classification and Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway analysis. We were unable to identify homologous genes in the NCBI database for 76 of the 440 differentially expressed unigenes. In the GO analysis, the majority of the unigenes were assigned to the three main categories biological process (166, 37.64%), followed by molecular function (42, 9.52%), and cellular component (16, 3.63%). In the biological process category, the terms oxidation-reduction process (46, 10.2%) and carbohydrate metabolic process (38, 8.39%) were prominently represented, indicating that important metabolic activities occurred in C. korshinskii in response to drought (Fig. 2). In the molecular function category, oxidoreductase activity binding (61, 36.1%) and iron ion binding (16, 9.5%) were the first and second most represented terms, respectively. In the cellular component category, extracellular region (27, 36%) and chromatin (14, 18.7%) were the two major terms. The remaining GO terms cell wall, nucleosome, protein-DNA complex, and external encapsulating structure explained 45.3% of the DEGs. 3.4. Functional classifications using KEGG pathway analysis of the DEGs The Kyoto Encyclopaedia of Genes and Genomes (KEGG) pathway database is a knowledge base for the systematic analysis of gene functions in terms of networks
of genes and molecules in cells and their variants specific to particular organisms. To further analyze the DEGs identified between the drought-treated and control plants of C. korshinskii, all the DEGs were used as search queries against the KEGG pathway database. Of the 440 DEGs, 126 (28.64%) that gave significant matches in the database were assigned to 57 KEGG pathways. Among the top 20 pathways, the glycolysis/gluconeogenesis pathway had the greatest number of DEGs, followed by pentose and glucuronate interconversions and the taurine and hypotaurine metabolism pathway (Fig. 3). These results indicate that active metabolic processes occur during drought treatment.
3.5. Expression of selected genes that are differentially regulated between the two DGE libraries To confirm the gene expression data, 14 unigenes in which expression was up-regulated in the drought-treated plants were randomly chosen from the two libraries for qRT-PCR analysis. The results showed that expression of all of the unigenes was significantly up-regulated in the drought-treated plants. As shown in Fig. 4, the unigene expression trends were similar for both DGE RNA-Seq and qRT-PCR data. The difference for RNA-seq and qRT-PCR were fold change value. After examining the gene annotation information available for these unigenes, we found that information on gene function was available for nine unigenes; however, there were no matches in any database for the remaining five unigenes. These results suggest that some drought-tolerance pathway genes may work together to help the
plant adapt to drought stress. 3.6. Phenotypic analysis of transgenic Arabidopsis plants We carefully examined the phenotypes of transgenic plants derived from transformation with the 16 gene constructs, and found that plants expressing the unigene com58276 displayed a drought resistant phenotype. The gene structure analysis showed that the ORF of this gene is 255bp and the predicted protein length is only 84aa (Fig. 5). We then blasted the sequence of com58276 gene in the NCBI database to make sure what is the gene family belonging to. While the result showed that there was no significant similarity to the known sequences. That mean com58276 is a new and specific gene which involves in drought tolerance pathway in C. korshinskii. After two weeks of drought treatment, the two transgenic lines from the unigenes grew normally while the control plants wilted. The transgenic plants become green after re-watering while the control plants did not (Fig.6A). The surviving rates for the two transgenic lines were 80.2% and 84.5%, while the rate of the control plants was 18.4% (Fig.6B). Then qRT-PCR was used to analyze the expression level of com58276 in WT and transgenic lines before and under drought stress. The results showed that compared with WT, the expression levels of transgenic plants at different periods were higher (Fig.6C). Then different phenotypes were analyzed. For water loss rate, it increased companing with the treatment time increasing for transgenic plants as well as the WT, while the water loss rate of the two transgenic lines was much slower than in WT plants(Fig.6D), the soluble sugar and malondialdehyde contents were higher than in WT (Fig. 6E-F).
These results strongly suggest that unigene com58276 is a potential drought resistant gene.
4. Discussion As an important species in the genus Caragana Fabr., C. korshinskii possesses tolerance to both drought and salinity. Previous anatomical analysis revealed that C. korshinskii leaves have a structure that is typical of drought-tolerant plants, such as a thick epidermis, stomatal depression, and the palisade tissue/spongey tissue ratio. At present, few studies have focused on the genetics of abiotic stress in this species. The objective of our study is to discover the drought related genes in C. korshinskii on a genome-wide basis and to provide useful genetic and genomic resources for the genetic improvement of different crops. Phenotypic observations and RWC measurements in drought-treated C. korshinskii plants in the current study were consistent with the results of previous studies. For example, Fang et al. (2011) investigated drought resistance in three different species in the genus Caragana and in chickpea (Cicer arietinum), and found that the RWC could approach 100% during exposure to drought (Fang et al., 2011). In addition to checking the phenotype, we further analyzed the unigene annotation results of our previously assembled reference genome (Long et al., 2015), and found that ~52% of the unigenes could not be matched to any database. This means that there must be some specific pathways that could regulate the expression of drought resistance genes during drought stress.
In order to identify drought resistant genes in C. korshinskii on a large scale, two DGE libraries, including one drought-treated and one control library, were sequenced and mapped to our previously assembled reference genome. After comparing the gene expression levels, we found 440 DEGs between the two libraries, which included 49 up-regulated and 391 down-regulated genes. It is well known that extensive changes in gene expression occur when plants are exposed to drought stress. For example, >16% of the genome exhibited altered expression levels in response to drought stress in cotton. And among the DEGs, 5,344 genes were up regulated and 4,630 genes were down-regulated after two days of drought treatment (Shanker et al., 2014). Generally, both up-regulation and down-regulation of gene expression occurs in the transcriptome under drought stress conditions. In model plants such as Arabidopsis and cotton, it was previously found that more genes are up-regulated than down-regulated under drought. In the current study, we identified 440 DEGs. Compared to known plants, there were relatively few DEGs. One of the reasons could be the limited number of time points for sample collection in our study. We extracted the RNA samples after two weeks of drought treatment from seedlings that were only one month of age. In order to discover more drought tolerance genes, we could add more time points for sample collection at different developmental stages, such as adult trees. We could also collect samples from plants growing in their natural geographical habitats. Previous studies have shown that three types of DEGs are related to the drought stress response, including functional proteins, regulatory proteins, and some unknown
functional proteins. For example, these three classes of DEGs were identified in flax after analyzing the differentially expressed genes following drought treatment. The first group of genes included seven sub-groups, such as late embryogenesis proteins, carbohydrate
metabolism,
amino
acid
metabolism,
lipid
transfer
proteins,
photosynthesis chloroplastic proteins, heat shock proteins, and developmental proteins. For the second group, several transcription factors such as WRKY, APK1A, and APK1B were identified. In addition to these two classes of genes, another nine genes with unknown function were also identified (Dash et al., 2014). In the current study, these types of DEGs were also identified in C. korshinskii. For the first group of functional proteins, we found proteins involved in fatty acid metabolism and cysteine and methionine metabolism. In addition to functional proteins, several regulatory proteins were identified, including APK1A (comp40456), WRKY (comp57777) and several bHLH transcription factors (comp65116 and comp73048). We also found that 17.3% of the DEGs could not be matched to any reported proteins in the databases we searched. It is possible that these predicted proteins are involved in as-yet undefined functions associated with the drought stress response. Based on the bioinformatics analysis of the DEGs, 14 unigenes were randomly selected for qRT-PCR analysis, and the experimental results showed that the gene expression patterns for all 14 unigenes were consistent with the RNA-seq results. This implies that the constructed DGE libraries were suitable for high-throughput sequencing and that the sequencing results are reliable. We then transformed 16 up-regulated unigenes into Arabidopsis thaliana, and one unigene, com58276, was
preliminarily identified as being involved in drought tolerance. The Blast result showed that com58276 is an unknown function gene, which means there is some specific pathways controlling drought tolerance in the C. korshinskii The gene resource could potentially be used to improve drought tolerance in a variety of crops. 5. Conclusions The current work used RNA-seq technology as a tool to discover genes involved in drought tolerance in the desert tree Caragana korshinskii Kom. We identified 440 DEGs, including 49 up-regulated and 391down-regulated unigenes. Based on gene annotation analysis, 16 unigenes were constructed transformed into the model plant Arabidopsis thaliana Col-0. After drought treatment, one unigene, com58276, was found to be involved in drought tolerance. This potentially useful drought tolerance gene resource could potentially help to improve drought tolerance in a number of important crops.
Acknowledgments This work was supported by National Natural Science Foundation of China (No. 31570330). Figure legends Fig. 1. Phenotype of drought-treated and control plants of Caragana korshinskii Kom.
Fig. 2. Gene ontology (GO) analysis of the 440 DEGs between the drought-treated and control plants. BP: biological process; CC: cellular component; MF: molecular
function.
Fig. 3. KEGG pathway enrichment of the DEGs
Fig. 4. qRT-PCR analysis of 14 randomly-selected DEGs which were up-regulated in response to drought treatment. The line chart in the graph represented the read count values from drought and control plants in the DGE sequencing data.
Fig. 5 Basic information of cDNA and predicted protein sequences of com58276 gene in C. korshinskii. A: cDNA sequence of com58276. The characters with underline show the CDS region; B: cDNA structure of com58276. C: predicted protein sequence of com58276; D: Predicted protein structure with low complexity domains. The red box represented the low complexity region (SHARPRMARRSPSPH).
Fig. 6 Phenotypic variations in com58276 transgenic and WT Arabidopsis plants during drought treatment. WT: wild type; line 3 and line 7 mean the two transgenic lines. A. Phenotypes of transgenic and WT plants before, during, and after drought treatment; B. Soluble sugar content comparison between transgenic and WT plants; C. Relative expression values of com58276 in WT and the two transgenic lines before, during and after drought treatment; D. Water loss rate comparison between transgenic and WT plants; E. Soluble sugar content comparison between transgenic and WT plants; F. Malondialdehyde content comparisons between transgenic and WT plants.
Tables Table 1 Summary of information on the DGE sequencing results from the four transcriptome libraries constructed with RNA extracted from C. korshinskii seedlings
*:
Sample name
Total reads
Total mapped
Unigene
CKM_C1*
12053819
11386475(94.46%)
59491
CKM_C2
14601238
13691868(93.77%)
63307
CKM_D1
13625328
12797136(93.92%)
63050
CKM_D2
14738241
13847271(93.95%)
66925
CKM_C1 and CKM_C2 represented the two control replicates; CKM_D1 and
CKM_D2 represented the two replicates for droughted-treat plants.
Supplementary files Supplementary file 1. Primer sequences used for qRT-PCR analysis of selected DEGs
Supplementary file 2. The 440 DEGs identified between drought-treated and control plants Supplementary file 3. Unigene sequence information and predicted potential function
References
Anders, S. and Huber, W., 2010. Differential expression analysis for sequence count data. Genome Biol 11, R106. Barrs, H.D. and Weatherley, P.E., 1962. A Re-Examination of the Relative Turgidity Technique for Estimating Water Deficits in Leaves. Australian Journal of Biological Sciences 15, 413-428. Bedada, G., Westerbergh, A., Muller, T., Galkin, E., Bdolach, E., Moshelion, M., Fridman, E. and Schmid, K.J., 2014. Transcriptome sequencing of two wild barley (Hordeum spontaneum L.) ecotypes differentially adapted to drought stress reveals ecotype-specific transcripts. BMC Genomics 15, 995. Clough, S.J. and Bent, A.F., 1998. Floral dip: a simplified method for Agrobacterium-mediated transformation of Arabidopsis thaliana. Plant J 16, 735-43. Conesa, A., Gotz, S., Garcia-Gomez, J.M., Terol, J., Talon, M. and Robles, M., 2005. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics 21, 3674-6. Dash, P.K., Cao, Y., Jailani, A.K., Gupta, P., Venglat, P., Xiang, D., Rai, R., Sharma, R., Thirunavukkarasu, N. and Abdin, M.Z., 2014. Genome-wide analysis of drought induced gene expression changes in flax (Linum usitatissimum). Gm Crops 5, 106-119. Fang, X., Li, F., Zhang, H. and Jiang, Z., 2011. The comparation of drought resistance between Caragana species (Caragana arborescens, C. korshinskii, C. microphylla) and two chickpea (Cicer arietinum L.) cultivars. Acta Ecologica Sinica 31, 2437-2443. Farooq, M., Wahid, A., Kobayashi, N., Fujita, D. and Basra, S.M.A., 2009. Plant drought stress: effects, mechanisms and management. Agronomy for Sustainable Development 29, 185-212. Golldack, D., Li, C., Mohan, H. and Probst, N., 2014. Tolerance to drought and salt stress in plants: Unraveling the signaling networks. Front Plant Sci 5, 151. Golldack, D., Luking, I. and Yang, O., 2011. Plant tolerance to drought and salinity: stress regulating transcription factors and their functional significance in the cellular transcriptional network. Plant Cell Reports 30, 1383-1391. Hegedus, Z., Zakrzewska, A., Agoston, V.C., Ordas, A., Racz, P., Mink, M., Spaink, H.P. and Meijer, A.H., 2009. Deep sequencing of the zebrafish transcriptome response to mycobacterium infection. Mol Immunol 46, 2918-30. Hui, Y., Hu, X. and Li, F., 2012. Leaf photosynthesis, chlorophyll fluorescence, ion content and free amino acids in Caragana korshinskii Kom exposed to NaCl stress. Acta Physiologiae Plantarum 34, 2285-2295. Jeong, J.S., Kim, Y.S., Baek, K.H., Jung, H., Ha, S.-H., Do Choi, Y., Kim, M., Reuzeau, C. and Kim, J.-K., 2010. Root-Specific Expression of OsNAC10 Improves Drought Tolerance and Grain Yield in Rice under Field Drought Conditions. Plant Physiology 153, 185-197. Li, B. and Dewey, C.N., 2011. RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. Bmc Bioinformatics 12, 323-323. Li, C., Wang, Y., Huang, X., Li, J., Wang, H. and Li, J., 2013. De novo assembly and characterization of fruit transcriptome in Litchi chinensis Sonn and analysis of differentially regulated genes in fruit in response to shading. Bmc Genomics 14, 552. Long, Y., Wang, Y.Y., Wu, S.S., Wang, J., Tian, X.J. and Pei, X.W., 2015. De Novo Assembly of Transcriptome Sequencing in Caragana korshinskii Kom. and Characterization of EST-SSR Markers. Plos One 10. Quail, M.A., Kozarewa, I., Smith, F., Scally, A., Stephens, P.J., Durbin, R., Swerdlow, H. and Turner, D.J.,
2008. A large genome center's improvements to the Illumina sequencing system. Nat Methods 5, 1005-10. Ren, A.Q., Jin, Y.I., Gao, H.W., Jun, L.I. and Wang, X.M., 2013. Cloning and expression analysis of the promoter of Caragana korshinskii gene. Acta Prataculturae Sinica. Shanker, A.K., Maheswari, M., Yadav, S.K., Desai, S., Bhanu, D., Attal, N.B. and Venkateswarlu, B., 2014. Drought stress responses in crops. Funct Integr Genomics 14, 11-22. Su, Z., Ma, X., Guo, H., Sukiran, N.L., Guo, B., Assmann, S.M. and Ma, H., 2013. Flower development under drought stress: morphological and transcriptomic analyses reveal acute responses and long-term acclimation in Arabidopsis. Plant Cell 25, 3785-807. Tian, D.Q., Pan, X.Y., Yu, Y.M., Wang, W.Y., Zhang, F., Ge, Y.Y., Shen, X.L., Shen, F.Q. and Liu, X.J., 2013. De novo characterization of the Anthurium transcriptome and analysis of its digital gene expression under cold stress. Bmc Genomics 14, 1-14. Trapnell, C., Williams, B.A., Pertea, G., Mortazavi, A., Kwan, G., van Baren, M.J., Salzberg, S.L., Wold, B.J. and Pachter, L., 2010. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol 28, 511-5. Wang, X., Chen, X., Liu, Y., Gao, H., Wang, Z. and Sun, G., 2011. CkDREB gene in Caragana korshinskii is involved in the regulation of stress response to multiple abiotic stresses as an AP2/EREBP transcription factor. Molecular Biology Reports 38, 2801-2811. Wang, Z. and Gao, H.W., 2008. Progress on Genetic Diversity of Genus Caragana Germplasm Resources. Journal of Plant Genetic Resources 9, 397-400. Xu, D.L., Long, H., Liang, J.J., Zhang, J., Chen, X., Li, J.L., Pan, Z.F., Deng, G.B. and Yu, M.Q., 2012. De novo assembly and characterization of the root transcriptome of Aegilops variabilis during an interaction with the cereal cyst nematode. Bmc Genomics 13, 133. Yan-Jin, L.I., Zhao, Z., Sun, D.X. and Han, G., 2008. Hydrological Physiological Characteristics of Caragana korshinskii under Water stress. Journal of Northwest Forestry University 23, 1-4. Yang, D.H., Song, L.Y., Hu, J., Yin, W.B., Li, Z.G., Chen, Y.H., Su, X.H., Wang, R.C. and Hu, Z.M., 2012. Enhanced tolerance to NaCl and LiCl stresses by over-expressing Caragana korshinskii sodium/proton
exchanger
1
(
CkNHX1
).
Biochemical
&
Biophysical
Research
Communications 417, 732-737. Yang, Q., 2013. Cloning and Expression Analysis of CkLEA1 Gene in Caragana korshinskii Kom. China Biotechnology 33, 93-99. Yang, Q. and Yin, 2013. Construction of a suppression subtractive hybridization library of Caragana korshinskii under drought stress and cloning of CkWRKY1 gene. Scientia Silvae Sinicae 49, 62-68. Ye, J., Fang, L., Zheng, H., Zhang, Y., Chen, J., Zhang, Z., Wang, J., Li, S., Li, R., Bolund, L. and Wang, J., 2006. WEGO: a web tool for plotting GO annotations. Nucleic Acids Res 34, W293-7.
Abbreviation list: C. korshinskii: Caragana korshinskii Kom. DGE: digital gene expression DEG:differentially expressed gene KEGG: Kyoto Encyclopedia of Genes and Genomes RWC: Relative water content FPKM: Fragments Per Kilobase of transcript per Million mapped reads qRT-PCR: quantitative real-time reverse transcription PCR Declaration of interests ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
Competing interests The authors declare that they have no competing interests
Highlights: 1. DGE combining with transgenic work to discover drought response genes in Caragana korshinskii Kom. 2. Hundreds of differentially expressed genes were screened. 3. One potential drought response functional gene has been identified.