Selection and validation of reference genes for quantitative real-time PCR studies during Saccharomyces cerevisiae alcoholic fermentation in the presence of sulfite

Selection and validation of reference genes for quantitative real-time PCR studies during Saccharomyces cerevisiae alcoholic fermentation in the presence of sulfite

    Selection and validation of reference genes for quantitative real-time PCR studies during Saccharomyces cerevisiae alcoholic fermenta...

411KB Sizes 3 Downloads 41 Views

    Selection and validation of reference genes for quantitative real-time PCR studies during Saccharomyces cerevisiae alcoholic fermentation in presence of sulfite Chiara Nadai, Stefano Campanaro, Alessio Giacomini, Viviana Corich PII: DOI: Reference:

S0168-1605(15)30099-4 doi: 10.1016/j.ijfoodmicro.2015.08.012 FOOD 7017

To appear in:

International Journal of Food Microbiology

Received date: Accepted date:

5 July 2015 15 August 2015

Please cite this article as: Nadai, Chiara, Campanaro, Stefano, Giacomini, Alessio, Corich, Viviana, Selection and validation of reference genes for quantitative real-time PCR studies during Saccharomyces cerevisiae alcoholic fermentation in presence of sulfite, International Journal of Food Microbiology (2015), doi: 10.1016/j.ijfoodmicro.2015.08.012

This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

ACCEPTED MANUSCRIPT Selection and validation of reference genes for quantitative real-time PCR studies during Saccharomyces cerevisiae alcoholic fermentation in presence of sulfite

IP

T

Chiara Nadaia, Stefano Campanarob, Alessio Giacominia c*, Viviana Coricha c

SC R

a Department of Agronomy Food Natural resources Animals and Environment (DAFNAE), University of Padova, Viale dell’Università 16, Legnaro, PD 35020, Italy

b Department of Biology, University of Padova, Via Ugo Bassi 58/b, Padova 35121, Italy

NU

c Interdepartmental Centre for Research in Viticulture and Enology (CIRVE), University of Padova, Via XXVIII Aprile 14, Conegliano, TV 31015, Italy

MA

* Corresponding author e-mail: [email protected]

D

Abstract

TE

Sulfur dioxide is extensively used during industrial fermentations and contributes to determine the harsh conditions of winemaking together with low pH, high sugar content and increasing ethanol concentration. Therefore the presence of

CE P

sulfite has to be considered in yeast gene expression studies to properly understand yeast behavior in technological environments such as winemaking. A reliable expression pattern can be obtained only using an appropriate reference gene set, that is constitutively expressed regardless of perturbations linked to the experimental conditions.

AC

In this work we tested 15 candidate reference genes suitable for analysis of gene expression during must fermentation in presence of sulfite. New reference genes were selected from a genome-wide expression experiment, obtained by RNA sequencing of four Saccharomyces cerevisiae wine strains grown in enological conditions. Their performance was compared to that of the most commonly genes used in previous studies. The most popular software based on different statistical approaches (geNorm, NormFinder and BestKeeper) were chosen to evaluate expression stability of the candidate reference genes. Validation was obtained using other wine strains by comparing normalized gene expression data with transcriptome quantification both in presence and absence of sulfite. Among 15 reference genes tested ALG9, FBA1, UBC6 and PFK1 appeared to be the most reliable while ENO1, PMA1, DED1 and FAS2 were the worst. The most popular reference gene ACT1, widely used for S. cerevisiae gene expression studies, showed a stability level markedly lower than those of our selected reference genes. Finally, as the expression of the new reference gene set remained constant over the entire fermentation process, irrespective of the perturbation due to sulfite addition, our results can be considered also when no sulfite is added to the must. 1

ACCEPTED MANUSCRIPT

Keywords: gene expression, wine yeasts, RNA-seq, Saccharomyces cerevisiae, fermentation, SO2

T

1. Introduction

IP

Real-time PCR is the standard method for the quantification of mRNA transcription levels of a limited number of target

SC R

genes. This method is rapid, accurate, sensitive, reliable and it is appropriate for studies of a selected number of target genes or pathway constituents in an experimental setup.

One of the major difficulties in obtaining reliable expression patterns is the removal of the experimentally induced non-

NU

biological variation from the true biological variation. This can be done through normalization by controlling as many of the confounding variables as possible (Vandesompele et al., 2009).

MA

In the normalization strategy, internal controls are subjected to the same conditions as target genes and their expression is measured by quantitative RT-PCR (qRT-PCR) (Spinsanti et al., 2006). The reference genes were expressed in the cells, and their mRNAs are present during sampling, nucleic acid extraction, storage, and any enzymatic processes such

D

as DNase treatment and reverse transcription (Vandesompele et al., 2009). The success of this procedure is highly

TE

dependent on the choice of the appropriate reference genes (Spinsanti et al., 2006). Although many studies using qRT-PCR relied upon only one endogenous control (Radonic et al., 2004, Suzuki et al.,

CE P

2000), to date the use of a single reference gene is considered insufficient and normalization by multiple reference genes is required (Pfaffl et al., 2004; Vandesompele et al., 2002). A suitable reference gene should be constitutively expressed in the cells regardless of the experimental environment.

AC

However, growing evidences suggest that there is no single universal reference gene whose expression is independent from experimental conditions. In recent years, calculation of a normalization factor based on the geometric average of validated multiple reference genes was suggested to eliminate possible outliers and differences in the abundance of different gene copies (Cankour-Cetinkaya et al., 2012; Vandesompele et al., 2002). Several works (Lee et al., 2005; Selvey et al., 2001; Stahlberg et al. 2008; Teste et al., 2009) proved how some of the most commonly used reference genes cannot always be considered as reliable controls because they show different behavior in various experimental conditions, emphasizing the importance of preliminary evaluation studies, aimed at identifying the most stable reference genes for each single experiments (Spinsanti et al., 2006). The approach in normalization using multiple genes is the selection of reference genes among candidate genes, which have been commonly used for normalization, although they may not be a suitable reference gene set for the specific experimental condition. Therefore, a successful approach could be to select new reference genes from a genome-wide expression

2

ACCEPTED MANUSCRIPT experiment obtained with microarrays or, more recently, RNA sequencing. Another important point is the determination of the most “stable” genes among the candidates under selected conditions by means of specific software tools. In validation experiments the most popular software are geNorm (Vandesompele et al., 2002), NormFinder (Andersen et

T

al., 2004) and BestKeeper (Pfaffl et al., 2004).

IP

In Saccharomyces cerevisiae, studies have focused on validation of reference genes under particular physiological

SC R

conditions, such as glucose stimulation or dehydration (Stahlberg et al. 2008; Vaudano et al. 2009). Teste et al. (2009) validated a set of reference genes suitable for S. cerevisiae growing in a synthetic minimal medium with 2% (w/v) glucose or galactose and pH 5.0. Vaudano et al. (2011) identified a set of reference genes among those previously

NU

reported in literature suitable for normalization of qRT-PCR expression data in S. cerevisiae during alcoholic fermentation in must. However, there is no established set of reference genes suitable for normalizing expression data of

MA

S. cerevisiae during fermentation in presence of sulfite, that is normally added to grape must for its antimicrobial and antioxidant properties when wine is produced industrially thus contributing to the harsh conditions of winemaking together with low pH (3.5), high sugar (150–200 g/l) and increasing ethanol concentration.

D

In this study four wine strains whose genome was completely sequenced (Treu et al., 2014a) were used to evaluate the

fermentation in presence of sulfite.

TE

suitability of a selected pool of genes to be used as reference genes in gene expression studies during must alcoholic

CE P

Some of these genes were chosen from those commonly used in previous studies. Moreover, taken advantage of transcriptome data set obtained for each strain in fermentation condition (Treu et al., 2014b) we were able to select other candidate genes.

AC

Their suitability was evaluated in fermentation conditions in presence of sulfite. Finally the validation was obtained using other wine strains by comparing the normalized gene expression data with transcriptome analysis of the same strains both in presence and absence of sulfite.

2. Materials and methods 2.1 Yeast strains Wine strains P283, P301, R008 and R103 were obtained from a yeast selection program that isolated approximately 600 strains from the vineyards of the Conegliano Valdobbiadene Prosecco Superiore DOCG and Raboso Piave DOC regions in North East Italy. These four yeasts are the wild strains that originated the derivative lines whose genome was previously sequenced (Treu et al., 2014a) and deposited in the Leibniz Institute DSMZ-German Collection of Microorganisms and Cell Cultures (Braunschweig, Germany); collection numbers assigned were DSM 28395, DSM

3

ACCEPTED MANUSCRIPT 28396, DSM 28397 and DSM 28398, respectively. The industrial strains are the following: EC1118 (Lallemand Inc.) AWRI796 (Maurivin, Australia) and VL3 (Laffort, Australia)

T

2.2 Fermentation in bioreactors and cells samplings

IP

Fermentations were performed in synthetic wine must MS300 (Bely et al., 1990) in 1-l Multifors bioreactors (Infors

SC R

HT, Basel, Switzerland). These instruments are equipped with sensors to monitor temperature, pH and with a flow meter red-y mod. GSM-A95A-BN00 to determine CO2 outflow (Infors HT) (range 1–20 ml/min). The temperature was maintained at 25°C and initial pH was 3.2. CO2 production was monitored by flow meter every 5 minutes to determine

NU

the rate of CO2 production. Strict anaerobiosis was not imposed but fermentation conditions were largely anaerobic due to the design of the bioreactors and the effect of CO2 production. Three independent biological replicates were carried

MA

out for each strain. For each replicate at each sampling 50 ml of cell culture were collected for cDNA retrotranscription or RNA sequencing analysis, briefly centrifuged to remove the growth media and immediately frozen at −80°C. For primer validation and candidate reference genes analysis strains P283, P301, R008 and R103 were grown in the

D

above conditions supplementing the synthetic must before inoculation either with 0 or 40 mg/l of SO2. Cells were

TE

collected at four different times: 30 minutes and 2 hours after cells inoculation (both corresponding to lag phase), at the

samples were collected.

CE P

beginning of fermentation (early log phase) and at 45 g/l of CO2 produced (early stationary phase). A total of 96

For reference gene set validation strains R008, EC1118, VL3, AWRI796 were grown in the above conditions supplementing the synthetic must before inoculation either with 0 or 25 mg/l of SO2. Cells were collected at 6 g/l of

AC

CO2 produced. A total of 24 samples were collected.

2.3 RNA extraction and reverse transcription Total RNA was extracted using the TRIzol® Plus RNA Purification Kit (Ambion). Concentration, purity and integrity of RNA samples were determined by spectrophotometric analysis considering the absorbance ratio at 260/280 nm and at 230/260 nm. The quality and integrity of RNAs were confirmed by electrophoresis on 1.5% agarose gels under denaturing conditions (2% formaldehyde, v/v, 20 mM MOPS, 5 mM sodium acetate, 1 mM EDTA, pH 7.0). RNA (1 µg) was treated with DNase I (Fermentas) according to the manufacturer’s instructions. cDNA was synthesized using RevertAid Reverse Transcriptase (Fermentas) according to the manufacturer’s instructions using both polyT(16) primers (MWG-biotech; 0.5 µg/µl) and random hexamers (Promega; 0.5 µg/µl).

4

ACCEPTED MANUSCRIPT For RNA sequencing total RNA extraction, quantification, rRNA removal and library preparation were performed as describe by Treu et al. (2014b).

T

For each strain, equal RNA amounts of the three independent replicates at each collection time were pooled together.

IP

2.4 Primer design

SC R

PCR primers for real-time assays were designed using Primer-BLAST (http://www.ncbi.nlm.nih.gov/tools/primerblast/). This software uses Primer3 to generate the candidate primer pairs for a given template sequence and then submits them to BLAST search against a user-selected database. Yeast database was used to check primer cross-

NU

reactivity with sequences of other yeast species. Special attention was given to primer length (15-25 bp), annealing temperature (58°C – 62°C), base composition, 3'-end stability and amplicon size (80-200 bp). All primers were

MA

synthesized by MWG-Biotech (HPSF purified) and are listed in Table 1. The ACT1, ALG9, PDA1, TAF10, TFC1 and UBC6 primers were designed by Teste et al. (2009).

D

2.5 Real-time PCR

TE

Real-Time PCR was carried out on a CFX96 Cycler – RealTime PCR Detection System (Bio-Rad Laboratories, Inc., Hercules, CA, USA), in white-walled PCR plates (96 wells). A ready to use master-mix containing a fast proof-reading

CE P

Polymerase, dNTPs, stabilizers, MgCl2 and EvaGreen dye was used according to the manufacturer’s instructions (BioRad). Reactions were prepared in a total volume of 15 μl containing 400 nM each primer (MWG), 1X SsoFast EvaGreen Supermix 2X (Bio- Rad) and 5 μl cDNA.

AC

The cycle conditions were set as follows: initial template denaturation at 98°C for 30 seconds, followed by 40 cycles of denaturation at 98°C for 2 seconds, and combined primer annealing/elongation at 60°C for 10 seconds. The amount of fluorescence for each sample, given by the incorporation of EvaGreen into dsDNA, was measured at the end of each cycle and analyzed via CFX-Manager Software v2.0 (Bio-Rad Laboratories, Inc.). To calculate efficiency (E) of Real Time-PCR (RT-PCR) and correlation coefficients (R2) of each primer pair PCR amplification was run using serial 1:5 dilutions of template cDNA obtained by pooling the 32 samples, on CFX96 cycler – RealTime PCR Detection System. Melting curves of PCR amplicons were obtained using temperatures ranging from 65°C to 95°C. Data acquisition was performed for every 0.5°C temperature increase with a 1-second step. Efficiency was calculated from the slope of the standard curve using the formula E = 10 -1/slope and %E = (E-1) x 100%. To obtained RT-PCR data set to analyze by the three software for each reference gene a 96-well plate was used. Each cDNA sample was loaded twice to obtain a repetition of the analysis and no-template controls for each primer pair were

5

ACCEPTED MANUSCRIPT included in all the plates. Melting curves of PCR amplicons were obtained with temperatures ranging from 65°C to 95°C. Data acquisition was performed for every 0.2°C temperature increase, with a 2-seconds step. Baseline and threshold values were automatically determined for all plates using the CFX-Manager Software v2.0. As

T

suggested by Spinsanti et al. (2006), in order to ensure comparability among data obtained from different experimental

IP

plates, the threshold value was subsequently manually set to the value corresponding to the arithmetic mean among the

SC R

automatically determined thresholds annotated previously and all data were reanalyzed. Data were analyzed using geNorm (Vandesompele et al., 2002) implemented in qBasePlus version 2.3 (Biogazelle, Ghent, Belgium), a program for qPCR data management and analysis (Hellemans et al., 2007), NormFinder version 0.953 (Andersen et al., 2004)

NU

and BestKeeper version 1 (Pfaffl et al., 2004) VBA applets. Software SPELL (Serial Pattern of Expression Levels Locator) – S. cerevisiae (Version 2.0.3r71) (Hibbs et al., 2007)was used to verify co-regulation of genes.

MA

For each target gene each sample was analyzed in triplicate and no-template controls for each primer pair were included in all plates. Gene expression analysis was performed using the CFX-Manager Software v2.0. Correlation analysis was

TE

3. Results and Discussion

D

performed by Pearson test.

Sulfur dioxide is a preservative indispensable in winemaking for its antiseptic, antioxidant and antioxidasic properties.

CE P

Therefore, the presence of sulfite has to be considered when yeast gene expression is performed to understand yeast behavior in technological environment, such as winemaking. Moreover sulfite is an important yeast metabolite and it is produced as an intermediate in the sulfate assimilation pathway for sulfur amino acid biosynthesis (Divol et al., 2012;

AC

Thomas and Surdin-Kerjan, 1997). Finally sulfite addition modulates yeast gene expression to induce appropriate cellular stress response. In this condition gene expression quantification using real-time PCR requires a suitable set of reference genes to allow normalization of the raw data.

3.1 Strategy to select reference gene candidates We selected fifteen reference gene candidates (Table 1). Eight of the candidate genes are commonly used as internal control in yeast gene expression studies (Cankorur-Cetinkaya et al., 2012; Vaudano et al., 2009; Teste et al, 2009). Moreover, we took advantage from an already available transcriptome data set (Sardu et al., 2014; Treu et al., 2014b) to identify new potential reference genes. In this previous work four wine strains P283, P301, R008 and R103 were fermented in synthetic medium without sulfite addition and RNA sequencing on cells collected at mid-log exponential phase and early-stationary phase was performed. Gene expression in these fermentation steps is strongly variable,

6

ACCEPTED MANUSCRIPT therefore their transcriptome comparison allows to identify the most stable genes. For each gene, expression levels were determined as RKPM (reads per kilobase per million). On the whole data set we evaluated the coefficient of variation of RKPM, which is the ratio of standard deviation to mean, calculated on the values of the four strains at both conditions,

IP

T

and selected, as new potential reference genes, those with the smallest coefficient.

SC R

3.2 Fermentation trials

To test the selected genes, fermentations using synthetic wine must (Bely et al., 1990) in bioreactors were performed for each strain (P283, P301, R008 and R103). This medium is similar to wine must and allows a comparison of strains in

NU

conditions mimicking real wine production. Fermentation performance of the 4 strains was determined both in presence and absence of sulfite. Strains showed different fermentation performance and resistance toward sulfite (Table 2). This

MA

strain-specific attitude involves different gene expression profiles that has to be considered when reference genes are tested. In particular in strain P283 fermentation start after cell inoculum was significantly delayed when SO2 was added, suggesting a longer lag phase than in the control conditions and a low resistance to sulfite. Strain R103 showed no

D

fermentation-start delay in presence of sulfite, although a longer fermentation time than in the control was detected.

TE

P301 together with R008 showed the best performances (high fermentation rate and short fermentation time) during fermentation in both conditions and in presence of sulfite in the case of P301 a higher maximum fermentation rate (CO2

CE P

g/l/h) than that evidenced in the control condition (Table 2), revealing both to be the most resistant strains. For each fermentation four samplings were done: two during the lag phase and two during the growth phase.

AC

3.3 Primers validation

For each reference gene a primer pair was designed (Table 1) and validated. For primer design Primer-BLAST program, that align the primer sequence on a selected database (chromosomes from all organisms), was used. No cross reactivity was found with sequences of organisms commonly present in wine environment. For each primer the efficiency (E) of RT-PCR and correlation coefficients (R2) were determined, amplifying DNA from serial 1:5 dilutions of template cDNA obtained by pooling the cDNA from all the samples. Efficiency was considered adequate when ranging from 90% to 110%, R2 was considered acceptable when greater than 0.98-0.99. Correlation coefficient and PCR efficiency of each standard curve were reported in Table 1. Among the commonly used FBA1 and PMA1 genes the amplification efficiency reported in literature (Cankorur-Cetinkaya et al., 2012; Vaudano et al., 2009) was not adequate (87% and 82% respectively). Therefore gene sequences were further analyzed and new primers were found out. With the new primer pairs amplification efficiency value rose to 102.4% for FBA1 and to 104.9% for PMA1.

7

ACCEPTED MANUSCRIPT

3.4 Data analysis We used RT-PCR quantification to test the fifteen potential reference genes during fermentation of the four strains in

T

synthetic wine must both in presence and absence of sulfite.

IP

Data obtained for each of the cDNA samples and each of the 15 potential reference genes were analysed using three

SC R

different applets, geNorm, NormFinder and BestKeeper.

geNorm is the first method developed to select the most stably expressed reference genes. The underlying principle is that the expression ratio of two proper reference genes should remain constant across samples. This software, by means

NU

of a normalization factor gives for each experiment the minimal number of reference genes for accurate normalization (Vandesompele et al., 2009).

MA

NormFinder is an Excel applet based on an algorithm for identifying the optimal normalization genes among a set of candidates. It ranks the candidate genes according to their mRNA expression stability values calculated combining the intra-group and inter-group expression variation in a given sample set and a given experimental design (Andersen et al.,

D

2004; Spinsanti et al., 2006). NormFinder examines the stability of each single candidate genes independently and not

TE

in relation to the other genes, as geNorm does (Andersen et al., 2004). This feature is important if little is known about genes co-regulation.

CE P

BestKeeper is another Excel-based tool that determines the optimal reference genes by using a pair-wise correlation analysis (Pearson correlation coefficient) of all pairs of candidate genes (up to 10 reference genes) and calculating the geometric mean of the best suited ones by raw Cq values of each gene. Differently from geNorm it uses Cq values

AC

(instead of relative quantities) as input and employs a different measure of expression stability. After a preliminary analysis with geNorm and NormFinder, TAF10 was ranked eleventh using NormFinder and second using geNorm (data not shown). This finding is probably due to a co-regulation with the gene YRB1 (first in geNorm ranking and sixth in NormFinder ranking), in fact both are repressed by Sfp1p. We verified this correlation with SPELL (Serial Pattern of Expression Levels Locator) (Hibbs et al., 2007). This query-driven search engine can be used to analyze large gene expression microarray database and to identify genes with the expression profiles more similar to the query gene. SPELL confers to TAF10 an Adjusted Correlation Score (a measure of weighted correlation for the gene with the query set across all datasets) of 2.4 in relation to YRB1, confirming that these two genes are co-regulated. The inclusion of two coregulated genes in the list of candidates may lead to false positive results, due to the similarity in their expression profiles. To avoid false positive results given by the use of two correlated reference genes the gene

8

ACCEPTED MANUSCRIPT TAF10 were excluded and a new analysis was performed. The fourteen remaining candidate reference genes were reanalyzed with the three selected programs. The results from geNorm and NormFinder can be easily compared because they both use raw data (relative quantities)

T

as input data.

IP

geNorm classifies genes according to thecontrol gene stability measure (M value), which represents the average of pair-

SC R

wise variation of a gene with all other control genes (Vandesompele et al., 2002). Selected reference genes were ranked according to their M value, from the most (lowest M value) to the least (highest M value) stable: ALG9, YRB1, FBA1, UBC6, LYS14, PFK1, TFC1, PDA1, ITR1, ACT1, PMA1, DED1, FAS2, ENO1 (Table 3; Figure 1b). All the genes

NU

reached a high expression stability with low M values, below the default limit of 1.5 (Vandesompele et al., 2002). Interestingly, ACT1 expression, widely used as a reference gene in many studies, appears to be less stable than other

MA

genes in the conditions tested. Additionally, the assessment of the normalization factor allowed the identification of the optimal number of control genes. The geNorm software suggested that an accurate normalization factor of qRT-PCR data can be calculated by using at least the four most stably expressed genes. As shown in Figure 1a, the addition of

D

further reference genes will not significantly affect the reliability of the normalization factor, yielding a V4/5 value

TE

(pair-wise variation between two sequential normalization factors) of 0.130, the first value lower than the default cut-off value 0.15 (Vandesompele et al., 2002). According to the geNorm stability rank of the reference genes studied, the four

Figure 1b).

CE P

genes to include in the calculation of a reliable normalization factor were ALG9, YRB1, FBA1 and UBC6 (Table 3;

NormFinder ranks the set of candidate genes according to their expression stability value in a given sample set and a

AC

given experimental design (Andersen et al., 2004). Results of NormFinder analysis are shown in Table 4. In this ranking, the most stable gene is TFC1. Both programs identified the same three reference genes (Table 3 and 4) as most stable expressed, FBA1, ALG9 and UBC6. The fourth gene can be chosen among PFK1 (fifth in NormFinder and sixth in GeNorm), YRB1 (second in GeNorm and eighth in NormFinder) or TFC1 (first in NormFinder and seventh in GeNorm). BestKeeper selects genes considering the coefficient of variance and standard deviation of quantification cycle (Cq). BestKeeper, using Cq values as input data, gave a slightly different output when compared to geNorm and NormFinder. With this software it is possible to analyze no more than ten reference genes together, so the first ten ranking genes in geNorm and NormFinder were selected for the analysis. For this reason, the BestKeeper analysis should not be considered exhaustive (Table 5). According to the authors (Pfaffl et al., 2004) samples showing higher variations in the expression stability of the reference genes were removed from the BestKeeper index calculation.

9

ACCEPTED MANUSCRIPT Table 5 shows Cq values determined for each gene. The most expressed gene was FBA1 with mean Cq values of 18.00, the least was TFC1, with a mean Cq value of 28.24. Considering CV values the most stable gene was UBC6. The standard deviation (SD) in the Cq of ACT1, PDA1 and YRB1 was more than 1, and this is the upper limit above which a

T

gene must be considered inconsistent (Pfaffl et al., 2004). For this reason, these three genes were removed in the

IP

calculation of the BestKeeper index, which finally exhibited a SD variation of 0.85. Correlation between each candidate

SC R

reference gene and the BestKeeper index was calculated, describing the relation between the index and the contributing reference genes by the correlation coefficient (r) and the p-value (Table 5). The best correlation, with highest significance level (p<0.001), between the reference genes and the BestKeeper index was obtained for FBA1, followed

NU

by ALG9, LYS14 and PFK1.

Comparing the ranking obtained with the three Applets the best reference genes were ALG9, FBA1, UBC6 and PFK1

MA

(fifth in NormFinder and sixth in GeNorm).

3.5 Validation of reference gene set

D

We set up a validation test comparing gene expression results using two independent methods: quantitative RT-PCR

TE

and RNA sequencing. For this purpose we considered strain R008 and three commercial strains (EC1118, VL3 and AWRI796) whose transcriptome profile was determined by RNA sequencing (data submitted to GEO database with

CE P

accession numbers GSE57282, GSM1378406–GSM1378411). Each strain was grown in synthetic must with the addition of either 0 or 25 mg/l SO2. Yeast cells were collected at the same times used for fermentation curves, i.e. at the middle exponential growth phase when CO2 produced was 6 g/l. Three biological replicates were performed for each

AC

strain and samples for RNA-seq were pooled before RNA sequencing. The same RNA samples used for RNA sequencing quantification were retro-transcribed to quantify gene expression by RT-PCR, using the reference gene set (ALG9, FBA1, UBC6 and PFK1) for the normalization of six target genes. Among them, for the validation we chose three genes involved in sulfate metabolism, SSU1, MET10, MET17, that modified their expression pattern in presence of sulfite and according to specific SO2 strain resistance level (Divol et al., 2012). Moreover three other genes, SOD1, OLE1 and EEB1, involved in ethanol stress response linked to fermentation kinetics and to flavor production were included in the analysis (Mason and Dufour, 2000; Rosenfeld et al., 2003) (Table 6). All data were normalized to reference genes simultaneously, using the CFX Manager software. Each normalized gene expression value obtained with RT-PCR was compared with that obtained by RNA sequencing expressed as RKPM.

10

ACCEPTED MANUSCRIPT We performed a correlation analysis by Pearson test on RT-PCR data and RKPM values (Table 6). The normalized expression values of the six target genes and the respective RKPM values from transcriptome dataset positively

T

correlated with high significance level, confirming the robustness of the reference gene set.

IP

In conclusion, this work selected and validated a reference gene set to be used for S. cerevisiae gene expression studies

SC R

by means of quantitative RT-PCR during alcoholic fermentation in presence of sulfite. Moreover, as the expression of this group of genes remained constant throughout the fermentation process tolerating the perturbation linked to sulfite addition, it can be proposed also when no sulfite is added to the must.

NU

The three software tested (geNorm, NormFinder and BestKeeper), based on different algorithms and analytical procedures, selected ALG9, FBA1, UBC6 and PFK1 as the most reliable reference genes of this set. On the other hand,

MA

ENO1, PMA1, DED1 and FAS2 commonly used as reference genes, showed unstable expression patterns and have been classified as the least reliable control genes of this group. Finally, the most popular reference gene ACT1, widely used in S. cerevisiae gene expression studies, showed a ranking value just above the latter four genes indicating a stability

TE

D

level notably lower than the proposed reference genes.

Acknowledgements

CE P

This study was funded in part by POR “Competitività regionale e occupazione” - parte FESR 2007/2013 Azione 1.1.1. Progetto “RISIB” SMUPR n. 4145 “Potenziamento della rete di infrastrutture a supporto dell’innovazione biotecnologica” and by MIUR (ex-60% grant).

assistance.

AC

We thank Dr. Marco Giorgio Bianchi (Bio-Rad System Specialists – Gene eXpr. Division) for excellent technical

References

Andersen, C.L., Jensen, J.L., Orntoft, T.F., 2004. Normalization of real-time quantitative reverse transcription-PCR data: a model-based variance estimation approach to identify genes suited for normalization, applied to bladder and colon cancer data sets. Cancer Research 64, 5245-5250.

Bely, L., Sablayrolles, J., Barre, P., 1990. Description of alcoholic fermentation kinetics: its variability and significance. American Journal of Enology and Viticulture 40, 319-324.

11

ACCEPTED MANUSCRIPT

Cankorur-Cetinkaya, A., Dereli, E., Eraslan, S., Karabekmez, E., Dikicioglu, D., Kirdar, B., 2012. A novel strategy for selection and validation of reference genes in dynamic multidimensional experimental design in yeast. PLoS ONE 7,

IP

T

e38351. doi:10.1371/journal.pone.0038351.

yeasts. Applied Microbiology and Biotechnology 95, 601–613.

SC R

Divol, B., du Toit, M., Duckitt, E., 2012. Surviving in the presence of sulphur dioxide: strategies developed by wine

NU

Hellemans, J., Martier, G., De Paepe, A., Speleman, F., Vandesompele, J., 2007. qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biology

MA

8, R19.

Hibbs, M.A., Hess, D.C., Myers, C.L., Huttenhower, C., Li, K., Troyanskaya, O.G., 2007. Exploring the functional

TE

D

landscape of gene expression: directed search of large microarray compendia. Bioinformatics 23, 2692-2699.

Lee, J.H., Fitzgerald, J.B., Dimicco, M.A., Grodzinsky, A.J., 2005. Mechanical injury of cartilage explants causes

CE P

specific time-dependent changes in chondrocyte gene expression. Arthritis & Rheumatology 52, 2386-2395.

1287–1298.

AC

Mason, A.B., Dufour, J.P., 2000. Alcohol acetyltransferases and the significance of ester synthesis in yeast. Yeast 16,

Pfaffl, M.W., Tichopad, A., Prgomet, C., Neuvians, T.P., 2004. Determination of stable housekeeping genes, differentially regulated target genes and sample integrity: BestKeeper--Excel-based tool using pair-wise correlations. Biotechnology Letters 26, 509-515.

Radonic, A., Thulke, S., Mackay, I.M., Landt, O., Siegert, W., Nitsche, A., 2004. Guideline to reference gene selection for quantitative realtime PCR. Biochemical and Biophysical Research Communications 313, 856-862.

12

ACCEPTED MANUSCRIPT Rosenfeld, E., Beauvoit, B., Blondin, B., Salmon, J.M., 2003. Oxygen consumption by anaerobic Saccharomyces cerevisiae under enological conditions: effect on fermentation kinetics. Applied and Environmental Microbiology 69,

T

113-121.

IP

Sardu, A., Treu, L., Campanaro, S., 2014. Transcriptome structure variability in Saccharomyces cerevisiae strains

SC R

determined with a newly developed assembly software. BMC Genomics 15, 1045.

Selvey, S., Thompson, E.W., Matthaei, K., Lea, R.A., Irving, M.G., Griffiths, L.R., 2001. Beta-actin--an unsuitable

NU

internal control for RT-PCR. Molecular and Cellular Probes 15, 307-311.

MA

Spinsanti, G., Panti, C., Lazzeri, E., Marsili, L., Casini, S., Frati, F., Fossi, C.M., 2006. Selection of reference genes for quantitative RT-PCR studies in striped dolphin (Stenella coeruleoalba) skin biopsies. BMC Molecular Biology 7, 32-

D

42.

TE

Stahlberg, A., Elbing, K., Andrade-Garda, J.M., Sjogreen, B., Forootan, A., Kubista, M., 2008. Multiway real-time PCR gene expression profilingin yeast Saccharomyces cerevisiae reveals altered transcriptional response of ADH genes to

CE P

glucose stimuli. BMC Genomics 9, 170.

AC

Suzuki, T., Higgins, P.J., Crawford, D.R., 2000. Control selection for RNA quantitation. BioTechniques 29, 332-337.

Teste, M.A., Duquenne, M., Francois, J.M., Parrou, J.L., 2009. Validation of reference genes for quantitative expression analysis by real-time RT-PCR in Saccharomyces cerevisiae. BMC Molecular Biology 10, 99-113.

Thomas, D., Surdin-Kerjan, Y., 1997. Metabolism of sulfur amino acids in Saccharomyces cerevisiae. Microbiology and Molecular Biology Reviews 61, 503–532.

Treu, L., Toniolo, C., Nadai, C., Sardu, A., Giacomini, A., Corich, V., Campanaro, S., 2014 a. The impact of genomic variability on gene expression in environmental Saccharomyces cerevisiae strains. Environmental Microbiology 16, 1378–1397.

13

ACCEPTED MANUSCRIPT Treu, L., Campanaro, S., Nadai, C., Toniolo, C., Nardi, T., Giacomini, A., Valle, G., Blondin, B., Corich, V., 2014 b. Oxidative stress response and nitrogen utilization are strongly variable in Saccharomyces cerevisiae wine strains with

T

different fermentation performances. Applied Microbiology and Biotechnology 98, 4119-4135.

IP

Vandesompele J, De Preter K, Pattyn F, Poppe B, Van Roy N, De Paepe A, Speleman F. (2002) Accurate normalization

SC R

of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes. Genome Biology 3, research0034.1–research0034.11.

NU

Vandesompele, J., Kubista, M., Pfaffl, M., 2009. Reference gene validation software for improved normalization. In Logan, J., Edwards, K., Saunders, N. (Eds), Real-Time PCR: Current Technology and Applications. Caister Academic

MA

Press, pp. 47-64.

Vaudano, E., Costantini, A., Cersosimo, M., Del Prete, V., Garcia-Moruno, E., 2009. Application of real-time RT-PCR

TE

Microbiology 129, 30-36.

D

to study gene expression in active dry yeast (ADY) during the rehydration phase. International Journal of Food

CE P

Vaudano, E., Noti, O., Costantini, A., Garcia-Moruno, E., 2011. Identification of reference genes suitable for normalization of RT-qPCR expression data in Saccharomyces cerevisiae during alcoholic fermentation. Biotechnology

AC

Letters 33, 1593-1599.

14

ACCEPTED MANUSCRIPT

structural constituent of

IP

(SGD curated)

ALG9

mannosyltransferase activity

R2

285 bp

96.8

0.999

156 bp

95.7

0.996

285 bp

100.2

0.999

141 bp

105.5

0.994

223 bp

101.8

0.982

272 bp

99.0

0.985

125 bp

102.4

0.998

139 bp

104.9

0.999

R: 5'-AAGAAGCTGCCACCGCCACG-3'

169 bp

102.4

0.999

F: 5'-TGCACGCTGTTAAGAACGTCAACGA-3'

183 bp

102.0

0.999

Primer Sequence [5'-->3']

Teste et al., 2009

F: 5'-ATTATATGTTTAGAGGTTGCTGCTTTGG-3'

cytoskeleton

R: 5'-CAATTCGTTGTAGAAGGTATGATGCC-3' Teste et al., 2009

F: 5'-CACGGATAGTGGCTTTGGTGAACAATTAC-3'

Teste et al., 2009

(acetyl-transferring) activity TAF10

RNA pol II transcription factor

R: 5'-TATGCTGAATCTCGTCTCTAGTTCTGTAGG-3' Teste et al., 2009

activity TFC1

RNA pol III transcription

Teste et al., 2009

R: 5'-GAACCTGCTGTCAATACCGCCTGGAG-3'

Teste et al., 2009

activity fructose-bisphosphate aldolase activity PMA1

hydrogen-exporting ATPase

F: 5'-GATACTTGGAATCCTGGCTGGTCTGTCTC-3'

R: 5'-AAAGGGTCTTCTGTTTCATCACCTGTATTTGC-3'

Cankorur-Cetinkaya et

F: 5'-GGTTTGTACGCTGGTGACATCGC-3'

al., 2012

R: 5'-CCGGAACCACCGTGGAAGACCA-3'

Vaudano et al., 2009

F: 5'-GCCTGCTAAGACTTACGATGACGC-3'

AC

FBA1

F: 5'-GCTGGCACTCATATCTTATCGTTTCACAATGG-3'

CE P

Ubiquitin-protein ligase

F: 5'-ATATTCCAGGATCAGGTCTTCCGTAGC-3' R: 5'-GTAGTCTTCTCATTCTGTTGATGTTGTTGTTG-3'

factor activity UBC6

F: 5'-ATTTGCCCGTCGTGTTTTGCTGTG-3'

TE D

pyruvate dehydrogenase

MA N

R: 5'-TATGATTATCTGGCAGCAGGAAAGAACTTGGG-3' PDA1

activity, phosphorylative

Amplicon

Efficiency %

Reference

CR

ACT1

Molecular function

US

Gene

T

Table 1. List of candidate reference genes and details of primers and amplicons for each gene.

R: 5'-TTCACCGGCGGCAACTGGAC-3'

Length

mechanism DED1

RNA strand annealing activity;

Present work

ATP-dependent RNA helicase

F: 5'-TGGCTGAACTGAGCGAACAAGTGC-3'

activity ENO1

phosphopyruvate hydratase

Present work

15

ACCEPTED MANUSCRIPT activity FAS2

R: 5'-CAGCGGCAGCTCTGGAAGCA-3'

Fatty Acid Synthetase activity

Present work

F: 5'-AGGGTGCTGCTGGTGCATGG-3'

myo-inositol transmembrane

Present work

F: 5'-CGCAATCAAATGTTGGTGATGCCG-3'

IP

ITR1

T

R: 5'-ACACGGCTCTGACACCGTCG-3'

LYS14

R: 5'-CGCTAGCGGGAGCCCTCTGTA-3'

RNA pol II core promoter

CR

transporter activity Present work

F: 5'-GCTAGAGCGGGATCTTTAGGTGGC-3'

PFK1

6-phosphofructokinase activity

R: 5'-GCTCTGAAGTAGTGGGATGACCTGC-3'

US

transcription factor activity Present work

F: 5'-GAGGTTGATGCTTCTGGGTTCCGT-3'

YRB1

Ran GTPase binding

MA N

R: 5'-TGTGGCGGTTTCGTTGGTGTCG-3' Present work

F: 5'-ATTCGATGCCGATGCCAAGGAATG-3' R: 5'-AGTGAAGGCTTCTGCTTCACCTTCT-3'

165 bp

101.0

0.998

129 bp

98.0

0.996

148 bp

105.4

0.995

138 bp

97.7

0.998

235 bp

95.2

0.999

Genes in bold are from literature, the remaining reference genes are chosen analyzing the transcriptome data set (Treu et al., 2014b). F and R indicate forward and reverse primer

AC

CE P

TE D

respectively.

Table 2. Strains fermentation performances in synthetic must (MS300). Fermentation start (hours)

Fermentation time (hours)

Maximum fermentation rate (CO2 g/l/h)

Strain

0 mg/l SO2

40 mg/l SO2

0 mg/l SO2

40 mg/l SO2

0 mg/l SO2

40 mg/l SO2

P283

6:05*

8:05*

175*

180*

1.28

1.31

P301

4:15

3:35

140

142

1.66*

2.12*

R008

4:35

4:15

187

189

1.66

1.99

R103 5:45 5:15 150* Significant differences are indicated with * (p<0.05).

171*

1.40

1.49 16

ACCEPTED MANUSCRIPT Table 3. Candidate reference genes for normalization of qRT-PCR ranked according to their expression stability

0.469

YRB1

0.565

0.679

FBA1

0.589

0.475

UBC6

0.657

0.527

LYS14

0.705

0.855

PFK1

0.77

0.631

TFC1

0.833

0.288

PDA1

0.888

0.532

ITR1

0.952

0.43

ACT1

1.014

0.535

PMA1

1.075

0.664

DED1

1.127

0.629

FAS2

1.186

0.768

ENO1

1.373

1.227

IP

0.527

SC R

ALG9

NU

CV

MA

M value

D

Gene name

T

(calculated as the average M value after stepwise exclusion of worse scoring genes) by the geNorm applet.

Table 4. Candidate reference genes for normalization of qRT-PCR listed according to their expression stability

TE

calculated by the NormFinder applet. Stability value

TFC1

0.280

FBA1 UBC6 PFK1 ITR1 PDA1 YRB1

0.363 0.416 0.458 0.489

AC

ALG9

CE P

Gene name

0.534 0.608 0.610

LYS14

0.644

ACT1

0.672

DED1

0.769

FAS2

0.804

PMA1

0.896

ENO1

1.648

17

ACCEPTED MANUSCRIPT Table 5. Results from BestKeeper descriptive statistical analysis and correlation analysis (measures of the correlation coefficients between each reference gene and the BestKeeper index). LYS 14

ALG 9

ITR 1

FBA 1

UBC 6

TFC 1

PDA 1

YRB 1

PFK 1

n

25

25

25

25

25

25

25

25

GM [Cq]

21.62

26.36

25.24

24.74

17.96

25.08

28.23

23.34

AM [Cq]

21.67

26.38

25.27

24.76

18.00

25.09

28.24

Min [Cq]

19.22

24.75

23.98

23.05

15.98

23.03

26.41

Max [Cq]

25.02

28.75

27.83

27.26

20.40

26.55

30.00

SD [± Cq]

1.19

0.85

0.88

0.88

0.87

0.70

0.83

CV [% Cq]

5.51

3.22

3.47

3.54

4.85

2.80

coeff. of corr. [r]

0.87

0.94

0.75

0.95

0.82

p-value

0.001

0.001

0.001

0.001

0.001

T

ACT1

25

22.49

23.47

23.37

22.53

23.49

21.10

21.21

21.85

25.12

25.03

26.40

1.10

1.12

0.97

4.72

4.97

4.14

IP

CR

US

MA N 2.93

25

0.86

0.001

0.001

TE D

0.84

Abbreviations: n: number of samples; GM [Cq]: geometric mean of Cq (quantification cycle); AM [Ct] arithmetic mean of Cq; Min [Cq] and Max [Cq]: extreme values of Cq;

CE P

SD [± Cq]: standard deviation of the Cq; CV [% Cq]: coefficient of variance expressed as a percentage on the Cq level; coeff. of corr. [r] : coefficient of correlation.

AC

Table 6. Target genes and primer pair sequences; Pearson correlation coefficients (r) calculated for normalized RT-PCR expression data and RKPM values.

Name

SSU1

Molecular function

Description

Primer Sequence [5'-->3']

coefficient [r]

(SGD curated)

p-value

Plasma membrane sulfite pump involved in

sulfite transmembrane

F: 5'-TTTGCGTTTGTTGGTCAATTCTATGCCTTTTA -3'

sulfite metabolism; required for efficient sulfite

transporter activity

R: 5'-TCCACGCTTTCAATGCTGTTATACGGAGAA -3'

Subunit alpha of assimilatory sulfite reductase;

contributes to sulfite reductase

F: 5'-GTACACCCGTAACTGCCATTTCATCTGTGC -3'

0.852

complex converts sulfite into sulfide

(NADPH) activity

R: 5'-AATGGCTTCCCACGTGATTCGTTACCA -3'

0.05

cysteine synthase activity

F: 5'-GCCAAGAGAACCCTGGTGACAATGCTC -3'

0.988

efflux

MET10 MET17

correlation

O-acetyl

homoserine-O-acetyl

serine

0.990 0.01

18

ACCEPTED MANUSCRIPT sulfhydrylase; required for methionine and

R: 5'-GGAAACGGGAATAGACGTAACCTGGAACTTCT -3'

0.01

F: 5'-GGTAACGTAAAGACGGACGAAAATGGT -3'

0.846

cysteine biosynthesis detoxifies superoxide required

for

electron carrier activity

F: 5'-ATCATTTCTGGTATTGTTCACGACGTATCTGG -3'

stearoyl-CoA 9-desaturase

normal distribution of mitochondria

activity

Acyl-coenzymeA:ethanol

fatty acid ethyl ester biosynthesis during

alcohol O-octanoyltransferase activity short-chain carboxylesterase

fermentation; possesses short-chain esterase

activity

O-acyltransferase;

R: 5'-CAAGACATTTTGAGCGGCATTTGAGTGAC -3'

activity

0.884 0.01

R: 5'-CGGCAGCTTGCTTTGTTAACCAGGAAT -3'

0.756 0.05

AC

CE P

TE D

F and R indicate forward and reverse primer respectively.

0.05

F: 5'-GCAACGGATGATCCAGTTACAGGTGAAAAC -3'

MA N

monounsaturated fatty acid synthesis and for

responsible for the major part of medium-chain

EEB1

IP

desaturase;

CR

acid

R: 5'-TTCAAAGATTCTTCAGTGTCACCCTTACCT -3'

US

Fatty

OLE1

superoxide dismutase activity

T

Cytosolic copper-zinc superoxide dismutase;

SOD1

19

ACCEPTED MANUSCRIPT

b

AC

CE P

TE

D

MA

NU

SC R

IP

T

a

Figure 1 geNorm output charts. (a) determination of the optimal number of control genes for normalization calculated on the basis of the pair-wise variation (V) analysis; V values under 0.15 threshold line indicate no need to include further HKG for calculation of a reliable normalization factor; (b) average expression stability measure (M) of control genes during stepwise exclusion of the least stable control genes.

20

ACCEPTED MANUSCRIPT Highlights

1. RNA-seq data set during fermentation process was used to select new reference genes 2. Reference genes were tested during fermentation in presence and absence of sulfite

T

3. Validation test was performed using different strains grown with and without sulfite

IP

4. In the validation test RT-PCR gene expression was compared with RNA-seq data

AC

CE P

TE

D

MA

NU

SC R

5. The most reliable reference gene set was composed of UBC6, FBA1, ALG9 and PFK1

21