Preference mapping by PO-PLS: Separating common and unique information in several data blocks

Preference mapping by PO-PLS: Separating common and unique information in several data blocks

Food Quality and Preference 24 (2012) 8–16 Contents lists available at SciVerse ScienceDirect Food Quality and Preference journal homepage: www.else...

417KB Sizes 3 Downloads 45 Views

Food Quality and Preference 24 (2012) 8–16

Contents lists available at SciVerse ScienceDirect

Food Quality and Preference journal homepage: www.elsevier.com/locate/foodqual

Preference mapping by PO-PLS: Separating common and unique information in several data blocks Ingrid Måge a,⇑, Elena Menichelli a,b, Tormod Næs a,c a

Nofima AS, Olsoveien 1, NO-1430 Ås, Norway University of Life Sciences, NO-1430 Ås, Norway c University of Copenhagen, Faculty of Life Sciences, Dept. Food Science, Rolighedsvej 30, 1958 Fredriksberg Copenhagen, Denmark b

a r t i c l e

i n f o

Article history: Received 7 January 2011 Received in revised form 16 August 2011 Accepted 16 August 2011 Available online 25 August 2011 Keywords: PO-PLS PLS regression Multi-block analysis Preference mapping

a b s t r a c t In food development, preference mapping is an important tool for relating product sensory attributes to consumer preferences. The sensory attributes are often divided into several categories, such as visual appearance, smell, taste and texture. This forms a so-called multi-block data set, where each block is a collection of related attributes. The current paper presents a new method for analysing such multi-block data: Parallel Orthogonalised Partial Least Squares regression (PO-PLS). The main objective of PO-PLS is to find common and unique components among several data blocks, and thereby improve interpretation of models. In addition to that, PO-PLS overcomes some challenges from the standard multi-block PLS regression when it comes to scaling and dimensionality of blocks. The method is illustrated by two case studies. One of them is based on a collection of flavoured waters that are characterised by both odour and flavour attributes, forming two blocks of sensory descriptors. A consumer test has also been performed, and PO-PLS is used to create a preference map relating the sensory blocks to consumer liking. The new method is also compared to a preference map created by standard PLS regression. The same is done for the other data set where instrumental data are applied together with sensory data when predicting consumer liking. Here the sensory variables are divided into two blocks: one related to appearance and mouth feel attributes and the other one describing odour and taste properties. In both cases the results clearly illustrate that PO-PLS and PLS regression are equivalent in terms of model fit, but PO-PLS offer some interpretative advantages. Ó 2011 Elsevier Ltd. All rights reserved.

1. Introduction In modern food development, actual prototypes are often assessed by several measurement principles such as chemical analysis, descriptive sensory analysis and various types of consumer liking or choice tests. Typically, one will relate these different measurements to each other in order to obtain improved information about what are the main ‘‘drivers of liking’’ and how the values of these ‘‘drivers’’ can be optimised (Helgesen, Solheim, & Næs, 1997; Moskowitz & Silcher, 2006; Næs, Lengard, Johansen, & Hersleth, 2010). The focus of the present paper will be relations between sensory attributes and instrumental measurements on one side and consumer liking of products on the other. Descriptive sensory analysis data often consists of different groups or types of attributes. For food products the most important groups are attributes related to visual appearance, smell, taste and texture. In many cases all these attributes are considered together (Helgesen & Næs, 1995; McEwan, 1996; Wold, Veberg, & Nilsen, ⇑ Corresponding author. Tel.: +47 64970100; fax: +47 64970333. E-mail address: ingrid.mage@nofima.no (I. Måge). 0950-3293/$ - see front matter Ó 2011 Elsevier Ltd. All rights reserved. doi:10.1016/j.foodqual.2011.08.003

2006), while in other cases one will also be interested in how the different groups of attributes, here called data blocks, relate to each other (Martens, Tenenhaus, Vinzi, & Martens, 2007). Likewise, within consumer testing one may be interested in the relation among different measurements taken, for instance among expectation, blind and informed liking or between consumer attributes such as attitudes and habits. The challenge is then not only to find the relation between the main categories of data as listed in the first paragraph, but also relations within each of the categories. Multivariate data analysis tools such as Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression (Martens & Næs, 1989), are essential for interpreting relationships between many variables. When several data blocks are present, a straightforward solution is to put all variables together into one large data matrix and analyse it with conventional PLS and PCA, depending on whether a predictive direction is present or not. This is often referred to as multi-block PCA and PLS, respectively (Westerhuis, Kourti, & MacGregor, 1998). The drawback of this approach is that variables from different blocks are mixed together, which might obscure interpretation (Jørgensen, Segtnan, Thyholt, & Næs, 2004). The solution will also depend heavily on how the different

9

I. Måge et al. / Food Quality and Preference 24 (2012) 8–16

variable blocks are scaled relative to each other and there may be problems for situations with different dimensionality within each of the blocks. A variant of this approach which proposes a special type of weighting is Multiple Factor Analysis (MFA, Escofier & Pagès, 1994) based on PCA of a concatenated matrix after weighting of each block separately. An approach which solves the problem of different scale is Canonical Correlation Analysis (CCA), introduced by Hotelling (1936). In CCA, linear combinations of two blocks of variables are obtained in such a way that the squared correlation between the linear combinations is maximised. A generalization of the method, called GCA (Carroll, 1968), allows for more than two data blocks. Even though GCA is invariant to scale, it has other problems related to over-fitting and instability when the number of variables is large. Other related approaches can be found in Kettenring (1971); Hanafi and Kiers (2006); Dahl and Næs (2006) and Kohler, Bertrand, Møretrø, and Qannari (2009). In the area of chemometrics, a couple of methods have recently been developed for solving these problems based on sequential use of PLS regression on matrices that are orthogonalised with respect to each other. These methods are invariant with respect to the relative scale of the data blocks, they allow for different dimensionality of the blocks, allow for high collinearity within and between blocks, and enhance interpretation (Jørgensen, Mevik, & Næs, 2007; Jørgensen et al., 2004; Måge, Mevik, & Næs, 2008). So far the methods have mainly been tested for predictive modelling of production processes with a recent exception of Næs, Tomic, Mevik, and Martens (2010) where one version of it is used within the context of path modelling. Two variants of this type of modelling exist, namely the sequential procedure (SO-PLS, Jørgensen et al., 2004, 2007; Næs et al., 2010) and the so-called parallel method (PO-PLS, Måge, Mevik, & Næs, 2008). The two variants are useful for different purposes, and the difference lies in the way the data blocks are incorporated and which type of information is extracted. In SO-PLS, the focus is on incorporating blocks of data one at a time and assessing and interpreting the incremental or additional contribution of the different blocks added. For the PO-PLS method the focus is on first identifying the information that is common between the blocks and then on identifying the information in each block that is unique. The present paper is a study of the use of PO-PLS in the area of preference mapping. The method is a combination of PLS regression and GCA. In the approached situation one is interested in the relation between sensory attributes and consumer liking with a special focus on how different blocks of sensory data relate to each other and to the consumer preference data. In one of the examples used for illustration, instrumental data will be applied together with sensory data when predicting consumer liking. In this way the paper is also an illustration of how one can incorporate instrumental data together with sensory data in preference mapping using one single analysis. It will be shown that this type of modelling can be used for obtaining more information than standard preference mapping which will also be tested on the same data. The examples chosen are particularly useful for showing what the new methods does in comparison with standard approaches. An additional scope of the paper is to make the methodology known to the sensory and consumer science community. The method will be illustrated by analysing two data sets.

2. Materials and methods 2.1. Data sets 2.1.1. Flavoured waters The main objective of this study was to develop a new type of flavoured water. Sensory and consumer trials were performed in

order to optimise the recipe and gain knowledge about which sensory attributes the consumers respond positively (or negatively) to. The data set is collected in such a way that it is suitable for investigating how different groups of attributes are affected by the recipe, and how consumers relate to these groups. Eighteen water samples were prepared according to a full factorial design with three design factors: Flavour type (A or B), flavour dose (0.2%, 0.6% or 0.8%) and sugar content (L (low), M (medium) or H (high)). A trained sensory panel consisting of 11 assessors evaluated the samples first by smelling (9 descriptors) and then by tasting (14 descriptors). The test was done according to a standard descriptive analysis protocol using a scale between 1 and 9 for each of the attributes. Two data blocks were then obtained for the odour and taste attributes separately by averaging both data sets over the assessors. The sensory descriptors are listed in Table 1. In addition, 180 consumers tested 10 of the waters each, and rated their overall liking on a scale from 1 (‘‘Dislikes very much’’) to 9 (‘‘Likes very much’’). The ten waters per consumer were selected according to an incomplete block design, and were presented in two sessions with five waters in each session. The consumers were selected according to relevant market figures: 50% males/females aged between 20 and 49 years. The missing observations, due to the incomplete design structure, were here estimated by PCA (The Unscrambler X, version 10.0.1, CAMO Software AS, Oslo, Norway). The NIPALS algorithm was used for estimation, and the consumers were mean centered but not scaled. The percentage of missing values is high (44%), but the estimates are regarded as adequate since the number of consumers is relatively high and the structure of the data is strong (Hedderley & Wakeling, 1995). The data thus consists of four data blocks: design matrix, two sensory data sets and one consumer liking data set (see Fig. 1). In this case, the design matrix is only used for interpretation purposes. Further information about the data set can be found in a series of application notes from CAMO Software AS (Måge, 2008a, 2008b, 2008c). 2.1.2. Jams This data set stems from the Norwegian food research institute (now called Nofima). It consists of 12 raspberry jams selected according to a factorial design based on four production places (C1-C4) and three harvesting times (H1-H3). It is used as a tutorial data set in The Unscrambler (CAMO Software AS, Oslo, Norway), and is also thoroughly described and analysed by Esbensen (2002). The jams were evaluated by a trained sensory panel, rating 12 attributes on a 9 point scale, and overall liking was scored by 114 representative consumers. For illustration purposes we will here Table 1 Sensory descriptors in the flavoured waters data set. To distinguish the two groups ‘‘odour’’ and ‘‘flavour’’, odour descriptors are always given in upper-case letters, and flavour descriptors in lower-case letters. Odour

Flavour

RIPE TROPICAL CANDY SYNTHETIC LACTONIC SULFURIC SKIN GREEN FLORAL

Ripe Tropical Candy Synthetic Lactonic Sulfurous Skin Green Floral Sweet Sour Bitter Dry Sticky

10

I. Måge et al. / Food Quality and Preference 24 (2012) 8–16

Fig. 1. Data blocks in the flavoured waters case study.

confine ourselves to the average of these consumer preferences. It will also be shown below that the prediction ability of the average liking is very high showing that the individual differences component is of less interest here. In addition, six relevant instrumental variables were measured in the laboratory. An overview of the sensory and instrumental variables is given in Table 2. The sensory variables are also divided into two conceptually meaningful blocks, one consisting of appearance and mouth feel attributes and the other block describing odour and taste attributes. For the statistical analysis we therefore have three predictor blocks (instrumental, appearance/mouthfeel and odour/taste) and one response variable (average consumer preference), in addition to the design which will here only be used for interpretation. The data blocks are illustrated in Fig. 2. It should be mentioned that other ways of splitting the data into blocks could also be envisioned, if other aspects than those discussed here are of interest. 2.2. Preference mapping by PLS regression In external preference mapping (Næs et al., 2010), a reduced rank map, obtained by PCA, of the sensory data is constructed and then consumers are related one by one to the information in the sensory space, i.e. the principal components of the sensory profile. Generally, both Principal Component Regression (PCR) and PLS regression can be used for external preference mapping, but for situations where the different consumers test different products, the PCR may be the most suitable since it builds on the same lowdimensional space for all consumers. In the flavoured waters case presented here, however, the missing values in the consumer liking Table 2 Instrumental and sensory variables for the jam data set. The sensory variables are separated into two blocks, one describing the appearance and mouthfeel of jams and one describing the odour and taste. Block

Name

Description

Instrumental

L

Spectrophotometric colour measurement (lightness) Spectrophotometric colour measurement (red/green) Spectrophotometric colour measurement (yellow/blue) Absorbency Soluble solids (%) Titrable acidity (%) Redness Colour intensity Shininess Juicyness Viscosity/thickness Chewing resistance Raspberry odour Raspberry flavour Sweetness Sourness Bitterness Off-flavour

a b

Sensory appearance/ mouth feel

Sensory odour/taste

Absorb Soluble Acidity Red Colour Shiny Juicy Viscos Chewing Smell Flavour Sweet Sour Bitter Off-flav

matrix are estimated (imputed) and therefore both approaches can be used. In general, from a practical point of view, the two approaches usually give quite similar results (Næs, Brockhoff, & Tomic, 2010). Here we have chosen PLS regression since PLS is the regression method used in the PO-PLS method. 2.3. Preference mapping by PO-PLS regression The main aim of PO-PLS is to identify common (i.e. overlapping, redundant) and unique components across multiple predictor data blocks (Måge, Mevik, & Næs, 2008). The PO-PLS algorithm uses a combination of PLS regression and GCA to identify these components. The idea is to first identify a subspace that is common for the input blocks and then orthogonalise the individual blocks with respect to this space in order to identify the unique information in the blocks. For combinations of for instance chemical and sensory data, this method is suitable for identifying which part (which dimensions) of the two data sets that are common and which parts that are unique and non-overlapping with the other. This adds to the general understanding of the data and may also give information about what to measure in future applications of the same type. The first step is to do separate PLS regressions for each input block and use only the relevant scores for the rest of the estimation. The background for this is to get rid of as much as possible of the noise and reduce dimensionality and in this way stabilise the estimation process of both the common and the unique contributions. This part is particularly useful for the GCA method which is very sensitive to noise. A number of selected score vectors thus define the relevant subspaces for the different blocks. Then, the GCA is used to explore the correlation structure between these subspaces. The GCA is a method which finds linear combinations of the different blocks that correlate as strongly as possible with each other. Directions or dimensions with a sufficiently high canonical correlation are defined as common. When the common information is identified, the original block subspaces are then orthogonalised with respect to the common components, and the information left is regarded as unique for each block. The unique part of the data blocks is then restructured into a set of unique components/dimensions by using a new PLS regression with the orthogonalised subspace as predictors. The PO-PLS algorithm will here be described for B blocks. For one of the examples, B will be equal to 2, but for the second one it will be equal to 3. Before starting the algorithm, the variables in all matrices are centered and possibly scaled. Note that the method is invariant to the relative scaling between the blocks, but not within each block. The different steps of the method will be discussed below. 3. PO-PLS algorithm 1. Perform a standard PLS regression for Y versus each of the input blocks Xb (b = 1, .. ,B) and keep a relevant number of components, Tb, from each model.

I. Måge et al. / Food Quality and Preference 24 (2012) 8–16

11

Fig. 2. Data blocks in the jam case study.

2. For a subset k of the blocks, GCA is used to identify common components (or directions in space) as those linear combinations of the blocks with a high (i.e. close to 1) correlation. The scores of these components are denoted by TCk. 3. Orthogonalise the Tb with respect to TC,k, obtaining Tb,O. 4. Repeat step 2 and 3 for all relevant subsets of blocks, now using the orthogonalised Tb,O as input to GCA. 5. For each block, perform a standard PLS regression of Y versus the orthogonalised scores of each of the blocks, i.e. versus. Tb,O, obtaining unique scores TUb. 6. Build the final prediction model by fitting Y directly to the concatenated score matrices [TC1, . . .TCK, TU1, . . .TUB] by ordinary least squares regression, obtaining regression coefficients R. When the model is fitted, loadings for each block are calculated as Pi ¼ XTb Ti for the predictor blocks and Qi = YTTi for the response. 3.1. Comments to the algorithm As can be seen, the method produces a set of blocks of common components (TCk), (where k denotes a subset of the blocks) depending on which combinations of blocks that are considered in step 2. The ‘‘leftovers’’ from this step, i.e. the parts of each individual blocks that are orthogonalised with respect to the common components (i.e. the unique parts, Tb,O), are likewise decomposed into a reduced number of PLS components (Tub). The underlying model for PO-PLS can thus be formulised as:

Xb ¼ TCb PTCb þ TUb PTUb þ Eb

b ¼ 1  B

Y ¼ Ttot R þ F where TCb comprises all common scores blocks for all subsets where block b is present, and Ttot is the concatenation of all common and unique scores for all blocks, [TC1,. . .TCK,TU1,. . .TUB]. The matrices Eb and F are error terms not accounted for by the systematic common and unique components. Note that the scores (both common and unique) within each block are always orthogonal, while the scores from different blocks are not necessarily so. For instance, a common component between block 1 and 2 is not orthogonal to unique components from block 3. The correlation between scores from different blocks will, however, never be close to one, because they would then have been identified as common components. In the example below it will be shown that the correlation can be very small, which will typically be the case in practice. Contrary to PCA and PLS regression the scores, and not the loadings, are scaled to unit variance in PO-PLS. The reason for this is that the common scores represent several predictor blocks, but explain a different amount of variance in each block. The variance is therefore retained in the individual block loadings instead of the scores.

Scores and loadings can be plotted and interpreted in the same way as for PCA and PLS regression. The common components will have loadings from several blocks, and it is natural to plot these together in order to identify variables from different blocks that are correlated. Since the loadings are not normalised, it is more informative to base such combined loadings plots on correlation loadings, which are simply the correlation between each of the score vectors and each X- or Y-variable. Correlation loadings will therefore be used to interpret results here. The first step in the algorithm is necessary in order to enhance stability. Since only the PLS components are used, the noise is filtered out and the dimensionality is reduced. Alternatively one could have used PCA at this point. In this step it is generally better to keep too many than too few components, in order to guarantee that the main part of the information is involved in the rest of the process. If many noise components are kept, however, the likelihood of obtaining ‘‘false’’ common components (i.e. components with high correlation but low explained variances) increases. In the second step, the focus is on identifying the common space. The problem here is to define the size of the correlation for setting the limit between common and not common dimensions. The limit depends on the noise level in the blocks: for noisy data one might accept a lower correlation than for very precise measurements. In our experience, the correlation should in any case be above 0.90 or 0.95. In addition to the correlation coefficient, explained variances in Xb and Y can be used to decide if a component is identified as common or not (as will be demonstrated below). For instance, if a common component has high canonical correlation, but only explains a small percentage of a certain data block, it is likely that it explains only noise and should not be associated with that block despite the high correlation. In the third step, orthogonalisation is used to remove the common space dimensions from the predictor blocks. The orthogonalised Tb,O can also be seen as the residuals obtained by fitting or predicting the block from the joint ‘‘block’’, represented by TCk. In the fifth step, the unique parts of the predictor blocks are identified as the remaining structure in Tb,O after all common components are projected away. The PLS regression in this step is needed to obtain full-rank and orthogonal score matrices for each predictor block. Similarly to step 1, PCA could have been used instead of PLS at this point. 3.2. Validation and selection of components In PLS regression, the number of components is often selected based on cross-validation (CV). This is also the case for PO-PLS, but instead of one, global CV a sequence of cross-validations is preformed. A local CV is run in each step of the algorithm, and the numbers of components are selected sequentially as the algorithm

12

I. Måge et al. / Food Quality and Preference 24 (2012) 8–16

consumers in each 0.1  0.1 square of the correlation loadings plot. The densities reveal that consumers show a higher preference for samples with high values of the synthetic, lactonic and floral attributes positioned at the right hand side of the loadings plot. In particular, most of consumers lie on the upper right quadrant, described mainly by odour variables (capital letters). The score plot in Fig. 3 suggests which samples correspond to the different sensory characteristics. The most preferred samples are characterized by flavour type A and a medium–high sugar content since they lie on the right in the scores plot. The least preferred ones are those with flavour type B and low sugar content. The design variable flavour dose is not revealed in this space; its information is to some extent present in the third component, but it explains only 3% of the variation in the sensory data and 6% of the consumer data. For visualisation of the method we focus on the two most important components only.

proceeds. This means that the number of common components is selected first, and the unique components will depend on the previous decisions. With this approach, focus is on detecting common structures rather that constructing the optimal prediction model. However, results show that prediction power is comparable to the ordinary PLS model. Cross-validation is not always suitable for small factorial designs, since all samples are unique and are needed to span the variation adequately. In some cases, the design itself can be used as external validation of the solution by acknowledging that all systematic variation in the data set should be caused by the design factors, and only components that reflect these factors should be regarded as valid. This is a common way of validating solutions in areas with few samples, and will be applied for the flavoured waters data set. 4. Results

5. PO-PLS regression 4.1. Flavoured waters data set The PO-PLS regression model suggests one common component from both blocks, and one unique component from the flavour block. The two components explain 23% of the variation in the consumer liking block, which is the same percentage as for the PLS model. The procedure for selecting the number of components was done sequentially, using the design variables to validate the components:

4.1.1. PLS regression based on all input variables All variables were here used without any standardisation, only centering. The PLS results are presented in Fig. 3 and in Table 3. The first two PLS-components are clearly related to two of the design variables: the samples are divided into two vertical groups according to flavour type and into three horizontal segments related to the sugar content (see right plot in Fig. 3). These two components describe 89% (58 + 31%) of the information in X (odour and flavour sensory blocks treated together). The regression model for the preference data versus the two components from sensory analysis explains about 23% (12 + 11%) of the variance. Note that all these percentages are based on fitting, so that validation needs to be based on the design as discussed above. The loadings plot for the two components is presented in Fig. 3. The density of consumers is calculated by counting the number of

5

candy 0.5

A−M A−M A−M 0

B−M B−M B−M

A−L

−5

B−L B−L B−L 0 Comp. 1

5

ripe tropical

floral LACTONIC SYNTHETIC GREEN SKIN FLORAL

CANDY

sulfurous 0

lactonic SULFURIC TROPICAL RIPE

syntetic bitter

−0.5

A−L A−L

−5 −10

sweet sticky

A−H

Comp. 2

Comp. 2

1

A−H A−H B−H B−H B−H

1. Selecting relevant subspaces for each block: Three PLS-components were selected from each block. The 3 components explained 96% (90 + 4 + 2%) and 92% (72 + 17 + 3%) of the X-variance in the odour and flavour blocks respectively. Three components were selected despite of the small amount of explained variance for some of them, since there are three design variables. In this step, it is better to keep too many than too few components.

green dry

10

sour skin

−1 −1

−0.5

0 Comp. 1

0.5

1

Fig. 3. Preference map by PLS regression. The scores plot (left) shows the products. Labels represent flavour type and sugar level. Loadings plot (right) shows the flavour descriptors (lower case) and odour descriptors (upper case) and density distribution of consumers, where darker colour represents higher density.

Table 3 Flavoured waters data set: component-wise and total explained X- and Y-variances for the PLS model and the PO-PLS model. The numbers are based on fitting, not crossvalidation. PLS

PO-PLS

Comp. 1 (%)

Comp. 2 (%)

Total (%)

Explained X-variance

58

31

89

Explained Y-variance

12

11

23

Common comp. (%) Odour Flavour

86 25 13

Unique comp. (%)

Total (%)

63 10

86 88 23

13

I. Måge et al. / Food Quality and Preference 24 (2012) 8–16

0.4

1 B−M

0.2

B−L B−L B−L

B−H B−H

0.1 0 −0.1

A−L

A−M A−M

−0.2 −0.3 −0.4 −6

A−L

A−H A−H

skin sour sweet sticky

0 CANDY

green dry −0.5

6

−1 −1

candy

bitter lactonic syntetic

A−H

−4 −2 0 2 4 Unique component (Flavour)

sulfurous ripe tropical

0.5

A−M A−L

RIPE SULFURIC TROPICAL

B−H

B−M B−M

Common component

Common component

0.3

SKIN

floral FLORAL LACTONIC GREEN SYNTHETIC −0.5 0 0.5 Unique component (Flavour)

1

Fig. 4. Preference map by PO-PLS regression. The scores plot (left) shows the products. Labels represent flavour type and sugar level. Loadings plot (right) shows the flavour descriptors (lower case) and odour descriptors (upper case) and density distribution of consumers, where darker colour represents higher density.

2. Selecting common components: Canonical correlation analysis resulted in correlation coefficients 0.99, 0.82 and 0.66. As can be seen, only the first one is close to 1, which is the criterion for common variation. This was also the only component among the three that clearly reflected one of the design variables. 3. Selecting unique components: a. The first unique component from the odour block separates the low flavour dose from the medium/high dose to some extent, but the pattern is unclear and the explained X- and Y-variances (odour and consumer liking respectively) are both below 10%. It is therefore considered of less interest (as compared to the rest) here. b. The first unique component from the flavour block showed a clear grouping according to sugar content, and this component was therefore regarded as valid. This component explains 63% of the information in the flavour block. Scores and loadings for the common and unique components are given in Fig. 4, and explained variances are given in Table 3. The two components are uncorrelated since they are orthogonal, and can therefore be treated independently of each other. In the scores plot, the waters are clearly grouped according to flavour type along the common component and sugar level along the unique flavour component. The common component is the most important for consumer acceptance, explaining 13% of the variance alone. The common component also represents the most dominant variation in the odour block, explaining 86% of the odour variations. On the other hand, it only explains 25% of the flavour variation. The density of consumers is strongest in the direction of flavour type A, which is described as synthetic, floral, lactonic and green. The unique flavour component explains 63% of the total flavour variation, and an additional 10% of the consumer liking variation. It is thus the most dominant feature in the flavour block. Consumer densities are essentially centred along this component, with one small peak at each side of the center. This indicates that there are two segments of consumers with opposing sweetness preferences. As can be noted the two axes are here presented in the opposite order as for the regular PLS regression. The reason for this is practical since it highlights the attribute names better.

in flavour type can be distinguished both by smelling and tasting, while the sweetness variations can only be captured by the flavour attributes (which was to be expected). The method also shows explicitly that the complexity of the two sensory blocks is different for the purpose of predicting consumer liking: the odour block is essentially one-dimensional (only the common component), while the flavour block is two-dimensional (one common and one unique component). This also means that the odour variables in this case add little to the general prediction ability of the consumer liking. As can also be noted, the PO-PLS gives explicit measures about how dominant the components are in each predictor block, see Table 3. While PLS only tells us that component 1 is the overall most dominant in all predictor blocks, PO-PLS shows how the variability is distributed between blocks.

6.1. Jams data set 6.1.1. PLS regression Since the predictor data consist of both instrumental measurements (with different units) and sensory attributes, all variables were scaled to unit variance and centred before analysis. Cross-validation suggested a model with two components, explaining 65% (39 + 26%) of the X data and 93% (85 + 8%) of Y (values obtained by regression fitting). The overall consumer preference is thus very well explained by the model, showing the relevance of considering only consumer averages in this illustration. Full cross-validation is consistent with this result, validating 85% (75 + 10%) of the explained Y-variance. The corresponding numbers for the individual predictor blocks are given in Table 4. The scores and loadings are presented in Fig. 5. It can be seen from the loadings plot that the variables correlating mostly with consumer preference are mainly related to colour along the first component, including both instrumental colour variables (‘Absorb’, ‘L’, ‘a’, and ‘b’) and sensory variables from the appearance/mouthfeel block (‘Colour’, ‘Red’). Jams with an intense red colour are Table 4 Number of components and explained variances for separate PLS regression models from the three blocks in jam data case. # Comp

Explained X variance (%)

2 2 3

88 74 97

6. Comparison of the two methods As can be seen, the two methods revealed much of the same information. PO-PLS, however, offers some additional interpretability advantages. For instance, it emphasises that the differences

Instrumental Sensory appearance/mouthfeel Sensory odour/taste

Explained Yvariance (%) Cal

Val

89 85 76

78 72 45

14

I. Måge et al. / Food Quality and Preference 24 (2012) 8–16

3

1 C4H2 C3H3

Acidity

2 C3H1

Sour Shiny

Smell 0

C2H2 C4H3 C3H2

C4H1

0.5

C2H3 C1H3 Comp. 2

Comp. 2

1

−1 C2H1

Bitter Chewing Flavour

Absorb Colour Juicy Red

0 a Viscos b

−2

PREF

L −3

−0.5

Soluble

Off−flav

C1H1 −4 −5 −4

C1H2 −2

0

2 Com . 1

Sweet 4

6

−1 −1

−0.5

0 Comp. 1

0.5

1

Fig. 5. Preference map for Jam data case by PLS regression: Scores (left) and correlation loadings (right). In the scores plot, labels C1-C5 represents the five production sites and H1-H3 the three harvesting times. In the loadings plot, the Y-loading ‘‘PREF’’ is boxed and upper-case.

the most preferred ones. The most important variables along the second component are related to the sweetness of jams, i.e. ‘Acidity’ (instrumental) and ‘Sour’ (odour/taste) versus ‘Sweet’ (odour/ taste). Sweeter jams are more preferred than sour jams. The first component is the most important for describing consumer liking, meaning that the colour of the jam is much more important than the sweetness. Intensities of raspberry smell and flavour do not seem to be important at all. The scores plot shows a grouping of samples according to harvesting time along the first component; the early harvested berries give lower colour intensity and thereby less preferred jams. Along the second component, it is clear that the berries from production site C1 are sweeter than the others, at least for the first two harvesting times.

a. Instrumental and odour/taste. The correlation coefficients are 0.98 and 0.69, indicating that the first component is common. It explains 22% of the instrumental block and 30% of the odour/taste block. It also explains an additional 15% of the Yvariance, and is therefore selected as valid. b. Appearance/mouthfeel and odour/taste. The correlation coefficients are here 0.87and 0.57, meaning that there is no strong common structure here. No common components were selected. 3. Selecting unique components; No unique components from any of the blocks gave any additional contribution to explaining the consumer preference, and none of them was therefore included in the model.

6.1.2. PO-PLS regression Full cross-validation was used to select the optimal number of components. First, separate PLS regressions from the three blocks were run. The selected number of components and corresponding explained variances for these models are given in Table 4. The selection of components in PO-PLS was done according to the following procedure:

In conclusion, the sequential selection resulted in a model with one common component from the instrumental and appearance/mouthfeel block, and one common component from the instrumental and odour/taste block. The two components are not completely orthogonal, but the correlation between them is just 0.01. We can therefore plot scores and loadings for the two components against each other, as is done in Fig. 6. All the relevant explained variances are summarised in Table 5.

1. Selecting common components for all 3 blocks: Canonical correlation analysis resulted in correlations coefficients of 0.91 and 0.82. The first component is above 0.9 but still not extremely high, so it is important to inspect this component thoroughly to determine if it should be considered as a valid common component or not. When looking at the explained variances, we see that the component explains 54%, 42% and 9% of the Instrumental, appearance/mouthfeel and odour/taste blocks respectively. Looking at the loadings plot (not shown here), we also see that none of the individual odour/taste attributes are more than 30% explained. This means that the odour/ taste block does not play an active part of this component, and it is mostly explaining variation from the two other blocks. We therefore select no components to be common for all three blocks. 2. Selecting common components for pairs of blocks: Instrumental and appearance/mouthfeel. The correlation coefficients are 0.97 and 0.78, indicating that the first component is a common one. It explains 64% and 55% of the Instrumental and appearance/ mouthfeel blocks respectively, and 77% of the Y-variance. This component is threrefore selected as valid.

7. Comparison of the two methods As for the other example, the main interpretations from PO-PLS and the ordinary PLS regression are similar. There is, however, some information from the PO-PLS model which is not explicitly found in the PLS model: The instrumental block is present in both components, the L, a, b and absorbance values to one of the blocks and acidity to the other. The two dimensions are close to orthogonal to each other. The instrumental measurements also capture the sensory variations that are relevant for consumer liking, since it is common with each of them separately. This means that the instrumental measurements taken here could be enough for predicting the consumer liking (see also Table 4). The same can be said about the two sensory blocks. The appearance/mouthfeel and odour/taste blocks are complementary (and in this case almost orthogonal), since there are no common components between them. These conclusions are also indicated in the regular PLS prefmap, but the POPLS focuses on these aspects, highlight them and makes them explicit. As can also be noted here, the PO-PLS gives explicit measures

15

ed R

O Sm So Bi Fl ff− av tt o u e ll u r e r fla v r

ou

y

ny

ic

ol

C

Ju

i Sh

r

g

C1H2 −0.4 −0.2 0 0.2 0.4 0.6 Common comp. Instrumental + appearance/mouthfeel

in

PREF −0.5

Soluble t

C1H1

−0.4

w

C3H2 C4H3

−0.2

0 b a L

he

C4H1 C2H1

Absorb

ee

0

0.5

Sw

C1H3 C2H3

Acidity

s

C4H2 C2H2

0.2

1

co

C3H3

Vi s

C3H1

Common comp. Instrumental + odour/taste

0.4

C

Common comp. Instrumental + odour/taste

I. Måge et al. / Food Quality and Preference 24 (2012) 8–16

−1 −1 −0.5 0 0.5 1 Common comp. Instrumental + appearance/mouthfeel

Fig. 6. Preference map for Jam data case by PO-PLS regression: Scores (left) and correlation loadings (right). In the scores plot, labels C1-C5 represents the five production sites and H1-H3 the three harvesting times. In the loadings plot, the Y-loading ‘‘PREF’’ is boxed and upper-case.

Table 5 Jam data set: component-wise and total explained X- and Y-variances for the PLS model and the PO-PLS model. PLS

Explained X-variance

Explained Y-variance

Cal Val

PO-PLS

Comp.1 (%)

Comp.2 (%)

Total (%)

39

26

65

85 75

8 10

93 85

Instrumental Apperance/mouthfeel Odour/taste Cal Val

about how dominant the components are in each predictor block, see Table 5. In this case it does not seem that the invariance aspect of the PO-PLS played any significant role. 8. Discussion In this paper we have discussed the newly developed PO-PLS method (Måge, Mevik, & Næs, 2008) for the purpose of preference mapping. The method is particularly suitable for identifying common and unique parts for different blocks of data. In the flavoured waters case, the odour data are shown to be totally unimportant for the description of the liking when the taste attributes are already used. In the jam data case, both sensory blocks are redundant when the instrumental variables are present. This may later on for certain purposes be used for reducing or removing complex and time-consuming sensory profiles. The presented data sets are both well-structured, have clear effects and a limited number of variables, which makes them straight-forward to interpret and analyse by any method. Many of the results are also clearly in line with what could be expected. Even so, PO-PLS is able to provide some extra interpretative power, by defining explicitly what is common and unique variation. For more complex data sets (with more latent dimensions or more variables), it will usually not be possible to distinguish common from unique information by interpreting ordinary PLS maps. The focus here is on PLS regression, but the same methodology can easily be used for explorative analysis (PO-PCA) when no response variables are present. This is done by replacing the PLSscores in step 1 and 5 of the algorithm (see Section 2.3) by PCAscores, and skip step 6. PO-PCR is obtained by using PCA in step 1 and step 5. The method can handle any number of predictor blocks, but the complexity increases as more blocks are added. If more than two blocks are present, as is the case for the jam example, we have common components on several levels: Components can be com-

Common comp. 1 (%)

Common comp. 2 (%)

Total (%)

63 55

22

85 55 30 92 84

77 75

30 15 9

mon for all blocks, or just for a subset of blocks. The natural way to search for these components would be to start at the upper level (all blocks), and continue to examine all combination of smaller subsets. The model will depend somewhat on the order these subsets are examined, and it is advisable to check several different sequences to ensure that the model is stable. It is also a good idea to use background knowledge to decide in which subsets to look for common components. The process of deciding numbers of components is the main drawback of PO-PLS. With only two blocks, five decisions need to be made. With three blocks, it increases to ten decisions if all subsets of blocks are examined. When many blocks are present, it is therefore necessary to use application knowledge to reduce the number of relevant subsets. Also, every decision will affect some of the subsequent decisions. The robustness of results with regard to deciding numbers of components will be an important topic for further work. PO-PLS is here used for preference mapping, but the methodology itself is not specific to sensory and consumer applications. It can be used in any case where there are several data blocks, and one is interested in describing how the blocks relate to each other. Some relevant application areas which we also are working on are process modelling, sensor fusion, and systems biology (the socalled – omics) analysis. As can be seen, the calculations for PO-PLS are very simple. The only tools used are PLS regression, GCA and orthogonalisation which are all quite easy to conduct, also within standard statistical program packages. This ensures interpretability and userfriendliness. 9. Conclusion PO-PLS is a new method for analysing multi-block data, with focus on identifying common and unique variation between several blocks of data. In contrast to the standard methods, PO-PLS is

16

I. Måge et al. / Food Quality and Preference 24 (2012) 8–16

invariant of block scaling, and allows different dimensionality in the data blocks. The invariance aspect may be important when data of different units are merged in a data analysis. This paper shows that PO-PLS is well suited for preference mapping applications, and has some interpretative advantages above the standard PCR/PLS regression. Acknowledgements We thank isi Sensory Analysis GmbH & Co. (Göttingen-Rosdorf) for providing the flavoured waters data set. The work was financially supported by the ’Data Integration’, ’Food Choice’ and ’Consumer Check’ projects, all financed by The Agricultural Food Research Foundation of Norway. References Carroll, J.D. (1968). Generalisation of canonical analysis to three or more sets of variables. Proceedings of the 76’th convention of the American Psychological Association, vol. 3. 227–228. Dahl, T., & Næs, T. (2006). A bridge between Tucker-1 and Carroll’s generalised canonical analysis. Computational statistics and data analysis, 50(11), 3086–3098. Esbensen, K. (2002). Multivariate Data Analysis – in Practice. An Introduction to Multivariate Data Analysis and Experimental Design (5th ed.). Oslo, Norway: Camo Software. Escofier, B., & Pagès, J. (1994). Multiple factor analysis (AFMULT package). Computational Statistics & Data Analysis, 18(1), 121–140. Hanafi, M., & Kiers, H. (2006). Analysis of K sets of data, with differential emphasis on agreement between and within sets. Computational Statistics and Data Analysis, 51, 1491–1508. Hedderley, D., & Wakeling, I. (1995). A comparison of imputation techniques for internal preference mapping, using Monte Carlo simulation. Food Quality and Preference, 6, 281–297. Helgesen, H., & Næs, T. (1995). Selection of dry fermented lamb sausages for consumer testing. Food Quality and Preference, 6, 109–120. Helgesen, H., Solheim, R., & Næs, T. (1997). Consumer preference mapping of dry fermented lamb sausages. Food Quality and Preference, 8(2), 97–109. Hotelling, H. (1936). Relations between two sets of variates. Biometrika, 28, 321–327. Jørgensen, K., Mevik, B. H., & Næs, T. (2007). Combining designed experiments with several blocks of spectroscopic data. Chemometrics and Intelligent Laboratory Systems, 88(2), 154–166.

Jørgensen, K., Segtnan, V., Thyholt, K., & Næs, T. (2004). A comparison of methods for analysing regression models with both spectral and designed variables. Journal of Chemometrics, 18(10), 451–464. Kettenring, J. R. (1971). Canonical analysis of several sets of variables. Biometrika, 58, 433–451. Kohler, A., Hanafi, M., Bertrand, D., Oust Janbu, A., Naderstad, T., Naderstad, K., Qannari, M., & Martens, H. (2009). New concepts for investigating several sets of data in modern bioscience. In P. Lasch & J. Kneipp (Eds.), Modern Concepts in Biomedical Vibrational Spectroscopy. USA: Blackwell Publishing (chapter 15). Måge, I. (2008a). The Multivariate Approach to Product Development (Flavoured Waters): Multivariate modelling of sensory profiles. http://www.camo.com/ resources/application-notes.html (Retrieved 2011-08-03). Måge, I. (2008b). The Multivariate Approach to Product Development (Flavoured Waters): Multivariate preference mapping, http://www.camo.com/resources/ application-notes.html (Retrieved 2011-08-03). Måge, I. (2008c). The Multivariate Approach to Product Development (Flavoured Waters): Quality assessment of sensory data), http://www.camo.com/ resources/application-notes.html (Retrieved 2011-08-03). Måge, I., Mevik, B. H., & Næs, T. (2008). Regression models with process variables and parallel blocks of raw material measurements. Journal of Chemometrics, 22, 443–456. Martens, H., & Næs, T. (1989). Multivariate Calibration. Chichester: John Wiley & Sons. Martens, M., Tenenhaus, M., Vinzi, V. E., & Martens, H. (2007). The use of partial least squares methods in new food production development. In H. MacFie (Ed.), Consumer-Led Food Products Development (pp. 492–523). Cambridge, England: Woodhead Publishing Lmt.. McEwan, J. A. (1996). Preference mapping for product optimization. In T. Næs & E. Risvik (Eds.), Multivariate Analysis of Data in Sensory Science, Vol.16, Data Handling in Science and Technology (pp. 71–102). Amsterdam: Elsevier Science B.V. Moskowitz, H. R., & Silcher, M. (2006). The applications of conjoint analysis and their possible uses in Sensometrics. Food Quality and Preference, 17(3–4), 145–165. Næs, T., Brockhoff, P., & Tomic, O. (2010). Statistics for sensory and consumer science. Chichester, UK: John Wiley and Sons. Næs, T., Tomic,O., Mevik, B-H., & Martens, H. (2010). Path modelling by sequential PLS regression. Journal of Chemometrics, in press. Næs, T., Lengard, V., Johansen, S. B., & Hersleth, M. (2010). Alternative methods for combining design variables and consumer preference with information about attitudes and demographics in conjoint analysis. Food Quality and Preference, 21(4), 368–378. Westerhuis, J. A., Kourti, T., & MacGregor, J. F. (1998). Analysis of multiblock and hierarchical PCA and PLS models. Journal of Chemometrics, 12(5), 301–321. Wold, J. P., Veberg, A., & Nilsen, A. A. (2006). Influence of storage time and color of light upon photooxidation in cheese. A study based on sensory analysis and fluorescence spectroscopy. Int. Dairy Journal, 16, 1218–1226.