Food Quality and Preference 21 (2010) 1117–1125
Contents lists available at ScienceDirect
Food Quality and Preference journal homepage: www.elsevier.com/locate/foodqual
Linking sensory characteristics to emotions: An example using dark chocolate David M.H. Thomson a,b,*, Christopher Crocker a, Christopher G. Marketo b a b
MMR Research Worldwide Ltd., 46 High Street, Wallingford, Oxfordshire, OX10 0DB, UK MMR Research Worldwide Inc., 303 South Broadway, Suite 233, Tarrytown, NY 10591, USA
a r t i c l e
i n f o
Article history: Received 18 February 2010 Received in revised form 1 April 2010 Accepted 23 April 2010
Keywords: Emotional profiling Conceptual profiling Sensory characteristics Sensory emotional linkages Product optimisation
a b s t r a c t The conceptual profile of an unbranded product arises via three sources of influence: (i) category effect – how consumers conceptualise the product category: (ii) sensory effect – how the sensory characteristics of a particular product differentiate it from other products in the category: (iii) liking effect – the disposition of consumers to the category and how much they like a particular product. Assuming that category effects (conceptualisation and disposition) are constant across the set of products, it is anticipated that the conceptual differences apparent across the set of unbranded products would be driven, at least in part, by sensory differences. This study describes the application of best–worst scaling to conceptual profiling of unbranded dark chocolates and outlines novel data modelling procedures used to explore sensory/conceptual relationships. Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction 1.1. Conceptualisation We become aware of all objects via our peripheral senses. The incoming sensory information is processed in the mind and consequently the nature of the object becomes apparent to us. The identity that we assign to this object (e.g. ‘it’s chocolate’) is based largely on learning. With increasing familiarity, we make associations between the identity of a particular object and other conceptual associations held in the mind. For example we may think that chocolate is ‘comforting’, ‘fattening’, ‘will help me to relax’, ‘is a treat’ and so forth. Some of these conceptual associations are learned from external sources (including marketing, advertising and hearsay) and some are based on internal experiences. The notion of being comforting, fattening, relaxing, a treat are all conceptualisations; i.e. constructions created in the mind that allow us to interpret, understand and otherwise assign meaning to what we experience. Inevitably, the identity of the object (‘it’s chocolate’) and the associated conceptualisations (‘it’s comforting’, ‘it’s fattening’, ‘it’s relaxing’, ‘it’s a treat’) coalesce and become as-one in the mind of the individual. This means that when we experience a product, we don’t just react to the product itself but also to the associated
* Corresponding author at: MMR Research Worldwide Ltd., 46 High Street, Wallingford, Oxfordshire, OX10 0DB, UK. Tel.: +44 1491824999; fax: +44 1491824666. E-mail address:
[email protected] (D.M.H. Thomson). 0950-3293/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.foodqual.2010.04.011
conceptualisations. It’s via this route that sensory characteristics, which are intrinsic to the product and therefore part of its identity, become linked with conceptualisations. This is represented in Fig. 1. Conceptualisations, although infinitely diverse can be reduced down to three broad categories; functional (e.g. ‘will refresh me’, ‘will wash my clothes cleaner’, ‘will kill germs’, etc.), emotional (e.g. ‘will make me happy’, ‘will calm me’, ‘will annoy me’, etc.) and abstract (e.g. ‘is sophisticated’, ‘is trustworthy’, ‘is feminine’, etc.) conceptualisations. Some abstract conceptualisations may impact on our emotions. For example, choosing a product that consumers conceptualise as sophisticated could promote feelings of ‘being classy’, ‘being superior’, ‘being successful’, etc. In other words, sophisticated (abstract conceptualisation) has emotional connotations that may, in turn, lead to emotional consequences. Likewise, if a product is conceptualised as trustworthy (for example) this may be based, at least in part, on that product’s reputation for being ‘full of goodness’, implying perhaps that the product might be ‘wholesome’ or otherwise ‘good for you’ (functional conceptualisations). As a consequence, trustworthiness has functional connotations although it has emotional connotations too. This suggests that abstract conceptualisations are analogous to stepping stones that lead eventually to emotional and/or functional conceptualisations and that all conceptualisations may eventually fall into one or other of two categories (Thomson, 2010): Conceptualisations that have immediate or eventual emotional connotations (emotionality). Conceptualisations that have immediate or eventual functional connotations (functionality).
1118
D.M.H. Thomson et al. / Food Quality and Preference 21 (2010) 1117–1125
Fig. 1. Perception and conceptualisation.
1.2. Emotional consequences vs. emotional conceptualisations It is generally recognised that the product itself, and not just the branding, the packaging or the manner in which it is presented, can have emotional consequences. Aligning the emotional messages communicated by the product and the pack with branding so that they are consonant, augments and strengthens the brand greatly (Lindstrom, 2005). Measuring the emotional consequences engendered by unbranded products is often futile because they may be subtle, may occur some time later and may not be immediately apparent to the person concerned. As a consequence, most ‘emotional measurement’ tools don’t access emotional consequences but emotional conceptualisations (or emotional associations). This means that when someone tells us that a product makes them feel ‘happy’, ‘passionate’, etc., it’s more likely that they are reflecting what the product is communicating to them (emotional conceptualisations) rather than doing to them (emotional consequences). This distinction is important, especially when developing measurement processes. 1.3. Measuring conceptual associations Three practical problems are often encountered when attempting to capture and measure conceptual associations: (i) Some conceptualisations are readily accessible, others less so, whilst some may be completely hidden to us (Greenwald, McGhee, & Schwartz, 1998). Research participants can usually allude quickly to conceptualisations that are readily accessible to them but, when stumped to explain why they chose something, the rational part of the mind automatically takes over and they unwittingly look for logical associations (Ariely, 2008). Whilst these associations may seem plausible and intuitive, sometimes they will have little or no bearing on reality. (ii) People form an impression very quickly and easily about whether or not they like something and to what extent, without necessarily needing to stop and think about what the object is or what it means to them (Zajonc, 1980). In this context, liking is defined (by the authors) as the immediate
enjoyment experienced when consuming and otherwise interacting with the product or object in question. Unfortunately, liking may have a pernicious effect on the ability of researchers to access the deeper and less accessible yet highly influential conceptualisations triggered by an object. This is because the easiest and sometimes the only option open to research participants is to associate positive conceptualisations or images with things that they like and, conversely, negative conceptualisations or images with things that they dislike. This ‘easy way out’ prevents researchers from accessing the true but often hidden conceptualisations associated with the object and it is one of the reasons why ratings of emotion terms and liking are often correlated. (iii) Some of the most influential conceptualisations may seem counterintuitive. For example, it isn’t obvious that the taste of dark chocolate would engender ‘trustworthiness’, and it would seem counterintuitive to ask consumers about this directly (i.e. ‘How trustworthy does this chocolate taste?’) yet ‘trustworthiness’ is one of the key conceptualisations engendered by the taste of dark chocolate (see below). The challenge for researchers is to develop methods that probe beyond what is obvious, apparently intuitive and otherwise associated with immediate liking, to access the deeper conceptualisations that genuinely influence choice and to do so without creating distortions or aberrations. 1.4. Accessing conceptualisations using words and best–worst scaling Words carry both literal and metaphorical (figurative) meaning. For example, the literal meaning of the word trustworthy is ‘worthy of trust’ or ‘something that can be relied upon’ but the word also carries metaphorical meaning that extends well beyond this. It is this mixture of literal and metaphorical meaning that brings such richness to language. Combinations of words bring both subtlety of meaning and precision. As a consequence, the spoken, sung and written word has evolved into the most widely used medium in everyday life for communicating feelings and experiences. Paradoxically, the use of words in emotion research is often criticised because it is assumed, quite wrongly, that in so doing each word should be associated with some form of measurement scale.
D.M.H. Thomson et al. / Food Quality and Preference 21 (2010) 1117–1125
However, there are indeed two issues when measurement scales are associated with words: 1. The mere existence of the scale inevitably encourages the individual to think about the meaning of the word thereby engaging rational, cognitive thought processes. The influence of the object on choice or purchase behaviour, on the other hand, may be due to non-cognitive or apparently irrational influences. 2. Rational, cognitive processing causes the individual to focus on literal meaning whereas it’s the metaphorical meaning that often conveys conceptual richness and depth.
1119
gain in efficiency. A pair of best–worst choices from a set of five words will recover seven of the 10 derived pairs of words. Moreover, best–worst tasks take advantage of individuals’ abilities in identifying and responding consistently to extreme options. Best–worst scaling has been shown to outperform rating scales in both discriminability and predictive validity (Cohen, 2003) although this was not the driving force for choosing best–worst scaling for conceptual profiling. A more extensive discussion of the method, focusing on hedonic measurement, is given by Jaeger, Jørgensen, Aaslyng, and Bredie (2008). 2. Methodology
Best–worst scaling (otherwise known as maximum difference scaling; Finn & Louviere, 1992) may offer a way forward with words because it avoids the use of external measurement scales. To apply best–worst scaling to conceptual profiling, the research participant is presented with the object under investigation along with a set of either four or five words, referred to hereafter as quads or quins. The object might be anything from branding, unbranded packaging, unbranded product, merchandising material, an advertisement, a person, a place or venue, a retailer, a picture, a cartoon character or even another word. All the research participant needs to do is decide which of the four or five words s/he feels, for whatever reason, is most closely associated with what she is experiencing in response to the object and which word is least closely associated with the experience. Each person would typically be shown 8–20 quads or quins in a statistical design balanced for order and context. The choice of design depends on the extent of the conceptual vocabulary, the complexity of the object under research, and the sample size. A typical experiment would employ 3–10 versions of a partially balanced incomplete block design. We generally use quins when the lexicon comprises more than ca. 20 words and quads with fewer words. The composition of the conceptual lexicon is object specific and the choice of words and the manner in which they are expressed requires great care. Conceptual words are normally drawn from a core vocabulary, supplemented with object specific terms. These are subject to qualitative review using focus group procedures and ideally pilot-scale evaluation. A conceptual lexicon normally comprises 16–30 terms and should include a mix of words with positive and negative connotations. As mentioned earlier, conceptualisations may have emotional or functional connotations and some may have both. It’s essential to deal with emotionality and functionality separately within a single research exercise because functional conceptualisations are often more readily apparent and otherwise more accessible to people than emotional conceptualisations and therefore the former tend to dominate. This example focuses on the emotional conceptualisations associated with dark chocolate. Each object should be evaluated as described above using a sample of target consumers that is sufficiently large to capture and reflect differences in the conceptual meaning of the object. We have found that some objects, such as well established brands, show a remarkable degree of consistency of conceptual interpretation with as few as 20–30 target consumers, with no tangible benefit in terms of qualitative conceptual information and precision achieved by increasing the sample size by a factor of ten. Indeed, such is the sensitivity of best–worst scaling when implemented properly, that clear discrimination between the conceptual profiles of two or more objects is often feasible with a relatively small sample of consumers. The consequent choice data are analysed to yield numerical values for the conceptualisations having excellent properties on an interval scale. Best–worst scaling can be envisaged as an extension of the method of paired comparisons (David, 1988), with a considerable
2.1. Product Nine sensorially differentiated dark chocolates were selected from the UK market (Cadbury’s Bournville Deeply Dark, Cote d’Or, Divine, Green & Blacks, Lindt, Montezuma, Seeds of Change, Tesco Finest and Waitrose). All of the chocolates were debranded. 2.2. Conceptual lexicon The lexicon was developed by a small group of reasonably articulate consumers who tasted and discussed the products under the guidance of a suitably qualified moderator. The group referred to a master list of ca. 100 emotional conceptualisations (a mix of emotion terms and abstract conceptualisations with emotional connotations) grouped into 28 emotional territories, but they were permitted to add terms of their own. An initial lexicon of 40 words was reduced to the following 24 by pruning over-represented emotional territories, whilst taking into account the ability of the words to discriminate amongst the products. Adventurous Arrogant Confident Energetic Fun Masculine Powerful Sensual Sociable Tacky Trustworthy Warm
Aggressive Comforting Easygoing Feminine Luxurious Ordinary Pretentious Serious Sophisticated Traditional Uncomplicated Youthful
2.3. Consumers Dark chocolate users familiarised in advance with the chocolates, the lexicon and the process. 2.4. Best–worst scaling protocol The research was conducted in a central location in the UK using an online interface. Each consumer evaluated all nine unbranded chocolates in three batches of three on separate days using a balanced rotated design. Sixteen sets of five conceptual words (quins) were presented to each consumer for each chocolate in a partially balanced incomplete block (PBIB) design with three versions of the questionnaire. For each quin, consumers identified which of the five words most readily and least readily spring to mind as a consequence of eating that particular chocolate. 2.5. Data analysis All statistical analysis was performed in SAS.
1120
D.M.H. Thomson et al. / Food Quality and Preference 21 (2010) 1117–1125
Table 1 Percentage share of conceptual profile for nine unbranded dark chocolates.
Adventurous Confident Energetic Sociable Comforting Warm Trustworthy Masculine Tacky Youthful Easygoing Traditional Aggressive Powerful Sensual Fun Feminine Uncomplicated Ordinary Serious Arrogant Pretentious Luxurious Sophisticated
Green and Blacks
Waitrose
Tesco Finest
Lindt
Bournville
Divine
Cote d’Or
Seeds of Change
Montezuma
7.5 6.9 4.9 8.8 3.9 5.4 4.0 4.9 1.8 2.1 2.9 9.7 1.9 3.2 1.5 3.5 2.4 4.6 3.6 7.1 2.7 2.4 2.1 2.3
2.5 9.2 3.7 5.8 6.3 6.8 7.4 3.4 1.7 3.5 7.1 4.5 1.1 3.2 3.2 4.5 3.3 6.5 3.4 4.1 1.5 2.1 2.6 2.8
2.3 5.8 5.0 12.4 5.5 8.0 5.9 1.6 1.2 3.7 5.8 3.3 1.0 1.5 3.4 7.7 4.5 5.7 4.2 1.6 1.0 1.8 3.3 3.8
7.2 11.1 15.1 3.8 1.0 1.9 2.3 4.8 0.4 3.3 2.6 2.5 3.7 10.6 3.5 2.6 2.6 2.3 0.7 4.2 4.1 3.6 3.3 2.8
1.0 2.6 1.1 19.9 9.0 6.9 8.0 0.6 0.4 4.3 13.1 3.9 0.1 0.2 2.9 5.1 6.1 7.5 4.1 1.3 0.1 0.2 0.9 0.6
3.5 7.6 5.4 9.2 3.7 3.4 8.9 3.9 3.0 4.4 4.2 4.0 1.3 3.0 3.7 3.8 3.8 7.6 4.3 4.8 1.2 1.4 2.5 1.5
2.5 5.6 2.7 5.1 3.0 4.8 3.7 3.3 3.1 7.3 3.4 6.5 1.9 2.1 2.2 2.5 5.3 7.3 6.9 6.1 4.1 5.3 2.0 3.3
5.4 8.8 7.4 5.6 3.0 3.8 3.4 4.7 1.4 2.3 4.5 6.7 3.4 4.5 2.5 3.1 4.3 4.0 2.0 3.2 3.5 4.1 2.6 5.8
8.8 13.2 7.0 3.4 1.4 2.0 2.0 7.0 0.3 1.4 2.2 3.0 3.3 11.8 3.8 1.9 2.0 1.4 1.0 6.4 4.5 2.1 5.1 4.9
The most rigorous approach to analysing best–worst data is to model the probability that an individual will choose a particular best–worst pair over all other possible best–worst pairs. To obtain a tractable model, the errors associated with the choices are usually assumed to be independently and identically distributed as Type I Extreme Value (Gumbel) random variables. These assumptions lead to the multinomial logit (MNL) model (McFadden, 1974) which is by far the most commonly used model in discrete choice experiments. However, this model is rather cumbersome when there are many words as each best–worst pair requires a dummy variable in the design matrix. A simpler but less rigorous approach is to model the best and worst choices separately using the marginal MNL model (Flynn, Louviere, Peters, & Coast, 2008). The marginal MNL model has been found to give a satisfactory approximation to the maximum difference model described above (Flynn et al., 2008; Sawtooth Software Inc., 2005) and is commonly used in practice. The parameters of MNL models can be estimated by conditional logistic regression. Note that they are not logistic regression models and cannot be estimated using the logistic regression routines typically offered by statistical software packages. We fit the models using SAS proc phreg (So & Kuhfeld, 1995). Other software for discrete choice modelling can be used provided the programming interface permits the coding of ‘‘worst” choices. The output is a set of ‘scale values’ (utilities) on an interval scale that characterises the conceptual profile of the product.
If the experimental design is balanced, approximate scale values can be obtained more easily by differencing the total best from the total worst counts for each item (Auger, Devinney, & Louviere, 2004). The accuracy of this method depends on each item appearing with every other item an equal number of times in the experiment. This is possible in designs with few items, but when there are 20–28 terms in a conceptual lexicon it is impossible to obtain complete balance without using a large number of different PBIB designs within the sample, even if there are no missing data in the responses. We therefore prefer to analyse BWS data using the marginal MNL model. In any case, computational speed issues associated with fitting the model have long since disappeared. We choose to express the scale values relative to a mean of zero. However, because the true anchor of the interval scale is unknown, we cannot measure in absolute terms the level of association of a conceptualisation with an object. It is useful to rescale the utilities according to the underlying choice model to give a proportional scaling (Cohen & Neira, 2003; Sawtooth Software Inc., 2005). The rescaling is carried out using the formula
expðU i Þ Pi ¼ 100 P ; i expðU i Þ where, Ui is the scale value of the ith conceptual term and the summation is over all the terms in the lexicon. The values of Pi can be interpreted as ‘share of conceptual profile’. This transformation of the scale values can aid interpretation and overcome some of the
Fig. 2. Conceptual profile of Cadbury’s Bournville Deeply Dark.
1121
D.M.H. Thomson et al. / Food Quality and Preference 21 (2010) 1117–1125
Tacky Aggressive Arrogant Pretentious Luxurious Sensual Sophisticated Ordinary Youthful Masculine Feminine Fun Comforting Serious Powerful Adventurous Warm Traditional Trustworthy Easygoing Uncomplicated Energetic Confident Sociable
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0%
Waitrose
Tesco
Lindt
Divine
Cote d'Or
Seeds of Change
Bournville
Montezuma
Fig. 3. Share of conceptual profile for nine unbranded dark chocolates.
Montezuma Lindt
1.0
Seeds of Change
0.8
0.6 Confident
Energetic Adventurous
Powerful
Green & Blacks
0.4
Dimension 2 (31.4%)
Green & Blacks
Masculine Arrogant 0.2 Aggressive
Sophisticated Serious Traditional
Pretentious 0.0
Sociable
Luxurious Sensual
-0.2
Waitrose Divine Tesco
Feminine Easygoing Warm Trustworthy Fun Uncomplicated
Cote d'Or
Youthful
-0.4
Bournville
Comforting
Ordinary
-0.6 Tacky -0.8 -0.8
-0.6
-0.4
-0.2
0.0
0.2
Dimension 1 (44.9%) Fig. 4. Conceptual biplot (D1 vs. D2).
0.4
0.6
0.8
1.0
1122
D.M.H. Thomson et al. / Food Quality and Preference 21 (2010) 1117–1125
difficulties associated with the presentation of interval scaled measurements to laypeople.
single set of parameters and an alternative model that includes interactions with a categorical variable representing the product. The null model is nested within the alternative model. The test statistic is computed as
3. Results
2flnðlikelihood of null modelÞ 3.1. Conceptual profiles
lnðlikelihood of alternative modelÞg
As an example, Fig. 2 shows the basic conceptual profile of Cadbury’s Bournville Deeply Dark. The conceptual terms positioned towards the right have the highest scale values (utilities) indicating that these conceptualisations (sociable and easygoing in particular but also trustworthy, comforting, uncomplicated, warm and feminine) are the most prevalent. Conversely, arrogant, aggressive, pretentious and powerful have the lowest scale values, indicating that these conceptualisations are associated least with this particular unbranded chocolate.
and is asymptotically distributed as chi-squared on the difference in the number of free parameters between the two models (see e.g. http://en.wikipedia.org/wiki/Likelihood-ratio_test). In the absence of any pre-planned comparisons, we usually start with a ‘‘global” test for any differences within the product set. If the null hypothesis is rejected, we then perform a paired comparison on each pair of products. The p-values from the chi-squared tests are then adjusted using Sidak’s approximation to control the overall Type I error across all paired comparisons to a maximum of 5%. There are highly significant differences across these conceptual profiles. This indicates that the unbranded chocolates are conceptually different and obviously that consumers are capable of realising and communicating these differences. The greatest differences in conceptual profiles exist between Cadbury’s Bournville Deeply Dark and Montezuma. Bournville is more sociable (19.9% > 3.4%), easygoing (13.1% > 2.2%), trustworthy (8.0% > 2.0%) and comforting (9.0% > 1.4%). Montezuma is more confident (13.2% > 2.6%), energetic (7.0% > 1.1%), powerful (11.8% > 0.1%) and masculine (7.0% > 0.6%).
3.2. Share of conceptual profile In Table 1 and Fig. 3, the utilities for each conceptual term for each chocolate are rescaled as described above and re-expressed as ‘share of conceptual profile’. Differences between conceptual profiles were assessed by means of a likelihood ratio test in which the null hypothesis of no difference in parameters (scale values) between products is tested against an alternative hypothesis that the parameters differ. This is done by fitting a null model with a
1.00 Sweet (O)
Body (Mf) Initial Bite (F) Thickness (Ap)
Vanilla (F) 0.75
Mouthcoating(Mf)
Waitrose Brown fruit (F)
Sweet (At) 0.50 Sweet (F) Creamy (Mf) Creamy (F)
Cocoa (F)
Brown (Ap)
Smooth (Mf)
Dimension 2 (26.1%)
0.25
Cocoa (O)
Bournville
Cocoa (At)
Montezuma Seeds of Change 0.00
Divine Milk choc (F)
Green & Blacks Tesco Lindt
-0.25
Smoky/burnt (O)
Buttery/margarine (F)
Nutty/earthy (F)
Salt (F) Coffee (F) Bitter (F) Red fruit (F) Sour(At) Sour (F) Bitter (At) Molasses (F) Drying (Mf) Smoky/burnt (F) Astringent (Mf)
-0.50
-0.75
Fatty/greasy (Mf) Cardboard/bland (F) Melt Rate (Mf)Cote d'Or Stale (F)
Waxy (Mf) Chemical (F)
-1.00 -1.00
-0.75
-0.50
-0.25
0.00
0.25
Dimension 1 (47.4%) Fig. 5. Sensory biplot.
0.50
0.75
1.00
1123
D.M.H. Thomson et al. / Food Quality and Preference 21 (2010) 1117–1125
zuma and Lindt. Dimension 3 (not shown in Fig. 4) distinguishes Cote d’Or. The differences between the products are best visualised in a plot of Dimension 2 vs. Dimension 3, both of which are discriminatory in this study. This plot is shown in Fig. 6 with an overlay of the sensory data (see below).
3.3. Conceptual biplots A biplot (Gabriel, 1971) of conceptual terms and products is a useful way of visualising the differences amongst products. The scaling of the data dictates that products are treated as variables and the conceptual terms as observations. Products are standardised to unit variance to allow them to contribute equally to the total explained variance. As we are primarily interested in the correlation between products, we choose the column metric preserving version of the biplot (where columns are products and rows as conceptualisations), also known as a GH biplot (Gabriel & Odoroff, 1986). This scaling equalises the variances of the dimensions. For pffiffiffiffiffiffiffiffiffiffiffiffi ffi ease of interpretation we divide the product loadings by N 1 where N is the number of conceptual terms, so that the product coordinates are equal to their correlation loadings in a principal components analysis. This scaling provides a measure of how well the biplot approximates the conceptual profile of each product, which may be enhanced by drawing the usual correlation circles of radii 1 pffiffiffi and 1/ 2: The conceptual terms are rescaled by a suitable factor to fit within the outer circle. Apart from the multiplier for scaling the observations, a biplot of this type resembles an MDPREF graph (Carroll, 1972) and can be interpreted similarly. When the conceptual profiles have a degree of similarity, Dimension 1 can be interpreted as an ‘‘average” or, if the sample set is regarded as representative, a ‘‘category” conceptual profile (a manifestation of the ‘category effect’ mentioned earlier.) The greater the proportion of variation captured by Dimension 1, the more similar are the conceptual profiles of the products. Fig. 4 (D1 vs. D2) shows the fairly extreme conceptual difference between Cadbury’s Bournville Deeply Dark versus Monte-
0.8
3.4. Sensory profiling The chocolates were profiled in triplicate using a trained sensory panel based at the Sensory Science Centre, University of Reading, UK. 14 panellists generated a vocabulary of 42 sensory attributes covering appearance, odour, flavour, mouthfeel and aftertaste. Samples were evaluated in temperature and odour controlled booths and presented monadically using a randomised design balanced for order effects. Scoring was on unstructured continuous line scales. Data were captured using CompusenseÒ Five software. Most of the important variation is captured by the first two dimensions of a biplot of products and attributes (Fig. 5). 3.5. Linking sensory and conceptual/emotional profiles Linking conceptual data generated using best–worst scaling with sensory data poses a challenge because of differences in the way the two data sets are scaled. As mentioned above, the conceptual terms are measured on an interval scale lacking a defined anchor. Our choice of the mean as the origin of the scale is arbitrary and therefore we cannot assume that zero on one product’s scale is comparable to zero on another product’s scale. The converse applies to sensory data. For any given attribute all products will be scored on the same scale, but the scales of differ-
Cote d'Or Chemical (F) Stale (F)
0.6
Traditional
0.4
Dimension 3 (10.3%)
Ordinary
Serious Melt Rate (Mf) Pretentious Fatty/greasy (Mf) Smoky/burnt (O, F) Waxy (Mf) Green & Blacks Arrogant Cardboard/bland (F) Salt (F) Sour (F, At) Red fruit (F) Astringent (Mf) Molasses (F) Uncomplicated Bitter (F, At) Youthful
0.2
Drying (Mf)
Tacky
Confident
Nutty/earthy (F)
Seeds of Change
Masculine
Feminine Warm Buttery/margarine (F)
Coffee (F) Adventurous Sociable Sophisticated Brown fruit (F) Aggressive Smooth (Mf) Divine Milk choc (F) Cocoa (O) Waitrose Cocoa (At) Thickness (Ap) Powerful Mouthcoating(Mf) Bournville Trustworthy Energetic Comforting Easygoing Cocoa (F) Sweet (O)
0.0
-0.2
Montezuma Lindt
Tesco
-0.4
Body (Mf) Creamy (F, Mf) Sweet (F, At)
Fun Luxurious Initial Bite (F)
-0.6 Sensual
Brown (Ap) Vanilla (F)
-0.8 -0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
Dimension 2 (31.4%) Fig. 6. Overlay of sensory data on conceptual biplot.
0.6
0.8
1.0
1124
D.M.H. Thomson et al. / Food Quality and Preference 21 (2010) 1117–1125
ordinates of the product vectors (products dimensions), V is the sum of the variances of the rows (products) of A, and f, k and P are scalars defined above. As the purpose of the overlay is to examine the relationship between conceptual terms and sensory attributes, it is not necessary to use the GH scaling. The row metric preserving version of the biplot, also known as a JK biplot, represents the relationship between the conceptual terms most closely and may be preferred when there is a disparity between the explained variance of the two dimensions. In the JK biplot, the coordinates of the conceptual terms are their principal component scores, and the product loadings are scaled to suit the appearance of the graph. The coordinates of the overlaid sensory attributes pffiffiffiffiffiffiffiffiffiffiffiffiffi are calculated as above except that the scaling factor f =k N 1 is replaced by 1/g where g is the scaling factor used to plot the product vectors. To facilitate comparison with Fig. 4 we have retained the GH biplot in this example. The variation in the sensory profiles explained by a dimension is calculated as the variance of the appropriate row of the matrix A defined above. These values are then expressed as a proportion of the total variance, V. Dimensions of interest are those that explain high proportions of conceptual and sensory variation. Because Dimension 1 represents the similarities rather than the differences amongst the products (i.e. the ‘category effect’ which should be more or less constant across all products), it explains relatively little sensory profile variation (5%). Dimensions 2 and 3 together explain 45% of the variation in the sensory profiles. Fig. 7 highlights the sensory/conceptual associations. These show that cocoa (sensory characteristic) is associated with powerful and energetic (conceptualisations); bitter with confident, adventurous and masculine; smoky/burnt with arrogant, serious,
ent attributes from QDA profiling are not comparable; effectively they are interval scales. Hence conceptual terms are comparable within but not between products, whereas sensory attributes are comparable between but not within products. We propose a graphical method in which sensory attributes are overlaid onto the conceptual biplot. Sensory attributes are standardised to zero mean and unit variance and scored on the dimensions of the conceptual biplot. To comply with pffiffiffiffiffiffiffiffiffiffiffiffi ffi our scaling of the biplot, the scores are multiplied by f =k N 1 where f is the scaling factor used for plotting the conceptual terms, k is the eigenvalue of the dimension and N is the number of conceptual terms. This factor ensures that the relationship between the map dimensions is the same for sensory attributes as it is for conceptual terms, but not that the attributes will be well positioned in relation to the axis scales. To achieve a scaling more compatible with the conceptual terms, we apply a further multiplicative factor of pffiffiffiffiffiffiffiffiffi P=V where P is the number of products and V is the sum of the variances of the products across the standardised sensory attributes. This scaling equalises the total variance of the sensory profiles (i.e. the sum of the variances of the dimensions) with the variances of the conceptual profiles (i.e. the sum of the variances of the dimensions) since the eigenvalues sum to P. Usually we find it achieves a satisfactory spread of sensory attributes on the map, as in Fig. 6. The scoring and rescaling operations can be expressed algebraically as
C¼
f k
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi! P A0 B ðN 1ÞV
where C is a matrix containing the co-ordinates of the sensory attributes on the map dimensions, A is a matrix of standardised sensory attributes (products attributes), B is a matrix containing the co-
0.8
Cote d'Or Chemical (F) Stale (F)
0.6
Traditional
0.4
Dimension 3 (10.3%)
Ordinary
Serious Melt Rate (Mf) Pretentious Fatty/greasy (Mf) Smoky/burnt (O, F) Waxy (Mf) Green & Blacks Cardboard/bland (F) Arrogant Salt (F) Sour (F, At) Red fruit (F) Astringent (Mf) Molasses (F) Uncomplicated Bitter (F, At) Youthful
0.2
Drying (Mf)
Tacky
Confident
Nutty/earthy (F)
Seeds of Change
Masculine
Feminine Warm Buttery/margarine (F)
Coffee (F) Adventurous Sociable Sophisticated Brown fruit (F) Aggressive Smooth (Mf) Divine Milk choc (F) Cocoa (O) Waitrose Cocoa (At) Thickness (Ap) Powerful Mouthcoating(Mf) Bournville Trustworthy Energetic Comforting Easygoing Cocoa (F) Sweet (O)
0.0
-0.2
Montezuma Lindt
Tesco
-0.4
Body (Mf) Creamy (F, Mf) Sweet (F, At)
Fun Luxurious Initial Bite (F)
-0.6 Sensual
Brown (Ap) Vanilla (F)
-0.8 -0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
Dimension 2 (31.4%) Fig. 7. Overlay of sensory data on conceptual biplot highlighting associations.
0.8
1.0
D.M.H. Thomson et al. / Food Quality and Preference 21 (2010) 1117–1125
traditional and pretentious; vanilla, brown and initial bite with sensual, fun and luxurious; and creamy and sweet with fun, comforting and easygoing. Formal modelling processes are also being investigated but as yet, none has proved superior to the graphical method described herein. 4. Conclusions Accessing the deep conceptualisations experienced by people in reaction to product, rather than branding or packaging, is very challenging because these conceptualisations are not always immediately apparent to the individual concerned. By virtue of the fact that the conceptual profiles of these nine unbranded dark chocolates are highly differentiated, this research has demonstrated the utility of best–worst scaling (maximum difference scaling), used in conjunction with an appropriate lexicon of conceptual descriptors, as a means of accessing such conceptualisations. The nature of the data generated by best–worst scaling is different from the other forms of descriptive scaling more usually associated with consumer research and sensory profiling. This poses challenges when attempting to use sensory profile data to explain the differences in conceptual profiles amongst unbranded products. By representing the conceptual profiles of the products using a biplot much akin to MDPREF and overlaying the sensory data on the key dimensions of this conceptual biplot, it is possible to make associations between the conceptualisations (most of which have emotional connotations) and the sensory characteristics that may drive them. One of the few product optimisation options still open to brand owners, in their constant struggle to justify their price premium over retailer and discount brands, is to align the emotional and functional conceptualisations of the product and the packaging with that of the branding. This creates consonance, and by definition removes contradiction, thereby reinforcing the brand message. The research tools described herein provide a mechanism for understanding how the sensory profile of a product might be optimised in order to align it better with branding. Acknowledgements Phiala Mehring for conducting sensory profiling and Compusense Inc., Guelph, Canada for its generous support of the Sensory Science Centre at the University of Reading, UK.
1125
References Ariely, D. (2008). Predictably irrational: The hidden forces that shape our decisions. London: Harper-Collins. Auger, P., Devinney, T. M., & Louviere, J. J. (2004). Consumer social beliefs: An international investigation using best worst scaling methodology. Melbourne: University of Melbourne, Melbourne Business School.
. Carroll, J. D. (1972). Individual differences and multidimensional scaling. In R. N. Shepard & S. B. Nerlove (Eds.), Multidimensional scaling: Theory and applications in the behavioural sciences. New York: Seminar Press. Cohen, S. (2003). Maximum difference scaling: Improved measures of importance and preference for segmentation. In Sawtooth software conference proceedings, 2003. Sequim, WA. . Cohen, S., & Neira, L. (2003). Measuring preference for product benefits across countries. In ESOMAR Latin American conference, Uruguay. Reprinted in Excellence in International Research, 2004 (pp. 1–22). Amsterdam: ESOMAR. David, H. A. (1988). The method of paired comparisons (2nd ed.). London and New York: Oxford University Press. Finn, A., & Louviere, J. J. (1992). Determining the appropriate response to evidence of public concern: The case for food safety. Journal of Public Policy and Marketing, 11(1), 12–25. Flynn, T. N., Louviere, J. J., Peters, T. J., & Coast, J. (2008). Estimating preferences for a dermatology consultation using best–worst scaling: Comparison of various methods of analysis. BMC Medical Research Methodology, 8, 76. . Gabriel, K. R. (1971). The biplot graphic display of matrices with application to principal component analysis. Biometrika, 58, 453–467. Gabriel, J. J. & Odoroff, C. L. (1986). Illustrations of model diagnosis by means of three-dimensional biplots. In E. J. Wegman & D. J. DePriest (Eds.), Statistical image processing and graphics (pp. 257–274). New York: Marcel Dekker. Greenwald, A. G., McGhee, D. E., & Schwartz, J. K. L. (1998). Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology, 74, 1464–1480. Jaeger, S. R., Jørgensen, A. S., Aaslyng, M. D., & Bredie, W. L. P. (2008). Best–worst scaling: An introduction and initial comparison with monadic rating for preference elicitation with food products. Food Quality and Preference, 19, 579–588. Lindstrom, M. (2005). Brand sense. New York: Simon and Schuster. McFadden, D. L. (1974). Conditional logit analysis of qualitative choice behavior. In P. Zarembka (Ed.), Frontiers in econometrics (pp. 105–142). New York: Academic Press. Sawtooth Software Inc., 2005. The MaxDiff/Web system technical paper. . So, Y., & Kuhfeld, W. F. (1995). Multinomial logit models. In Proceedings of the SUGI 20 conference. Florida, USA. . Thomson, D. M. H. (2010). Reaching out beyond liking to make new products that people want. In H. J. H. MacFie & S. R. Jaeger (Eds.), Consumer driven innovation in food and personal care products. Cambridge: Woodhead Publishing. Zajonc, R. B. (1980). Feeling and thinking: Preferences need no inferences. American Psychologist, 35(2), 151–175.