Food Quality and Preference 12 (2001) 359–363 www.elsevier.com/locate/foodqual
Segmentation of a panel of consumers using clustering of variables around latent directions of preference E. Vigneau a,*, E.M. Qannari a, P.H. Punter b, S. Knoops b a
ENITIAA/INRA - Unite´ de Statistique Applique´e, la Ge´raudie`re B. P. 82225, 44322 Nantes cedex, France b OP&P Product Research, Postbus 14167, NL-3508 SG Utrecht, The Netherlands
Abstract A procedure of clustering of variables is discussed and applied for the purpose of segmenting a panel of consumers. The underlying principle of the method is to find K groups of variables (i.e. the consumers) and K latent components such that the consumers in each group are as much correlated as possible with the corresponding latent component. The procedure involves running, in a first step, a hierarchical clustering algorithm to determine the appropriate number of clusters and an initial partition of consumers. In a second step, a partitioning algorithm is carried out in order to improve the solution thus obtained. This clustering approach is illustrated using two real data sets. On these data sets, the procedure MD-PREF is also performed and it is shown how it can be complemented by the outcomes of the cluster analysis. In particular, indication about the number of clusters among consumers is given. # 2001 Elsevier Science Ltd. All rights reserved. Keywords: Consumer segmentation; Preference mapping; Clustering algorithms
1. Introduction Segmentation of a panel of consumers is very useful in preference studies. Suppose that p consumers are asked to rate the acceptability of n products, the data can be presented in the form of a table where the rows refer to products and columns (variables) refer to consumers. Principal component analysis (PCA) performed on this data table leads to the so-called Internal Preference Mapping: MD-PREF (see, for instance, Greenhoff & MacFie, 1994). This technique provides a graphical display of the preference data where consumers are depicted by linear vectors representing increasing acceptance. However, this graphical display may be cumbersome if the number of consumers is very large or if the number of underlying dimensions is greater than 3. In these situations, the segmentation of the consumers into homogeneous clusters may enhance the understanding of the outcomes of the internal preference analysis. The underlying principle of the method of clustering of variables (consumers) discussed herein is as follows: find K groups of variables G1, G2, . . . GK and K latent components T1, T2, . . ., TK associated respectively with * Corresponding author. Tel.: +33-2-51785454; fax: +33-251785455. E-mail address: vigneau@enitiaa- nantes.fr (E. Vigneau).
the K groups such that the variables in each group are as much correlated as possible to the corresponding latent variable. The groups and the associated latent variables are determined by means of a partitioning algorithm. Furthermore, this partitioning algorithm is complemented by a hierarchical clustering technique that helps the practitioner to establish the appropriate number of clusters and gives an initial solution to be used as a starting point in the partitioning algorithm. This clustering approach around latent components is in line with MD-PREF technique as it is not only useful to exhibit groups in the panel but also aims at finding directions of preferences. This complementarity between the two techniques will be stressed through two examples. The procedure discussed herein presents several advantages over conventional techniques of clustering of variables which consider distances between variables as the starting point of the algorithm (see, for instance, Qannari, Vigneau, Luscan, & Thedaudin-Lefevre, 1997). It can also be viewed as an alternative to Varclus (Variables Clustering) procedure implemented in SAS package which basically involves a descendant hierarchical algorithm in the course of which PCA followed by quadrimax rotations are performed. With Centroid option which is more appropriate in the present context, clusters are chosen to maximize the variation accounted
0950-3293/01/$ - see front matter # 2001 Elsevier Science Ltd. All rights reserved. PII: S0950-3293(01)00025-8
360
E. Vigneau et al. / Food Quality and Preference 12 (2001) 359–363
for by the centroid component of each cluster (SAS/ STAT, 1990). Among the advantages of our procedure over the methods cited above, it is worth mentioning the following aspects: 1. The procedure answers the objectives sought in preference analysis which consist in finding underlying directions of preference. These directions are given by the latent variables which may be directly depicted on graphical displays or on the basis of a graphical display derived from MD-PREF. 2. Contrary to Varclus procedure, the approach discussed herein follows a pattern which is nowadays commonplace in cluster analysis as it involves a hierarchical analysis followed by a partitioning algorithm (see, for instance, Hair, Anderson, Tatham, & Black, 1992). This makes the principle of the method easy to grasp. 3. The technique can be extended in order to relate preference scores to external data such as sensory or instrumental variables. This can be done by performing the same procedure under the constraint that the latent variables that underly the preference data should be linear combinations of the external variables. This possibility of extension will be discussed elsewhere in more details. It is also worth mentioning that segmentation of consumers on the basis of preference ratings can be performed using competing methods based on latent class approaches. De Soete and Winsberg (1993) proposed a latent class (or mixture distribution) model that simultaneously clusters the consumers into a small number of homogeneous groups and carries out an analysis based on the vector model, yielding a geometrical representation where each cluster is represented by a single vector. Latent class approaches assume that preference data are independently and normally distributed whereas the method proposed herein is purely descriptive (distribution-free). Moreover, Expectation-Maximisation (EM) algorithms used to fit latent class models are much more complicated than the algorithms involved in our approach, as outlined in the two next sections.
2. Partitioning algorithm Suppose that p consumers are asked to rate the acceptability of n products. We aim at finding segments of consumers which are homogeneous in the sense that the scores of consumers in each cluster are as much correlated as possible to a latent variable associated with this cluster. Therefore, the underlying principle of the method can be expressed in a formal way as follows: find K groups of variables G1, G2, . . . GK (with K supposed to be fixed) and K latent components, T1, T2, . . .,
TK (each component being associated with a cluster) such that the quantity S is maximized: S¼
pk K X X rðxkj ; Tk Þ k¼1 j¼1
xkj denotes the jth centered variable (consumer) in group Gk; pk the total number of variables in this group and rðxkj ; Tk Þ is the correlation coefficient between xkj and Tk. Criterion S is invariant whether variables xkj are standardized or not. For convenience, we shall assume that all the variables are standardized in order to have unit variance. The algorithm that aims at maximizing this criterion runs as follows: Step 1: Start with an initial solution consisting of K groups of variables (consumers). This initial partition may be obtained by random allocation of the variables into K groups or, preferably, from the hierarchical clustering method which will be discussed below. Step 2: In cluster Gk (k=1, 2, . . ., K) variable Tk is set to the centroid of cluster Gk: pk P
Tk ¼
xkj
j¼1
pk
Step 3: New clusters are formed by moving each variable to a new group if its coefficient of correlation with the centroid of this group is higher than with any other centroid. In further steps, the process starting from step 2 is continued iteratively until stability is achieved.
3. Hierarchical clustering algorithm We complement this partitioning algorithm by a hierarchical clustering method. In practice, both methods should be performed in order to gain benefit of each. The hierarchical technique can be used to establish an appropriate number of clusters. Furthermore, the partitioning algorithm may be carried out using the clusters from the hierarchical results as an initial solution (step 1). In this way, the advantage of the hierarchical method is complemented by the non hierarchical methods to improve the results by allowing the switching of cluster membership (Hair et al., 1992). The hierarchical clustering technique is based on the criterion S given above. It proceeds by merging at each step two clusters in such a way that the quantity S is kept as large as possible. Thus, it is clear that both the
E. Vigneau et al. / Food Quality and Preference 12 (2001) 359–363
361
Fig. 1. Juice data: internal preference mapping MD-PREF: (a) representation of the consumers, (b) representation of the products, on the basis of the first two components which, respectively, explain 29.2 and 8.8% of the total variance.
clustering algorithms (hierarchical and partitioning) are motivated by the same rational. We can prove that the criterion S can also be written as: S¼
K X pk ðTk Þ k¼1
where pk is the number of variables in group Gk; Tk is the centroid of this group and ðTk Þ is the standard deviation of Tk. At the first stage of the hierarchical clustering technique, each variable forms a cluster by itself. We have at this stage S=p (total number of consumers). At stage h, consider two clusters of variables (consumers) A and B and denote by TA and TB their respective centroids. If A and B are merged, this will result in a variation of criterion S as measured by: D ¼ S Snew ¼ pA ðTA Þ þ pB ðTB Þ ðpA TA þ pB TB Þ where pA and pB are, respectively, the number of variables in group A and group B. We can prove that is non negative which implies that the merging of two clusters results in a decrease of the criterion S. Therefore, the strategy of aggregation consists in merging at each stage those clusters A and B that result in the smallest decrease () in criterion S. The use of the hierarchical clustering for the purpose of determining the appropriate number of clusters will be discussed through the case-studies outlined in the next section.
4. Application The method of segmentation of a panel of consumers is illustrated on the basis of two case-studies. As a first example, we consider a case-study where 88 consumers rated 30 juices (in five sessions) with respect to overall liking. In order to illustrate the preference of consumers, MD-PREF technique was carried out. The outcomes, related to the first two components which respectively explain 29.2 and 8.8% of the total variance, are depicted in Fig. 1. In this figure, each line is associated with a consumer and represents its direction of increasing acceptance. In a further stage, hierarchical clustering of the consumers based on criterion S was performed. Fig. 2 depicts the values of criterion S corresponding to the last steps of the hierarchical algorithm (number of clusters from 7 down to 1). This graph should be interpreted as a scree diagram. The decay of criterion S when the number of clusters increases is rather even which does not seem to evidence the existence of different segments
Fig. 2. Values of criterion S for different numbers of clusters.
362
E. Vigneau et al. / Food Quality and Preference 12 (2001) 359–363
Fig. 3. Soup data: internal preference mapping MD-PREF: (a) representation of the consumers, (b) representation of the products, on the basis of the first two components which, respectively, explain 18.2 and 18.0% of the total variance.
among consumers. This statement is in accordance with the configuration of the consumers shown in Fig. 1(a). The second example refers to a case study where 80 consumers rated nine varieties of soup (in two sessions) with respect to overall liking. Fig. 3 shows the results derived from MD-PREF procedure which was carried out on the preference data. This figure represents the configuration of the consumers and the products on the basis of the first two principal components which, respectively, explain 18.2 and 18.0% of the total variance. The configuration of the consumers seems to evidence the existence of segments among consumers. Fig. 4 reflects the evolution of criterion S in the course of the hierarchical clustering procedure (number of clusters from 16 down to 1). It can be seen that the aggregation criterion jumped when passing from the solution with two clusters to the solution with one cluster. This should arouse the suspicion that ‘unnatural’ clusters are being merged and therefore the solution with two clusters should be retained. The solution with two clusters obtained from the hierarchical clustering was used as a starting point in
Fig. 4. Values of criterion S for different numbers of clusters.
the partitioning algorithm. As a matter of fact, only a slight improvement was achieved by the partitioning algorithm as few consumers changed memberships. This is usually the case as the solution obtained by means of the hierarchical clustering algorithm is generally quasioptimal. Fig. 5 reproduces the MD-PREF graphical display in Fig. 3 and only the endpoints of the vectors associated to the consumers are represented. Moreover, each point is labeled according to the cluster to which belongs the associated consumer. This figure also shows the overall position of cluster 1 and cluster 2, i.e. the projection of the latent direction of preference in each cluster on the first factorial plan of MD-PREF. The two segments of consumers being thus identified by means of the
Fig. 5. Soup data: representation of the consumers obtained with MD-PREF where consumers are labeled according to the cluster to which they belong. The overall latent directions of preference of consumers in cluster 1 and cluster 2 are also shown.
E. Vigneau et al. / Food Quality and Preference 12 (2001) 359–363
clustering approach, further investigations may be undertaken in order to discover the reasons of this segmentation in terms of demographic variables, behavioral variables, etc. (Greenhoff & MacFie, 1994).
5. Conclusion The paramount feature of the procedure of clustering of consumers is that it seems to be in line with MDPREF which is widely used within consumer studies framework. We particularly stressed through examples how both methods are complementary with each other. This complementarity is of prime interest for the practitioner when the percentage of variance explained by the first MD-PREF components is not large enough to allow a reliable graphical display on the basis of a small number of components. Similarly to MD-PREF, the underlying model of preference under consideration is a vector model. This is reflected by the fact that the correlation coefficients are used in the criterion to be maximized. Furthermore, this criterion can be slightly amended in order to take account of the variance of the scores for each consumer. This can be simply done by using non-standardized variables and replacing, in criterion S, the correlation coefficient by the covariance.
363
In practice, this leads to lump together in the same cluster those consumers who do not perceive marked differences in the liking of the products. Another extension of the clustering procedure discussed herein is to allow the investigation of the relationship between preference data and external data (for instance sensory or instrumental data). Further details about this extension will be reported elsewhere. The clustering procedure discussed in this paper will be implemented in Senstools (OP&P Product Research).
References De Soete, G., & Winsberg, S. (1993). A latent class vector model for preference ratings. Journal of Classification, 10, 195–218. Greenhoff, K., & MacFie, H. J. H. (1994). Preference mapping in practice. In H. J. H. MacFie, & D. M. H. Thomson, Measurement of food preferences (pp. 137–166). London: Blackie academic & professional. Hair, J. F., Anderson, R. E., Tatham, R. L., & Black, W. C. (1992). Multivariate data analysis with wreading. New York: Maxwell Macmillan International. Qannari, E. M., Vigneau, E., Luscan, P., & Thedaudin-Lefevre, A. C. (1997). Clustering of variables, application in consumer and sensory studies. Food Quality and Preference, 8(5/6), 353–358. SAS/STAT (1990). User’s guide. Version 6, Vol. 2. Cary, NC: SAS Institute Inc.