FUULIQUALITYAND PREFERENCE3 (1991/92) 23-36
CORRESPONDENCE ANALYSIS IN SENSORY EVALUATION JeanA.McEwan Department
of Sensory Quality and Food Acceptability,
Research Association,
Chipping
Campden,
Campden
Gloucestershire
Food and Drink
GL55 6LD, UK
& Pascal Schlich Institut National de la Recherche Aromes,
Agronomique,
7 January
1991;
analysis is a technique which has
component
analysis
argued by advocates
In addition,
Multivariate
it has been
used
that it is more correct to use
data.
correspondence analysis with sensory data due to its often
categorical
nature.
This
paper
the use of correspondence
sensory evaluation, and generalized
aims
to
ized
analysis
in
widely
component
generalized
Pro-
conventional profile.
multiple
1992 Elsevier
Science
evaluation
Publishers
1990).
analysis;
applied (CA)
0950-3293/92/$05.00
analysis technique
methods
between
both The
points
which
general-
has become
& Flanzy,
1989;
terms.
principal
is correspondence
Buuren,
trated
means though
(GPA)
(T omassone
the
profile
in the last five years. Another,
1983 ; Van
interpreting 23
this (PCA),
& Dijksterhuis, In
are routinely
of conventional
commonly
Procrustes
differences
Ltd
procedures
analysis
Danzart,
; correspondence
correspondence
Most
analysis
Procrustes analysis.
Keywords : Sensory
analysis
less well known,
analysis
Burg
0
analysis;
for the analysis
component
and compares the results with
those obtained from principal
analysis ;
component
Background
and
of data to a more easily interpretable
of dimensions.
illustrate
1991)
Procrustes analysis in that it reduces the
dimensionality number
sur les
INTRODUCTION
evaluation data. It is a technique which has similar generalized
14 /tine
crustes analysis;
been little used by sensory scientists for sensory to principal
accepted
principal
ABSTRACT
objectives
de Recherches
17 Rue Sully, BV 1540, 21034 Dijon Cedex, France (Received
Correspondence
Laboratoire
and
in mathematical authors the
then user
1987 ; Van
CA
terms describe
must
a correspondence
der
et al.,
Guichard
section,
PCA
1977 ;
the
main
are illusand in lay five
consider analysis.
main when It is
24
JEAN
A. McEWAN,
assumed throughout
PASCAL
SCHLICH
the paper that the reader
is familiar with the interpretation
of both PCA
and GPA.
rather than a matrix of attribute scores used to quantify sensory perception of a set of samples, it is necessary to recall arguments types of data collected
most sensory scientists know,
There
interval
were two main objectives
this paper.
The
first was to introduce
methods of CA and MCA, basis of the Introduction. was to illustrate MCA
in writing the
and this forms the
The second objective
the application
of CA and
to sensory profile data, how the results
should be interpreted to those obtained
and how they compare
from PCA and GPA.
and ratio
1984;
Greenacre, which
(association)
1984)
looks
scale (e.g. 7-point
multivariate
row and column application,
(or categorical)
make up a two-way
vari-
CA can be
ordinal
data, unless shown
However, common
to assume for ease of data analysis
With
a trained sensory panel this assumption with the robustness methods
similar
application
of
in
a data
matrix
represent
describe
the
these
two
the
attributes
used
to
the samples. The data points in this
matrix each represent a rating of perception
of
However, method
for
at a multivariate
level are not so usual. It is therefore one of the aims of this paper to examine data
PCA has become
the effect of
as ordinal
will also be performed rows
and
made between
classes of statistical
profile
where
methods.
data, comparisons
by performing
table.
parametric
are often
treating
represent the samples being evaluated, and the columns
comparisons
univariate
results are obtained
statistical
while
of many para-
both
interval,
In its simplest form, sensory profile analysis results
it is
that these profile data have interval properties.
a PCA on
contingency
to be otherwise.
in spite of this knowledge
variables which
regarded in the light of performing two nominal
et al.,
at the correspondence
between
ables. In its original
a
scale) or a con-
tinuous line scale (e.g. 100 mm line scale) are
metric
is
It is on a
category
nonparametric
method
1986).
that data collected
recognised
from (Lebart
ordinal,
may not be too far removed from reality, and
Correspondence analysis analysis (CA)
(O’Mahony,
generally
together
Correspondence
there are four
scale ; nominal,
types of measurement
Objectives
relating to
in sensory analysis. As
instead
CA and PCA.
of
GPA
as this extension
of
more widely used over the
past few years. Returning
to the application
of CA, it was
stated that the row variable was divided into a number
of categories ; i.e.
samples evaluated.
the number
of
This is the simplest case,
a particular attribute for a given sample. Thus
and in reality the number of categories
it is not immediately
row variable can be calculated in a number of
can be considered
clear that such a matrix
as a two-way
contingency
ways :
table. To try and get this idea across, it is first necessary
to define the rows and columns
the sensory variables,
profile
chocolates
suppose
categories
1. samples X assessors x replicates ;
of
2. sample X assessors
of two
that a profile
3. samples X replicates
of six
is the columns
be treated in light of two categorical
variables,
over
over assessors and
replicates). These options also apply to data analysis with PCA.
(i.e. 14 attributes).
To understand why a profile matrix might
(averaging
4. samples (averaging
and the second
which have fourteen
over
assessors) ;
then
is the rows which have six
(i.e. 6 chocolates)
(averaging
replicates) ;
of categories.
resulted in fourteen attributes,
the first variable variable
in terms
each with a number
For example,
categories
matrix
of the
order
In this paper Option to
examine
assessors. However,
the
1 was chosen in
reproducibility
of
it is often more practical
CORRESPONDENCE
to use Options advisable
2, 3 or 4, and it is usually
to perform
these
analyses
in as-
ANALYSIS
i (i = 1, 2, . . , n) is the ith row (sample) of the matrix
the variance
column
of replicates,
asses-
X,
and j
(j = 1, 2, .
(attribute)
, p) is the jth
of the matrix X. In PCA,
the data xii are column
sors and samples.
25
obtained, which comprises elements xij, where
cending order. This enables the user to evaluate contribution
IN SENSORY EVALUATION
centred according
to
The next section illustrates in more math-
the formula xij - z~. In other words, the mean
ematical terms the method of CA, but as most
of the data in the jth column (%J is calculated
users of sensory profile analysis will be familiar
and subtracted
with
that column.
the practical
interpretation, what
application
it is well
is considered
property should
of CA. recall
of PCA
worth
illustrating
to be the fundamental At this point
that
for
the reader
on performing
PCA
on
from each of the elements
the p columns. case
in
CA
frequency
of
This step is repeated for each of However, since
this cannot be the
often
the
data
xij
are
counts (positive integer figures). In
cases where the data submitted
to CA are not
sensory profile data, he derives sample scores
frequency-type,
which represent the position of the samples in
zero.
a multivariate
consider the xij as ‘the amount of something ‘,
component
space.
loadings are obtained
resent the weighting to
each
of
principal
In addition,
the
principal which rep-
(or importance)
sensory
components
attached
attributes
on
the
(new dimensions).
(R) and column
original
matrix
(C)
of the
of data. This results in two
new data matrices, the Appendix.
profile
R and C, as illustrated
PCA is then performed
Another
way of looking
and hence by definition
in
at this is to
cannot be negative.
In CA, both the row and column profiles of the matrix
X are calculated,
resulting in two
R denotes the matrix of row
new matrices;
The first step of CA is to calculate the row profile
the xii must be greater than
profiles, while C denotes the matrix of column profiles. In mathematical by calculating
notation, R is derived
xij /zci for all i and j, while C is
derived by calculating
xij /%j for all i and j.
on the
The xi, are the sums over the columns for each
R and C matrices, and scores and loadings are
row, while x,~ are the sums over the rows for
derived in each case. The fundamental
each column j.
erty of CA is that the principal scores of R are proportional C,
and
similarly
the
to the loadings of
principal
scores of C are proportional
prop-
component component
In sensory profiling, xi. is the global value of one row over all the attributes, equivalent
while x,~ is
to the mean of attribute j. Perfor-
to the loadings of
ming CA means that differences between x,~ are
R. In practical terms, this allows samples and
lost and overall differences between xi, (rows)
attributes to be represented
are lost. This leads onto another point worth
simultaneously
on
the same plot.
noting, namely that CA does not take account
One final practical point is that to perform CA, summing
across the rows and columns
must make sense. For example, be
sensible
variables
to
sum
(attributes)
across
it would not
columns
were measured
if the on dif-
of large differences
between
the xi,, whereas
PCA will often separate low and high xi, on the first dimension.
Essentially,
CA is equiv-
alent to performing
two PCAs ; the first is a
PCA of the matrix R, and then a PCA of the
ferent scales ; p I, weight in grams, volume in
matrix
cm3, etc.
using the x2 metric distance, which is, in the
C. Both
these PCAs
are performed
PCA of R, the distance between profiles
i and i’. This can be written
Mathematics interpretation of CA
following
To understand in more detail how CA differs
d2(i, i’)
from PCA, the reader must first consider what these two techniques Typically,
are doing
in sensory profiling
to the data. a matrix
X is
the two line
=
in the
expression
$ >(2_?)2.
.j
An important
2.
point
2’.
to note
when
com-
paring this with the classical PCA is that the
26
JEAN
TABLE
A. McEWAN,
1. Four
Points
PASCAL
SCHLICH
for the Interpretation
of Correspondence
Point
Analysis
Description
1
The origin
corresponds
column
profile.
assessor
or attribute)
to the mean
Thus,
line profile
the importance increases
and mean
of a point
with
its distance
(sample, from
the
origin 2
The closeness assessors)
of two
implies
attributes
that
they
have
(or two
samples
or two
similar
column
(or line)
profiles 3
The closeness
of a sample
in terms
of correspondence.
this sample
received
higher
this particular 4
with
contribute
to
small the
be
values
than
that
others
for
away
two
close points
are from
the origin,
the
the implications
x,,/x,~ (x,, is the grand mean) allows
attributes
relative
should This means
attribute
The further stronger
weight
and an attribute
interpreted
values
distances
(low
x,~) to
between
line
preted with confidence,
but if the same two
attributes were to lie at the same distance from each other near the origin, then their possible
profiles to the same extent as those attributes
correspondence
be treated
with cau-
which have large values (high x,~). In a sense,
tion. These points should become
clearer on
this is comparable
working
correlation
to the normalized
PCA (i.e.
matrix) in which the weight is the
would
through
the interpretation
of data
later in this paper.
inverse of the variance. One
can prove
ponents
that
the principal
scores of R are proportional
comto the
Multiple
correspondence
analysis
loadings of C. Thus, a number of useful points for interpretation Table
1. Point
emerge,
as summarized
1 is self explanatory,
Point 2 it is worth
noting
in
but for
that it would
be
MCA, like CA, is used to analyse data collected on both nominal and ordinal attributes, where the number
of different responses for a given
wrong to conclude that two samples have the
attribute is fixed. Further, the total number of
same
lie
responses over attributes is therefore fixed, and
PCA or GPA should be used to
equal to the number of columns of the table of
attribute
close together.
amounts,
even
if
they
determine attribute amounts, which effectively
data to be analysed. MCA
indicates the perceived intensity of an attribute.
lent to performing
PCA
Considering
Point 2 in more detail it should be
nominal
noted
distances
is, in fact, a generalization
that
between
between assessors are comparable. that there is more ‘between than ‘between
samples
and
This implies
sample ’ variation
assessor ’variation if assessors lie
number
(or category) of measured
is generally
MCA
for the assessors to be
at the origin,
for replicates
the same sample to be close together,
of
and for
samples to be spread over the space. Point 4 is a useful origin
reminder
space. For example, together
that
are contributing
points
close
to the
little to the derived
two attributes lying close
away from the origin can be inter-
Thus, MCA
of CA when the
variables
is equal to k,
first necessary to appreciate that in CA the data are essentially
desirable
variables.
with k greater than two. To illustrate this, it is
close together and samples far apart. Clearly, it close together
is in effect equivaon more than two
a two-way
contingency
table.
is in fact analysing k x (k - 1) two-way
contingency
tables,
based
on
all
possible
variable pairs. Using
these
two-way
tables
it is then
possible to construct a Burt table (Lebart et al., 1984), which is a partitioned symmetric
matrix
of all pairs of two-way
tables.
The MCA
program
contingency
of SAS can then be used
CORRESPONDENCE TABLE
2. Example
Analysis
of Data
Format
for
Performing
Multiple
IN SENSORY
Correspondence
EVALUATION
Analysis
with
27
a Correspondence
Program Age group Individual
Al
to analyse
this table.
constructs
a partitioned
matrix (PSDM) this
ANALYSIS
PSDM
equivalent table.
A2
However,
A3
A4
M
if the user
symmetric
design
to
a simple
CA
In fact,
the Burt
product of the PSDM.
program
MCA table
A
F
into
is
on a Burt is the inner
To obtain the PSDM
B
three
venience After
of OS and Is, then submitting
to performing
Social class
Sex
Cl
new
C2
D
E
attributes,
which
MCA
the three levels of
sweetness are plotted together the other
attributes
from
with levels of
the profile
to join SW0 to SWl,
and SW1 to SW2. Thus,
performing
path of each attribute to be followed
consider
how
a simple
three categorical
this
example
would
where
variables;
look
there are
being
analysed. A directional line can then be drawn
data.
illustrate
con-
can be called SWO, SW1 and SW2.
performing
the user must have access to the raw individual To
for
MCA
on profile data allows the through
the sample space.
age, sex and social
class. Age has four levels (Al, A2, A3 and A4), sex has two levels (M and F) and social class has six levels (A, B, Cl, individuals
C2, D and E). For five
the first five lines of the PSDM
will take the form
shown
example,
the
line
individual
is in age group A2, is male and of
first
in Table
indicates
social class C2. One important
2. For that
the
feature to note
MATERIALSAND METHODS Background to data
is that the sum of each row should equal the total number Another
of variables, aspect
in this case three.
of performing
sensory data, is the conversion variable from its original
MCA
on
of data on each
scale (e.g. 100 mm
line scale, 9-point category scale) into a smaller number of representative
categorical
This is achieved by constructing for each variable distribution number
variables.
a histogram
and using the shape of the
to aid in the
of categories,
selection
of the
as will be illustrated
later in this paper. In practice,
between
5 new categories
Suppose that a
are selected.
categories,
where
observations
O-20 were allocated to th first category, were 51-100
allocated
to the second
were allocated
category
were
from
profile of eight commercially berry jams, conducted Drink
Research
twelve
trained
intensity
a conventional available straw-
at Campden
Association. sensory
of eighteen
Food and
In this profile,
assessors
attributes
rated
(Table
the
3) for
each of the jam samples in triplicate using an unstructured
line scale. The
attributes
used
were described and defined over a number
of
training sessions.
Principal component analysis
rated 21-50 and
to the third category.
Then the attribute sweetness is effectively
data used
2 and
100 mm line scale for sweetness is divided into three
The
split
Principal
component
analysis
Chatfield
&
1980),
covariance
matrix was applied to the complete
Collins,
(PCA) based
on
(e.g. the
data set. In other words, a matrix of 288 rows
28
JEAN
TABLE Jams,
A. McEWAN,
3. Attributes
PASCAL
SCHLICH
used to Describe
and their Abbreviations.
relate to the number
of categories
in parentheses
Abbreviation
(4)
Acid (3) Caramelized
(2)
in increments
CAR
axis the frequency
MUS
of strawberry of fruit (3)
Synthetic/perfumy Bitter
(3)
into each of the 12 categories.
fell
The original
SOS SOF
values. SAS was used to perform
to take category the MCA.
JEL STE
RESULTSAND DISCUSSION
THI SEE
Seedy (3) Gelatinous
GEL
(3)
Mouthcoating
Bitter
which an observation
data were then transformed
(2)
(4)
MC0
(3)
SMO
(3)
Throat
of 5 units, and the horizontal
BIT (2)
Stewed texture
Smooth
where the
the 60 mm line scale
SYN
(2)
(2)
Jelly-like Thick
analysis
as histograms,
SWE
Musty
(2)
Data were plotted
AC1 RIP
Strength
correspondence
vertical axis represented
Over ripe (2) Strength
Multiple
used in MCA)
Attribute Sweet
the Strawberry
(Figures
catching aftertaste
TCA
(3)
BAT
(2)
(12 assessors x 8 samples x 3 replicates) columns
(attributes)
was
analysed.
and attribute plots were obtained. was used to perform
by 18 Sample
Figure 1 shows the sample and attribute
plot
from
cor-
PCA,
while
responding
plot
aging the principal
PCA.
consensus
Procrustes analysis
sources
Generalized Procrustes analysis (GPA) (Gower, & Hallett, 1990) was applied to
the complete sample
and
data set, to derive a consensus attribute
individual
corresponding GENSTAT
plot,
as well
as the
assessor
plots.
was used to perform
of
variation
the GPA.
The attribute measuring
the
GPA.
The
in
GPA
the
is the
for
data
three
through
rotation/reflection.
plot for PCA was obtained
the correlation
averaging.
scores across
adjusting
and
scores and the original
between
by
the PC
attribute
ratings, after
This same procedure
was used on
the GPA data to obtain an attribute
plot for
each
ease
individual
presented
analysis
component after
scaling
presentation
Correspondence
from
the plot from
derived
translation, 1975 ; McEwan
2 shows
sample plot from PCA was derived by aver-
MINITAB
assessors, whilst
Generalized
Fig.
derived
assessor,
but
for
of
the consensus of these values are
in Fig. 2. The sample scores were
scaled to enable both samples and attributes to fit on the same plot. The first letter of the
Correspondence
analysis
was performed
on
data of the same form as that used for PCA
attribute name is its location It is evident
in space.
that the sample spaces from
and GPA. This allows a consensus sample and
these methods are very similar, though there is
attribute
a more distinct separation of Samples A, D and
individual perform
plot
to
sample the CA.
be
derived,
plots.
SAS
as well
as
was used to
F in the
GPA
plot,
possibly
due
variation in the data after adjustment
to
less
through
translation and scaling. The attribute plots are also similar, though there are a few differences worth
noting.
contribute samples
very in
Firstly, little
smooth to the
and jelly-like
separation
of the
PCA since they are both near the
CORRESPONDENCE
ANALYSIS
29
IN SENSORY EVALUATION
synthetic
0,
0 ‘tertaste
0 E
s
;weet _B,~oatc~~:g
01
(u
seedy E, I!
i
_
H
Y
strength
jelly-like
smc th
of strawberry
G
-0
musty acid
'G
\i
thick
mouthcoating
,--r
-0
over-ripe
caramelized
stewed gelatinous
-0
-0.4
-0.6
3
0
-02
0.4
0.2
0.8
0.6
I.0
PC 1 31% Fig. 1. Sample
and attribute
plot derived
fr6m
principal-component
analysis-sample
0.6
scores
averaged
over
Jelly - II ke
synthetic C
smooth
0.4
0.2 bitt
aftertaste
Neet
2 &I (u 2
D
C
D fl b musty
over-ripe
Strengi OF”’
AGA caramelized stewed acid
-0.: strength of strawberry gelatinous thickGxG,, -0.r
I
I
-0.8
I
I
I
-0.6
-0.4
-0.2
seedy
0
I
I
I
0.2
0.4
0.6
PA 1 64% Fig.
2. Consensus
sample
and attribute
plot
derived
from
generalized
Procrustes
analysis
t
assessors.
JEAN
30
0.6
A. McEWAN.
PASCAL
SCHLICH
BAT BIT
iSYN
0.6
0.4
C
20 2
c’
01
o-2
: ‘Z c
E
SMO 5 1
I’E
.-E” D
!/
0 ,- TCA-H#gB&H,
1
SOF
SEE
9 4
7
MU
SOS RIP
-0.2
-0.4 I-
STE
11
7, G
,
‘G:LCAR -0.4
-0.2
I
I
I
0.2
0.4
0.6
Dimension
Fig.
3. Combined
sample and attribute
plot derived
from
1
I 0.6
I
I
1.0
1.2
33%
correspondence
analysis, with average
assessor positions
represented.
origin. Bitter flavour and aftertaste contribute
matched
through
the
steps
of
translation,
less to the GPA space than that obtained from
scaling and rotation
PCA.
the effect of elimination
assessor to assessor
variation.
this better,
Other
contribution
differences
to attributes on individual dimen-
sions. For example, important
strength
of strawberry
on the first dimension
space and contributes dimension,
can be seen in the
a little
of the GPA
to the second
but only really contributes
first dimension
of PCA.
sions were examined, to the interpretation
is
Subsequent
to the dimen-
but as they added little they are not included in
this discussion. The amount
analysis
the first dimension
accounted
of the PCA
for by
(30 %) and
GPA (64Y’)o sp aces differs considerably.
This is
To understand of variance
which has
is performed
principal component,
specifying
suppose on each
‘assessor ’and
‘sample’ as the two factors. Then this analysis of variance would show both between assessor variation and between sample variation. However, if the corresponding formed
on the principal
GPA, of variation
and reflection,
then
the
between
analysis was peraxes obtained assessor
after
variation
would be close to zero. This means that while the consensus
plots from
the two
show the same visual structure,
methods
the variance
due to the way in which the data are treated
structure is different. If PCA were performed
using both methods.
using
performed
In PCA,
the analysis is
on a matrix where the observations
(rows/objects) and assessors.
comprise Thus,
principal components
the
samples,
replicates
derivation
includes variation
of the attri-
data
eliminating
averaged
across
assessors,
after
the assessor effect, then a similar
variance structure to GPA is usually obtained. Figure 3 shows the combined attribute
plot derived
from
sample and
correspondence
buted to each of these three sources. However,
analysis. The sample positions were obtained
in the case of GPA,
by averaging
each assessor’s space is
over
assessors. These
show
a
CORRESPONDENCE
E
a 0
IN SENSORY EVALUATION
and F were perceived
(a)
‘;
ANALYSIS
to have greater flavour
strength than the other five samples. Another example
is synthetic
towards
the top left hand corner
suggesting \
I
31
that Sample
more synthetic
>
The
Attribute Intensity
(SYN),
of Fig. 3,
C was perceived
as
than the others.
numbers
calculated
which increases
1-12
average
on
points
Fig.
3
are
of each
the
assessor.
Ideally, all assessors should cluster around the origin indicating
agreement.
However,
Asses-
sors 8, 12 and 11 are slightly further out from
(b)
the rest suggesting that their data are different from
the others
respondence I
I
with
respect
reflected by examination
>
Attribute Intensity
to their
with the attributes.
cor-
This was not
of the assessor plots,
assessor residuals and percentage variance from the GPA.
F
a
0
Cc)
J-l
number
of
histogram
category
(MCA),
histograms
attribute.
There were three general shapes to
formed
>
for
used each
to determine of
the
sensory
attributes.
with the percentage closely resembling from GPA.
This latter observation
the similarity
both subject to between Using
CA
assessor variation.
the attributes
can readily
plotted on the same diagram abbreviations). the
points
To interpret raised
remembered. dence
First
between
example
is due to
of PCA and CA in that they are
(see Table 3 for
Table
1 the
should
and attributes ; for
A, D and F have a high
Sample G shows a strong correspondence lie away
(MCO).
from
indicate
the
the samples,
direction
of flavour
Dimension
but
these still
of increasing
(SOF)
corre-
For example,
is increasing
attributes
similar to Fig. 4(a) were levels,
four
category
levels.
The
number
levels for each attribute
of
is given in
Table 3 (figures in parentheses). Figure after
5 shows the sample space derived
MCA.
As before
the
three
positions of each sample are joined triangle. and D.
On this two-dimensional Also,
in common
with
replicate to form a
picture, all of A
the other
methods is the cluster of A, D and F on the positive side of the first dimension,
with the
other samples on the negative side. However, the pattern of Samples E, C, H, B and G along the second
dimension
of Fig. 5 is different
from the other plots. The reason for this can be found on examining
with
Many of the attributes
spondence with these attributes. strength
be
correspon-
with acid (ACI) flavour, while
mouthcoating
data. Those
samples are separated with the exception be
this plot (Fig. 3)
consider
samples
Samples
correspondence
in
each
Fig. 4 (b) to three category levels and Fig. 4 (c)
variation explained more the PCA variance than that
for
to data with two category
category to the PCA and GPA plots,
drawn
each attribute was trans-
to categorical
converted to
similar structure
were
analysis
as illustrated in Fig. 4. Based
with a distribution
shapes
levels
correspondence
on these histograms,
Attribute Intensity
4. General
to multiple
these histograms,
I Fig.
Prior
Fig. 6.
Figure 6 shows the path of a number of the attributes through the jam space. The number of
points
on
corresponds
the
path
to the number
of
each
attribute
of category
levels
allocated to it. To illustrate the reason for the
along
different
sample
1 which suggests that Samples A, D
consider
the attribute
positions
on Dimension
2,
‘thick ‘. In each of the
32
JEAN
A. McEWAN,
PASCAL
I.0 -
SCHLICH
G
A
G-G
OEI -
2
IO
3 7 0
4
5
1
8
‘C -
-0.5
12 -1.0 t
I
I
6
I
-0.5
-1.0
0.5 Dimension
Fig. 5. Sample plot derived case letters
represent
previous methods, on the bottom where
the
However,
from
centroid
multiple
position
left-hand
quadrant of the plot
G has been
in Fig. 5 from MCA,
by examining
quadrant
positioned.
Sample G is in
of the plot. Now
Fig. 6 it is evident that the path
of ‘thick ’ (THWTHIl-THI2-TH13) its maximum different
view
profile
data.
observed
MCA
quadrant of
is taking
a slightly
of the interpretation This
same
on tracing
‘mouthcoating
6)
shown
the attributes
of the
phenomenon
other attributes,
’ (Fig.
strawberry ’ (not versely,
reaches
in the top left-hand
the space. Thus,
and on
was
such as of
‘strength Fig.
caramel,
6).
Con-
over-ripe
stewed provide less information
and
in the MCA
The
results of the MCA
(including
those
in Fig. 6) can be summarized
in terms
of the diagram
represents
the path of the main attributes
the MCA
space. This illustrates the Guttman,
or horseshoe effect (Benzecri,
with average
in Fig.
1984)
assessor positions
as represented
represented.
(Lower-
7, which on
1973 ; Greenacre,
by the steeply
curve. This phenomenon correspondence dimensional while
the
orthogonal
to each other,
dimension
and strawberry
other
in
multi-
It reflects, that
dimensions
are linearly
they can also be
way (Greenacre,
There is also a shallower horizontal
and
scaling methods. derived
dipped
is quite common
analysis
related in a nonlinear
1984).
direction
across the
representing
sweetness
on the left and fruit and acid
on the right. Musty takes a shorter path than the other attributes along the horizontal diagram,
whilst bitter and synthetic
of the
follow
a
path on the vertical of the diagram. These results indicate that MCA is taking a different
view
techniques
than they did using PCA or CA. not represented
analysis,
of sample.)
‘thick ’ has been positioned
Sample
the top left-hand
correspondence
I.0
1 16%
of the data from
the other
discussed in this paper. This is not
to say that MCA is more ‘correct ’or ‘wrong ‘, merely from
that it approaches a different
idea of tracing
the interpretation
and interesting an attribute
angle.
The
path through
a
perceptual space certainly has its attractions for the interpretation
of profile
data, and those
CORRESPONDENCE
ANALYSIS
33
EVALUATION
T
THI 3
1.5 -
IN SENSORY
11
1 ,o
TH! 0
\ MC02
9
GEL2 \
2
0 .5 -
BAT0
9
b
TCAP
cu
b
G
‘Ji
\
O-
:
.-E D
t,
STEOTH\l2 C
THI 1
-0 .5 -
12 BAT1
-1 ,O-
I
I
-1.0
-0.5
6
Dimension
Fig. 6. Path of selected attributes position
derived
I 0.5
0
from
multiple
I 1.0
1 16%
correspondence
analysis.
(Lower-case
letters
represent
centroid
of sample.)
THROATCATCHING MOUTHCOATING GELATINOUS THICK
G SWEET STRAWBERRY
FRUIT ACID
MUSTY
BITTER SYNTHETIC
Fig.7. Summary
of main
points
from
multiple
correspondence
analysis
attribute
path
plot,
illustrating
the Guttman
effect.
with
access
to an MCA
program
are encour-
aged to make use of this approach plementary
as com-
to the more traditional methods of
analysing such data.
On the question of individuals,
MCA
like
CA is not as flexible as GPA in dealing with this. However, or
it can indicate potential
‘different ’ assessors
by
calculating
‘odd ’ an
34
JEAN
TABLE
A. McEWAN,
PASCAL
4. Hotelling-Lawley
Accounted
SCHLICH
and Fisher Statistics
for the Different
Multivariate
Methods,
and Percentage
Variation
for in the First Two Dimensions Fisher value” Method
HLd
Dim
1
% Variation
Dim 2
Dim
1
Dim 2
GPA
232.8
401.4
79.7
64.4
PCA
173.6
316.3
342
30.8
12.4
PCA”
203.9
337.1
106.9
75.3
14.2
CA
116.3
205.2
53.8
28.4
13.4
CA”
171.2
249.1
113.4
73.1
142
MCA”
118.1
241.7
13.8
15.7
8.5
a PCA on data averaged
11.6
over assessors (24 rows by 18 columns).
b CA on data averaged over assessors (24 rows by 18 columns). ’ With
MCA,
the categorization
’ Fisher approximation
does not allow data to be averaged over assessors.
with 14 and 18 d e g rees of freedom.
e Fisher with 7 and 16 degrees of freedom.
Critical
average
Figure
position
for each person.
5
shows the twelve assessors in a similar position to that obtained both
CA
from the CA (Fig. 3). Thus,
and MCA
highlighted
Critical
value at level 0.001 is 487.
value at level 0001
the same
is 6.46.
(ANOVA)
was calculated
based on 24 samples
replicates).
the consensus. It can be concluded that Assessor
average
6 is sensitive
computed.
wrong
to the attribute
11 is not. to
suggest
bitterness,
However, an
it would
association
Assessor 8 and jelly-like
but be
between
since they lie within
the central (origin) region of the plot. Returning
to the GPA,
it is worth
sidering another point, having completed terpretation
in-
using the four methods. In Fig. 2,
the attributes
smooth, jelly-like
have greater
importance
and seedy all
as a result of GPA
than they have using the other methods. In the GPA,
smooth
and jelly-like
related to synthetic the
are
strongly
which may suggest that a
number
of
between
these attributes.
panel
find
For
a relationship
This was not picked
Table
value, the better
are highly
the
and 24
was given
the were
approxi-
the discrimination
between
and Fisher values
assuming that the test
assumptions are true. This indicates that some of the samples are different, and that assessors are able
to evaluate
multivariate important
context.
discriminant results
these
power
suggest
criminating
in a
the
most
4 is to compare
of each method.
that
GPA
is the
the
These
most
dis-
of the methods, most likely due to
the rotation/reflection
aim
differences
However,
use of Table
that it is better
Having looked at each of the four methods
MCA
samples
4 gives a Fisher
significant,
before
on the same data set, consideration
CA
of
and 3
samples is in the first sample plot.
up by the other methods, and, hence is worth GPA.
(8 jams
mation for the HL statistic. The higher this F
noting as an advantage
of applying
PCA,
locations
All the Hotelling-Lawley con-
These
statistics are all measures of sample discrimination,
assessors as being different in some way from
Assessor
for the first two
dimensions derived from each method.
step. It is also evident
to average
performing
across assessors
either PCA
is to separate
the
possible. Alternatively,
or CA if the as well
as
the standardization
samples
of
to the question of what is the best method to
each assessor separately is also likely to result in
use. Detailed investigation
of this is outside the
good
discrimination
scope of this paper, but Table 4 provides some
PCA
is more
information
MCA,
for consideration.
The Hotelling-
between
discriminating
probably
due
Lawley (HL) statistic (Freund et al., 1986) was
normalization
calculated
balance of discriminant
in
association
analysis of variance Fisher
statistic
with
multivariate
(MANOVA),
from
analysis
while the of
variance
Dimension
to
the
the than line
samples. CA
or
profile
of the CA methods. In terms of information
1 and Dimension
between
2, as indicated by
the Fisher value, the CA on the averaged data
CORRESPONDENCE is the best in this respect. is the method dimensional variation lower
which
accounted
The
for using
phenomenon.
remembering
multivariate to cope
method
with
and GPA
that
this is a
However,
it
MCA
data,
are theoretically
ordinal
though
is the
and interval
while
more
is
only is able
CA,
correct
PCA to use
data.
IN SENSORY
EVALUATION
Guichard,
E., Schlich,
Typicality
of apricot aroma: Correlations
sensory
is much
of this sort which
nominal
two-
percentage
MCA
than the other methods,
worth
that CA
leads to the best
discrimination.
well-known
with
This means
ANALYSIS
P. & Issanchou,
35 S. (1990). between
data. /. Food Sci., 55 (3),
and instrumental
735-738. Lebart,
L.,
Morincau,
(1984). Multivariate Correspondence
A.
Descriptive
Analysis
McEwan,
J. A. & Halictt,
the
and
crustes
Techniques _for
and Sons, New
York.
E. M. (1990). A Guide to
Interpretation
Analysis,
K. M.
Statistical Analysis:
and Related
Large Matrices. John Wiley Use
& Warwick,
of Generalized
Statistical
Manual
Pro-
Number
1,
CFDRA. M. (1986). Sensory Evaluation
O’Mahony, Statistical New
ACKNOWLEDGEMENTS
Methods
and Procedures.
Marcel
qf Food: Dekker,
York.
Tomassone,
R. 81 Flanzy,
synthetique
de
don&es
diverses
par un jury
C. (1977). mdthodes
Prdscntation d’analyse
de
Ann. Technol.
de degustateurs.
Agric., 26, 3733418. Van
Buuren,
spondence The authors
would
INRA
financial
for
possible
support
CFDRA in
Council
Foreign
Affairs the
exchange
and the French for providing
authors
to
work
and
making
to analyse the data in depth,
British allow
like to thank
it
and to the
Ministry
travel
funds
together
Chemistry Van
S. (1987). analysis
in
Using
and Industry, July,
dcr
Burg,
Nonlinear
multiple
sensory
quality
447-50.
E. & Dijksterhuis,
canonical
correlation
of
way data. In Multiway
to
& S. Bolasco.
North
corrcresearch.
G. B. (1989).
analysis
of multi-
Data Analysis, ed. 1~. Coppi Holland,
Amsterdam.
and
ideas.
STATISTICAL PACKAGES REFERENCES GENSTAT ithms Benzecri, J. P. (1973). L’Analyre des Donnees. 2: L’Analyse Chatfield,
des Correspondames.
Dunod,
Tome
Paris.
Group
MINITAB crence
C. & Collins, A. J. (1980). Introduction to
SAS
(1988).
Analysis. Chapman and Hall, London.
ditional Institute
Algor-
Limited. (1990).
Manual
Danzart, M. (1983). Evaluation of the performance
Multivariate
(1983). Relcasc 4.04. Numerical Data
Analysis
Software.
Release 7.2. Minitab SAS
SAS/STAT Inc.. North
Technical
Report
Procedures,
Rcf-
Inc. PA, USA. P-179:
Ad-
Rclcasc 6.03. SAS
Carolina.
of panel judges. In Food Research and Data Analysis, cd. H. Martens & H. Russwurm. London,
Applied Science,
pp. 305-319.
Freund, R. J., Littell, R. C. & Spector, P. C. (1986). SAS Systems for Linear Models. SAS Institute, Inc., North Carolina.
APPENDIX
Gower, J. C. (1975). Generalized Procrustes analysis. Psychometrika, Greenacre,
40 (I), 33-51.
M. J. (1984).
Correspondence
Analysis.
Theory and Applications Academic
of
Press, London.
To illustrate and
column
the method profiles
of calculating of
a matrix
the row of
data,
36
JEAN
A. McEWAN,
PASCAL
SCHLICH
Raw data
Attribute Sample
1
2
A B C II E F G H
20
30
1
7
44
14
12
7
44 21 34 16 45 45
13 28 18 36 16 16
8 4 5 2 14 15
6 14 7 12 7 7
Total
269
171
61
67
consider
3
the following
the raw data is taken
4
5
6
7
8
15
2
26
2 11 4 12 0 1 40
raw
8
9
10
11
12
13
31
6
4
17
28
22
17
11
24
7
8
5
40
20 13 24 12 34 28
9 32 17 34 6 13
35 6 13 7 10 14
9 7 15 7 6 9
26 22 14 19 3 11
4 24 14 30 4 9
31 15 32 18 48 28
18 9 16 19 17 18 14
172
153
115
64
120
118
234
128
167
data.
In this case
to be the mean
sample
calculate
Attribute
the
value
1 for the line profile
of
Sample
matrix,
A,
the raw
calculate
Attribute
Sample
This
then multiplied
is 259.
result
is then value
8
17
9
6
259
29 23 7 20 7 30 26
17 35 26 20 23 16 22
25 23 12 27 13 25 28
8 10 6 16 8 6 9
330 327 272 320 281 328 319
150
176
162
69
is repeated
the
value
for
of profile
raw data value of 20 is divided sum for Attribute
by 100 to give the line profile
13
33 20 8 21 8 40 24
1 for the column
by the row sum for
A which
18 Total
16
process
data value of 20 is divided multiplied
17
15
all data
values. To
score for each attribute. To
of 7-7. This
14
value
1 which
A,
matrix,
the
by the column
is 269. This result is
by 100 to give the line profile
of 7.4. This
data values.
Sample
process
is repeated
for all