Correspondence analysis in sensory evaluation

Correspondence analysis in sensory evaluation

FUULIQUALITYAND PREFERENCE3 (1991/92) 23-36 CORRESPONDENCE ANALYSIS IN SENSORY EVALUATION JeanA.McEwan Department of Sensory Quality and Food Accept...

1MB Sizes 1 Downloads 53 Views

FUULIQUALITYAND PREFERENCE3 (1991/92) 23-36

CORRESPONDENCE ANALYSIS IN SENSORY EVALUATION JeanA.McEwan Department

of Sensory Quality and Food Acceptability,

Research Association,

Chipping

Campden,

Campden

Gloucestershire

Food and Drink

GL55 6LD, UK

& Pascal Schlich Institut National de la Recherche Aromes,

Agronomique,

7 January

1991;

analysis is a technique which has

component

analysis

argued by advocates

In addition,

Multivariate

it has been

used

that it is more correct to use

data.

correspondence analysis with sensory data due to its often

categorical

nature.

This

paper

the use of correspondence

sensory evaluation, and generalized

aims

to

ized

analysis

in

widely

component

generalized

Pro-

conventional profile.

multiple

1992 Elsevier

Science

evaluation

Publishers

1990).

analysis;

applied (CA)

0950-3293/92/$05.00

analysis technique

methods

between

both The

points

which

general-

has become

& Flanzy,

1989;

terms.

principal

is correspondence

Buuren,

trated

means though

(GPA)

(T omassone

the

profile

in the last five years. Another,

1983 ; Van

interpreting 23

this (PCA),

& Dijksterhuis, In

are routinely

of conventional

commonly

Procrustes

differences

Ltd

procedures

analysis

Danzart,

; correspondence

correspondence

Most

analysis

Procrustes analysis.

Keywords : Sensory

analysis

less well known,

analysis

Burg

0

analysis;

for the analysis

component

and compares the results with

those obtained from principal

analysis ;

component

Background

and

of data to a more easily interpretable

of dimensions.

illustrate

1991)

Procrustes analysis in that it reduces the

dimensionality number

sur les

INTRODUCTION

evaluation data. It is a technique which has similar generalized

14 /tine

crustes analysis;

been little used by sensory scientists for sensory to principal

accepted

principal

ABSTRACT

objectives

de Recherches

17 Rue Sully, BV 1540, 21034 Dijon Cedex, France (Received

Correspondence

Laboratoire

and

in mathematical authors the

then user

1987 ; Van

CA

terms describe

must

a correspondence

der

et al.,

Guichard

section,

PCA

1977 ;

the

main

are illusand in lay five

consider analysis.

main when It is

24

JEAN

A. McEWAN,

assumed throughout

PASCAL

SCHLICH

the paper that the reader

is familiar with the interpretation

of both PCA

and GPA.

rather than a matrix of attribute scores used to quantify sensory perception of a set of samples, it is necessary to recall arguments types of data collected

most sensory scientists know,

There

interval

were two main objectives

this paper.

The

first was to introduce

methods of CA and MCA, basis of the Introduction. was to illustrate MCA

in writing the

and this forms the

The second objective

the application

of CA and

to sensory profile data, how the results

should be interpreted to those obtained

and how they compare

from PCA and GPA.

and ratio

1984;

Greenacre, which

(association)

1984)

looks

scale (e.g. 7-point

multivariate

row and column application,

(or categorical)

make up a two-way

vari-

CA can be

ordinal

data, unless shown

However, common

to assume for ease of data analysis

With

a trained sensory panel this assumption with the robustness methods

similar

application

of

in

a data

matrix

represent

describe

the

these

two

the

attributes

used

to

the samples. The data points in this

matrix each represent a rating of perception

of

However, method

for

at a multivariate

level are not so usual. It is therefore one of the aims of this paper to examine data

PCA has become

the effect of

as ordinal

will also be performed rows

and

made between

classes of statistical

profile

where

methods.

data, comparisons

by performing

table.

parametric

are often

treating

represent the samples being evaluated, and the columns

comparisons

univariate

results are obtained

statistical

while

of many para-

both

interval,

In its simplest form, sensory profile analysis results

it is

that these profile data have interval properties.

a PCA on

contingency

to be otherwise.

in spite of this knowledge

variables which

regarded in the light of performing two nominal

et al.,

at the correspondence

between

ables. In its original

a

scale) or a con-

tinuous line scale (e.g. 100 mm line scale) are

metric

is

It is on a

category

nonparametric

method

1986).

that data collected

recognised

from (Lebart

ordinal,

may not be too far removed from reality, and

Correspondence analysis analysis (CA)

(O’Mahony,

generally

together

Correspondence

there are four

scale ; nominal,

types of measurement

Objectives

relating to

in sensory analysis. As

instead

CA and PCA.

of

GPA

as this extension

of

more widely used over the

past few years. Returning

to the application

of CA, it was

stated that the row variable was divided into a number

of categories ; i.e.

samples evaluated.

the number

of

This is the simplest case,

a particular attribute for a given sample. Thus

and in reality the number of categories

it is not immediately

row variable can be calculated in a number of

can be considered

clear that such a matrix

as a two-way

contingency

ways :

table. To try and get this idea across, it is first necessary

to define the rows and columns

the sensory variables,

profile

chocolates

suppose

categories

1. samples X assessors x replicates ;

of

2. sample X assessors

of two

that a profile

3. samples X replicates

of six

is the columns

be treated in light of two categorical

variables,

over

over assessors and

replicates). These options also apply to data analysis with PCA.

(i.e. 14 attributes).

To understand why a profile matrix might

(averaging

4. samples (averaging

and the second

which have fourteen

over

assessors) ;

then

is the rows which have six

(i.e. 6 chocolates)

(averaging

replicates) ;

of categories.

resulted in fourteen attributes,

the first variable variable

in terms

each with a number

For example,

categories

matrix

of the

order

In this paper Option to

examine

assessors. However,

the

1 was chosen in

reproducibility

of

it is often more practical

CORRESPONDENCE

to use Options advisable

2, 3 or 4, and it is usually

to perform

these

analyses

in as-

ANALYSIS

i (i = 1, 2, . . , n) is the ith row (sample) of the matrix

the variance

column

of replicates,

asses-

X,

and j

(j = 1, 2, .

(attribute)

, p) is the jth

of the matrix X. In PCA,

the data xii are column

sors and samples.

25

obtained, which comprises elements xij, where

cending order. This enables the user to evaluate contribution

IN SENSORY EVALUATION

centred according

to

The next section illustrates in more math-

the formula xij - z~. In other words, the mean

ematical terms the method of CA, but as most

of the data in the jth column (%J is calculated

users of sensory profile analysis will be familiar

and subtracted

with

that column.

the practical

interpretation, what

application

it is well

is considered

property should

of CA. recall

of PCA

worth

illustrating

to be the fundamental At this point

that

for

the reader

on performing

PCA

on

from each of the elements

the p columns. case

in

CA

frequency

of

This step is repeated for each of However, since

this cannot be the

often

the

data

xij

are

counts (positive integer figures). In

cases where the data submitted

to CA are not

sensory profile data, he derives sample scores

frequency-type,

which represent the position of the samples in

zero.

a multivariate

consider the xij as ‘the amount of something ‘,

component

space.

loadings are obtained

resent the weighting to

each

of

principal

In addition,

the

principal which rep-

(or importance)

sensory

components

attached

attributes

on

the

(new dimensions).

(R) and column

original

matrix

(C)

of the

of data. This results in two

new data matrices, the Appendix.

profile

R and C, as illustrated

PCA is then performed

Another

way of looking

and hence by definition

in

at this is to

cannot be negative.

In CA, both the row and column profiles of the matrix

X are calculated,

resulting in two

R denotes the matrix of row

new matrices;

The first step of CA is to calculate the row profile

the xii must be greater than

profiles, while C denotes the matrix of column profiles. In mathematical by calculating

notation, R is derived

xij /zci for all i and j, while C is

derived by calculating

xij /%j for all i and j.

on the

The xi, are the sums over the columns for each

R and C matrices, and scores and loadings are

row, while x,~ are the sums over the rows for

derived in each case. The fundamental

each column j.

erty of CA is that the principal scores of R are proportional C,

and

similarly

the

to the loadings of

principal

scores of C are proportional

prop-

component component

In sensory profiling, xi. is the global value of one row over all the attributes, equivalent

while x,~ is

to the mean of attribute j. Perfor-

to the loadings of

ming CA means that differences between x,~ are

R. In practical terms, this allows samples and

lost and overall differences between xi, (rows)

attributes to be represented

are lost. This leads onto another point worth

simultaneously

on

the same plot.

noting, namely that CA does not take account

One final practical point is that to perform CA, summing

across the rows and columns

must make sense. For example, be

sensible

variables

to

sum

(attributes)

across

it would not

columns

were measured

if the on dif-

of large differences

between

the xi,, whereas

PCA will often separate low and high xi, on the first dimension.

Essentially,

CA is equiv-

alent to performing

two PCAs ; the first is a

PCA of the matrix R, and then a PCA of the

ferent scales ; p I, weight in grams, volume in

matrix

cm3, etc.

using the x2 metric distance, which is, in the

C. Both

these PCAs

are performed

PCA of R, the distance between profiles

i and i’. This can be written

Mathematics interpretation of CA

following

To understand in more detail how CA differs

d2(i, i’)

from PCA, the reader must first consider what these two techniques Typically,

are doing

in sensory profiling

to the data. a matrix

X is

the two line

=

in the

expression

$ >(2_?)2.

.j

An important

2.

point

2’.

to note

when

com-

paring this with the classical PCA is that the

26

JEAN

TABLE

A. McEWAN,

1. Four

Points

PASCAL

SCHLICH

for the Interpretation

of Correspondence

Point

Analysis

Description

1

The origin

corresponds

column

profile.

assessor

or attribute)

to the mean

Thus,

line profile

the importance increases

and mean

of a point

with

its distance

(sample, from

the

origin 2

The closeness assessors)

of two

implies

attributes

that

they

have

(or two

samples

or two

similar

column

(or line)

profiles 3

The closeness

of a sample

in terms

of correspondence.

this sample

received

higher

this particular 4

with

contribute

to

small the

be

values

than

that

others

for

away

two

close points

are from

the origin,

the

the implications

x,,/x,~ (x,, is the grand mean) allows

attributes

relative

should This means

attribute

The further stronger

weight

and an attribute

interpreted

values

distances

(low

x,~) to

between

line

preted with confidence,

but if the same two

attributes were to lie at the same distance from each other near the origin, then their possible

profiles to the same extent as those attributes

correspondence

be treated

with cau-

which have large values (high x,~). In a sense,

tion. These points should become

clearer on

this is comparable

working

correlation

to the normalized

PCA (i.e.

matrix) in which the weight is the

would

through

the interpretation

of data

later in this paper.

inverse of the variance. One

can prove

ponents

that

the principal

scores of R are proportional

comto the

Multiple

correspondence

analysis

loadings of C. Thus, a number of useful points for interpretation Table

1. Point

emerge,

as summarized

1 is self explanatory,

Point 2 it is worth

noting

in

but for

that it would

be

MCA, like CA, is used to analyse data collected on both nominal and ordinal attributes, where the number

of different responses for a given

wrong to conclude that two samples have the

attribute is fixed. Further, the total number of

same

lie

responses over attributes is therefore fixed, and

PCA or GPA should be used to

equal to the number of columns of the table of

attribute

close together.

amounts,

even

if

they

determine attribute amounts, which effectively

data to be analysed. MCA

indicates the perceived intensity of an attribute.

lent to performing

PCA

Considering

Point 2 in more detail it should be

nominal

noted

distances

is, in fact, a generalization

that

between

between assessors are comparable. that there is more ‘between than ‘between

samples

and

This implies

sample ’ variation

assessor ’variation if assessors lie

number

(or category) of measured

is generally

MCA

for the assessors to be

at the origin,

for replicates

the same sample to be close together,

of

and for

samples to be spread over the space. Point 4 is a useful origin

reminder

space. For example, together

that

are contributing

points

close

to the

little to the derived

two attributes lying close

away from the origin can be inter-

Thus, MCA

of CA when the

variables

is equal to k,

first necessary to appreciate that in CA the data are essentially

desirable

variables.

with k greater than two. To illustrate this, it is

close together and samples far apart. Clearly, it close together

is in effect equivaon more than two

a two-way

contingency

table.

is in fact analysing k x (k - 1) two-way

contingency

tables,

based

on

all

possible

variable pairs. Using

these

two-way

tables

it is then

possible to construct a Burt table (Lebart et al., 1984), which is a partitioned symmetric

matrix

of all pairs of two-way

tables.

The MCA

program

contingency

of SAS can then be used

CORRESPONDENCE TABLE

2. Example

Analysis

of Data

Format

for

Performing

Multiple

IN SENSORY

Correspondence

EVALUATION

Analysis

with

27

a Correspondence

Program Age group Individual

Al

to analyse

this table.

constructs

a partitioned

matrix (PSDM) this

ANALYSIS

PSDM

equivalent table.

A2

However,

A3

A4

M

if the user

symmetric

design

to

a simple

CA

In fact,

the Burt

product of the PSDM.

program

MCA table

A

F

into

is

on a Burt is the inner

To obtain the PSDM

B

three

venience After

of OS and Is, then submitting

to performing

Social class

Sex

Cl

new

C2

D

E

attributes,

which

MCA

the three levels of

sweetness are plotted together the other

attributes

from

with levels of

the profile

to join SW0 to SWl,

and SW1 to SW2. Thus,

performing

path of each attribute to be followed

consider

how

a simple

three categorical

this

example

would

where

variables;

look

there are

being

analysed. A directional line can then be drawn

data.

illustrate

con-

can be called SWO, SW1 and SW2.

performing

the user must have access to the raw individual To

for

MCA

on profile data allows the through

the sample space.

age, sex and social

class. Age has four levels (Al, A2, A3 and A4), sex has two levels (M and F) and social class has six levels (A, B, Cl, individuals

C2, D and E). For five

the first five lines of the PSDM

will take the form

shown

example,

the

line

individual

is in age group A2, is male and of

first

in Table

indicates

social class C2. One important

2. For that

the

feature to note

MATERIALSAND METHODS Background to data

is that the sum of each row should equal the total number Another

of variables, aspect

in this case three.

of performing

sensory data, is the conversion variable from its original

MCA

on

of data on each

scale (e.g. 100 mm

line scale, 9-point category scale) into a smaller number of representative

categorical

This is achieved by constructing for each variable distribution number

variables.

a histogram

and using the shape of the

to aid in the

of categories,

selection

of the

as will be illustrated

later in this paper. In practice,

between

5 new categories

Suppose that a

are selected.

categories,

where

observations

O-20 were allocated to th first category, were 51-100

allocated

to the second

were allocated

category

were

from

profile of eight commercially berry jams, conducted Drink

Research

twelve

trained

intensity

a conventional available straw-

at Campden

Association. sensory

of eighteen

Food and

In this profile,

assessors

attributes

rated

(Table

the

3) for

each of the jam samples in triplicate using an unstructured

line scale. The

attributes

used

were described and defined over a number

of

training sessions.

Principal component analysis

rated 21-50 and

to the third category.

Then the attribute sweetness is effectively

data used

2 and

100 mm line scale for sweetness is divided into three

The

split

Principal

component

analysis

Chatfield

&

1980),

covariance

matrix was applied to the complete

Collins,

(PCA) based

on

(e.g. the

data set. In other words, a matrix of 288 rows

28

JEAN

TABLE Jams,

A. McEWAN,

3. Attributes

PASCAL

SCHLICH

used to Describe

and their Abbreviations.

relate to the number

of categories

in parentheses

Abbreviation

(4)

Acid (3) Caramelized

(2)

in increments

CAR

axis the frequency

MUS

of strawberry of fruit (3)

Synthetic/perfumy Bitter

(3)

into each of the 12 categories.

fell

The original

SOS SOF

values. SAS was used to perform

to take category the MCA.

JEL STE

RESULTSAND DISCUSSION

THI SEE

Seedy (3) Gelatinous

GEL

(3)

Mouthcoating

Bitter

which an observation

data were then transformed

(2)

(4)

MC0

(3)

SMO

(3)

Throat

of 5 units, and the horizontal

BIT (2)

Stewed texture

Smooth

where the

the 60 mm line scale

SYN

(2)

(2)

Jelly-like Thick

analysis

as histograms,

SWE

Musty

(2)

Data were plotted

AC1 RIP

Strength

correspondence

vertical axis represented

Over ripe (2) Strength

Multiple

used in MCA)

Attribute Sweet

the Strawberry

(Figures

catching aftertaste

TCA

(3)

BAT

(2)

(12 assessors x 8 samples x 3 replicates) columns

(attributes)

was

analysed.

and attribute plots were obtained. was used to perform

by 18 Sample

Figure 1 shows the sample and attribute

plot

from

cor-

PCA,

while

responding

plot

aging the principal

PCA.

consensus

Procrustes analysis

sources

Generalized Procrustes analysis (GPA) (Gower, & Hallett, 1990) was applied to

the complete sample

and

data set, to derive a consensus attribute

individual

corresponding GENSTAT

plot,

as well

as the

assessor

plots.

was used to perform

of

variation

the GPA.

The attribute measuring

the

GPA.

The

in

GPA

the

is the

for

data

three

through

rotation/reflection.

plot for PCA was obtained

the correlation

averaging.

scores across

adjusting

and

scores and the original

between

by

the PC

attribute

ratings, after

This same procedure

was used on

the GPA data to obtain an attribute

plot for

each

ease

individual

presented

analysis

component after

scaling

presentation

Correspondence

from

the plot from

derived

translation, 1975 ; McEwan

2 shows

sample plot from PCA was derived by aver-

MINITAB

assessors, whilst

Generalized

Fig.

derived

assessor,

but

for

of

the consensus of these values are

in Fig. 2. The sample scores were

scaled to enable both samples and attributes to fit on the same plot. The first letter of the

Correspondence

analysis

was performed

on

data of the same form as that used for PCA

attribute name is its location It is evident

in space.

that the sample spaces from

and GPA. This allows a consensus sample and

these methods are very similar, though there is

attribute

a more distinct separation of Samples A, D and

individual perform

plot

to

sample the CA.

be

derived,

plots.

SAS

as well

as

was used to

F in the

GPA

plot,

possibly

due

variation in the data after adjustment

to

less

through

translation and scaling. The attribute plots are also similar, though there are a few differences worth

noting.

contribute samples

very in

Firstly, little

smooth to the

and jelly-like

separation

of the

PCA since they are both near the

CORRESPONDENCE

ANALYSIS

29

IN SENSORY EVALUATION

synthetic

0,

0 ‘tertaste

0 E

s

;weet _B,~oatc~~:g

01

(u

seedy E, I!

i

_

H

Y

strength

jelly-like

smc th

of strawberry

G

-0

musty acid

'G

\i

thick

mouthcoating

,--r

-0

over-ripe

caramelized

stewed gelatinous

-0

-0.4

-0.6

3

0

-02

0.4

0.2

0.8

0.6

I.0

PC 1 31% Fig. 1. Sample

and attribute

plot derived

fr6m

principal-component

analysis-sample

0.6

scores

averaged

over

Jelly - II ke

synthetic C

smooth

0.4

0.2 bitt

aftertaste

Neet

2 &I (u 2

D

C

D fl b musty

over-ripe

Strengi OF”’

AGA caramelized stewed acid

-0.: strength of strawberry gelatinous thickGxG,, -0.r

I

I

-0.8

I

I

I

-0.6

-0.4

-0.2

seedy

0

I

I

I

0.2

0.4

0.6

PA 1 64% Fig.

2. Consensus

sample

and attribute

plot

derived

from

generalized

Procrustes

analysis

t

assessors.

JEAN

30

0.6

A. McEWAN.

PASCAL

SCHLICH

BAT BIT

iSYN

0.6

0.4

C

20 2

c’

01

o-2

: ‘Z c

E

SMO 5 1

I’E

.-E” D

!/

0 ,- TCA-H#gB&H,

1

SOF

SEE

9 4

7

MU

SOS RIP

-0.2

-0.4 I-

STE

11

7, G

,

‘G:LCAR -0.4

-0.2

I

I

I

0.2

0.4

0.6

Dimension

Fig.

3. Combined

sample and attribute

plot derived

from

1

I 0.6

I

I

1.0

1.2

33%

correspondence

analysis, with average

assessor positions

represented.

origin. Bitter flavour and aftertaste contribute

matched

through

the

steps

of

translation,

less to the GPA space than that obtained from

scaling and rotation

PCA.

the effect of elimination

assessor to assessor

variation.

this better,

Other

contribution

differences

to attributes on individual dimen-

sions. For example, important

strength

of strawberry

on the first dimension

space and contributes dimension,

can be seen in the

a little

of the GPA

to the second

but only really contributes

first dimension

of PCA.

sions were examined, to the interpretation

is

Subsequent

to the dimen-

but as they added little they are not included in

this discussion. The amount

analysis

the first dimension

accounted

of the PCA

for by

(30 %) and

GPA (64Y’)o sp aces differs considerably.

This is

To understand of variance

which has

is performed

principal component,

specifying

suppose on each

‘assessor ’and

‘sample’ as the two factors. Then this analysis of variance would show both between assessor variation and between sample variation. However, if the corresponding formed

on the principal

GPA, of variation

and reflection,

then

the

between

analysis was peraxes obtained assessor

after

variation

would be close to zero. This means that while the consensus

plots from

the two

show the same visual structure,

methods

the variance

due to the way in which the data are treated

structure is different. If PCA were performed

using both methods.

using

performed

In PCA,

the analysis is

on a matrix where the observations

(rows/objects) and assessors.

comprise Thus,

principal components

the

samples,

replicates

derivation

includes variation

of the attri-

data

eliminating

averaged

across

assessors,

after

the assessor effect, then a similar

variance structure to GPA is usually obtained. Figure 3 shows the combined attribute

plot derived

from

sample and

correspondence

buted to each of these three sources. However,

analysis. The sample positions were obtained

in the case of GPA,

by averaging

each assessor’s space is

over

assessors. These

show

a

CORRESPONDENCE

E

a 0

IN SENSORY EVALUATION

and F were perceived

(a)

‘;

ANALYSIS

to have greater flavour

strength than the other five samples. Another example

is synthetic

towards

the top left hand corner

suggesting \

I

31

that Sample

more synthetic

>

The

Attribute Intensity

(SYN),

of Fig. 3,

C was perceived

as

than the others.

numbers

calculated

which increases

1-12

average

on

points

Fig.

3

are

of each

the

assessor.

Ideally, all assessors should cluster around the origin indicating

agreement.

However,

Asses-

sors 8, 12 and 11 are slightly further out from

(b)

the rest suggesting that their data are different from

the others

respondence I

I

with

respect

reflected by examination

>

Attribute Intensity

to their

with the attributes.

cor-

This was not

of the assessor plots,

assessor residuals and percentage variance from the GPA.

F

a

0

Cc)

J-l

number

of

histogram

category

(MCA),

histograms

attribute.

There were three general shapes to

formed

>

for

used each

to determine of

the

sensory

attributes.

with the percentage closely resembling from GPA.

This latter observation

the similarity

both subject to between Using

CA

assessor variation.

the attributes

can readily

plotted on the same diagram abbreviations). the

points

To interpret raised

remembered. dence

First

between

example

is due to

of PCA and CA in that they are

(see Table 3 for

Table

1 the

should

and attributes ; for

A, D and F have a high

Sample G shows a strong correspondence lie away

(MCO).

from

indicate

the

the samples,

direction

of flavour

Dimension

but

these still

of increasing

(SOF)

corre-

For example,

is increasing

attributes

similar to Fig. 4(a) were levels,

four

category

levels.

The

number

levels for each attribute

of

is given in

Table 3 (figures in parentheses). Figure after

5 shows the sample space derived

MCA.

As before

the

three

positions of each sample are joined triangle. and D.

On this two-dimensional Also,

in common

with

replicate to form a

picture, all of A

the other

methods is the cluster of A, D and F on the positive side of the first dimension,

with the

other samples on the negative side. However, the pattern of Samples E, C, H, B and G along the second

dimension

of Fig. 5 is different

from the other plots. The reason for this can be found on examining

with

Many of the attributes

spondence with these attributes. strength

be

correspon-

with acid (ACI) flavour, while

mouthcoating

data. Those

samples are separated with the exception be

this plot (Fig. 3)

consider

samples

Samples

correspondence

in

each

Fig. 4 (b) to three category levels and Fig. 4 (c)

variation explained more the PCA variance than that

for

to data with two category

category to the PCA and GPA plots,

drawn

each attribute was trans-

to categorical

converted to

similar structure

were

analysis

as illustrated in Fig. 4. Based

with a distribution

shapes

levels

correspondence

on these histograms,

Attribute Intensity

4. General

to multiple

these histograms,

I Fig.

Prior

Fig. 6.

Figure 6 shows the path of a number of the attributes through the jam space. The number of

points

on

corresponds

the

path

to the number

of

each

attribute

of category

levels

allocated to it. To illustrate the reason for the

along

different

sample

1 which suggests that Samples A, D

consider

the attribute

positions

on Dimension

2,

‘thick ‘. In each of the

32

JEAN

A. McEWAN,

PASCAL

I.0 -

SCHLICH

G

A

G-G

OEI -

2

IO

3 7 0

4

5

1

8

‘C -

-0.5

12 -1.0 t

I

I

6

I

-0.5

-1.0

0.5 Dimension

Fig. 5. Sample plot derived case letters

represent

previous methods, on the bottom where

the

However,

from

centroid

multiple

position

left-hand

quadrant of the plot

G has been

in Fig. 5 from MCA,

by examining

quadrant

positioned.

Sample G is in

of the plot. Now

Fig. 6 it is evident that the path

of ‘thick ’ (THWTHIl-THI2-TH13) its maximum different

view

profile

data.

observed

MCA

quadrant of

is taking

a slightly

of the interpretation This

same

on tracing

‘mouthcoating

6)

shown

the attributes

of the

phenomenon

other attributes,

’ (Fig.

strawberry ’ (not versely,

reaches

in the top left-hand

the space. Thus,

and on

was

such as of

‘strength Fig.

caramel,

6).

Con-

over-ripe

stewed provide less information

and

in the MCA

The

results of the MCA

(including

those

in Fig. 6) can be summarized

in terms

of the diagram

represents

the path of the main attributes

the MCA

space. This illustrates the Guttman,

or horseshoe effect (Benzecri,

with average

in Fig.

1984)

assessor positions

as represented

represented.

(Lower-

7, which on

1973 ; Greenacre,

by the steeply

curve. This phenomenon correspondence dimensional while

the

orthogonal

to each other,

dimension

and strawberry

other

in

multi-

It reflects, that

dimensions

are linearly

they can also be

way (Greenacre,

There is also a shallower horizontal

and

scaling methods. derived

dipped

is quite common

analysis

related in a nonlinear

1984).

direction

across the

representing

sweetness

on the left and fruit and acid

on the right. Musty takes a shorter path than the other attributes along the horizontal diagram,

whilst bitter and synthetic

of the

follow

a

path on the vertical of the diagram. These results indicate that MCA is taking a different

view

techniques

than they did using PCA or CA. not represented

analysis,

of sample.)

‘thick ’ has been positioned

Sample

the top left-hand

correspondence

I.0

1 16%

of the data from

the other

discussed in this paper. This is not

to say that MCA is more ‘correct ’or ‘wrong ‘, merely from

that it approaches a different

idea of tracing

the interpretation

and interesting an attribute

angle.

The

path through

a

perceptual space certainly has its attractions for the interpretation

of profile

data, and those

CORRESPONDENCE

ANALYSIS

33

EVALUATION

T

THI 3

1.5 -

IN SENSORY

11

1 ,o

TH! 0

\ MC02

9

GEL2 \

2

0 .5 -

BAT0

9

b

TCAP

cu

b

G

‘Ji

\

O-

:

.-E D

t,

STEOTH\l2 C

THI 1

-0 .5 -

12 BAT1

-1 ,O-

I

I

-1.0

-0.5

6

Dimension

Fig. 6. Path of selected attributes position

derived

I 0.5

0

from

multiple

I 1.0

1 16%

correspondence

analysis.

(Lower-case

letters

represent

centroid

of sample.)

THROATCATCHING MOUTHCOATING GELATINOUS THICK

G SWEET STRAWBERRY

FRUIT ACID

MUSTY

BITTER SYNTHETIC

Fig.7. Summary

of main

points

from

multiple

correspondence

analysis

attribute

path

plot,

illustrating

the Guttman

effect.

with

access

to an MCA

program

are encour-

aged to make use of this approach plementary

as com-

to the more traditional methods of

analysing such data.

On the question of individuals,

MCA

like

CA is not as flexible as GPA in dealing with this. However, or

it can indicate potential

‘different ’ assessors

by

calculating

‘odd ’ an

34

JEAN

TABLE

A. McEWAN,

PASCAL

4. Hotelling-Lawley

Accounted

SCHLICH

and Fisher Statistics

for the Different

Multivariate

Methods,

and Percentage

Variation

for in the First Two Dimensions Fisher value” Method

HLd

Dim

1

% Variation

Dim 2

Dim

1

Dim 2

GPA

232.8

401.4

79.7

64.4

PCA

173.6

316.3

342

30.8

12.4

PCA”

203.9

337.1

106.9

75.3

14.2

CA

116.3

205.2

53.8

28.4

13.4

CA”

171.2

249.1

113.4

73.1

142

MCA”

118.1

241.7

13.8

15.7

8.5

a PCA on data averaged

11.6

over assessors (24 rows by 18 columns).

b CA on data averaged over assessors (24 rows by 18 columns). ’ With

MCA,

the categorization

’ Fisher approximation

does not allow data to be averaged over assessors.

with 14 and 18 d e g rees of freedom.

e Fisher with 7 and 16 degrees of freedom.

Critical

average

Figure

position

for each person.

5

shows the twelve assessors in a similar position to that obtained both

CA

from the CA (Fig. 3). Thus,

and MCA

highlighted

Critical

value at level 0.001 is 487.

value at level 0001

the same

is 6.46.

(ANOVA)

was calculated

based on 24 samples

replicates).

the consensus. It can be concluded that Assessor

average

6 is sensitive

computed.

wrong

to the attribute

11 is not. to

suggest

bitterness,

However, an

it would

association

Assessor 8 and jelly-like

but be

between

since they lie within

the central (origin) region of the plot. Returning

to the GPA,

it is worth

sidering another point, having completed terpretation

in-

using the four methods. In Fig. 2,

the attributes

smooth, jelly-like

have greater

importance

and seedy all

as a result of GPA

than they have using the other methods. In the GPA,

smooth

and jelly-like

related to synthetic the

are

strongly

which may suggest that a

number

of

between

these attributes.

panel

find

For

a relationship

This was not picked

Table

value, the better

are highly

the

and 24

was given

the were

approxi-

the discrimination

between

and Fisher values

assuming that the test

assumptions are true. This indicates that some of the samples are different, and that assessors are able

to evaluate

multivariate important

context.

discriminant results

these

power

suggest

criminating

in a

the

most

4 is to compare

of each method.

that

GPA

is the

the

These

most

dis-

of the methods, most likely due to

the rotation/reflection

aim

differences

However,

use of Table

that it is better

Having looked at each of the four methods

MCA

samples

4 gives a Fisher

significant,

before

on the same data set, consideration

CA

of

and 3

samples is in the first sample plot.

up by the other methods, and, hence is worth GPA.

(8 jams

mation for the HL statistic. The higher this F

noting as an advantage

of applying

PCA,

locations

All the Hotelling-Lawley con-

These

statistics are all measures of sample discrimination,

assessors as being different in some way from

Assessor

for the first two

dimensions derived from each method.

step. It is also evident

to average

performing

across assessors

either PCA

is to separate

the

possible. Alternatively,

or CA if the as well

as

the standardization

samples

of

to the question of what is the best method to

each assessor separately is also likely to result in

use. Detailed investigation

of this is outside the

good

discrimination

scope of this paper, but Table 4 provides some

PCA

is more

information

MCA,

for consideration.

The Hotelling-

between

discriminating

probably

due

Lawley (HL) statistic (Freund et al., 1986) was

normalization

calculated

balance of discriminant

in

association

analysis of variance Fisher

statistic

with

multivariate

(MANOVA),

from

analysis

while the of

variance

Dimension

to

the

the than line

samples. CA

or

profile

of the CA methods. In terms of information

1 and Dimension

between

2, as indicated by

the Fisher value, the CA on the averaged data

CORRESPONDENCE is the best in this respect. is the method dimensional variation lower

which

accounted

The

for using

phenomenon.

remembering

multivariate to cope

method

with

and GPA

that

this is a

However,

it

MCA

data,

are theoretically

ordinal

though

is the

and interval

while

more

is

only is able

CA,

correct

PCA to use

data.

IN SENSORY

EVALUATION

Guichard,

E., Schlich,

Typicality

of apricot aroma: Correlations

sensory

is much

of this sort which

nominal

two-

percentage

MCA

than the other methods,

worth

that CA

leads to the best

discrimination.

well-known

with

This means

ANALYSIS

P. & Issanchou,

35 S. (1990). between

data. /. Food Sci., 55 (3),

and instrumental

735-738. Lebart,

L.,

Morincau,

(1984). Multivariate Correspondence

A.

Descriptive

Analysis

McEwan,

J. A. & Halictt,

the

and

crustes

Techniques _for

and Sons, New

York.

E. M. (1990). A Guide to

Interpretation

Analysis,

K. M.

Statistical Analysis:

and Related

Large Matrices. John Wiley Use

& Warwick,

of Generalized

Statistical

Manual

Pro-

Number

1,

CFDRA. M. (1986). Sensory Evaluation

O’Mahony, Statistical New

ACKNOWLEDGEMENTS

Methods

and Procedures.

Marcel

qf Food: Dekker,

York.

Tomassone,

R. 81 Flanzy,

synthetique

de

don&es

diverses

par un jury

C. (1977). mdthodes

Prdscntation d’analyse

de

Ann. Technol.

de degustateurs.

Agric., 26, 3733418. Van

Buuren,

spondence The authors

would

INRA

financial

for

possible

support

CFDRA in

Council

Foreign

Affairs the

exchange

and the French for providing

authors

to

work

and

making

to analyse the data in depth,

British allow

like to thank

it

and to the

Ministry

travel

funds

together

Chemistry Van

S. (1987). analysis

in

Using

and Industry, July,

dcr

Burg,

Nonlinear

multiple

sensory

quality

447-50.

E. & Dijksterhuis,

canonical

correlation

of

way data. In Multiway

to

& S. Bolasco.

North

corrcresearch.

G. B. (1989).

analysis

of multi-

Data Analysis, ed. 1~. Coppi Holland,

Amsterdam.

and

ideas.

STATISTICAL PACKAGES REFERENCES GENSTAT ithms Benzecri, J. P. (1973). L’Analyre des Donnees. 2: L’Analyse Chatfield,

des Correspondames.

Dunod,

Tome

Paris.

Group

MINITAB crence

C. & Collins, A. J. (1980). Introduction to

SAS

(1988).

Analysis. Chapman and Hall, London.

ditional Institute

Algor-

Limited. (1990).

Manual

Danzart, M. (1983). Evaluation of the performance

Multivariate

(1983). Relcasc 4.04. Numerical Data

Analysis

Software.

Release 7.2. Minitab SAS

SAS/STAT Inc.. North

Technical

Report

Procedures,

Rcf-

Inc. PA, USA. P-179:

Ad-

Rclcasc 6.03. SAS

Carolina.

of panel judges. In Food Research and Data Analysis, cd. H. Martens & H. Russwurm. London,

Applied Science,

pp. 305-319.

Freund, R. J., Littell, R. C. & Spector, P. C. (1986). SAS Systems for Linear Models. SAS Institute, Inc., North Carolina.

APPENDIX

Gower, J. C. (1975). Generalized Procrustes analysis. Psychometrika, Greenacre,

40 (I), 33-51.

M. J. (1984).

Correspondence

Analysis.

Theory and Applications Academic

of

Press, London.

To illustrate and

column

the method profiles

of calculating of

a matrix

the row of

data,

36

JEAN

A. McEWAN,

PASCAL

SCHLICH

Raw data

Attribute Sample

1

2

A B C II E F G H

20

30

1

7

44

14

12

7

44 21 34 16 45 45

13 28 18 36 16 16

8 4 5 2 14 15

6 14 7 12 7 7

Total

269

171

61

67

consider

3

the following

the raw data is taken

4

5

6

7

8

15

2

26

2 11 4 12 0 1 40

raw

8

9

10

11

12

13

31

6

4

17

28

22

17

11

24

7

8

5

40

20 13 24 12 34 28

9 32 17 34 6 13

35 6 13 7 10 14

9 7 15 7 6 9

26 22 14 19 3 11

4 24 14 30 4 9

31 15 32 18 48 28

18 9 16 19 17 18 14

172

153

115

64

120

118

234

128

167

data.

In this case

to be the mean

sample

calculate

Attribute

the

value

1 for the line profile

of

Sample

matrix,

A,

the raw

calculate

Attribute

Sample

This

then multiplied

is 259.

result

is then value

8

17

9

6

259

29 23 7 20 7 30 26

17 35 26 20 23 16 22

25 23 12 27 13 25 28

8 10 6 16 8 6 9

330 327 272 320 281 328 319

150

176

162

69

is repeated

the

value

for

of profile

raw data value of 20 is divided sum for Attribute

by 100 to give the line profile

13

33 20 8 21 8 40 24

1 for the column

by the row sum for

A which

18 Total

16

process

data value of 20 is divided multiplied

17

15

all data

values. To

score for each attribute. To

of 7-7. This

14

value

1 which

A,

matrix,

the

by the column

is 269. This result is

by 100 to give the line profile

of 7.4. This

data values.

Sample

process

is repeated

for all