A comparison of imputation techniques for internal preference mapping, using Monte Carlo simulation

ELSEVIER 0905-3293(95)00030-S ACOMPARlSONOFlMPUTATlONTECHNlQUESFOR lNTERNALPREFERENCEMAPPING,USlNG MONTECARLO SIMULATION Duncan Hedderley and Ian ...

Download PDF

2MB Sizes 0 Downloads 51 Views

Report

PDF Reader
Full Text

ELSEVIER

0905-3293(95)00030-S

ACOMPARlSONOFlMPUTATlONTECHNlQUESFOR lNTERNALPREFERENCEMAPPING,USlNG MONTECARLO SIMULATION Duncan

Hedderley

and Ian Wakeling

Institute of Food Research, Earley Gate, Whiteknights Road, Reading RG6 6BZ, UK.

(Received 26 October 1994; accepted 12 May 1995)

number

ABSTRACT

of products

columns

as vectors

space,

The usual algorithm for internal preference mapping requires a complete set of observations, meaning the technique cannot be used to analyse trials based on incomplete block designs.

such

of consumers),

and the

points

scalar

the rows and

in multidimensional

products

of

are as close to the original

the

points

data matrix

as

possible. The usual way of calculating is

to

use

requires product

for imputing missing values under various conditions. Sets of simulated preference data with dqfwent character-

jects

istics were constructed. Monte Carlo simulation was used to create missing observations in these sets; the imputation

a

Singular

these vectors

Value

and points

Decomposition,

which

a complete data matrix (e.g. scores for each from each subject). This has meant that sub

with the occasional

score which

by accident

have had to be discarded

it has also

meant

that

internal

has been

missed

from the analysis;

preference

mapping

could not be used to analyse trials which were designed

techniques were applied to the data; and the results of

to be incomplete

preference mapping based on the imputed data compared to those from the complete data set. Convergence problems were found with two techniques.

(for instance,

where

was able to assess all of the available

no one subject

products

because

of fatigue effects). As part

Analysis of variance revealed that effects on performance were dominated by the proportion of data missing, the

potential

level of noise in the data, and the size of the data set.

based

of a MAFF/industrial

LINK

strategies

this

identified:

Daffmences in performance among the three convergent imputation techniques were small; mean substitution is recommended, as it performed as well as more complex iterative techniques.

either

on

Maximisation Analysis

solving values

a technique

Algorithm

(Beale

1982);

the

to the

Kruskal’s

fit

original

of this approach

The study focused

Keywords: Preference mapping; missing values; imputation; simulation; expectation-maximisation algorithm; row-column substitution algorithm; MISTRESS algorithm;

for the number

PRINQUAL procedure; mean substitution.

approaches

were

the

missing

like

the

Expectation-

& Little,

1975;

data Demp-

form of Homogeneity

available

for handling

(similar missing

to data

1964. An example

and Zamir,

on techniques

which just

data

scaling; Kruskal,

is Gabriel

three

or an approach

suggestion

in multidimensional

project, problem

for

or on a modified

(Meulman,

maximised

1979).

for imputing

values

missing data points, because programs for a of imputation techniques were either already

available Gabriel

to

substituting

ster et al. 1977),

The results were broadly confirmed by a similar study on a genuine set of prefmence data.

in SAS or easily written, (modified

and Zamir’s

whereas

Homogeneity

weighted

the other

Analysis

least-squares)

or

were not.

A Monte Carlo simulation study was planned to compare the performance of the different imputation techniques under a range of conditions typical of

INTRODUCTION AND LITERATURE SURVEY

Preference conditions

Internal Preference Mapping is a variation on the biplot (Gabriel, 1971; Krzanowski, 1988a). Working from

adequately

a matrix

complete

preference

that

and vectors

A simulation study was carried out to compare techniques

of data (for instance,

given by a sample

its aim is to find a way of representing

scores

for a 281

Mapping studies, and to find under the techniques gave solutions which close to the results obtained data sets.

what were

from analysing

282

D. Hedderlty, I. Wakeling

MISSING _ The

VALUE

five imputation

METHODOLOGIES

techniques

chosen

were the Expectation-Maximisation & Little,

1975);

made iterative:

Krzanowski’s

Algorithm

198813);

the

values

for the study

Row-Column

(Beale

the

solution

data

until the imputed

Several rithm.

SAS PRINQUAL

(Krzanowski,

algorithm algorithm

1989, 1990); the MISTRESS (SAS Institute, (van Buuren & van Rijckevorsel, 1992); and (Bello,

to complete

repeated

problems

The

matrix.

are arbitrary,

the

is an

to predict

sufficient

statistics

of)

does this by regression; for variable known,

iterative

technique

a multivariate

if subject

data

the algorithm

set. It

S has a missing

x, but s’s values for variables

then

other,

missing values in (or at least, value

xi to X, are

would use all the records

with known values for xi and 3 to X, to form a regression equation

for x, and then use that to predict

x, for subject for each

S, given S’s values of xi to x,. This is done

missing

algorithm),

value

(the

Expectation

and then maximum

the sufficient

statistics

statistics.

The

iteration

until it converges

algorithm

iterative

updated

goes

through

(1975)

Dempster

normal

of Buck’s

eventually point,

have proved

converge

so

long

although

‘the

as rate

slow’. Outlining algorithm,

algorithm

100 iterations;

(Buck,

a matter

vector

log

(1983)

program

and Little

were

often

as an 1960))

(1975)

(cited

algorithm

likelihood

of convergence

enough,

can

in will

or a saddle-

Krzanowski

Row Column (1988b)

to implement

on sub matrices

achieved analytically; however, with multiple missing values,

if the matrix

that one of the singusign, compared

values

be updated

at each

unstable.

appears

to the To over-

comparing

to reference

the signs

left

and

right

to have changed

by -1. These

reference

iteration

the

from an SVD

sign it

matrices

to take

must

account

of the

updating of the imputed values in the data matrix. A second problem is that, if the estimation is based on a fixed number of the

lower

heavily

influenced

estimates.

of dimensions

dimensions

This

can cause

cycle,

preventing

check

on

which

reduces

the

convergence

by the

most

dependence

mates

in some cases, some

may be essentially recent

on the

the imputations convergence. rate

the number

value

recent

esti-

to get trapped

in a

To

most

overcome

of convergence

criterion

random,

missing

this,

a

was introduced,

of dimensions

used if the

is not improving.

PRINQUAL The

one

PRINQUAL

1989,199O)

the

a limit

of

study

problem

which

on qualitative ance

procedure

covers

method

perform data.

The

SAS

Minimum transforms

regression

The SAS/STAT method although

being

Generalised

Variand

the transforma-

the determinant

matrix

of the

manual

gives an example

it is accompanied

tech-

analyses

the variables,

to update

used to estimate

Institute

scaling

components

with the aim of minimising covariance

(SAS

of optimal

principal

iteratively

then uses multiple

in

a number

transformed

of the MGV

values for missing

by warnings

of

variables. data,

that the tech-

nique may get stuck in a degenerate solution where all non-missing values are merged into one category.

MISTRESS

a technique

which

the row or the column containing For a matrix with a single missing

elements

is then multiplied

the

for imputing

a missing value in a multivariate data set by reconstructing the data matrix from the result of Singular Value Decompositions

matrix

matrices;

tions,

Substitution

presents

there is a possibility the imputed

SVD. Since

vectors

this a check was introduced,

of the

be painfully

171 iterations’.

Krzanowski’s

come

is bounded;

suggest but

the algo-

of programming:

from another

will have the wrong

making

niques

they add that ‘in our simulation

10 iterations required

the

for

assumptions. Wu

in Krzanowski’s

but Beale

to a maximum

a computer

Beale

suggested

that the E-M

either

of

1987,

is

used in calcu-

distribution;

and

et al. (1977)

1991)

this process

show that it can also be seen

refinement

for

sufficient

(see Little and Rubin,

which does not make distributional Rubin,

imputations

had already been

data with a multivariate

of

from the filled-in

of the actual technique

This approach

and Little

estimates

data, which in turn give refined

for a description

of the

step). These sufficient statistics

can then be used to produce the missing

part

likelihood

are computed

data set (the Maximisation

lation).

the value of

process to a limit.

the algorithm calculates imputed values using the left singular vector from one SVD on the data matrix, and

lar vectors

Expectation-Maximisation which attempts

This

highlighted

signs within each pair of singular

Expectation-Maximisation

for each

while programming

first was purely

the right singular

1993).

is found

for the other missing

values converge

not

paper were encountered

Substitution

algorithm

Mean Substitution

an analytical

missing value, using the estimates

exclude

either

the missing value. value, this can be

to extend it to matrices the technique must be

The

MISTRESS

evorsel,

1992)

algorithm

(van Buuren

is a technique

for

& van Rijck-

imputing

missing

values in categorical data such that the internal consistency of the resulting data set is maximised. In other words, given an observation with a missing value in a set of categorical data, their technique replaces it with a category which is ‘typical’ data set.

of similar observations

in the

Comjxwison of Imputation Techniques for Internal Preference Mapping Technically, maximise

the

At each iteration, each

category

spects

algorithm

iteratively

squared

correlation

Guttman’s

missing

data

its imputed

correlation

to

coefficient.

scaled values are found

of each variable;

each

changes

optimally

attempts

the algorithm

point,

category

and

then

recovered

in-

Bello mance

the squared

values

the algorithm

treats the data as categorical, compared

to other

data.)

The

make any a priori

MISTRESS

assumptions

algorithm

about

cannot

data.

Using

and

he studied

of means

Bello

concluded

different that

(such as mean substitution) sophisticated

these from the data set itself. This means

that the data

E-M

or Krzanowski’s

high, and the proportion large.

Van

Buuren

MISTRESS

structure;

of missing

adequately

or

data not to be too

and van Rijckevorsel

performs

the

must be moderate suggest

that

with up to 10% miss-

number

better

consistency

centrate

of O-6 or higher;

for adequate Despite

consistency

for

20%

than 0.75

missing

problem,

the MISTRESS

in the study because

applicable,

and because

algo-

it was (at least

SAS code

Mean Substitution missing the

mean

Although structure

score

for

the

which

variable

preference

techniques, other

(e.g.

which

more

and

with 1993).

assumes and

as a ‘control’

sophisticated

niques were expected

Bello,

to the other

are all iterative,

It was included

them

and ignores

mapping

data, it is also quick compared mented.

to replace

this is very simple-minded

the

is in the

sumer

method,

which

time-consuming

tech-

preference

imputation

between

plete’

of data sets typical of a con-

to ‘missing’.

techniques

preference

mapping values.

preference

using a number

performed.

In the field of Multi-Dimensional

Data Sets

proposals under

for

the

different

a number

of

studies to investigate

how

recovery

data

of missing

conditions.

For

instance,

Spence and Domoney (19’74) studied whether the technique performed better when the specific comparisons which were missing systematic

pattern;

occurred

at random

and Whelehan

or followed

et al. (198’7)

studied

a

A preference

and observations run

on the

carried

of measures.

were then

from

compared

with

the complete

This process

build up a distribution

of how the different

features

set to ‘missing’,

of the data set were chosen

of the preference

and the level of noise in the data.

to

techniques

for manipula-

of subjects, of stimuli,

the dimensionality

data set,

was repeated

observations

tion: the number the number

set

obtained

50 times with different

Four

‘incom-

out on each

configurations

mappings

data

at ran-

values were saved, and a

was then The

These

(chosen

Each of the different

was then

data set; the imputed

COMPARING THE PERFORMANCE OF THE METHODS: SIMULATION TESTING

Kruskal’s

the preference

STUDY

the ‘true’ results from analysing

perform

substitu-

would be moderate

could be compared.

taken,

were changed

imputation

of imputed

Scaling,

techniques

like mean

test were constructed.

techniques

sets were then dom)

imputation easily imple-

to outperform.

have used simulation

all

to con-

mapping was done on the complete data sets to get a set of ‘true’ results against which the results from the

these

papers

under

in most situations.

OF THE

For the study, a number which has been used to deal with

values in data has been

iterative

that there

products

ones, although

the decision

techniques

levels of correlation

scores for different

OUTLINE technique

guided

moder-

techniques

superior

complex,

to simpler

were

the iterative

for it was

available.

One other

results

on investigating

to high

as

are not very

the variables

was consistently

tion, on the assumption

this potential

better (such

on data with a small

than the non-iterative

These

as opposed

is needed

results.

rithm was included vaguely)

while

greater

and where

no single technique conditions.

performed

techniques

where the variables

ately or highly intercorrelated, performed

of

non-iterative

However, for data with a higher number

of dimensions,

ing data; for 15% missing data, the algorithm only perif the data have an internal forms adequately data, internal

iterative

technique)

of dimensions

highly correlated.

the

recov-

proportions

simple,

more

internal

between

and the variance-covariance

than

of the variables

of con-

of observa-

correlations

techniques

intercorrelation

missing

how closely the techniques

after he had deleted

data.

imputing a number

numbers

it has to infer

must have a fairly strong/clear

total of the

study of the perfor-

for

instead

data values relate to each other -

of the

on the accuracy

a simulation

dimensionalities,

matrices

how the different

proportion

data sets with different

ered the vectors the

the used

techniques

in multivariate

variables,

the data as ordinal or interval scaled. which probably are safe with preference

or sensory

reports

of various

tions,

algorithms

and

data. (1993)

structed

it is at a disadvantage which treat (Assumptions

of noise

of comparisons

for

coefficient.

Because

effect

necessary

if

to improve

the

number

283

space

284

D. Hedderlq, I. Wakeling

These

were

influential

factors

et al. 1987),

performing

because

control

the

most

observations control

For instance, larger

knowledge.

most

directly

are the numbers

it is possible

the case if the proportion niques

clear.

for

for

experiequal,

to missing

whether

with certain

the

a

values

some

shapes

Row-Column

tech-

of data

Substitution

algorithm may perform better with ‘squarer’ data sets, rather than ones which are either ‘long and narrow’many subjects,

few products;

many products,

few subjects;

and

column

corresponding

when estimating matrices, matrix

discards

but are potentially

be aware of. Previous

that the underlying performance results

studied.

with higher

found

the

One

might

random

expect

more

observations

found

design

resolution structed

(confounding

IV was chosen. by randomly

co-ordinates

consumer

Normal

readjustment

noise.

Three

levels of missing

data set chance’ in

an

chosen

data)

and 65%

incomplete

block

For each

of the 2”’

but

the

being

of dimensions, performed data

the imputation

in

successfully level of

data points

pattern

When

(see,

for

1974).

However,

patterns,

this factor

in the study, and all the missing

constructing

were chosen,

these factors plus their potential be studied systematically, it was be used, data sets

the data sets, two levels of each representing

points

were

data sets and the three sets with

on the

compared

imputed

the complete

submitted mapping

data

to the results

matrix

data points/performance

indicators

3 X 24-I combinations

of technique,

The

design out,

were

under 1000

would have been

was then per-

and

the results mapping

a sample

on

of 50

for each of the 5 X level of missing

some of these data points

performed

SAS

6.07

for

h of computing

would have required

a 24 design

of

imputed

of lack of convergence.

simulations

workstation,

were

to each

of a preference

data set. This produced

are missing because

levels of

values

and the resulting

values were saved. A preference formed

missing

with

a DEC and

time.

5000 took

A full 24

twice the time, which was

although,

as a reviewer

25 data

an alternative

and allowing more effects

on

Unix,

sets per requiring

pointed

combination similar

time

to be estimated.

data

randomly.

decided that constructed data sets should although a confirmatory study using genuine was planned as well. factor

Data

by

be found

at random.

techniques,

or

To ensure that interactions could

design).

data set was then

occur

was not included

on each

Runs

felt to be impractical;

were located

by a final

were still in the range

(a level which might

Each

that recovery

the missing

Domoney,

these

(a high level for ‘missing

to be set to ‘missing’

is whether and

was followed

values were tested

5% (low), 35%

of the techniques Spence

of the

; resealing

1 to 9; and then adding

This

data and data set. In practice,

of the variety of possible

the scalar-product

by

l-9.

approximately

because

calculating

for each product

so that the scores

relative to the other factors under investigation. One factor which might influence the performance

instance,

of con-

and consumer

space;

co-ordinates)

so that they were in the range random

data

product

created.

techniques

some

sets were

choosing

product

manageable,

The

(i.e. taking

and

normal

interaction)

have found

from data with a higher

follow

and the noise

the 4way

‘true’ preference

vector-projection

50 or 20;

a random

in 2- or 4dimensional

each consumer’s

(Box,

= 1 (low) or 2 (high).

50 data

noise, but it is not clear how large this effect is

randomly,

deviation

data,

when the simulated difficulty

was either

2 or 4 dimensions,

variate with standard

is to

trends

10 or 30; the dimen-

(for a 1-9 scale) was set to be either

missing

had more than 2 dimensions. imputing

of subjects was either

factors

techniques,

numbers

that iterative

than mean substitution

the experi-

influential

on the technique

et al. (1987)

difficult

while Bello better

to depend

Whelehan

was more

of

of the data affects

of imputation

objective

space, and the

studies

dimensionality

seem

of products

The Simulation

of the preference

studies. This is a

the

which might

in the data are not within

control,

they might the

proportion

less accurate.

The dimensionality menter’s

the row

observation

matrix,

The number

was either

where

and size of major

et al. 1978).

-

With more rectangular

a greater

than it does with a square

level of noise

it ignores

to a missing

that observation.

this

make the estimation

exact

or ‘wide and shallow’ because

the presence

the number

a 2”

this would still be that

determine

mapping

in trials

To keep the time spent on the simulations

the

of data missing was kept con-

better

instance,

technique

of subjects and stimuli.

robust

strategy

sionality

data set.

It is also possible

will perform

matrix;

study,

have

that, all else being

than a smaller one, although is not

to an

factors

under

typically found in preference common

It is there-

these

appropriate

data set might be more

stant

to be

mapping

on a particular

two factors

menter’s

their

to know what effect

choosing

The

found

1993; Whelehan

a preference

or at least within

imputing

been

(Bello,

they would be under the experimenter’s

fore important when

had

studies

and were also likely to be relevant

experimenter either

which

in previous

high and low values

Performance

Measures

Two different methods of measuring the performance of each imputation technique were used. The first (following Bello, 1993) was the sum of squared differences between the imputed data and the complete data set. Because points

this sum was over a different for each

combination

number

of Number

of data

of Subjects,

Compatison of Imputation Techniques forInternal Prqmence Mapping Number

of Products

and Proportion

of Data Missing,

the sum of squares was divided by the number vations

declared

Number

missing

of Products

create

an average

value,

which

footing.

X Proportion

allow

of factors

Otherwise,

proportion

be larger,

more

terms,

imputing

rather

of Missing difference

the

results

the results

from

data,

simply

values

of Subjects

from

than

are

picture

from

‘true’

values. this only measures

values match

the original

how closely the imputed

data. The real interest

study is in how results from preference on incomplete

data

corresponding

complete

approach

of Beale

compare

al. (198’7), who chose close

their

complete

final

nique reproduced To this end,

to measure

from

difference

the

number

between

the

Similarly,

the

performance

et

by how from

the

data matrix.

be expected

tions obtained

data

set,

and

the

differences

mean divided

of consumers

between

the

based on the imputed

and

sets was calculated.

relative

positions

of the points,

such,

both

the

rather

were

to match

them

to the configurations

subjected

complete

data

the

the main

mappings and

consumer

to Procrustes

mean

is the

than their co-or-

product

configurations

before

Because

preference

derived squared

the study. This algorithm

was performing

was the proportion converged.

practical

an algorithm

&AS/STAT

Obviously, Of course,

converges criteria;

will depend

SAS Institute,

variation:

be

found.

was used

Because

to a non-optimal

(or even non-sensical)

on

that empirical

these

the

pling

for each of the data sets in the study, 50 new

assessors’

responses

were then

the one derived

analysed from

of

(complete)

distribution

mapping,

compared

the

original

the ‘true’

The 95th percentile

recent

work

crustes

Analysis

results.

on the

This

strategy

significance

configurations

Wakeling et al. 1992) Unfortunately, because

the

assessors

runs were not comparable

was not suitable

& Arents,

Configurations

varia-

were

on Pro-

1991; being

from different as they did not

This meant

for deriving

of this

is based

of Generalised (Ring

of the

configura-

was taken as the limit for ‘adequate’

tion from

to

data set. This pro-

of the variation

about

with

sets of data.

Configurations

the original

Configurations

by resam-

matrices),

using preference

Stimuli

an empirical

Stimuli

(rows

from the original

and the resulting duced

concerns,

panel-to-panel

data sets of the same size were constructed,

resampling

converge,

during

in which

this approach

a limit for an ‘adequate’

of the Assessor Configuration.

the

performance

and the most appropriate

of cases is

Reliability

of Convergence

cases,

on the set-

in some cases in the

it is possible

or will converge solution.

of the

RESULTS

algorithm

Testing compared

of

to estimate

consist of the same assessors.

chapter

1989)

will never

techniques,

could

rotations

in some

however,

that a technique

Adequacy

an

the Proc PRINQfiAL

manual,

was a

depend

all the data sets used in the simulation

from the

emerged of runs

in the vast majority

value.

ting of the convergence (see, for example,

on this

which gave a good initial impression

which does not converge whether

study

resampling

recovery

measure

of how each technique

of limited

data, in which case it would be unlikely data to match

the Assessor

were calculated.

results

might

resampled,

differences

solu-

panels of consumers.

features of the data set, such as the number of stimuli, the number of assessors and the amount of noise in the

tion due to panel variation.

difference

than the variation

that the level of variation

distribution

squared

the

are no fur-

and it was also felt that there

was calculated.

mean

is

is that

complete-data

We were not aware of any published

of dimensions)

data

Another

solution between

from different

type of variation;

These of

to the configuration

(sum of squared of points

was performed

performance’

definition and subjects)

ther from the complete-data

replacement,

data. The configuration

complete

when interpreting as

follows

and Whelehan

to the results

compared

interest dinates

based on the

(of products

X number

configurations

Having

configurations

(the

data set) to be

good

possible

two sets of co-ordinates

the

complete

This

mapping

of imputed

squared

based

than simply how well their tech-

was then

by the

were

the original

derived

sets. (1975)

a preference

matrix

the products

data

results

data, rather

on each

to those

and Little

of the

mappings

one

arises:

results

of ‘the truth’?

of ‘sufficiently

but

possibility

However,

each

a reliable

concept

that might

are

the

This

ill-defined,

matrices

the techniques

question

to the ‘true’

considered

runs with a high

further

another

close

to

they are the sum of

because

which

identified,

sufficiently

with the complete

different

data

had been

results from working

on an equal

or bigger

because

Data)

situation

is the result

X

per missing

to be compared

of missing

could

Number

mean squared

would

combinations

(i.e.

of obser-

285

different

one for a given

The

first criteria

niques

used

to compare

was in what proportion

solution to the problem The results are presented

the different

of cases

of imputing the missing in Table 1.

The simplest technique,

Mean Substitution,

in every case,

the proofs

cited

problems

programming

Row-Column

as might

by Rubin

Substitution

be expected,

(1991).

the

After

technique,

technique

a

data.

will always

find a solution, so long as at least one consumer each of the products. Expectation-Maximisation converged

tech-

they found

some

scored also given initial

Krzanowski’s also performed

D. Hedderlq, I. Wakeling

286

TABLE 1. Number of Simulation Runs which Converged (N= 50) Stimuli Assessor

Dimens

Imputation Techniques

Noise Mean

E-M

Row-Co1

MISTRESS

PRINQUAL

10

50

2

Low

5% 35% 65%

50 50 50

50 50 50

50 50 50

50 47 33

49 13 0

10

200

2

High

5% 35% 65%

50 50 50

50 50 50

50 50 49

50 48 7

31 34 0

30

50

2

High

5% 35% 65%

50 50 50

50 50 50

50 50 50

50 41 9

41 1 2

30

200

2

Low

5% 35% 65%

50 50 50

50 50 50

50 50 50

50 6 0

32 13 1

10

50

4

High

5% 35% 65%

50 50 50

50 50 50

50 50 50

50 49 27

33 24 0

10

200

4

Low

5% 35% 65%

50 50 50

50 50 50

50 50 50

50 35 3

38 20 0

30

50

4

Low

5% 35% 65%

50 50 50

50 50 50

50 50 50

50 41 15

32 0 0

30

200

4

High

5% 35% 65%

50 50 50

50 50 50

50 50 50

50 31 0

19 0 0

well, only failing on

one

MISTRESS values,

to converge

simulation

within the iteration

run.

performed

As

been

limit

expected,

well with low levels of missing

but was less successful

QUAL’s

had

performance

at higher

levels.

PRIN-

just

those

with missing

used during

the simulation

circumstances

was poor.

the manual),

these

as intended.

squared

differences

The

usual, and so is higher

data matrices

differences

and the original

of how well the imputation inal data. Because ing observation, data matrices The taking

mean

between

they are a mean the

results

squared

difference

data sets are a measure

technique

with different

duced by the imputation

recovers difference

are comparable number

differences

between

the imputed

the

per miss-

ever,

high

calculated

the

reflect

values

were retained

(if only imperfectly)

by pro-

and the entire origi-

Analysis of variance the relative influence

would

is not impor-

be the same

terms than

for

PRINQUAL result;

because

how-

they do

the fact that the solution

formance three

was performed

of the techniques.

techniques

(Mean

found

the detail of this calculation

the sum of

if the most

rithm,

is

a degenerate use;

and

the

Substitution either

solution while

to judge

on the per-

The analysis was limited Substitution,

and the Row-Column

practical

in order

of the various factors

techniques,

the results

to in

to have

unsatisfactory.

it was felt that PRINQUAL

since

scores

and questionable

nal data matrix, squaring the differences, summing them, and then dividing the result by the number of values declared missing in that matrix. For most of the tant,

this occurred,

Difference

It is a peculiar

entire data matrix

technique,

Squared

do not appear

than usual, skewing some of the

upwards.

of missing values. were

Mean

When

under some referred

involves more non-zero

the origbetween

are parameter

solutions

constraints

worked

squared

There

study; however,

(the degenerate

Mean Squared Differences Between Imputed and Actual Values mean

values.

settings which are supposed to force the procedure to leave the non-missing data unchanged, and these were

too frequently

to

algo-

technique)

did not converge,

MISTRESS

reliably, at least with small numbers

E-M

; or

to be of

converged

fairly

of missing data, initial

obvious alternative (simply comparing the true values for those observations which were declared missing with their imputed equivalents) had been used. However,

inspection performed

when creating an output data set, proc PRINQUAL reassigns the values for all the points in the data set, not

A split-plot design was used because it was felt that the experimental units fell into two strata; the results of

of the results showed that it consistently less well than the other three techniques.

Compatison

applying

the

techniques

to a given

incomplete

of Imputation Techniques for Internal Preference Mapping

of predicted

data

means or confidence

matrix were likely to depend on the pattern of the missing

tive. Instead,

data in the matrix,

model

similar

to each

matrices. being

This

treated

and so they were more

other led

than to the

as the ‘plots’

to results stratum,

as ‘subplots’.

as Wilks’

Pillai’s

Lambda,

Lawley Trace) Sphericity Test the univariate

than

matrices

and the results for and

the

different Apart

Hotelling-

explanation

(Schlich,

the analysis.

are

because

of the range

they cover range

because

from

was a risk that, if untransformed

(means 1.26

there

TABLE 2. Analysis of Variance

for Difference

Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique

Techniques

for the

more

homogenous.

in variance, of factors

however,

the

and technique

in Table

2. Almost

all the

compared

to the

relevant

variation.

inter-replicate

the variation

the significance

those

effects

of effects

hence

the Fdistributions

have different

Imputed

in performance

Comparing

judge

Between

To assess which F ratios in order

to

can be misleading

if

degrees

of freedom

have different

and Actual

fac-

most, the

however,

degrees

(and of free-

in this case it is

Data

Wilks’

Pillai’S

H-L Trace

297.9 842.3 331.9 110281 238.0 63.8 353.2 108.8 27.9 19.3 40.5 0.6 4.9 4.6 11.7

220.1 842.3 331.9 11028.1 238.0 63.8 353.2 108.8 27.0 19.1 39.3 0.6 4.9 4.6 11.6

385.3 842.3 331.9 11028.1 238.0 63.8 353.2 108.8 28.7 19.5 41.7 0.6 4.9 4.6 11.9

561.4 354.8 70.8 104.9 617.5 48.3 34.3 250.5 139.0 33.6 12.7 55.5 0.9 7.1 6.2 16.2

561.4 281.1 70.8

_

Stratum 6, 234? 3,1173 3,1173 3,1173 3,1173 3,1173 3,1173 3,1173 6,234? 6,234? 6,234? 6,234? 6,234? 6,234? 6,234;

Per cent Missing Number of Products Number of Subjects Level of Noise Dimensionality Products X Subjects Products X Noise Products X Dimensions Per cent Missing X Products Per cent Missing X Subjects Per cent Missing X Noise Per cent Missing X Dimensions Per cent Missing X Products X Subjects Per cent Missing X Products X Noise Per cent Missing X Products X Dimensions Between

bonus

the variance

Equivalent FValues DF

Data Matrices

rather

A pragmatic

are significant

Source of Variation

Between

of the mean,

factors and interactions

dom in their denominators);

data were used, some

a of

distributed.

are presented

F ratios were inspected.

for different

to 10.95)

of factors

Normally

tors influenced

yet

made

the differences

The results

the Mean

positive;

constant.

for each combination

was roughly

1994,

of the analysis of

by definition

combinations

implies

for each combination

percentage

an absolute

from

results

were log transformed

This was done

Differences combinations

Differences

is a constant being

limits might be nega-

transformation

the variation

was that the transformation

Multivariate tests (such

Trace,

in which

factors

data

designs.)

Squared factor

data

tests were inappropriate.

The Mean Squared before

other

were considered, because Mauchly’s was highly significant, indicating that

gives a clear and simple split-plot

from

incomplete

individual techniques

likely to be

the logarithmic

287

Stratum

X X X X X X X X X X X X X X

Per cent Missing Products Subjects Noise Dimensions Products X Subjects Products X Noise Products X Dimensions Per cent Missing X Products Per cent Missing X Subjects Per cent Missing X Noise Per cent Missing X Dimensions Per cent Missing X Products X Subjects Per cent Missing X Products X Noise X Per cent Missing X Products X Dimensions

N.B. For some effects, Numerator DF vary between variation does not affect the significance level.

2346

2,1174 4,234? 2,1174 2,1174 2,1174 2,1174 2,1174 2,1174 2,1174 4,234? 4,234? 4,234? 4,234? 4,234? 4,234? 4,234? and 2350

depending

617.5 48.3 34.3 250.5 139.0 32.7 12.7 54.0 0.9 7.1 6.2 16.0

on the test; however,

561.4 70.8 104.9 617.5 48.3 34.3 250.5 139.0 34.4 12.7 57.0 0.9 7.1 6.2 16.3

with such large values,

this

288

D. Hedderky, I. Wakeling TABLE 2a. Noise

TABLE 2f. Number of Dimensions

Noise

Dimensions

Root Mean Square Error

Low (SD = 1) High (SD = 2)

1.39” 2+31b

Root Mean

TABLE 2g. Technique

Low High

Square Error 10

Interactions with Technique

Noise

TABLE 2b. Number of Products

1G39b

X Noise

Mean Subs

Root Mean Square Error Row-Co1

E-M

l-39” 2.14b

1.38” 2.46d

1.40” 2.36”

(Means with different superscripts are significantly different, as judged by a Bonferonni corrected LSD test at 5% significance on the Log transformed data.)

1.70”

30

1.74” l-85’

2 4

(Means with different superscripts are significantly different, as judged by a Bonferonni corrected LSD test at 5% significance on the Log transformed data.)

Products

Root Mean Square Error

TABLE 2c. Products X Noise Noise Level

Products

Low High Low High

10 10 30 30

TABLE 2h. Technique

Root Mean Square Error

Root Mean Square Error E-M Row-Co1

Mean Subs

1.506 2.39” 1.29” 2.24’

1.73”

1.84’

1.82b

TABLE 2i. Technique

TABLE 2d. Number of Subjects

Percentage Missing

Root Mean Squared Error

Subjects

Root Mean Squared Error Mean Subs Row-Co1

5 35 65

1.86b 1.73”

50 200

1.70b 1.72hc 1.75’

TABLE 2j. Technique Number of Products

Root Mean

E-M

1.66” 1.82” 2.06’

TABLE 2e. Percentage Data Missing Percentage

X Missing

Level of Noise

1.67” 1.716 2.1d

X Products X Noise

Root Mean Square Error MeanSubs Row-Co1 E-M

Square Error

clear that most of the effects the F ratios

are used

to judge

effect is (relative to the inter-replicate 1 is an ordered based Trace The

Lambda.

(The

and the Hotelling-Lawley Level

influencing

of Noise variations

leads to better

recovery

is the

the

Figure

Approximations

patterns

for

Pillai’s

Trace

are similar.)

most

important

in the performance of the original

and

how large

variability).

bar chart of the FRatio

on Wilks’

10 30 30

are highly significant,

simply

higher

the proportion

recovery preference

dimensional Among

spaces

factor

next most

similar Mean

are

the worse the

and higher

recovered

1.46d 2.36’ 1.34’ 2.35’

dimensional

less well

than

low

ones.) the between-technique

on low noise Substitution

the main

1.4gd 2.6Y 1.296 2.29h

of data missing,

of the data matrix;

with Noise is most influential

(lower noise

data);

1.543 2.18s 1.25” 2.1of

Low High Low High

10

1.68” 1.75b 1.96”

5 35 65

effect

data,

performs of Technique

factors, the interaction (the three techniques

but with high better

than

is almost

noise

are data,

the others); as important.

important is the Number of Products (more products lead to better recovery, though the benefit is less with high noise data), and the Number of Subjects (more

(On average, Mean Substitution performs better than E-M, which is better than Row-Column Substitution.) The Technique X Missing interaction is the next most

subjects lead to better recovery). The Proportion of Missing Data and the Dimensionality of the preference

important

space

have

a lesser

influence

on

the

results.

(The

(the iterative

Column Substitution the percentage missing

techniques

-

E-M

and Row-

were more heavily affected by than Mean Substitution; with

Compan’son of Imputation Techniques for Internal Preference Mapping

.iir, , ,

289

T’ql’ f: ; r

.ji

;:t

i

;

0

1000

500

1500

0

2

1000

500

FIG.

1. FRatios

imputed

from analysis of variance

and actual

of difference

2

1500

3

Wilks’ F Ratio

Wilks F Ratio

between

FIG.

3.

F Ratios

from

analysis

of variance

of

consumer

configurations.

Data.

Notes

on Figures:

Effects

are ordered

within

strata

by size of

F ratio.

1obo

5io

0

Letters indicate factors: 7’ = Technique (Mean Substitution, Row-Co1 or E-M) ; M = Percentage Missing Data (5%, 35% or 65%); N = Level of Random Noise (Low or High); P = Number of Products (10 or 30); S = Number of Subjects (50 or 200) ; D = Number of Dimensions (2 or 4) ; Interactions are indicated with a * (For instance T*N is the interaction of Technique and Noise).

1500

Wilks’ F Ratio

FIG.

2. FRatios

from analysis of variance

of product

configu-

rations. low

levels of missing

Mean Substitution, they performed

data they performed

whereas worse),

at moderate

followed

Products

X Noise

interaction

iterative

techniques

perform

better

by the Technique

(with better

10 products, than Mean

tution when noise is low, but Mean Substitution with high noise.

With

always performs

better.)

Tables presented

of means

30 products,

mean

X the

Substiis better

Substitution

A similar

to these

effects

are

2a-j.

split-plot

ANOVA

pared

to the inter-replicate

necessary

to chart

It is clear Level

variation,

the probability

followed

are

the

by the Dimensionality of Missing

Data,

between

ences

are small, and the major

As above,

follow Table

configurations were

(for

both

log-transformed

performed

Products

before

on them

as a precaution

predictions, and in an attempt of observations from different Again,

apart

from

for each technique

analysis

were roughly Normal.

Consumers)

of variance against

was

negative

to equalise the variances combinations of factors.

the differing under

and

the

variances,

each combination

the results of factors

between ber

Technique,

of Products.

between-Technique effects

3 (Tables

of means

of Subdiffer-

are interactions

the Level of Noise, Tables

and

factors,

Noise and the

and the Number

Mean Squared Differences Between Configurations of Stimuli and Consumer Points Based on the Complete and Imputed Data Matrices

the

influential

of the data, the Num-

an interaction

By comparison,

between

it was

of Data Missing

most

jects.

Differences

so again

levels of the effects

that the Percentage

of Noise

ber of Products,

Squared

The

(Fig. 2).

Percentage

the Mean

was performed.

results for the Stimulus Configurations are presented in Table 3. Almost all the effects are significant com-

the

corresponding

in Tables

than

or high levels,

and the Num-

for

these

effects

3a-i) .

The higher the proportion

of the data which is missing,

the less accurate the recovery of the product configuration. Similarly,

the higher

less accurate action

shows that these

nations

between

the proportion

noise in the data, the

(The Missing by Noise inter-

differences

of Noise and Percentage

difference when

the random

the recovery.

low and

hold for all combiMissing,

high

of data missing

noise

but that the data

is higher.)

is less High

290

D. Hedderlq,

I. Wakeling TABLE 3. Analysis of Variance

for Product

Configurations

Equivalent FValues

Source of Variation

DF Between

Data Matrices

Per cent Missing X Products X Subjects Per cent Missing X Products X Noise Between

X Products

Techniques

Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique

Technique Technique Technique

X X X X X X X X X X X X X X X

H-L Trace

Pihi’S

Stratum

Per cent Missing Number of Products Number of Subjects Level of Noise Dimensionality Products X Subjects Products X Noise Products X Dimensions Per cent Missing X Products Per cent Missing X Subjects Per cent Missing X Noise Per cent Missing X Dimensions

Per cent Missing

wii’

X Dimensions

7.1

342.2 258.4 94.2 1098.1 345.5 6.8 48.2 27.0 22.4 6.8 100.3 6.0 5.8 8.4 7.0

2702.2 258.4 94.2 1098.1 345.5 6.8 48.2 27.0 22.4 6.9 129.7 6.0 5.8 8.5 7.1

23.0

23.0

7g.i 26.1 80.8 17.4 4.5 55.1 38.0 24.0 3.4 26.5 4.2 7.5 8.2 7.3

7i.i 26.1 80.8 17.4 4.5 55.1 38.0 23.9 3.4 25.9 4.2 7.5 8.1 7.3

23.0 6.0 70.3 26.1 80.8 17.4 4.5 55.1 38.0 24.0 3.4 27.0 4.2 7.6 8.3 7.4

6,234? 3,1173 3,1173 3,1173 3,1173 3,1173 3,1173 3,1173 6,234? 6,234? 6,234? 6,234? 6,234? 6,234? 6,234?

1115.2 258.4 94.2 1098.1 345.5 6.8 48.2 27.0 22.4 6.9 114.8 6.0 5.8 8.4

2,1174 4,234? 2,1174 2,1174 2,1174 2,1174 2,1174 2,1174 2,1174 4,234? 4,234? 4,234? 4,234?

Stratum

Per cent Missing Products Subjects Noise Dimensions Products X Subjects Products X Noise Products X Dimensions Per cent Missing X Products Per cent Missing X Subjects Per cent Missing X Noise

Per cent Missing X Dimensions Per cent Missing X Products X Subjects Per cent Missing Per cent Missing

X Products X Products

TABLE 3a. Percentage

X Noise X Dimensions

4,234? 4,234?

4,234?

TABLE 3d. Number

of Data Missing

Percentage Missing

Root Mean Squared Error

Number of Products

0.27” 0.936 1.49’

(Means with different superscripts are significantly different, as judged by a Bonferonni corrected LSD test at 5% significance on the Log transformed data.) TABLE 3b. Level of Noise Root Mean Squared Error

Low (SD = 1) High (SD = 2)

0.53” 0.99b

Root Mean Squared Error

2 4

0.82’ 0.63”

30

TABLE 3e. Percentage Percentage

Data Missing

X Noise

Noise Level

Root Mean Squared Error

5

Low

5 35 35 65 65

High Low High Low High

0.18” 0.46’ 0.69’ 1.24d 1.28” 1.73’

TABLE 3c. Number of Dimensions Dimensions

of Products

10 5 35 65

Noise

TABLE 3f. Number of Subjects

Root Mean Squared Error 0.60” 0.866

Subjects 50 200

Root Mean Squared Error O-66” 0.78’

Compatison of Imputation Interactions with Technique TABLE Noise

dimensional

3g. Technique

product

X Level of Noise

more

Root Mean Squared Error Mean Subs E-M Row-Co1

Low High

O-51” 1.04”

0.56” 0.96’

the

3h. Technique

Products

X Number

0.86’ 0.62”

TABLE 3i. Technique Products Noise

10 10 30 30

X Number

of Products

0.70’

High Low High

l.Od 0.44” 0.87”

0.58” 1.19” 0.45” 0.90’

techniques

with a large because

X Level of Noise

Between

Techniques X X X X X X X X X X X X X X X

performs

and

better

number

it appears

X Noise

other

techniques.

tion’s

performance

of products,

(especially)

the

E-M

than Mean Substitution,

they all perform X Products

Similarly,

is more

The

is mostly sensitive

of products

Row-Column

at different

but

similarly.

interaction

Substitution

to

than the Substitu-

levels of missing

data

seems to be more sensitive to noise than the other

tech-

The

results in Table

for the Assessor

almost all the factors it was necessary to judge

configurations

4. As with the Stimulus and interactions

their relative importance

for Consumer

can be

configurations, are significant,

to chart their probability

so

levels in order

(Fig. 3).

Configurations Equivalent

F Values

Wilks’

Pillai’s

6,234? 3,1173 3,1173 3,1173 3,1173 3,1173 3,1173 3,1173 6,234? 6,234? 6,234? 6,234? 6,234? 6,234? 6,234?

1716.7 104.7 971.6 1725.6 848~ 1 9.8 113.1 75.8 21.2 8.5 149.8 2.9 6.5 13.2 22.2

376.8 104.7 971.6 1725.6 848.1 9.8 113.1 75.8 21.0 8.5 125.6 2.9 6.4 13.2 21.7

5392.6 104.7 971.6 1725.6 848.1 9.8 113.1 75.8 21.4 8.5 175.1 2.9 6.5 13.3 22.7

2,1174 4,234? 2,1174 2,1174 2,1174 2,1174 2,1174 2,1174 2,1174 4,234? 4,234? 4,234? 4,234? 4,234? 4,234? 4,234?

9.0 11.2 113.3 25.5 179.9 25.1 9.5 90.5 61.8 27.6 9.1 57.0 1.7 8.4 11.5 13.2

9.0 11.1 113.3 25.5 179.9 25.1 9.5 90.5 61.8 27.2 9.0 54.8 1.7 8.4 11.5 13.0

9.0 11.3 113.3 25.5 179.9 25.1 9.5 90.5 61.8 28.0 9.1 59.2 1.7 8.5 11.5 13.3

H-L Trace

Stratum

Stratum

Per cent Missing Products Subjects Noise Dimensions Products X Subjects Products X Noise Products X Dimensions Per cent Missing X Products Per cent Missing X Subjects Per cent Missing X Noise Per cent Missing X Dimensions Per cent Missing X Products X Subjects Per cent Missing X Products X Noise Per cent Missing X Products X Dimensions

less

slightly worse

With a small number

Row-Column

found

0.58” 1.05’ 0.45” 0.92’

Per cent Missing Number of Products Number of Subjects Level of Noise Dimensionality Products X Subjects Products X Noise Products X Dimensions Per cent Missing X Products Per cent Missing X Subjects Per cent Missing X Noise Per cent Missing X Dimnsions Per cent Missing X Products X Subjects Per cent Missing X Products X Noise Per cent Missing X Products X Dimensions Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique Technique

the

positions.

niques.

DF Data Matrices

Curiously, set,

with Technique,

the noise level with small number

0.78’ 0.64”

positions.

in a data

of the

leads to a

with high noise data, and Mean Substitu-

perform

Technique

recovery

products

of the product

Substitution

of Variation

Between

are

Substitution

better.

Row-Column

TABLE 4. Analysis of Variance Source

of their

there

the recovery

than expected

Root Mean Squared Error Mean Subs E-M Row-Co1

LOW

while more

recovery

subjects

tion slightly

of Products

0.83” 0.63h

to less accurate

leads

As for the interactions

Root Mean Squared Error Mean Subs Row-Co1 E-M

10 30

accurate more

that Row-Column

Means with different superscripts are significantly different, as judged by a Bonferonni corrected LSD test at 5% significance on the Log transformed data. TABLE

data

configuration,

accurate

0.51” 0.98’

291

Techniques for Internal Preference Mapping

292

D. Hedderlq,

I. Wakeling

TABLE 4a. Level of Noise

TABLE 4f. Number of Products X Level of Noise Root Mean SquaredError

Noise Low (SD = 1) High (SD = 2)

0.32” 0.50b

(Means with different superscripts are significantly different, as judged by a Bonferonni corrected LSD test at 5% significance on the Log transformed data.)

Products

Noise

Root Mean SquareError

10 10 30 30

Low High Low High

0.32” 0.46’ 0.32” 0.54’

Interactions with Technique TABLE 4b. Percentage of Data Missing Percentage

Noise

0.18” 0.50h 0.72”

Low High

5 35 65 TABLE 4c. Subjects Number of Subjects

TABLE 4g. Technique

Root Mean SquaredError

Root Mean SquaredError

Root Mean SquareError Mean Subs Row-Co1 E-M 0.326 0.47’

TABLE 4h. Technique Number of Products

TABLE 4d. Dimensions Number of Dimensions

Root Mean SquaredError

4

Percentage

Noise Level

Root Mean SquaredError

Low High Low High Low High

0.14” 0.25’ 0.40’ 0.63d 0~64~ 0.82”

5 5 35 35 65 65

Number of Products

Noise

10 10 30 30

Low High Low High

Subjects,

most

significant

the Perecentage

factors

are

between

the Percentage

Noise and the Number

Number

of

of Missing Data, the Level of

Noise in the data, the Dimensionality actions

the

of the data, inter-

Missing

of Products.

Data and the

Again,

differences

between the Techniques are less significant, the primary ones being an interactions with the Level of Noise and the Number

of Products.

Means

for these

0.38’ 0.41d

more the

effects

dimensions

Noise

and Proportion

recovery

the

and more

0.29” 0.49 0.32’ 0.5w

of the data missing The

interaction

of

is the same as with the

the more

difference

and

0.29”” 0*5og 0.32’ 0.54’

Missing

data. More products recovery,

0.37” 0.42’

Number of Products X Noise

less accurate.

configuration;

less the

X

0.33” 0.44’ 0.29b’ 0.5Zh

make

the

Number of Products

Root Mean SquaredError Mean Subs Row-Co1 F,-M

noise,

product The

X

0.38’ 0.40’

TABLE 4i. Technique

TABLE 4e. Percentage Data Missing X Level of Noise

0.29” 0.51d

Root Mean SquaredError Row-Co1 Mean Subs E-M

10 30

0.33” 0+46b

2

0.29” 0.52’

(Means with different superscripts are significantly different, as judged by a Bonferonni corrected LSD test at 5% significance on the Log transformed data.)

0.47b 0.33”

2::

Noise

X

between

data that is missing, high

and

low noise

in the data set lead to less accurate differences

between

noise data are more pronounced

low and

high

with 30 products

than

with 10. The Technique although the three

X Noise interaction techniques perform

low noise data, Mean Substitution

performs

shows that, similarly with better

than

are given in Tables 4a-j. The results for Subjects, Proportion of Data Missing, Dimensionality and Noise are as might be expected;

the others on high noise data. This is most pronounced with 65% missing data (see the Technique X Missing

with more subjects the recovery of the consumer configuration is more accurate, but a higher level of

interaction techniques

X

Noise

interaction). indicates perform

The

Technique

X

Products

that the E-M and Row-Column less well than Mean Substitution

Comparison of Imputation Techniques for Internal Preference Mapping TABLE

5. Confirmatory

Analysis:

Comparing

Imputed

and Actual

Data Matrices

FFtatio

Source

DF

ss

MS

Technique (Means, Row-Co1 or E-M) Per cent Missing (5%, 35% or 65%) %Missing X Technique Residual Total

2 2 4 441 449

0.09 1.22 0.55 3.26 5.11

0.043 0.609 0.137 0.007

293

Significance <0.05
5.79 82.33 18.58

Values quoted are based on Type III Sums of Squares (change in Sum of Squares of model when the relevant term is fitted last: SAS/S?‘A7’Munual, SAS Institute, 1989) from the analysis of the log-transformed variables.

Selected Means TABLE

5a. Percentage

Missing

TABLE Root Mean Squared Error

Per cent Missing

1.55” 1.58” 1.65’

5% 35% 65%

with high

number

X Products other

of products;

X Noise

Row-Column E-M

Per cent Missing

interaction

Substitution

techniques performing

while seems

with 10 products worse

with

30

random

and high noise, products

and

Only the three

and

set supplied

high

Analysis

studied

Switzerland).

The data matrix consisted

X 29 Stimuli, implies

that the preference TABLE

6. Confirmatory

space

Centre,

Analysis:

Comparing

converged

It Stimulus

Configurations

Source

DF

ss

Technique (Means, Row-Co1 or E-M) Per cent Missing (5%, 35% or 65%) % Missing X Technique Residual Total

2 2 4 441 449

13.42 1132.11 6.24 70.54 1222.31

data

and

the

results

the

Unfortunately,

data

set, the factors

to the proportion

different

imputation

are presented

actual

and

in Tables

imputed

data

Per cent Missing 5% 35% 65%

based on Imputed MS

and Complete

5, 6

Significance

FRatio

6.71 566.05 1.56 0.16

Data

41.95 3538.85 9.75

when the relevant

term is fitted

Missing TABLE

Root Mean Squared Error 0.16” 0.59b 1.02’

tech-

matrices

Selected Means 6a. Percentage

reli-

of the data were taken

was only a single

Values quoted are based on Type III Sums of Squares (change in Sum of Squares of model SAS/STAT Manual, SAS Institute, 1989) from the analysis of the log-transformed variables.

TABLE

E-M

(Table 5), all three effects (the imputation technique, the proprotion of missing data, and their interaction)

matrix

is 2-dimensional.

Substitution,

on them.

in the ANOVA were limited

Comparing

of 166 Assessors

and the analysis of the complete

of the level of

which

logarithms ANOVA

niques used. The and 7.

on a real data

Research

(Mean

Substitution)

natural

there

of missing

runs (50 runs at each of the (Nestle

an estimate

techniques

performing

because

by P. Leathwood

1.54” 1.54” 1.66’

ably were used, to save time. As before,

3 levels of missing values) was performed

E-M

1.52” 1.61’ 1.68’

to make

and Row-Column

before

A similar set of simulation

Error

noise in the data.

than

noise.

Confirmatory

X Technique

Mean Squared Row-Co1

1.49” 1.59b 1.61’

was not possible

to be due to worse

Root Mean Subs

5% 35% 65%

the Technique

performing

5b. Per cent Missing

Mean Subs 0.42”

6b. Technique

Root Mean Squared Row-Co1 0.51’

Error E-M 0.446

last:

294

D. Hedderley, I. Wake&g

were

significant;

data (Table

however,

5a) dominated

the proportion the effects.

(Table 5b) follows the same pattern study (Table Substitution

2h) declines

tion of missing

forms similarly with 5 or 35% of the observations ing

as found in the main

the performance the E-M

(before

mance

deteriorating

as

missing),

of Row-Column

with each increase

data, whereas

of missing

The interaction

the

in the propor-

is affected

missing data.

TABLE 7. Confirmatory Analysis: Computing Assessor Configurations

Source

DF

Technique (Means, Row-Co1 or E-M) Per cent Missing (5%, 35% or 65%) %Missing X Technique Residual Total

z 4 441 449

much

less by increases

miss-

level of perfor-

technique of mean

at

65%

substitution

in the proportion

of

based on Imputed and Complete Data MS

FRatio

1.71 368.15 0.37 0.05

33.65 7236.65 7.21

SS

3.42 736.29 1.47 22.43 763.62

Row-Column

while the performance

algorithm

per-

to a similar

Significance

Values quoted are based on Type III Sums of Squares (change in Sum of Squares of model when the relevant term is fitted last: SAS/SII’ATMunual, MS Institute, 1989) from the analysis of the log-transformed variables.

Selected Means (Log, Mean Squared Difference) TABLE 7a. Percentage

Missing TABLE

Root Mean

Per cent

7b. Technique

Squared Error

Missing

Root Mean Squared Error 5%

0.11”

35%

0.34*

65%

0.49’

TABLE 8. Adequacy

StiIUUli

Assessor

10

50

10

Dimens

Thresholds Noise

Mean Subs

Row-Co1

E-M

0.28’

0.26”

0.25”

and Numbers Missing

of Simulation

Runs Within Those

Adequacy Limit

Thresholds

Number of Runs within Threshold Mean

E-M

Row-Co1

Low

5% 35% 65%

0.33

50 26 0

50 43 5

50 46 13

200

High

5% 35% 65%

2.29

50 48 30

50 50 34

50 45 23

30

50

High

5% 35% 65%

0.62

50 46 4

50 47 10

50 43 11

30

200

Low

5% 35% 65%

0.19

50 48 0

50 36 0

50 37 0

10

50

High

5% 35% 65%

0.78

50 44 0

50 38 0

50 32 0

10

200

Low

5% 35% 65%

0.41

50 19 0

50 30 0

50 33 0

30

50

Low

5% 35% 65%

0.15

50 50 0

50 49 0

50 50 0

30

200

High

5% 35% 65%

0.72

50 48 4

50 47 0

50 44 0

Compatison of Imputation Techniques for Internal Preference Mapping Comparing

the Stimulus

again all three

effects

the proportion

of missing

the techniques, ration

more

(Table

(Table

6))

the imputation

but dominated

by

the levels chosen

data (Table

mean substitution accurately

both perform

Configurations

are significant,

than

6a). Comparing

recovers

the E-M

the configu-

algorithm,

much better than Row-Column

and

substitution

6b).

Similar rations

results were found

(Table

Adequacy

for the Assessor

Configu-

7, 7a and 7b).

apart;

do not vary as much, mance between will not

performance

derived

the data sets, and the number the three

techniques

Expectation-Maximisation, tion)

gave similar

With

the simulation

runs were within percentile

95th

differences Values,

based

on

a majority with

adequate.

65%

do occur,

were data,

differences

appear

to be any literature

sumer

data;

2-dimensional,

in

some samples has shown

and 2.24

all

Missing

the

not

threshold;

very few results

were

the techniques

in the

10 Stimuli,

data,

(on a g-point

replicates,

implying

and

the

the

Stimulus,

data,

and

the three

reliably,

10

stitution)

that it doesn’t

results

make much

one uses; performance

techniques

from

the

techniques

difference

(on all three

data missing, used).

to be in their

level of noise,

and the number

Moreover,

Dimensions

them playing a fairly minor significant tion runs).

compared

mea-

by features

the proportion

of stimuli

role (although

to the variation

of

and assessors

with interactions

seems

between

they are still

between

iterative

formance

measures,

implying The ence

effect of

ones

simula-

E-M

or

data, but

than the non-iterathe Technique

was significant its F ratio

X

for all three per-

was never

of the number

the

the

of subjects

product

very large,

number

of

products

was also unexpected.

tend

subject

Intuitively,

a larger

it;

to be further

studies with more

from

(higher

more

uncertainty

accuracy

of their

the

negative

configuration,

effect

in

upon

.judging

the the

difference)

of subjects

(say)

the product

for both products

of higher

own configuration effect

and similarly recovered

simultaneously,

in locating

and vice versa. However, the positive

the complete

total squared

when fitting a large number

the

of

product

It may be that because

are estimated

uncertainty

subjects,

instead

are less accurately

products.

two configurations

also causes

the accuracy

when one has more subjects,

configurations

higher

not

the

the

affect

at least

and

on

recovery,

or

on the differ-

configurations,

to improve

other

(like

data sets. We did not

data set might be expected

weigh

of the

(like Mean Sub-

it was never very influential.

between

points,

of the tech-

on 2-dimensional

were better

interaction

produced

which technique

the main effect of these factors

Main Effects,

conimply

performance

sures) seems to be more strongly influenced of the data set (the

to con-

which

of the main reasons

techniques

Substitution)

configurations

simulation

One

of the the data

on the dimensionality

outperformed

the iterative

effect

FURTHER oneself

for

in the study was that Bello

find this effect in this study; although

High Noise data, the

one restricts

important.

data; non-iterative

data solution

imputation

the

that the dimensionality

10

200

the

1.47

the scores

levels used in this

tive ones on higher-dimensional

worse than the

30

original

subject sidering

that the noise

50

fare worse than Mean Substitution.

that once

of between

between

space which was used to generate

configurations

verge

the data

within tasting ses-

deviations scale)

It was also surprising preference

Row-Column

Low Noise data,

Low Noise

AND

were replicated

standard

that the relative performance

35%

noticeably

while

sions

noted

Stimulus, 50 Assessor, 4dimensiona1,

It is interesting

not

convincingly

varied depending

between

performs

DISCUSSION DEVELOPMENTS

because

to produce

(1993)

Low Noise

techniques

seemed

niques

At

4dimensiona1,

iterative

Values

runs (although

200 Assessor,

Assessor,

the levels were chosen

Dimensionality

within

a pattern;

techniques;

assessors does

on the levels of noise in con-

set was not more

2-dimensional,

iterative

in perfor-

there

for including

Stimulus,

Substitution

Unfortunately,

of the

Assessor, Mean

be as great.

thresh-

resampling).

missing

to discern

the difference

data

study are reasonable.

Substitu-

it is usually at the 35% Missing level; however

it is hard

then

far

consumer

distribution

of the simulation

Where

in genuine

the adequacy

of the

an overwhelming majority) while

Substitution,

5% Missing

old

(the

(Mean

in Table 8.

and Row-Column

results.

from

of simulation

runs which passed those limits are presented In general,

may be because

‘noisy’ preference maps. However, subsequendy, analysis of several unpublished consumer preference trials

of ‘adequate’

resampling

This

were unrealistically

a panel of more or less accurate

sets they produced

of the Results

limits

used.

for the factor

if the levels of noise

where The

technique

295

and

numbers

on the

appears

to out-

accuracy

by the variance

of the ratios

of

the effects. Some

of these conclusions

simulation

process

were supported

was applied

when the

to a real data set. How-

on the

ever, because only one data set was available, it was not possible to confirm the relative importance of a num-

results might be expected, but it is surprising that it has more influence than, say, the size of the data matrix, or

ber of the factors (the size of the data set, the level of noise, and the dimensionality of the preference space).

The On

importance

the basis

of the noise

of previous

studies

factor some

was surprising. effect

296

D. Hedderlq, I. Wakeling

Given

that

each

requirements been

consumer

trial

and features,

available

structure

data sets had

centred

that they would have fitted

erence

mapping,

mean,

rather

to replace

testing of all the effects studied in the simulation. The adequacy tests imply that all the techniques

give

the values are missing.

much

the results may be questionable, while at 65% no tech‘sufficiently good results’, reliably produces nique regardless of the size of the data set or the accuracy of

Product

raised by a reviewer was that rotating

and Consumer

unreasonable,

and

a

better

would have combined fit of both A single carded

and Consumer

(which

points than the Product

for

picture

but

dis-

criteria

instance,

always contained

configuration),

have been

using

the

Product

and the Consumer the Mean

A number which

to rotate

to be applied

to both

of

bers

have equal influence

of points

in each;

Product

and

imputed

data reconstruct

Consumer

despite

the Product

a and

the different

the original

based matrix

ence scores. Given the comparatively

small differences

mance

most successful

between

the three

our recommendation tion

(replacing

num-

how well the

configurations

on the

of preferin perfor-

techniques,

would be to use mean

substitu-

missing values with the mean

score for

the corresponding

stimulus),

since it is simple

to pro-

gram, executes almost instantaneously, and performs at least as well as the more complex iterative techniques. One

refinement

take account

we would

of the assessor’s

as well as whether below average

recommend

by the group

would

average preference

the stimulus

be to score,

effects

(and

the

grand

means

scores

to

obtain

an

the data sets; however, in actual trimight be expected to have different

it may improve

performance

conducting

the

techniques

performance

original

(by reducing

an appropriate

authors

would

throughout

like

the project;

on the rotation Second

trial

sources

as in an

of random

sample size, etc.).

to thank

two anonymous

Dr Neil Gains for suggestions

problem;

Sensometrics

and several

Group

attendees

Meeting

at the

in Edinburgh

for

imputation-by-ANOVA.

This work was conducted of Agriculture,

Fisheries

as part of a U.K. Ministry and

Mathematical

Food

LINK

Methods

project

for

on

Preference

Mapping.

REFERENCES Beale E. M. L. & Little R. J. A. (1975). Missing values in multivariate analysisJ. R. StatisticalSot., Series B, 37, 129-45. Bello, A. L. (1993). Choosing among imputation techniques for incomplete multivariate data; a simulation study. Commun. Stat. Themy &Methods, 22 (3)) 853-77. Box G. E. P., Hunter W. G. & Hunter J. S. (1978). Statistics/or Exptimenters. Wiley, New York. Buck,

noticeably.

S. F. (1960).

in multivariate

A method

of estimation

data suitable

of missing

values

for use with an electronic

com-

puter.~J. R Statistical Sot. Series B, 22, 302-6. van Buuren,

S. & van Rijckevorsel,

of missing

categorical

J. L. A. (1992).

data by maximising

Imputation

internal

consis-

tency. Psychomettika, 57,567-80. Dempster

A. P., Laird

likelihood

N. & Rubin

estimation

algorithm

from

D. B. (1977).

incomplete

(with discussion)

.J

data

Maximum via the

EM

R Statistical SOL, Series B, 39,

l-38.

as a whole; for

imputed value. In the simulation study, this was not an important factor because no special effort was made to introduce different mean scores for different assessors when constructing als, where subjects

to improve

reviewers for drawing attention to more appropriate ways of analysing the data; Dr Hal MacFie for guidance

with

mean)

esti-

ACKNOWLEDGEMENTS

Gabriel

above

instance by using regression or ANOVA to calculate assessor and stimuli effects, and then summing the relevant

to the imputation

going

manner

noise, choosing

or

was scored

of assessors

be used

or ANOVA

and then calculating

or comparing

that the subject should

the

Consumer configurations, using a weighted Procrustes criterion in which the Product and Consumer configurations

appropriate

Improved

best

both

configurations, Differences;

not

as would

suggesting

subsequently:

matrix

configurations Squared

more

and so not give

suggested

rotation

the Product

single rotation,

of fit of the Con-

of the performance.

matches checking

configurations.

by the goodness

mean,

data if regression

refinements

probably

The

of

it was felt that the sum of squares would

configuration

alternative

criterion

initially,

are

was

on the goodness

was considered

have been dominated

a balanced

the

separately

performance

information

the Product criterion

because

sumer

configurations

one reviewer suggested

than the product

missing

However, are

With 35% of the values missing,

scores for each subject

of zero as the first step in pref-

mates could not be used.

results which are sufficiently close to those which would be obtained from a complete data matrix if only 5% of

the assessors. One point

to have a mean

which would have allowed the

it is unlikely

into a factorial

In fact, since the preference

has its own unique

even if more

K R. (1971).

The biplot to

application

graphical

principal

display of matrices

component

analysis.

Biometrika, 58,453-67. Gabriel

K. R. & Zamir

of matrices

S. (1979).

by least

squares

Technometrics, 21, 489-98. King B. M. & Arents P. (1991). obtained

from generalised

Lower with

rank approximation

any choice

A statistical

Procrustes

of weights.

test of consensus

ana1ysis.J.

Sens. Stud.,

6,37-48. Kruskal, J. B. (1964). goodness

l-27.

Multidimensional

of fit to a nonmetric

scaling

hypothesis.

by optimizing

Psychometrika, 29,

Comparison of Imputation Techniques for Internal Preference Mapping Krzanowski

W. J.

Oxford

University

Krzanowski

(1988a).

Principles of Multivariate

W. J. (1988b). data

matrix.

Biometrical L&t., XXV,

using

the

R. J. A. & Rubin,

value imputation

singular

value

in multi-

decomposition

of a

DSWO Rubin

J.

(1982).

D. B. (1987).

Statistical Analysis with

SAS Institute

(1989).

Users Guide Chap.

Homogeneity

Analysis of Incomplete Data. Psychometriku, 56,241-54.

‘The PRINQUAL 34.

SAS

PRINQUAL Car-y NC.

Spence

Institute

Procedure’, Inc.,

Gary

.SAS/STAT NC,

pp.

Technical Report R-108: Algorithms for the

and TRANSREG

Procedures. SAS Institute

Inc.,

sensory

I. & Domoney designs

and

designs

consumer

D. W. (1974).

for

and repeated sciences.

Food

nonmetric

Single

subject

multidimensional

incomscaling.

Psychometriku, 39,469-90. I. N., Raats M. M. & MacFie

significance Whelehan of

test

H. J. H. (1992).

in generalised

H. J. H. & Baust,

differences

recovery

Wu C. F. J. (1983). algorithm.

consensus

P., MacFie,

individual

simulated

for

A new

Procrustes

Sens. Stud., 7, 91-6. 0.

rates and error (1990).

Uses of change-over in

PreJ, 4, 223-36.

ana1ysis.J.

EM and beyond.

1265-323. SAS Institute

@al.

Wakeling

Press, Leiden

D. B. (1991).

P. (1994).

plete

31-9.

Missing Data. Wiley, New York. Meulman

Schlich

measurements

Missing

variate Little

Analysis.

Press, Oxford.

297

scaling

of structure

1evels.J.

under

for

N. G. (1987). sensory

various

Use

studies:

missing

value

Sens. Stud., 1, l-8.

On the convergence

Ann. of Statistics, 11, 95-103.

properties

of the EM