Modelling time-intensity curves using prototype curves

Modelling time-intensity curves using prototype curves

Vol. 8, No. 2, pp. 131-140, 1997 Q 1997 Elswier ScienceLtd Printedin Great Britain. All rights reserved Food Quality and Prcf~~c PI!: 0950-3293/Q? ...

867KB Sizes 1 Downloads 112 Views

Vol. 8, No. 2, pp. 131-140, 1997 Q 1997 Elswier ScienceLtd Printedin Great Britain. All rights reserved

Food Quality and Prcf~~c

PI!:

0950-3293/Q? Sl?.M)+ .OO

s0950-3293(96)00039-0

NGTIME-INTENSITYCURVES USINGPROTOTYPE MODELLI CURVES' Garmt Dijksterhuisa* “ID-DLO

Institute

for Animal

& Paul Eilersb

Science and Health, Food Science Department, Sensory Laboratory, Lelystad, The Netherlands “DCMR Environmental Protection Agency The Netherlands

PO Box 65, NL-8200,

(Accepted 23 July 1996)

projection takes place in both the time and intensity domains. TI data have some clear properties that seem to call for specific ways of Statistical analysis (see also Dijksterhuis, 1995) :

ABSTRACT Usually time-intensity ber of parameters, of maximum parameters assessors.

curues are summa&d

such as the maximum

using a num-

intensity, the time l

intensity, the area under the curve, etc. These are derived from

This

averaging

a

TI

curve averaged

over

l

has been the subjeGt of some

debate and some alternative methods to averaging curves haue been proposed.

l

l

the II

Recently a new approach,

a

‘the

projected prototype curve model’, is suggested based on the

TI curves contain there are there are there are TI curves

Rationale

a high number

of data points

large individual differences intra-individual consistencies differences between stimuli have a distinctive shape (see Fig. I).

to use TI methods

assumption of an underlying smooth curve which is projected onto the data. This projection fakes place in both the time and intensity domains separately.

set of TI curves and it is shown to provide a goodjt data. 0

There appears to be an increasing recognition of the need for dynamic, instead of static models, and also for non-linear instead of linear models (cJ Lute, 1995). Conventional sensory methods using difference tests, line-scales, etc., implicitly regard the sensory properties under investigation as static phenomena. This implies a model in which the static judgement is a kind of integral of the perception over time, from the moment the stimulus is put in the mouth to the time of swallow or of expectoration of the stimulus. Changes taking place during this time cannot be inferred from static judgements (see e.g. Dijksterhuis, 1996). Both physical and psychological processes appear responsible for the dynamism in taste and its perception. In Fig. 2 a tentative mode1 which includes these processes is shown. Several physical and chemical processes turn a food or a drink, the outside stimulus, into the inside stimulus by breaking down the matrix, diluting with saliva, etc. The volatiles that are released trigger physiological processes in the olfactory and taste receptors. The neural responses of the receptors are the starting point for the psychological side of flavour perception, which ultimately leads to a response. Because there are large individual differences in the processes that take place in the mouth, the release of flavour will differ over individuals. Fischer et al. (1994) show that there exist significant differences in saliva flow rate, causing different temporal perception of gustatory stimuli. It is also documented that there are large

The model is applied to a to the

1997 Elsevier Science Ltd. All rights reserved

INTRODUCTION Timeintensity

studies

have

release

and the development

about

40 years.

One

been

used

to study

of taste intensity

of the first papers

flavour

over time, for

on time-intensity

srudies is the paper by Neilson (1957). Usually TI curves are summarised using a number of parameters, like the maximum intensity, the time of maximum intensity, the area under the curve, etc. (see e.g. Lee and Pangborn, 1986). These parameters are usually derived from a ‘IX curve averaged over assessors. This averaging has been the subject of some debate and some alternative methods to averaging of the TI curves have been proposed (Liu and MacFie, 1990; van Buuren, 1992; Dijksterhuis, 1993). In this paper a new approach is suggested based on the assumption of an underlying smooth curve which is projected onto the data. This (TI)

*To whom correspondence should be addressed. ‘Presentation delivered at the Second Rosemary Pangborn Sensory Science Symposium, July 30-August 3, 1995. University of Davis, California, USA. 131

132

G. Dijksterhuis, P. Eden

individual differences in chewing behaviour (Brown, 1994; Brown et al., 1994). The differences in chewing pattern can lead to differences in flavour release and hence to differences in perception.

ANALYSING

TICURVES

In this section the different approaches that have been used thus far are briefly illustrated. The TI data used come from a study in which 14 assessors received three concentrations of caffeine and three concentrations of quinine (Flipsen, 1992). The data from the lowest concentration of caffeine are analysed using different models to illustrate the appropriateness of the models to use for the analysis of TI curves.

Principal curves The next step is to calculate instead of the usual average curve, a weighted average curve, such that representative curves receive large weights, and deviant curves low weights. This is essentially what happens in calculating principal curves using principal component analysis (van Buuren, 1992). There are different ways of standardising the data prior to the principal component analysis, which result in different curves (see Dijksterhuis et al., 1994). The non-centred PCA (see Dijksterhuis, 1993) looks most promising, but this point is not pursued here any further. In addition to the first principal curve, which looks very much like the average curve in the case of a non-centred PCA, 2nd, 3rd, and higher principal curves can be obtained. The loadings of the PCA provide information on the assessors (see Dijksterhuis et al., 1994). Thus we try to approximate the data by

Averaging TI curves _Y$= The averaging that is

UIVj

(3)

of TI curves takes place over the assessors, 1

(1)

yIj =_!I = kYtiI3

where u1 is a common, underlying curve, and the the weights per assessor. This amounts to finding of weights uj are such as to minimise

vJ

are a set

where the ytj are the data, i.e. perceived intensities collected on times t= l,..., T, for each assessor j= l,..., 3. The average curve minimises

(2) Figure 3 shows 14 individual TI curves (thin line) and their average TI curve (dashed line). It may be clear that the shape of the average curve is not representative of most individual curves. One property of the average curve which clearly does not represent the observed curve, is the length of the curve. Consider for example curve no. 4 or 7. The observed curve is much shorter than the average curve. Clearly one average curve cannot be used to sufficiently represent all individual observed curves. Analogously the heights of the observed curves differ considerably from the height of the average curve (see e.g. curves nos 2, 3 and 13).

There are certain disadvantages associated with the PCA of TI curves. One is that the ordinal aspect of time is disregarded in the analysis. Any permutation of the time points will result in the same solution. The other disadvantage is that the time axis is not scaled, the scaling takes place only in the intensity direction of the curve. In Fig. 4 the same 14 individual curves are shown as in Fig. 3, together with the weighted principal curve. Note that the fitted curves differ in height only. In Fig. 4 can be seen that the weighted principal curves are more representative for the individual observed curves with respect to the heights of the curves. The lengths of the curves still seems problematic, see for example curve nos 1,4 and 7, where the principal curve is much longer than the observed curve. Of course a scaling of the intensity direction only will never result in a shorter curve, except in the uninteresting case when the weight is zero.

Scaling the time axis There have been a number of attempts to model the curves also in the time direction. Overbosch et al. (1986)

~~1~~~~ outside

Time FIG. 1. Typical TI curve. Recorded are plotted against time.

perceived

intensity

values FIG. 2. Two-step

model underlying

time intensity

curves.

Modelling

provide a method in which the rising and falling flanks of the curve are modelled separately. Liu and MacFie ( 1990) note some disadvantages of the Overbosch method and provide an alternative method. The disadvantage of the methods that weight the time axis is that the curves need either to be resampled or certain ‘landmarks’ on the curves are needed. It may be hard to find good landmarks due to plateaux and jumps in the curves.

Stretching/shrinking

both time and intensity axes

A method to stretch or shrink both the intensity and the time axis was suggested by Dijksterhuis and Van den Broek (1995). They scale the entire TI curve isotropically. This means that the curve is stretched or shrunk with the same factor in both directions. However their method only provides an ‘incomplete’ solution

50 40 30 20 10 O50

L I

I

‘\

\

.-__

Time-Intensity

Curves

because there is no reason to assume that the intensity and time direction need be scaled with the same amount.

THE PROJECTED CURVE MODEL

PROTOTYPE

The analysis of TI data and the associated problems led to the following line of thought. What is needed is a method that enables the stretching and shrinking along time and intensity axes separately. When we can assume a smooth curve underlying the TI curves, we can attempt to formulate a model. In this model the individual observed curves are assumed to be distorted versions of the smooth underlying curve. The distortions will differ per assessor and will be different in the intensity and

1

I

.-

c--5

4

30 40 Ti

f-1

‘,J+_______ I

\

‘+

6

$

-_

8

_ -Il

/-z

If-7

\ ‘h__

--I

Intensity vs time (xc)

0

FIG. 3. Example

50

of 14 individual

100

I50

0

50

133

100

Tl curves (solid line) and their average

One substance,

14 panelists

Data

curve

I50

curve (dashed line).

and mean

I

134

G. Dijksterhuis,P. Eilers

the time direction. The projected prototype curve model (Eilers, 1993a,6) is a model which can be adapted to meet this needs (see Eilers and Dijksterhuis, 1995). The prototype curve approach consists of three parts: l

0 l

a smooth prototype curvef(t) intensity scale factors time scale factors bj Uj

The smooth underlying prototype curvef(t) is a function of a normalised time scale t = bjt. For each assessorj= l,...,J, scale factors in the intensity direction aj, and in the time direction bj are needed. The mathematical problem is to estimatef, aj and bj (see e.g. Eilers, 1993a). The smooth underlying function f is made out of B-splines. A spline is a curved figure with some special properties. It is a smooth curve, this smoothness translates into the mathematical terms continuous and differentiable. Figure 5 shows a so-called quadratic 50

30 20 10 400 50

L ,-

-_

B-spline. It is constructed from three polynomials of the second order, i.e. three quadratic functions. These quadratic functions are the parabolas drawn in Fig. 5. The three parabolas join smoothly at so-called knots. Figure 6a shows the same B-spline as in Fig. 5, and Fig. 6b shows four translated copies of it. A number of the splines shown in Fig. 6a can be used to build any other curve. In the prototype curve model, we construct f as a sum of X shifted copies of the B-spline, Bk(t), each multiplied by a coefficient ck.

f

(4 =

&&(r)

(5)

k

The coefficients ck determine the shape of the resulting prototype curvej In the process of fitting the TI curves, the prototype curve is scaled in both the intensity and time direction to maximise the fit to the observed individual TI curves. In Fig. 7 one TI curve (jagged curve) is shown together with the six spline curves (thin lines) that

I

3

4

6

---_

40 4 30 20 10 0

50 40 30

I

1

‘i$

4

_____

50

10

40 30

g$A+____ /’

30

FIG.4.

\

\

\

I

\

I I-LL 0

$A--___

I

II

0

13

40

2o 10 0

q

$A.___

$

5ol

‘. 50

Example of 14 individual

15”n

50

100

14

1;;: //

-. 100

9

8

o

--_-

50

Intensity vs time (xc) One substance, Data and PCA

--

100

150

TI curves (solid line) and their principal curve (dashed line).

14 panelists fit

150

Modelling

FIG. 5. Example

of a quadratic

B-spline curve, and its three constituting

See the Appendix

1 for a more detailed

presentation

of

Curves

135

parts, three parabolas.

In Appendix

build the underlying projected prototype curve (thick line).

‘TimeIntensity

2 some remarks

and the number

estimated

on the quality

parameters

of the fit

are made.

the method. The trate

14 individual the other

TI

curves that were used to illus-

methods

(averaging

in Fig.

Fig. 4) are shown in Fig. 8, together jected

prototype

improved

curves. Eyeballing

fit compared

3, PCA

in

with the fitted prothe curves shows the

to the average

and the principal

curve.

Data from Flipsen the projected

Initially because

only

the first 60 s of the curves

the curve tails might obscure

several

end-effects.

within Other

EXAMPLE

Most

assessors’

were used

the results due to reach

zero

the first 60 s so the loss of data is relatively

curves

low.

curves never reach zero but remain

at a low con-

(1992)

prototype

bitter substances,

caffeine and quinine,

14 assessors in three concentrations To get an impression average bination.

Clearly

retract

centration

effect -

TI

-

the mouse

completely.

It was assumed

safe to

the total curve was used, excluding Experimenting

analyses

the zero end-parts.

with the number

the three curves

is not

and the order of the

flank

easy

showed that these had not much effect on the

centration

solution.

More

other

so this number splines

than six splines did not improve was selected

are the simplest

their first derivative disadvantage,

for the analyses.

‘curved’

spline.

the fit,

caffeine

to infer appears

curves.

higher

is clearly

Fig.

curves.

somewhat

to

high).

9 shows the

concentration)

com-

curves were stronger

Furthermore,

concentrations

visible

properties

of the different

B-splines

x

Two

were presented

of the data,

the three quinine

than

the use of data.

(low, medium,

TI curves per (substance

stant value, which may be due to the assessor failing to replace such low constants by zeros. In subsequent

are used to illustrate curve model for TI

for both about The more

high

the con-

show higher substances. the

caffeine

convex

It

decreasing than

conthe

curves.

Quadratic

The

fact

that

B-spline

smoothing

of one T-I curve

consists of linear pieces, a theoretical

did appear not to deteriorate

curves and their fit. Hence

the resulting

for the final analyses

quad-

ratic splines were used. x .Z

30

g 2

20

IO

0 0 (b)

I

I

I

FIG. 6. Example of the same translated copies of it (b).

I

I B-spline

I curve

(a)

and

four

FIG. 7. Example of a TI curve and a smooth consisting of six B-splines.

prototype

curve

136

G. Dijksterhuis, P. Eilers

For each quinine type

curve

ogy.

The

resulting

individual

curves,

prototype

curve

was

and

caffeine

was estimated

blown

prototype are apart

up

by

a

concentration

using

curves,

shown

in

from

the

factor

a proto-

the proposed and

Fig.

individual for

set

using a higher

the

curves

be seen

show

centration curves curves

procedure, number

the effect

curves are

from

the

lowest,

negative

prototype

of concentration. and

the

of caffeine perceived

The

be much different,

curves high

A remarkable

con-

medium

prototype

curve

intensities

shorter

tail.

curves.

ceived intensity

are meaningless.

fact can be seen when comparing

A tentative of caffeine

a /zig/zstimulus

ingesting

explanation bitterness are judged

After

could

be an effect

of adaptation,

50

40 30 20 10 0 50 40 30 20 10

40

50

40 30 20

50

as representative

appears,

visually,

either

contrast-effect

I/

’ \ \

\

known

or some

from psychophysics.

\

i

#



5

/

\

7

4

11

10

\

I/ \

\

1

14

13

Intensity vs time (set) One substance, 14 panelists Data and PPC fit Caffeine, low concentration

0

FIG. 8. Example

50

of 14 individual

100

150

0

I 50

3

\



I 100

TI curves (solid line) and their projected

I 150

prototype

curve (dashed line).

to give

subsequent

I

4

101 0

and

that

less intense.

1

0 ILL 50

curve

The

at the onset, t,, of

\ L l!!!IlIL l!!IL8IA. \L :\ ILL It!?-

the prototype

the low

is that the per-

is so high

t, (t= l,...,q

disappeared

to

apart from the height of the curve.

judgements,

of this one tail the swing

anyway.

curves in Fig. 10 do not appear

This negative swing is caused by one of the individual curves. This curve has a lingering tail of a low value. removal

the

curve could also be seen

tive because one would expect low concentrations

for

a negative

by

On

low curve has the longest tail. This seems counter-intui-

concentration

shows

splines.

curve with the medium and /zig/z curves for caffeine.

the low concentration

The projected

concentration

Of course

10 that

are the highest,

are in between.

the high

Fig.

effect, due to

could be circumvented

of underlying

as an outlier which had to be removed The three quinine

It can

curves of the

as the prototype

other hand, the one individual

it

illustratory

purposes. clearly

curves

the spline fitting

the weighted

10. To

1.5, just

for the underlying

other stimuli (see Fig. 11). This unwanted

methodol-

6

This kind

of

Modelling

Time-Intensity

Curves

13 7

-

70 Quinine Hi 60 I-

50

x 40 .Z a 9 E 30

20

10

0

80

120

100

140

160

r\

180

Time (s) FIG.9.

Average

curves for the two bitter

Caffeine.

Caffeine.

Caffeine,

substances

(caffeine

and quinine)

at three concentrations

high concenlrntian

medium

concentration

low coaccatration

FIG. 10. Prototype curves for the three caffeine and quinine stimuli. The prototype 1.5, just to set it apart from the individual curves, for a clearer picture.

Quinine.

Quinine.

Quinine.

(Lo, M, Hi).

high concentration

medium concentration

low concentration

curve is drawn again after scaling it with a factor

G. Dijksterhuis,

138

P. Eilers II. Use of electromyography

50

ing behaviour.

Caffeine, high concentration

De Boor, C. (1978)

40

Dierckx,

P.

G.

Springer.

Curve and Surface Fitting with Splines. The

B.

Time-Intensity

(1993)

Principal

Bitterness

Curves.

component

analysis

of

Journal of Sensory Studies 8,

3 17-328. Dijksterhuis,

10

G. B. (1995)

Multivariate Data Analysis in Sensory

and Consumer Science. Thesis, n

0.2

-0.0

0.4

0.6

0.8

of Leiden,

1.0

FIG. 11. Prototype removal

to assess chew-

Press, Oxford.

Dijksterhuis,

20

chewing

A Practical Guide to @lines. Berlin,

(1993)

Clarendon

30

during

Journal of Texture Studies 25, 455-468.

curve for the caffeine-high curves, of the low tail of one of the individual curves.

Dijksterhuis, after

G.

Review

B.

(1996)

and Preview.

‘Interaction flavour

Dept

of Datatheory,

University

The Netherlands. Time-Intensity

Proceedings

of food matrix

and texture’,

Dijon,

Methodology:

of the COST96

with

small

France,

ligands

20-22

meeting influencing

November,

1995,

P. H. (1994)

Prin-

pp. 79-81. Dijksterhuis,

CONCLUSION

G. B., Flipsen,

cipal component

M. and Punter,

analysis

of time-intensity

data.

Food Quality

and Preference 5, 12 1- 12 7.

The aggregating over assessors of time intensity curves is a problem because of the substantive individual differences. The hitherto used averaging of individual TI curves gives a rather bad fit of the observed curves. Principal TI curves show increased fit, but lack a scaling of the time axis. The projected prototype curve model gives representative underlying (prototype) TI curves. These curves have good fit, higher than with the averaging and PCA methods. The projected prototype curve model as it is suggested in this paper appears to indicate a fruitful line of research into the aggregation of individual TI curves. The model is flexible enough to allow tuning to the specific problems encountered. Some of these problems are the occurrence of negative parts of the prototype curve and the expected difficulties by including the tail part of the curves. Though good fit was obtained with the projected prototype curve model suggested in this paper, more research in the application of the model to time intensity data is needed.

Dijksterhuis,

G. B. and van der Broek,

Eilers,

P. H. C. (1993a)

Paper

presented

ling, Leuven, Eilers,

Eilers,

Ervaringen

curves,

in Dutch).

Presentation July, Fischer,

at the 9th

U., Boulton,

gical factors perception Flipsen,

R. B. and Noble,

M.

between

II:

Sweet W.

suring chewing. Brown,

in chewing in mea-

M. and MacFie,

differences

in chewing

H. J.

H. (1994)

behaviour

A. C. (1994)

Physiolo-

of sensory

assess-

flow rate and temporal

(1992)

Intensiteit Onderzoek. I: Algemeen;

Food Quality and Preference 5,

Tid

Intensity

solutions,

and

Pangborn,

aspects 1986,

Lute,

R.

in Dutch).

Flavor

R.

(1986)

II:

I: General;

Research

report

of Wageningen.

Time-intensity:

perception.

Food

The

Technology,

7 l-82. H. J. H. (1990)

(1995)

Methods

for averaging

Chemical Senses 15, 471484.

Four

in psychology.

A. J. (1957)

tensions

concerning

mathematical

Annual Reviews in Psychologll46,

Time intensity

l-26.

studies. Drug and cosmetic

80, 452-453. P., Enden,

Overbosch,

University

of sensory

curves.

D.

Research.

J.

method

C. and van den Keur, for measuring

B. M. (1986)

intensity/time

relation-

taste and smell. Chemical Senses 11, 331-338.

P., Afterof,

W. G. M. and Haring,

release in the mouth.

P. G. M. (1991)

Food Reviews International 7, 137-

184.

Journal of Texture Studies 25, 455-468.

W., Shearn,

to investigate

differences

4-7

stimuli.

ships in human to investigate

Leiden,

of gustatory

and bitter

E.

modeling

I. Use of electromyography

salivary

OP and P Utrecht/Agricultural

industry.

Method

Semi Parametric

Society,

to the variablity


Overbosch,

in humans:

note.

55-64.

An improved W. (1994)

Research

G. B. (1995)

Psychometric

contributing

ments: Relationship

Neilson,

behaviour

van

Time-

1995.

time-intensity

Brown,

modelling

The Netherlands.

P. H. C. and Dijksterhuis,

November

REFERENCES

with

Unpublished

Rijnmond,

Liu, Y. H. and MacFie,

for his com-

Model-

met het modelleren

(Experiences

Milieudienst

temporal

The authors thank Joop de Bree (ID-DLO) ments on an earlier version of the paper.

on Statistical

Modelling of Time-Intensity Data with Projected Prototype Curves.

Lee,

ACKNOWLEDGEMENTS

the

Belgium.

P. H. C. (19936)

Intensity

Matching

Estimating Shapes with Projected Curves.

at the 8th worskshop

tijd-intensiteit-curven. DCMR

E. (1995)

Journal of Sensory Studies 10, 149-161.

shape of TI-curves.

Method

in humans:

van Buuren,

S. (1992)

sensory evaluation.

Analyzing

time-intensity

Food Technology 46, 101-104.

responses

in

Modelling

APPENDIX

The

A

minimization

regression

In this appendix

we give some details on the fitting of the

Pascal

and Matlab,

construction. author

have been implemented while a version

Programs

can be obtained

from the second

Let the data be (t+_yG), i= l,..., m, j= indexes

the subjects

f(bjx)

is under

(e-mail address: [email protected]).

to a system

of linear

the_bs, we linearizef(bjx)

in the neighbour-

hood of a guess bj as follow:

in Turbo

in S-plus

of S leads

139

Curves

equations.

To estimate

model. The algorithms

Time-Intensity

=J(&x

+ Abjx) xf(&~)

+ Abjxf’(&x).

This leads to explicit equations:

l,..., n, where j

and i the times tG of the measure-

ments. When we say that the TI curves have a common shape, with linear stretching

of the scales, we are assum-

ing that

i=l

The derivative

Jo = uf(bjti)

is a good model for the data.

be of unequal

we

w?

introduce

weights

For

length,

missing

therefore

data

curvef(.).

of squared differences

and bs; this is simple

the bs for givenf(.) but

neighbourhood

af cx)iax;

the functionf(.)

not expectf(.)

the

of the bs, if

to have a simple parametric

solving

convergence,

it in

for given us and bs; we do

we model it with (quadratic) By iteratively

linearize

of good approximations

we can compute estimate

and as; this is a non-

we can

form, so

B-splines. until

we hope to solve our problem.

have to be chosen,

normalized.

A practical

choice

for estimating

is to take 0 to 1 as the

B-splines

estimate

described

we

is indicated

explicitly

recursively

1978; Dierckx,

are obtained

from 1993))

automatically

Now we have all the building

in

blocks for an algorithm.

we determine

the maximum

(rj) of each curve, and the time at which it occurs ( Tj). We put aj= l/rj and bj= l/ Tj and normalize

a and b as

described above. Then we repeatedly estimatef(=), the us and the bs, until convergence. Practical experience has shown that lo-20 Experience

iterations

little influence

on the solution.

from 0 to 1 reduce

1%. Therefore complete

are sufficient.

has also shown the number More

of knots has

than five knots on

the errors

by less than

we work with five knots (two extra knots

B-spline

from 0 to 1, to construct

a

basis).

APPENDIXB

In this section it is argued that the proposed = 1, . . . . n.

less parameters

i=l

f(.)

of degree d- 1 (De Boor,

To start the computations,

use

(quadratic)

in the body of the paper.

model uses

to model the data than averages or prin-

cipal curves while still giving a better fit. For each curve there

To

I),

Quality of fit and estimated parameters

the us are explicit:

2ix w~~(b~t~)/ 2 W&‘(bjti)j 1

d -

and/or the us and bs have to be

for all i and j where we = 1.

aj =

Ck-,)Bk(t;

of the B-splines

are used outside the domain

and to normalize the us such that domain of f(.), cy=, uj’/n = 1. Th e b s are normalized such that bjtv
-

k=2

by d. B-splines of degree d are computed

the domain

each of these tasks in turn,

It is necessary to introduce normalization conditions, to make the solution unique. The domain and range of f(.)

j&k

the process.

linear regression;

problem,

=

so lower degree B-splines

in three parts:

the as for givenf(.)

one-dimensional linear

sum

between ys and $, .

We can split the problem

estimate

4

k=l

where the degree the as, the bs and the prototype

As a measure of fit we use the (weighted)

estimate

$&kBk(t;

my=O,

wV= 1.

Our task is to estimate

easily, because of the

of B-splines:

Some data may be miss-

ing, or the series might otherwise

can be computed

following property

B-splines,

We minimize

least squares function 2

as the

are 60 observations,

number

of data points,

one each

second.

The

total

for 14 assessors, is 60x 14 = 840.

An average curve reduces this number to 60 parameters: one for each second. The computation of principal curves results in one principal component, containing one value for each second, and a weight per assessor, this totals to 60 + 14 = 74 parameters. One should lessen this number by one, because of the normalisation of either the component or the parameter per assessor.

140

G. Dijksterhuis, P. Eilers

The projected prototype curve model gives two parameters for each assessor, one for the scaling factor of the time axis, and one for the scaling factor of the intensity axis. In addition there are seven parameters for the coefficients of the B-splines that build the prototype curve. This amounts to a total of 24 + 14 + 7 = 45 parameters. It is possible to approximate an average curve, or a principal curve, using B-splines. This will reduce the number of parameters drastically, and may give a fairer comparison of the different methods. In this case the average curve will give seven parameters, the principal curve will give 2 1. Looked upon in this way the projected prototype model introduces more parameters, but also a substantial gain in fit.

This fit is

with j$ the fitted value of y+ according to one of the above mentioned models. Q is the sample standard deviation of the differences between the data and the model. For average curves Q= 11.9, for the first principal curve Qz6.9 and for the projected prototype curve QL= 2.9. This substantiates the conclusion drawn from eyeballing the figures: The projected prototype curve model is superior to both the average curve and the principal curve, in terms of fitting the model to the data.