ELSEVIER
Food@alityandP@rence6 (1995) 61-67 0 1995 Elsevier Science Limited Printed in Great Britain. All rights reserved 095~3293/95/$9.50+.00
0950-3293(94)000424-x
CONFIDENCEINTERVALSFORTHETRIANGLETESTCAN GIVE REASSURANCETHAT PRODUCTSARE SIMILAR A. W. MacRae School of Psychology, The University of Birmingham, Edgbaston, Birmingham, UK, B15 2lT (Accepted 3 January 1995)
surance about absence of a perceptible difference. It is certainly not the case that lack of a significant difference constitutes evidence that there is no difference. However, awareness of that fact is not universal, and the current international standard for the triangle test (ISO, 1983) offers no analysis to give reassurance that two versions of a product are similar though it is intended that the next revision of the IS0 standard will include ways of seeking reassurance that samples are nearly indistinguishable. One approach to assessing the strength of the evidence for similarity is a power analysis in which we determine not only an acceptable level (alpha) of Type-l error - falsely concluding that there is a real difference when only chance is at work in the data; but also an acceptable level (beta) of Type-2 error - failing to conclude that there is a difference when in fact there is one. Tables to assist in the task have been provided by Schlich (1993)) who also discusses the background and advantages of this style of analysis. However, Type-2 error does not have a unique value: for any alpha level and for any particular amount of data, the probability of Type-2 error depends on the size of difference that is considered to make a practical difference to the product that is, it naturally depends on just what counts as a Type-2 error. Thus the approach requires three parameters of the analysis to be specified in advance: alpha, beta and the smallest degree of detectability that matters. The analyst has a problem if the outcome of a power analysis falls near a borderline. After all, there is something rather arbitrary about these advance choices, especially selecting the smallest difference that matters in practice, so the analyst may wonder if the analysis should be repeated, perhaps with some change in the size of the smallest difference that is considered important, or in the level of beta considered acceptable. The alternative approach described here avoids that prob lem because it starts from the results rather than by choosing parameters in the abstract - or rather, only the significance level, alpha, needs to be selected. Because only one arbitrary parameter (alpha) is involved, the approach described here will usually be simpler to apply and is almost always more conceptually direct.
ABSTRACT One of the major advantages of the triangle test to set against its low statistical power is its potential for revealing disn’minable sensory daffences when the nature of the difference is unknown. That makes it an attractive tool for seeking assurance that there is no sensory difference between samples (after a process change, fm instance). Mere absence of signiJicant difference is completely inadequate to give that reassurance and power analysis is much better However; power analysis requires three somewhat arbitrary parameter values to be setected in advance. An alternative approach based on exact binomial confidence intervals is described which needs only a single paramete comparable to the alpha level in a test of signiJicance, to be specified. It is shown that the amount of data usually envisaged for seeking reassurance about lack of difference is much too small to do the task adequately.
INTRODUCTION The triangle test requires an assessor to locate the odd sample in a set of three, of which two are identical. It is widely used for sensory analysis in industry in spite of concern about its poor sensitivity (Ennis, 1990, 1993). One reason for its widespread use is the fact that an assessor can make successful judgements without knowing how the samples differ. That makes it useful for assessing the detectability of sensory changes in a product when a new source of raw material or a changed process is introduced, because in such cases, not only the assessors but also the production experts may be uncertain about the nature of the alteration, if any, in the finished product. However, the only statistical analysis generally used is calculation of the significance of any above-chance success rate. While that is a satisfactory way to demonstrate the existence of a difference, should that be needed, it is woefully inadequate for the purpose of giving reas61
62
A. W. Mac&e
BINOMIAL CONFIDENCE INTERVALS The approach focuses on confidence bounds for the estimate of success rate provided by the data. For any statistic calculated from a set of data it is possible to calculate two bounds, one above and one below the observed value, spanning a range of values called the confidence interval. The calculation places the bounds so as to give any desired degree of confidence that the true value of the statistic in question (here, the estimated probability of an assessor making a correct choice) lies within the interval. (Rather than speaking of the ‘true value’, many prefer to speak of the ‘value of the statistic in the population from which these particular results were sampled’ and the statistic calculated from the sample then gives an estimate of the corresponding parameter of that population.) As with a test of significance, there is always a possibility of error in the outcome. In a significance test, we choose an alpha level to set the probability of Type-l error that we consider tolerable. With confidence intervals, we choose a comparable alpha level to control the probability of the true answer lying outside the bounds. With a significance test, setting alpha too small brings a greater chance of failing to label a real effect significant. With confidence intervals, the penalty of too rigorous an alpha is that the bounds must be far apart so we obtain a wide range of credible values for our estimate of the population parameter. Confidence intervals are rarely invoked in sensory analysis but are a standard tool of statistics. Often, they are calculated from approximations based on the normal distribution, for example by Smith (1981) in a rare invocation of confidence intervals in sensory analysis, but that is only for convenience rather than by necessity. Here, they are calculated from the binomial distribution the exact distribution of probabilities of frequencies of right and wrong responses in a finite number of trials with the same probability of success on every trial. One of the first papers about confidence intervals (Clopper & Pearson, 1934) used the binomial distribution to illustrate the idea. Here, the approach has been adapted in an unusual way for convenient use with the triangle test. It is usual to express confidence as a percentage. For example, ‘the 90% confidence interval’ is the range of possible parameter values for which a directional alpha from each of the bounds is 0.05. That is, if we adopt the probability of success at the bound as a ‘null hypothesis’, the data differ from that with an alpha of 0.05 in a directional test. The sum of these alphas is O-10, giving a 10% chance of error in one direction or the other. With the triangular test, it is customary to invoke a directional hypothesis and that practice has been
followed here. Calculations here are for a 90% confidence interval so as to correspond to a directional significance test at the 0.05 level. The calculations required for exact binomial confidence intervals are simple in principle but too laborious to perform without a computer. The exact upper bound for a particular alpha is found by carrying out binomial tests of significance on the observed results for various values of the probability, p(C), of a correct choice being made. A trial value for p(C) is chosen which is greater than the observed proportion of correct choices. If a significance test using that p(C) as the ‘null hypothesis’ yields a probability greater than the target alpha, p(C) is increased. That is, it is made more different from the observed proportion of successes. If the probability is less than alpha, p(C) is moved closer to the observed proportion. This process is iterated until a value of p( C) is found for which the significance of the data is equal to the desired alpha. The lower bound is found in the corresponding way beginning with a trial value below the observed proportion. Figure 1 contains all the information needed for an analysis seeking reassurance that a sensory difference is small. The horizontal axis is scaled from 12 to 150, each scale point representing the total number of trials in a study. Ideally, that will equal the number of assessors, each making a single choice of odd item from a triangle of samples. Values are plotted in steps of three, partly to avoid crowding the graph but also because it is good practice to use multiples of three (and six is even better) to permit counterbalancing the position of the odd item across triangles). The vertical axis on the left shows the upper bound of a 90% confidence interval for the probability of making a correct choice. That is, the true value of p(C) is estimated to exceed the predicted bound on no more than 5% of occasions that the technique is used. (This, corresponds to a directional alpha of 0.05 in a test of significance, as commonly used with triangle tests.) The right vertical axis represents a transformation that is often applied to p(C) to represent the percentage of ‘detectors’ P(D) in the population if each individual either detects the difference reliably or guesses completely at random. The relationship between the axes is that P(D) = 15Op( C) -50. Although the model it relates to is very implausible, the transformation is widely used and is provided here for convenience. These scale values do not map neatly onto the grid lines, so if more than an approximation is wanted it is better to calculate it from p(C) since that can be read to about three decimal places. The curved line labelled ‘O’, represents the upper bound for a result where the number of successes is at the chance level - one third of the number of triangles. That is, on the left, four correct responses out of 12 result in an estimate of the population probability of correct responses having an upper bound of 0.609
Conjidence Intervals while, on the right, 50 correct bound
of 0.402.
these
probabilities
of ‘detectors’, The sent
Reading
the right-hand
can be translated
lines above it labelled bounds
exceeds
the
the number
The
sult the heavy line labelled ing to 75 triangles bounds
expected
of
much
correct
by chance
by
the difference,
that p( c) = 0.568
to give adequate
results with a non-directional
63
choices
in
an outcome
rate.
If this paper
Since
that trans-
of the population
that is not much reassurance
detecting at all! Even
only very limited reassurance
does no more
of simi-
is at the chance
than emphasize
level.
that point
for sensory analysts it will have served a useful function.
correspond-
Below-chance success frequencies
high. If so, it
appreciated assurance
than 40%
larity can be given if performance
and P(D) = 35%.
may all seem alarmingly
twice the chance
with 150 triangles,
is 25
that by 10 so we conposition
exceed
lates into more
yield 35 cor-
by chance
with as few as eight correct but, with twelve triangles,
Test
at the chance level (four correct) gives reassurance at the same alpha level of 0.05 only that accuracy does not
‘10’. We can read from the
it is not generally
data it takes
number
that 75 triangles
number
value of that line in the horizontal
Significant
into 41 and 10%
expected
of 75) and the result exceeds
is because
twelve triangles
1, 2, 3 and so on, repre-
when
1, 2, 3 and so on. Suppose rect responses.
These
can be obtained
scale,
respectively.
upper
responses
(l/3
out of 150 have an upper
from
fw the Triangle
how much
If a directional test is appropriate, then below-chance success frequencies, should they occur, generate lower
of similarity. alpha of 0.05
confidence
bounds
than
do
chance
outcomes.
0.9
0.8
60
0.6
0
-1
-2
‘3
-4
15
-5
30
-6
45
60
75
90 105 Number of Sets of Three
120
135
150
FIG. 1. Upper 90% confidence bounds for the estimated population proportion of correct responses in a triangle task as a function of the number of triangles presented and the excess of correct responses over the proportion expected by chance. Readings on the left represent a probability of correct response in the task which is unlikely to be exceeded by the true probability in the population. Readings on the right represent an estimate of the proportion of the population who detect the difference. Both conclusions are drawn with the same confidence as is given by a directional test of significance at the 0.05 level.
64
A. W. MacRae
90
9
0.9 80 0.8
70 8
60 0.7 50
7
0.6
40 6
30 0.5
0.4
5
20 IO
4
nV.“““” 2~ ,nn
15
30
45
60 Number
75
90
of Sets
105
120
135
lic?
of Three
FIG. 2. Lower 90% confidence bounds for the estimated population proportion of correct responses in a triangle task as a function of the number of triangles presented and the excess of correct responses over the proportion expected by chance.
(Although the outcome is attributed to the influence of chance alone, the particular outcome observed is less plausible for some higher value of the population parameter than is an outcome at the chance level.) However, if the true state of affairs is that the sensory difference is imperceptible there is no way to encourage belowchance outcomes to occur - they are just occasional happy accidents. They are irrelevant if the purpose is to demonstrate the existence of a perceptible difference, but if the purpose is to gain reassurance that the sensory difference is small, the analysis should take note of them if they occur. In that case, we use the lines in Fig. 1 labelled -1, -2, -3 and so on, referring to frequencies of success that are 1,2,3 and so on below the number expected by chance. If a belowchance outcome does occur, the upper bound may come as low as the chance level. It cannot come below chance since, in order for these lines to be meaningful, we must believe that no conceivable process can
generate
systematically below-chance
outcomes.
With 12 triangles, of which none are correctly identified, the upper bound is $. In the graph, that corresponds to the line labelled ‘-4’ being below the base of the graph in the position corresponding to 12 triangles. Since the chance outcome for 12 triangles is four correct, -4 represents no correct responses. As another example, an upper bound at chance level is given by 60 triangles of which 13 or fewer are correct (because the chance frequency is 20 and the line representing -7 is below the baseline). It must be emphasized, however, that although these outcomes are possible they are very unlikely and it would not be sensible to design a study in the hope that they will occur. The probability of zero successes in 12 triangles is less than 0.01 by chance alone, for instance. If a difference is truly imperceptible, belowchance and abovechance outcomes will occur about equally often.
Confidence Intervals for the Triangle Test them
THE LOWER BOUND
only by probabilities.
independent
CONFIDENCE
underlying
theory,
of the confidence
theory
performance
of a binomial
on
or any other
in particular.
confidence
interval
is
useful. Figure 2 is set out in the same way as Fig. 1, but dis-
equally suitable for other forced-choice difference tests but some, such as the duo-trio test or pair comparisons
plays the lower bound
where
the
require
different
we can be confident
-
interval can also be
principle
they are processes
and do not depend
threshold
model of psychophysical The lower bound
of decision
observed performance
signal detection The
For that reason,
of the view taken
65
that is, the value below which
with an alpha level of 0.05 that the
true probability of success does not fall. To say that the lower bound lies above the chance level is to exclude the chance
level and anything
for the population
lower as plausible
parameter
underlying
other words, the result is statistically From
Fig.
significant
2
we
can
at the 0.05
see
values
the result.
In
which
expectation,
5 above chance, graph
the correct
choice
is 30 above
for this
(perhaps
of sample,
and 50 correct
the chance
an
Fig. 2 allows us probability
to be significant.
ple, if there were 60 triangles
of an
as well as
value for p(C)
see that the highest which
plausible
we have in the reality
hypothesis
in
others
However,
so the
may be less for
If so, the analysis needs
to be differ-
Directional
and non-directional
hypotheses
If the number
of correct
is 0.734.
From Fig. 1 we is 0.90’7. Both
of a result
believe
that
no conceivable
cause a below-chance
as that
which
is just
are reason-
DISCUSSION
perceptible
process
could
outcome
difference
is known, the triangle forced
choice
strong
a
evidence
a below-chance
success
does
exist,
with
the
assessors
responses to it. explanation (a sys-
tematic
because
effect)
is not very plausible
that the assessors
have given
it is not easy to envisage
they could
systematically
A cause
haps not entirely is to assess the noticeability of some in any other situation where the
as especially
making systematically inappropriate With the triangle test, the second
by mistake.
than three-alternative
systematic
rate of success, we can interpret
take that view, we must regard
deliberately:
task is less efficient
can be explained
rate as no better than performance at the chance level. Indeed, it might even be interpreted as evidence that a
which
of the sensory difference
is less than the num-
in two ways. The first is that the result has happened by chance. The second is that there is some cause. If we
requires
When the purpose improvement, or
choices
the outcome
(stronger than a chance number of successes) of lack of perceptible difference. If we are not prepared to
of 20.
probability
at the 0.05 level and the bounds
by chance,
selections,
expectation
ably narrow.
direction
to one out of three
belowchance
are made with the same confidence
significant
tests and
For exam-
Our best estimate of the probability of success, p(C) , is 50/60, or 0.833, but from Fig. 2 we also see that the
statements
The proba-
is 5 in 3AFC
here may be usable.
of
that a di-
ent in some respects.
ber expected
is noticeable,
the outcome
lowest plausible
here
the probability
4 above
potential.
improvement)
the result
a target sample
these procedures.
is $,
is appropriate.
by chance
of a directional
over the usual tables
to find a lower limit for the plausible just declaring
test of significance
bility of success
chance described
alone is 4, provided
provided
If our aim is to show that a difference
assessor making
by
methods
which is 10 + 5, or 15, and so on. The
but it has further
intended
rectional
by chance
the graphs
that is 5 + 4, or 9. With 30 we need
has no advantage
purpose,
choices
plausibility
at the 0.05 level. With 15 trials, we again need chance
correct
are
12 to 150. With 12 trials, we need 4 above the chance expectation (which is 4), so 8 out of 12 are significant
success
The
of trials from
outcomes
level for numbers
of
graphs.
apply equally well to any task where
matching
significant.
probability
a mechanism
give the wrong
for below-chance
inconceivable
answers
by
answer is per-
even with the triangle
test but with the 3AFC a cause is certainly when the task concerns
it virtually
false responses
imaginable:
a low level of some flavour
de-
fect, say, it is possible that assessors will wrongly identify which type of sample has the defect and will systemati-
(3AFC). That was demonstrated by MacRae & Geelhoed (1992) and Geelhoed et al. (1994), for example.
cally tend to pick one of the pair as having the defect. How plausible such an explanation is, depends on
The
various
the
difference
is caused
task on assessors
decision
strategies
explanation
by the different
and
that
they follow
of the difference
demands
the consequently in each
of
features
of the
task,
sensory
If such effects
attribute
in
question
task. An
ered possible, two consequences follow, both of which weaken the power of the procedure to give reassurance
resides in signal-detection
and the assessors.
the
different
are consid-
theory and has been expressed in various ways by Ennis (1990,1993), Frijters (19’79)) Ura (1960) and others.
that sensory differences are small. Firstly, the test needs to be nondirectional,
However, the methods described here refer only to observed numbers of each kind of outcome and model
confidence interval is needed to give an alpha of 0.05, leading to wider bounds. (The graphs here can be
so a 95%
66
A. W. MacRae
used, however, an alpha comes
if they are interpreted
of O-1.)
Secondly,
may be evidence
any systematic chance
cause
since
for a perceptible can
should
strong evidence
occur,
for
they can (at best)
1 labelled
not be treated If any
be interpreted
with negative
if
a below-
against a difference.
same way as a success rate at the chance in Fig.
out-
difference
be envisaged
success rate, they should certainly
as especially
to an extent
as representing below-chance
in the
level. The lines
numbers
should
be
1 to require
The
with the tables by Schlich
is about
approaches
are to some
extent
yields the probability, ing
that
the
p, of Type-2
detectability
exceed some fixed amount. degree fixed
of detectability probability
levels
of
implausible.) correct
(/3 = 0.05)
here
are
of ‘detectors’
responses,
p(C)
, and
For
therefore
judged
to
the
+ 50)
(11 is 1 greater
which of
I call correct
it
may
be
of success
as it is to be lower than
undesirable
what reassurance grounds
possible
is required
for setting approach
and
some
to
specify
in
unless
there
are
particular
is to conduct
determine
the
criterion.
as large a test as
degree
of
reassurance
given by the results. Suppose that the test is, in fact, conducted triangles.
If the number
be 56 (that
of correct
is, 6 more
the manager
than
0.442
sponding
of correct
by reference
to 150 triangles.
using 150 turns out to
expectation),
with a confidence choices
to the curved
to 6, in a horizontal
corresponds
choices
the chance
can be reassured
of 0.05 that the probability exceed
the
the probability that as is desired is some-
position
A probability
line
corre-
corresponding
of being correct
to a percentage
level
does not
of ‘detectors’
of 0.442 of 150
X
0.442
- 50 = 16%, so the result gives reassurance
at the
usual
level
is no
higher
of confidence
that
the
percentage
than that.
as a percentage
we use one of the equa-
of how the two approaches and
11 correct
work
responses,
the
(1993) gives p = 0.00 for 50%
0.05 for 25% of ‘detectors’. obtained
of the per-
probability
p = 0.01 for 37.5%
as 0.533,
the
are tested or
must be below
Amount
of data needed
+ 150 or P(D) = 15Op(C) -50.
table in Schlich
of ‘detectors’,
be
is also expressed of choosing
an answer expressed
30 triangles
appropriate
0.05
(Higher
in the population,
Here is an example out.
error.
in the population
tions: p(C) = (P(D)
exist for a
gives an estimate
P(D) and he calls p,. To convert between of ‘detectors’
not
of Type-2
it is the probability
sample but Schlich
centage
declar-
does
plausibly
Beware that detectability
differently:
when
Here, Fig. 1 yields the largest
that might
detectability
error
of a difference
of 1
of con-
expectation.
The alternative
complementary.
10%
150 triangles judged
as likely to be higher
formal
For any number of triangles presented and number correct responses, the table in Schlich’s Appendix
than
what less than 0.5, since the observed frequency
advance
(1993)
than
correctly
chance level. With 150 triangles, the outcome will be as reassuring
Therefore,
Comparison
by more
that more
that the number
the chance
ignored.
detectable
sumers. For P(D) = 10, p(C) = (10 + 50) + 150 = 0.4. TO bring p(C) down to 0.4 with /I - O-05, can be seen in Fig.
of ‘detectors’
Figure
1 gives p(C)
by consulting
than the chance
and p = for /3 =
the line for +l
expectation
of 10 cor-
The graphs
provided
here allow for analysis of a maxi-
mum of 150 triangles. customary
in sensory
That
give firm reassurance cases. Similar (larger)
graphs
is a larger
about
lack of difference
can be prepared
beta levels, which generate
the cost of offering
number
than is
analysis but is still insufficient
less confidence
to
in most
for less stringent
narrower
bounds
at
that the true value
rect) above the horizontal scale position corresponding to 30 triangles. Converting p(C) to P(D), the percent-
of the parameter being estimated actually lies between them. But if a company really is concerned to avoid a
age of ‘detectors’,
flavour defect
being
of confidence
demanded
the 25% indicated
gives 30% by Schlich’s
not too different table,
that the value of p it gives is expressed only one digit. The table additionally
bearing
from
in mind
to a precision
of
gives p values for
of the opposite
kind -
noticeable.
gives answers for only one /3value but with that level of
smaller
is expected
confidence
significant,
percentages
of ‘detectors’
it can be used to discover
while
the graph
what is the maxi-
by consumers,
the degree
of the answers should
less than that which would be acceptable indeed
two other
noticed
a decision
Conventionally, before
that a difference
is
an alpha of 0.05 or
a difference
so it is reasonable
not be
for a decision
is considered
to demand
beta
mum probability of correct responses after observing any particular number of correct responses. Alternatively, it can reveal the largest number of correct responses that would be consistent with the probability of correct responses not exceeding some desired value.
about similarity we require much more data than for the purpose of demonstrating a sensory difference. The disparity is so great that most writers on the topic
Practical applications
despair of assessing similarity with the degree of confidence expected from tests of significance. Schlich
Suppose
a quality
ance that a process
assurance change
manager
wants
has not altered
reassur-
the product
be no larger than that if reassurance
that
should
of detectable difference is sought. It was noted earlier that in order
(1993) in his fifth example, being conducted to reassure
about lack
to get reassurance
envisages an assessment a marketing department
67
Conjidence Internals for the Triangle Test that a new ingredient tible
will not cause a difference
to consumers.
percep
He takes it as axiomatic
puter
that not
experience.
interface
A version
with
an improved
user
is under development.
more than 100 triangles can be used and consequently settles for a probability of ‘detection’ (P(D) in my terminology
and
p, in his)
offers a selection
of O-25, at which
of tradeoffs between
Type-l
and Type-2 errors,
0.0434
and 0.0443
more unbalanced
his analysis
the probabilities
one pair of possibilities
respectively, between
with the other
of
being options
the two types of error. These
REFERENCES Clopper, C. J. & Pearson, E. S. (1934). or fiducial Ennis,
using 100 triangles
Ennis,
they are achieved
that a 25% ‘detection
acceptable,
whereas
bility of detection
only at the cost of
rate’ in the population
one can well imagine
is
that a proba-
of 25% would be completely
unaccept-
able for some types of flavour defect in a food product. The problem is that sensory analysts tend to allow their
constraints
to be set by what is considered
an economically Instead,
practicable
they should
the problem
and design
ing is logically acceptable
required
degree
amount
determine
of sensory testing.
the
whatever
to be
requirements
of
programme
of test-
to answer the question
with an
D. M.
methods
A stand-alone
computer
Frijters,
the
models
data simi-
lar to that used in Figs 1 and 2 runs on a PC and is available on request sending
as a UUENCODED
electronic
or as an executable or 5.25
file by
mail to
[email protected], file by sending
in. disk to the author.
not yet been
executable
optimised
The
a PC-formatted
3.5
user interface
has
so the user needs
some
com-
case
of the binomial.
The
relationship
the
of sensory
Variations
testing
discrimination
forced-choice
method
probabilistic signal
detection
Br.J. Mathematical Stat. Psychology,32, 229-41.
gives more
task can
of the triangular
of its unidimensional
E. N., MacRae,
erence
of difference
Food Technol., 44, 1147.
power
to three-alternative
Geelhoed,
power
Se?Lsoly Stud., 8, 353-70.
theory models.
be
A. W. & Ennis,
consistent
D. M. (1994).
judgments
modelled
as forced
Pref-
than oddity only if choice.
Percept. Psy-
ckopkys., 55, 473-7. IS0
(1983).
Geneva:
Schlich, @al. Smith,
Sensory analysis International
4120-1983.
(Under
powerful
Methodology -
Organization
for
Triangular Test. Standardization,
revision)
A. W. & Geelhoed,
inability.
to generate
Relative
J. E. R. (1979).
and
more
program
The use of confidence
in the
evaluation.
D. M. (1993).
MacRae,
Computer program availability
(1990).
in sensory
meth0ds.J.
IS0
of confidence.
illustrated
Biomettika, 26, 404-13.
error probabilities are not particularly exacting and are close to the alpha of 0.05 used here. However, when agreeing
limits
E. N. (1992).
than detection
Preference
can be
of oddity as a test of discrim-
Percept. Psychophys, 51, 179-81.
P. (1993).
Risk tables
for discrimination
tests. Food
Prefflence, 4, 141-51. G. L. (1981).
difference
Statistical
tests: Confidence
properties limits
of simple
sensory
and significance
tests. j
Sn’. Food Agn’culture, 32, 513-20. Ura, S. (1960).
Pair, triangle
tistical Application &search, ginefxs, 7, 107-19.
and duo-trio
tests. +orts
of Sta-
Union ofJapanese Scientists and En-