Base size in product testing: A psychophysical viewpoint and analysis

Base size in product testing: A psychophysical viewpoint and analysis

PIII s0950-3293(97)00003-7 Food Qnltg and Pr+-rmce Vol. 8, No. 4, pp. 247-255, 1997 Q 1997 Else&r Science Ltd All rights reserved. Printed in Chat B...

810KB Sizes 9 Downloads 29 Views

PIII

s0950-3293(97)00003-7

Food Qnltg and Pr+-rmce Vol. 8, No. 4, pp. 247-255, 1997 Q 1997 Else&r Science Ltd All rights reserved. Printed in Chat Britain D950-3293197 817.00+0.00

BASESIZEIN PRODUCTTESTING:APSYCHOPHYSlCAL VIEWPOINT AND ANALYSIS Howard R. Moskowitz Moskowitz Jacobs Inc., White Plains, New York, USA (Accepted

1 January

1996)

No matter what objectives motivate the product test, a recurring debate concerns the appropriate number of panelists to use. This is the ‘base size question’. Base size becomes important because it influences the stability of the average, and cancels out random variability. Base size is especially important for subjective measurement,

ABSTRACT In the applied world of product testing the appropriate number of panelists (base size) involves technical and business considerations. Base sires range from very low (around six; used in expert panelist projling) to high (hundreds; used in product tests by marketing researchers). Often base sizes are dictated by the requirement that the project identify statistical dayerences between or among samples. The probabilistic analysis of dzyerences (sign;f;came vs. insignt$icance) derivesfrom statistical theory, with base size used a_sa method to in@ence the sampling error This paper looks at base sizes another way (variabilig). from the viewpointof psychophysicalscaling. The issue then can be re-stated as ‘what is the necessary base size at which the average rating stabilizesY Empirical data suggest that base sizes of 4Q-50 panelists generate stable averages, and that byond the BY)panelists the average is not particularly afected by the base size. These resultsholdfor actual datafor a varietyof jroducts, andfor dryerem gpes of attributes, specttcallysensory (amount of a characteristic), and hedonic (liking of a characteristic) . 0 1997 Etsevie-rScience Ltd

because of the extreme variability in many types of judgments (e.g. ratings of liking). The key question is ‘how many panelists are needed to achieve the necessary precision and stability of the average?’ Does the test need as few as 6-10 panelists as averred by those promoting expert panels (Stone and Sidel, 1985)? Of course the small base size of 6-10 panelists in expert panel research is often compensated for by numerous replicates. The average rating for a stimulus from expert panelists is precise because the noise or extraneous variability has been canceled out due to the panelists’ training. Or does the test require 100 or more panelists, as market research often dictates? The market researcher is interested both in the representativeness of the consumer panel, and in the stability of average. The precision of the average comes from the large base size which cancels out the noise. Some of the differences in the perceived ‘appropriate base size’ can be traced to the intellectual origins of the researchers who cope with the issues. Applied product testing derives from two sources, each with its own viewpoint and prescriptions. One source is the ‘expert’, deriving from the tradition of the perfumer, master brewer, winemaker, etc. In this tradition the expert panelist is assumed to have had extensive training, and to be able to perform like an instrument. Given this intellectual history it is no wonder that those researchers who use expert panelists aver that only a relatively small number of such panelists need to participate (Hubbard, 1990). Psychophysicists searching for the relation between physical stimulus level and sensory response also use relatively few, unpracticed consumer panelists. Most psychophysical experiments in the scientific literature use base sizes of 10-20 panelists. This small base suffices to reveal the relation between a systematically varied stimulus and the rating of perceived intensity (Stevens, 1975). Even with base sizes of 10, the parameters of the psychophysical power law are reproducible. (The power

INTRODUCTION During the past thirty years ‘product testing’ or the evaluation of physical stimuli has become a topic of increasing importance in both the basic and applied research worlds. Food scientists, interested in the properties of food products, have increasingly utilized ‘sensory analysis’ jointly with proximate physical analysis. Psychophysicists, interested in the relation between stimulus characteristics and sensory responses, use panelists to rate and then seek quantitative relations the products, between the physical stimuli and the subjective ratings. Commercially oriented market researchers and sensory analysts assess products using panelists first, in order to guide development and second, in order to estimate the potential viability of products in the marketplace. 247

248

H. R. Moskowitz

law describes

the relation between magnitude estimates intensity and well-defined physical measures of intensity). Psychophysical studies and sensory description studies thus use enough just enough panelists to smooth out the random variation from person to person, allowing patterns in the data to reveal themselves. Individuals vary, but the underlying assumption is that each of the individuals responds similarly when perceiving the sensory characteristics of the product. The variability becomes a nuisance factor, to be averaged out. Psychophysical studies of the hedonics of taste and smell occasionally use slightly more panelists (e.g. 2050), but rarely far more (Conner et al., 1988; Ekman and Akesson, 1964; Pangborn, 1970, 1981; Stevens et al., 1989). Substantial individual differences across people exist in likes and dislikes. The larger base size helps to ensure that the researcher has properly sampled the range of different patterns of likes and dislikes. At the opposite end of the spectrum lies the market researcher who traces his intellectual heritage to and public opinion surveys in sociology in general, particular. Market researchers recognize that consumers differ dramatically from each other, and thus claim, as a sociologist might, that valid data about the population can only emerge when the average is computed from the response of hundreds of panelists. The smallest base size is often quoted informally as 100, although conventionally many of these researchers use larger base sizes in their studies in order to take into account alpha and beta risks, and the probabilities of making an incorrect decision. In product evaluations, specifically claims tests for product performance, the base size is often considerably larger. The recommended base size for claims substantiation on television networks is 300 panelists or even more (Smithies and Buchanan, 1990). This large base size is predicated on the notion that a valid average is one which derives from appropriately sampling the population. The foregoing discussion contrasts two rather different points of view. On the one hand are scientific researchers working with products or ‘model’ systems, who hold that nature will reveal its patterns even with a limited number of individuals. These researchers believe that the ‘signal’ or pattern emerging from nature is very strong for the phenomenon under study. If there is any noise, then these researchers reduce that noise by carefully controlling the testing situation in order to eliminate extraneous sources of variability, or sample a larger group of panelists and average out the noise. This first group of researchers focus far more on the stimulus, or on the relation between the properties of the stimulus and the subjective response to those properties. At the other end of the spectrum are the social scientists, who believe that the best way to remove individual differences is to average them out by using many panelists (Kramer and Thiemann, 1987). This second group of researchers focus far more on counting the number of individuals in the population who exhibit a certain type of behavior.

Scope of this paper and analytic considerations

ofsubjective

This paper deals with one type of test-the sequential monadic product test-which is used both by academic researchers interested in scientific issues, and applied researchers interested in business issues. Sequential monadic tests present the consumer with a set of products in a randomized order. The panelist evaluates each product, one at a time. For each product the panelist rates a variety of sensory, liking and image characteristics on a scale. The panelist may rate all of the products in the set, or only some of the products in an incomplete design. The data from sequential monadic tests can help to determine the base size at which the average stabilizes, and thus the cost of the research (at least insofar as the cost of the panel is concerned). This stability analysis can be done by attribute (e.g. do sensory ratings stabilize at a lower base size than liking attributes ?), as well as by segment (do homogeneous segments defined by demographic or psychological responsiveness to products require lower base sizes than do heterogeneous populations considered as a whole ?). The primary objective of the research is to determine how high the product scores on a defined scale, and at what point does the average stabilize, allowing the creation of mathematical models relating subjective responses to physical properties. The author’s worldview is thus ‘psychophysical’ rather than ‘statistical’, and the domain is product testing rather than psychophysical science. In a sense this paper deals with ‘processing the average’, in contrast to other papers which deal with ‘processing the variability’ (S. S. Stevens, 1969, pers. comm.). Other analyses can be done on the data in order to assess the effect of base size (e.g. analysis of variance to analyze variability due to product differences vs. variability due to random error, etc.). These alternative and more traditional analyses derive from the statistical world-view, which deals with sampling error, rather than from psychophysics, which deals with the stability of average ratings and the stability of relations between variables. Studies which process the variability typically look at the of distributions of ratings, variability in the system, and the odds of making correct vs. incorrect decisions. Studies which process the average typically look at patterns in the data, trying to discover lawful relations in nature.

METHOD This paper presents data from two product studies (salami, fish sticks). The two product studies were originally commissioned by manufacturers to assess consumer responses to different product formulations and to competitive products. The data were used to guide product reformulation, in light of consumer sensory preferences. These studies provide sufficient data to understand how the average rating stabilizes with increasing base size of panelists.

Base size in Product Testing

Results

‘smoked flavor’ salamis

Experiment lMethod In

this

study

95 panelists

participated

(40

males,

55

users of smoked meat products who agreed to participate for a 3.5 h evaluation session). Each panelist rated all nine ‘smoke flavor’ salamis that had been systematically varied by experimental design on two ingredients (spice level, grind level). All nine products were assigned three digit identification numbers. Panelists used anchored O-100 point rating scale to assess the various dimensions of liking (overall, appearance, aroma, taste/flavor, texture) and sensory attributes (darkness, flavor intensity, coarseness of grind). The study was run in three separate markets in the US (New York, Chicago, San Francisco), with approximately equal numbers of panelists (n = 31/32) per market. In the analysis the data for each of the nine products were broken out by total panel, gender, market, and by sensory preference segments (Moskowitz et al., 1985; Moskowitz, 1994). Sensory preference segmentation divides consumers into homogeneous groupings, based upon the pattern relating sensory attribute level and overall liking. An analysis of the stability of the average for total panel versus for sensory segments can reveal whether a truly homogeneous group of consumers (in terms of sensory preferences) generates an average rating for a product which stabilizes with lower base size. Two sensory segments emerged for the smoked flavor salamis-consumers who preferred low spice/flavor and fine texture (n = 35), and consumers who preferred a high spice and coarse texture (n = 60). females,

all category

The analysis presented here deals with the two most different salamis on a sensory basis which are here called A and B, respectively.Table 1 shows that the averages for liking and sensory attributes stabilize with increasing base size for the total panel.Table 2 shows a parallel analysis by subgroup. The first two subgroups are male and female, the second two subgroups are the sensory preference segments. First, Table 1 and Table 2 suggest that as the base size increases the attribute ratings oscillate around the average. The average moves around, albeit in a slowly drifting fashion. However, the data never stabilizes to the point where the average is unmovable. Second, the slow oscillating behavior of the average is not limited to hedonic attributes (e.g. overall liking, liking of flavor, of texture, etc.). Sensory attributes also show oscillation around the average with increasing base size. Thus a base size of 40-50 individuals may be needed to stabilize averages for sensory attributes just as this same base size is needed for stable averages of hedonic attributes. This base size of 40-50 holds when interest is focused on the averages of ratings for a product. Third, subgroups are neither more no less robust (in terms of base size) than is the entire panel. Just because a subgroup of consumers is selected on the basis of either a demographic breakout (gender) or on the basis of a sensory preference profile (sensory segment) does not mean that this group generates a tighter average which stabilizes more rapidly than does the average from the total panel. Homogeneous subgroups also show variability of the average, and a drift that needs to be counteracted by

TABLE 1. Average ratings for two salami products (A,B) on liking and sensory attributes Liking attributes

Base 10 20 30 40 50 60 70 80 90 Sensory

Base 10 20 30 40 50 60 70 80 90

attributes

Overall A

Overall B

Flavor

as a function

A

Flavor B

56 50 47 47 49 50 51 52 52

74 62 55 54 58 57 57 57 57

53 50 48 50 50 52 53 53 52

75 62 55 52 56 57 58 58 59

Smoke

Smoke B

Coarse

Coarse B

A

50 46 50 50 48 47 50 50 49

249

56 49 47 48 50 51 53 52 54

A

48 45 49 44 41 41 40 38 38

36 39 35 34 34 36 39 39 39

of base size-total

Texture A

52 56 53 50 50 52 54 55 55

panel Texture B

70 58 52 50 55 55 56 57 56

250

H. R. Moskowitr

a large base size. [These results appear in Table 3, which compares the data for base across total panel and key subgroups, for both products]. Ideally the base size should comprise 50 panelists in order to yield a robust estimate of the average. There is no clear need for the hundred or more panelists required by market researchers, at least in order to obtain stable averages on attributes.

Experiment 2-fish

fillets

Experiment 2 represents the more typical product test performed by market researchers, comprising a few products, but many consumer panelists. The study comprised 150 consumers, 73 males and 77 females, distributed

approximately equally in four markets around the US (New York, Chicago, Los Angeles, Miami). In a previous study with the same 150 panelists each panelist had been classified as belonging to one of two sensory segments (low impact vs. high impact). This second experiment was a follow-up study to evaluate two new formulations. In Experiment 2, panelists evaluated only two products-here labeled X and Y, respectively. (In the actual study the two stimuli were assigned three digit identification numbers.) The two fish fillets differed substantially in terms of perceived spice level (very low vs. high levels of a flavoring comprising salt, chili pepper, and regular black pepper).

TABLE

2. Average liking and sensory ratings for two salami products (A,B) as a function of base size: results for four subgroups: males and females, sensory segments (low, high impact)

Liking attributes

A

A

Flavor A

Flavor B

Texture A

Texture B

Male base 10 20

51

62

55

63

46

54 53 53

58 59 65

53 53 52

58 60 65

51 53 52

58 57 58 59

Flavor

A

Flavor B

55 53 51 50

65 59 55 53

Overall B

Flavor A

Flavor B

44 45 45

44 48 49

45 42 46

Flavor A

Flavor B

53 54 59

68 69 72

Overall

30 40

Overall A

Overall

Overall B

Texture

A

Texture B

59 54 55 54

62 57 55 54

Female base 10 20 30 40

52 52 50 49 Overall A

Low impact base 10

38

20 30

46 49 Overall A

High impact base 10 20 30 Sensory attributes

Males base 10 20 30 40

55 55 60

62 56 53 52

Overall B

69 68 71

Smoke A

Smoke B

Coarse A

Coarse B

61 57 53 50

54 53 55 56

42 47 41 39

43 40 39 38

Smoke A

Smoke B

Coarse A

Coarse B

50 48 50 50

59 51 54 53

31 28 34 37

34 35 38 39

Females base 10 20 30 40

Texture

A

Texture B

49 51 50

44 48 49

Texture

A

65 61 66

Texture B

67 69 67

Base size in Product Testing The panelists evaluated each of the two products in a sequential order, randomized so that half of the panelists evaluated X first and the other half evaluated Y first. Panelists rated each product on a battery of liking and sensory attributes. TABLE 3. Average liking ratings for two salami products

251

The results for fish fillets parallel those of smoked salami. Table 4 shows the average ratings for the total panel. These data again suggest that base sizes should be approximately 50 in order to achieve stable average ratings (at least for the total panel)

(A,B) as a function of base size: direct comparison of total panel and four

key subgroups Total

Male

Female

A

A

A

Low impxct A

56 50 47 47

51 54 53 53

52 52 50 49

55 55 60 NA

Total B

Male B

Femde B

Low impact B

74 62 55 54

62 58 59 65

62 56 53 52

69 68 71 NA

Base 10 20 30 40

Base 10 20 30 40

TABLE 4. Average

ratings for two fish fillet products

Overall Liking attributes Base 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 sensory

Base 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

X

62 61 61

ratings

High impact A

38 46 49

NA High impact

B

44 45 45 NA

(X,Y) on liking and sensory attributes

as a function

of base size-total

panel

Fhsvor

Texture

X

Y

X

Texture Y

60 55 57

67 61 58

66 55 53 55 55 57 58 57 58 58 56 57 57 57 57

56 49 54 56 57 57 59 59 59 59 59 58 59 59 59

Overall Y

Flavor

62

59

59

61 61 61 61 62 61 60 61 61 60 61

60 60 62 62 63 63 61 61 61 63 62

59 59 59 59 59 59 58 59 60 60 60

56 50 58 60 61 61 62 62 63 63 62 62 63 64 63

Dark X

Dark Y

Flavor X

Flavor Y

Greasy X

Greasy Y

61 58 59 59 59 59 59 60 59 58 57 57 57 57 57

41 37 40 41 42 42 42 41 40 57 39 38 39 39 39

55 52 55 57 57 58 59 58 57 50 56 58 59 59 59

37 43 49 49 49 49 50 49 49 49 49 50 50 50 49

51 50 53 52 53 52 50 49 49 40 49 48 47 47 47

30 36 37 37 41 41 39 39 40 40 39 38 38 39

252

H. R. Moskowitr:

TABLE 5. average liking and sensory ratings for two fish fillet products females, sensory segments (low, high impact))

Overall Iiking

ratings

Male base 10

20 30 46 50 60 70

Female base 10 20 30 40 50 60 70

Overall

(X,Y) as a function of base size: Four subgroups

Flavor

Flavor

Texture X

(males and

Texture Y

X

Y

X

Y

69 65 62 62 64 64 63

68 62 61 60 61 63 62

65 59 59 60 62 62 61

66 62 61 61 61 63 63

Overall

Overall

Flavor

X

Y

X

Y

64 63 59 61 61 62 61

64 63 59 61 61 62 61

64 63 59 61 61 62 61

64 63 59 61 61 62 61

64 63 59 61 61 62 61

64 63 59 61 61 62 61

Texture X

Texture Y

Flavor

64 58 57 59 60 61 59 Texture X

70 61 58 59 62 62 60 Texture Y

Overall

Overall

Flavor

Flavor

Liking ratings

X

Y

X

Y

Low impact base 10 20 30 40 50 60 70

61 61 61 57 58 57 59

61 61 61 57 58 57 59

61 61 61 57 58 57 59

61 61 61 57 58 57 59

61 61 61 57 58 57 59

61 61 61 57 58 57 59

Flavor X

Flavor Y

Texture X

Texture Y

Overall X

High impact base 10 20 30 40 50 60 70

sensory

ratings

Male base 10 20 30 40 50 60 70

Female base 10 20 30 40 50

Overall Y

65 65 62 65 66 64 63

68 70 67 65 66 62 64

64 64 61 63 65 62 62

68 71 69 66 66 63 65

66 63 60 61 60 58 58

62 63 60 61 62 60 62

Dark X

Dark Y

Flavor X

Flavor Y

Greasy X

Greasy Y

62 59 55 54 53 54 54

37 37 35 35 39 38 38

53 55 57 56 58 59 58

43 52 53 50 49 50 49

28 36 41 39 39 39 40

46 40 41 39 39 38 39

Dark X

Dark Y

Flavor X

Flxvor Y

Greasy X

Greasy Y

64 63 59 61 61

64 63 59 61 61

64 63 59 61 61

64 63 59 61 61

64 63 59 61 61

64 63 59 61 61

Base size in Product Testing

253

Table jcontd Dark

Greasy X

Greasy Y

Dark

Flavor

Flavor

Sensory ratings

X

Y

X

Y

Female base 60 70

62 61

62 61

62 61

62 61

X

Dark Y

Flavor X

Flavor Y

61 61 61 57 58 57 59

61 61 61 57 58 57 59

61 61 61 57 58 57 59

61 61 61 57 58 57 59

61 61 61 57 58 57 59

61 61 61 57 58 57 59

Dark X

Dark Y

Flavor X

Flavor Y

Greasy X

Greasy Y

64 64 60 61 59 58 59

48 45 43 42 41 39 41

60 63 60 62 62 61 61

53 48 47 49 50 49 50

46 46 49 46 48 48 48

42 44 44 42 41 40 39

Dark Low impact base 10 20 30 40 50 60 70

High impact base 10 20 30 40 50 60 70

Average ratings by subgroup also support the need for bases sizes around 50 (see Table 5 ). Finally, one can compare

the total panel and the key subgroups

for sta-

bilty of average as a function of base size (see Table Again the results parallel those found for salami.

6).

(either defined

they feel that the data will be more stable. The

too many if the researcher

or at

if the researcher

uses binary

preference

ratings).

an

With

for commercial

the base size the greater

rating.

This

that number

base size is approximately

sizes of 100 or more are not necessary

to generate

lized averages when the data are obtained an anchored

&IO0

On a practical

50,

can vary between 40 and 60. Base stabi-

by averages of

point scale, note,

mind is that the base size cannot

thing to keep in

be as low as 10 or 20

(because the average is not yet stable), nor need it be 100 or more (because the average has already stabilized so that the additional 50 panelists beyond the first set of 50 will not influence the average very much). Furthermore, the base size around 50 recurs, whether the data comes from the total panel,

or from a homogeneous

subgroup

stable

reduced

with

because the

the odds of significance

the standard

greater

base

for a

error of the mean is

size. Thus,

for stable

averages a base size of 50 may suffice whereas, for significance

tests using the sampling

distribution

of the aver-

age, the larger base sizes may be desirable. The

the important

scale,

tests can be obtained

with base sizes around 50. On the other hand, the greater given average

the average

but may be

data (e.g. from

attribute

product

is little additional although

The base size of 100 is far

uses scale data,

appropriate

least an efficient base size of panelists beyond which there in terms of

as

base sizes used by

Market researchers, especially in the industry, opt to use large base sizes

quest for stability is admirable.

PRODUCT

to be obtained

the need for the large

base sizes used by market researchers,

researchers.

averages

information

by gender or defined endo-

well as the very small but convenient

because

These data suggest that there may be an optimum

Greasy Y

of sensory preferences).

This paper calls into question

academic

AND FOR

exogenously

(and expensive)

62 61

Greasy X

genously by the pattern

practitioners

DISCUSSION IMPLICATIONS RESEARCH

62 61

foregoing

variability

comparison

of stable

of the average contrasts

ches and, more profoundly,

averages

two research

versus approa-

two world views of data. The

first approach looks at the average-what is the average and what does that imply about the underlying process or system, or the relation between the average score and some external product treatment? This approach traces back to the psychophysicist and physical scientist, interested in relations among variables. The second approach

254

H. R. Moskowitz

TABLE 6. Average

liking ratings for two fish fillet products

(A,B) as a function of base size: direct comparison

of total panel and four

key subgroups

Base 10 20 30 40 50 60 70

Base 10 20 30 40 50 60 70

Total X

Male X

Female X

62 61 61 62 61 61 61

69 65 62 62 64 64 63

64 63 59 61 61 62 61

Total Y

Male Y

Female Y

60 52 57 59 60 60 62

68 62 61 60 61 63 62

64 63 59 61 61 62 61

looks at the variability of the average-what are the odds that the average derived from a study differed from some pre-determined standard? Or, to express this in the more conventional manner, did the treatment have an effect on changing the average, and what is the odds of that effect really existing? This second approach deals with the probability of change, rather than the actual substantive meaning of the average value itself. This second approach traces back to the sociologist’s point of view40 the two populations differ from each other? (Moskowitz, 1994).

Practical application of these results Academic researchers working either in consumer research or food science/sensory analysis use small samples. Sometimes these researchers use base sizes as low as 20 or 30. The typical argument for small base sizes is that with base sizes above 30 the researcher can use the < (viz the normal or Gaussian) distribution rather than the t distribution. With a base size of 30 individuals conventional statistical wisdom assumes that the data can be analyzed by statistics appropriate for normally distributed data (since at a base size of 30 the distribution is Gaussian). This paper suggests that although the average is normally distributed at a base size of 30, the average itself becomes robust at a base size of 50. Base sizes of 50 might be better because it is more likely that the average will remain unchanged with increasing numbers of panelists.

Low impact X

High impact X

61 61 61 57 58 57 59

65 65 62 65 66 64 63

Low impact Y

High impact Y

61 61 61 57 58 57 59

68 70 67 65 66 62 64

worked). This initial research is, exploratory. The rationale for small bases in exploratory research is that at this early stage the precise average is not of interest, but rather just simply an indication as to whether or not the treatment worked. In early stage work the research looks for general patterns, without establishing the numerical parameters of these patterns. The unexpressed but equally important assumption in this early stage exploratory research is that later on the larger base size will more adequately establish the pattern of key relations in the data. The results in this paper belie the unexpressed confidence that the pattern obtained with a small base size will necessarily reappear with a larger base size. The data suggest that although the best guess of the mean rating for studies with large base sizes is the average obtained in studies with a smaller, more exploratory base size, the promising pattern discovered with small base sizes may not, in fact, reappear. Furthermore, models fit to average data based upon small base sizes may be governed by parameters that would not fit the data for larger base sizes. Simply stated, the patterns discovered with small base sizes may disappear entirely, or change qualitatively when the study is expanded. Happily, however, with base sizes of 4&60 one can be reasonably certain of obtaining reliable averages.

EDITOR'S

NOTE

Trading off base size and early stage exploration Quite often in the research process the investigator may wish to identify general patterns in the data (e.g. establish that a phenomenon exists, or that a treatment

Food Quality and Preference will papers which stimulate discussion taries. Readers are encouraged to mentaries for publication in future

occasionally publish along with commensend in further comissues.

Base size in Product Testing response

segmentation

REFERENCES

Pangborn,

R.

responses Conner,

M. T., Haddon,

A. (1988) ences

The

Ekman,

G.,

A. V., Pickering,

sweet tooth

in preferences

sweetened.

Journal

for both

177,

individual

D.

differ-

sweet foods and foods highly

C. A. (1964)

a study of quantitative

Report

E. S. and Booth,

demonstrated:

Psychological

Saltiness,

relations

sweetness and

in individual

Laboratories,

subj’ects.

University

Of

Stockholm. Hubbard,

M.

R.

(1990)

industry. Van Nostrand Kramer,

Statistical Reinhold,

H. C., and Thiemann,

quality

control for

H. R.

(1994)

Food concepts and products:

development. Food and Nutrition Moskowitz,

H. R., Jacobs,

Park. Just

N. (1985)

(1970)

R. M. (1981).

in time

John

variations

Wiley,

Individuality

in responses to sensory stiForster Verlag,

B. S. (1990)

Proceedings,

Better

Bureaus,

Business D.

Explaining pp. 173-180.

Zurich.

Sensory evaluation practices.

New York.

R. A., and Buchanan,

Foundation,

in affective

acceptrmce: how man chooses what he eats, ccl.

claim. Transcript

Stevens, Product

Individual

H., and Sidel, J. L. H. (1985)

Smithies,

differ-

Psychonomic Science 21, 125-128.

J. Solms and R. L. Hall, pp.177-219.

NAD

and

the

A.,

and

Substantiating

Workshop. Advertising

a taste

Council

Of

Research

New York. A.,

Dolley,

individual

food acceptance.

Press, Trimbull.

B. E. and Lazar,

M.

of individual

Of Food Quality 8, 168-l 91.

to taste stimuli.

muli. In Criteria offood

Stevens,

How many subjects:

Statistical power analysis in research. Sage, Newbury Moskowitz,

the food

New York. S. (1987).

Pangbom,

Stone,

Of Applied Psychology 73, 275-280.

and Akesson,

preference:

and the analysis

ences in liking. Journal

255

In Food acceptability.

Elsevier,

S. S. (1975)

D.

differences

Laird,

in flavor

J.

D.

perception

(1989) and

ed. D. M. H. Thomson,

London.

Psychophysics:

neural and social prospects. John

An introduction to its perceptual,

Wiley,

New York.