The effects of alternative methods of collecting similarity data for Multidimensional Scaling

The effects of alternative methods of collecting similarity data for Multidimensional Scaling

ELSEVIER Intern. J. of Research in Marketing 12 (1~';5)363-371 The effects of alternative methods of collecting similarity data for Multidimensional...

955KB Sizes 84 Downloads 59 Views

ELSEVIER

Intern. J. of Research in Marketing 12 (1~';5)363-371

The effects of alternative methods of collecting similarity data for Multidimensional Scaling T a m m o H . A . Bijmolt a.., Michel W e d e l b a Department of Business Administration, Tit.burg Unicersity, P.O. Box 90153, 5000 LE Tilburg, Nether~,da b University of Groningen, Netherlands

Received March 1994;accepted February 1995

Abstract

In this paper, we study differences between the four data collection methods sorting, paired c ~ , conditional rankings, and triadic combinations. Effects of the judgment task on fatigue, boredom, and Marn~g on similarity data, and on perceptual maps dcrb,'ed with Multidhnensional Scaling (MDS) are investigated. The data are analyzed with MAXSCAL. For each data collection method, subjects responding to the task become f a t e d bored, whereas increases in stimulus knowledge and task insight do not occur. The amount of fatigue and b o r e ~ differs between the data collection methods. Moreover, the type of data collection method used affects bo~h the similarity data and the MDS solution. Keywords: Similarity data; Data collection methods; Multidimensional .scaling

1. Introduction In the application of Multidimensional Scaling (MDS) to marketing problems, several decisions have to be made concerning selection of the subjects and brands, the data collection method, and the MDS model. These decisions may have a considerable impact on subject's fatigue, boredom, and learning, on the similarity judgments

* Corresponding author. Tel. +31-13-663423, fax +31-13662875, e-mail: [email protected] research was sponsored by the Economic Research Foundation, which is part of the Netherlands Organization for ScientificResearch (NWO). Part of this study was carried out while the fi~-st author was employed at the Department of Marketing and Market Research, Universityof Groningen.The authors thank Yoshio Takane for his help with the MAX$CALanalyses.

obtained, a n d o n the comqguratfon of b r a n d s provided by the MDS models. Thereby, sm:h decisions critically influence the quMity of the data input for the development of marketing strategy,

MDS being one of the important too~ in marketing research for product positioning (Cooper, 1983). Yet, the impact of many of these d e ~ is unknown, and recently several a u t ~ have indicated the need for studies examining the cesses underb'ing similarity judgments ~ i~'estigating when and bow different data c o ~ methods can be app|ied ( J o h ~ et aL, I ~ 1992; Malhotra, 1987). The aim of this stmly is to

assess the effects of alternative co!.~ctb~ met~ ods on the occurrence of fatig~, b o ~ - e ~ learning of the subjects, on the similarity ments obtained from them, and ~ t ~ MDS solutions.

0167-8116/95/$09.50 @ 1995 ElsevierScience B.V. All rigMs reserved S3DI 0167-8116(95)00012-7

364

T.H.4. Bijmeltet al./ Intern.Z of Research in Marketing12 (1995)363-371

We investigate differences between the four most commonly used data collection methods, namely sorting, paired comparisons, conditional makings, and triadic combinations (Coxon, 1982). In the sorting task, each subject has to sort the stimuli into a number of groups, according to similarity. The number of groups is determined by the subject during the judgment task. In the Faired comparisons task, the stimuli are presented to the subject in all possible pairs of two stimuli. The subject has ~o rate each pair on an ordinal nine-point scale where the extreme values of the scale represent taaximum dissimilarity and maximum similarity. Subjects who perform the conditkmal ranking ta~k have to order stimuli on the basis of their similarity with an anchor stimulus. Each of the stimuli is in turn presented as the anchor. In the triadic combinations task, subjects have to indicate which two stimuli of combinations of three stimuli form the most similar pair, and which two stimuli form the least similar pair ~. Differences in the effects of the data collection methods may be caused by three factors. First, an importam aspect of the data collection method is whether or not all the stimuli are presented to the subject at once. In contrast with paired comparisons and triadic combinations, sorting and conditional ranking give a subject the opportunity to inspect the entire set of stimuli at each evaluation. Second, the difficulty of similarity evaluation differs between the data collection methods. For example, the simultaneous comparison of numerous stimuli to a single reference stimulus in the m e t h - ~ of conditional ranking, is more complicated than comparing only one pair of stimuli in the method of paired comparisons. Finally, the total number of slrnilar/ty evaluations to be made varies con_siderably a c r e s methods. The extremes are the sorting task on the one hand, which is essentially a single similarity evaluation, and the comparison of al| poss~le triadic combinations on the other hand.

~We reduce the number of triadic combinations to be made by each subject from 220 to 80, in order to reduce the v ~ t ~ in the ~engthof the judgment task.

2. Research

questions

While subjects perform a similarity judgment task, they may experience increases in fatigue, boredom, stimulus knowledge, and task insight 2 Although previous studies made assumptions on the effects of fatigue, boredom, stimulus knowledge, and task insight (e.g. Johnson et al., 1990), little research has been done to assess the extent to which they actually occur for alternative data collection methods. The only study we know that addresses these issues has been done by Melntyre and Ryans (1977). Subjects reported to what extent they felt bored and to what extent they found the task difficult. The paired comparison task turned out to be significantly less boring and less difficult than the conditional ranking task. We invesngate the following research question on the occurrence of fatigue, boredom, stimulus knowledge, and task insight and differences in these aspects between data collection methods. QI: Are there differences between the data collection methods with respect to the occurrence of (a) fatigue, (b) boredom, (c) stimulus knowledge, and (d) task insight? Possible effects of the data collection, ~aethods on the similarity judgments pertain to the extent of completion (number of missing values), the completion time, and the quality of the data. Neidell (1972) found that the methe," of conditional rankings outperformed the method of triadic combinations in terms of both response rate and completion rate. The effect of the data col-

2 We define fatigue as "a subjective mental condition caused by the continuation of a mental activity, in which the ability to perform that or a related activity is diminished", boredom as "a subjective mental condition caused by the monotony of a mental activity, in which the motivation to continue that activity is diminished", stimulus knowledge as "the amount of information directly available to the subject while h e / s h e performs the task", and task insight as "the extent to which a subject understands the activities requested in a specific task, which enables him/her to perform the task successfully".

T.H~4. Bijmoltet al.~Intern. J. of Researchin Marketing12 (1995)363-371 lection method on the completion time has been investigated in a 1mmber of studies (Henry and Stumpf, 1975; Humphreys, 1982; McIntyre and Ryans, 1977). Paired comparisons have been shown to require less completion time than conditional rankings, which method appeared in turn to be faster than triadic combinations and ranking of pairs. Humphreys (1982) did not find significant differences in the test-retest reliability hetween paired comparisons, conditional rankings, triadic combinations, and ranking of pairs. Hence, the completion time required and data quality have been determined for several methods, but the extent of completion or the occurrence of missing values for various tasks is largely unknown. In this study, we investigate whether the data collection methods differ in completion time and extent of completion.

Q2: Are there differences between the data collection methods with respect to (a) completion time and (b) the occurrence of missing values? The following aspects of MDS solutions are possibly influenced by the data collection methods: the recovery of known distances, the fit of the model to the data, the dimensionality, the error variance, and the coordinates in the perceptual map. The recovery of true known distances has been studied by Rao and Katz (1971), Henry and Stumpf (1975), and Mclntyre and Ryans (1977). In a simulation study Rao and Katz (1971) found that ranking of pairs produced better recoveries of the original confi~guration than sorting, pick k of n, and conditional ranking. Henry and Stumpf (1975) and Mclntyre and Ryans (1977) did not fred significant differences in recovery of the true geographical distances between the methods of paired comparisons, conditional raukings, ranking of pairs, and triadic combinations. Although some differences have been found (Humphreys, 1982), configurations derived from various types of similarity data are highly similar in general (Kinnear and Taylor, 1975; Whipple, 1976). Effects on the dimensionality and the error variance of perceptual maps ha~,e to our knowledge not been not studied. We investigate the

following research question with r~¢.~.~ to effects on the MDS solution: Q3: Are there differences between the data cogection methods with respect to (a) the recovered dimensionality, (b) the amount of error in the data, and (c) the recovered stimulus coordinates? Table 1 provides on overview of I ~ r ~ research on data collection meth~ds. In p r ~ studies on data collection n ~ ~ a selected number of variables were i n v e ~ Moreover, none of the previous ~ i e s ~ e s the data collection methods which are ~ commonly used in research practice: ~ pa~cd comparisons, conditional rankings, and combinations (Coxon, 1982). The current knowb edge of to what extent data collection methods have an effect on fatigue, boTedom, s t ~ knowledge, and task insight, the similarity and the MDS solution is therefore and rather fragmented.

3. Study des~a and method 3.l. Design of the experiment On the basis of a pilot study ~ subjects, we ~lected 12 brands of a u t ~ and 12 brands of beer for the study. In the the questionnaire was tested with respect to phrasing of the questions, the length of the tasks, and the operational~ation of the d e p e n ~ t measures used. For the main study itself, a r ~ sample was drawn from the telephone r e g r e t of the city of Groningen. One htmdred and ninetyfive subjects participated in the experL-nenL order to motivate the respondents, they r e c e b ~ an incentive of ten Dutch guilders (about 6 US$)o In the experiment, subjects evaluate s i ~ ties of 12 automobile brands by ways of oae of the four data collection methods. For each of four data collection methods, subjects are randomly assigned to one of five experimental The type of judgment made ~ the evaluated in a task preceding the m ~ t y judgment task is varied between the five expert-

366

T.H.4. Bijmoltet al. ~Intern. J. of Research in Marketing 12 (1995) 363-371

Table I Pre~dous research on data collection methods Study

Data collection methods a

Effects considered b Subject-related variables

Rao and Katz (1971)

ST PI CR RP

Neidetl (1972~

CR TC

K.innear and Taylor (1975) Henry and Stumpf (1975)

PC CR RP TC CT DY CR TC RP

Whipg~e (1976) McIn~,re and Ryans (1977)

CR TC PC CR

Humphre.vs (1982)

PC CR RP TC

Similarity judgments

MDS solution distance recovery (RP > PI = CR ST) =

missing values (CR > TC) completion time (CR > RP > TC) boredom (PC > CR) task insight (PC > CR)

completion time (PC > CR)

completion time (PC>CR>TC=RP) data quality (PC = CR = RP = TC)

coordinates diste,nce recovery (CR = TC = RP) coordinates distance recovery (PC = CR)

fit of data (RP>PC=CR=TC) coordinates

~ = sorting, P! = pick k out of any, PC = paired comparisons, CR = conditional rankings, RP = ranking of pairs, TC = triadic combinations, CT = complete method of triads, and DY = dyads. A > B means A outperforms B, A = B means A and B have similar performance. m e n t a l g r o u p s 3. T h e p r e c e d i n g t a s k s a r e s t a n d ardized to a duration of about fifteen minutes. T h e d e s i g n e n t a i l s 20 g r o u p s in t o t a l w i t h a b o u t 10 s u b j e c t s p e r g r o u p .

3.Z Maximum Likelihood Multidimensional Scaling In order to analyze the similarity data obt~dned in o u r e x p e r i m e n t , w e a p p l y t h e M A X S -

3 Subjects assigned to group I do not perform a preceding task. The preceding task of subjects assigned to group 2 is identical to their main task. In group 3 the preceding task is identk:al to the main task, but the stimuli are beer brands. Subjects in group 4 are provided a free response task on automobile brands, and subjects in group 5 a free response task on beer brands. "f~m purpose of these groups is to r~anipulate the fatigue (F), boredom (B), task insight (T), and stimulus knowledge (S) levels of the subjects. In group 2 we hypothesized the occurrence of relatively high levels of F, B. T, and S, in group 3: F aed T, in group 4: F and S, in group 5: F. Group 1 served as a control group. Since manipulation checks do not sufficiently confirm the hypothes/zed effects on ~m occurrence of fatigue, boredom, task insight, and stimulus knowledge, differences between the five groups are not discussed in detail

C A L m e t h o d o l o g y ( T a k a n e , 1981, 1982; T a k a n e a n d C a r r o l l , 1981). M ~ X S C A L p r o v i d e s a unifying, m a x i m u m l i k e l i h o o d b a s e d f r a m e w o r k f o r t h e a n a l y s i s o f t h e t y p e s o f similarity d a t a c o l l e c t e d in o u r study, as it a c c o m m o d a t e s s o r t i n g ( T a k a n e , 1981), p a i r e d c o m p a r i s o n s ( T a k a n e , 1981), c o n d i t i o n a l r a n k i n g s ( T a k a n e a n d C a ~ o l l , 1981), a n d t r i a d i c c o m b i n a t i o n s ( T a k a n e , 1982). M a x i m u m Likelihood Multidimensional Scaling (MLMDS) o f f e r s a n u m b e r o f a d v a n t a g e s o v e r classical M D S m o d e l s . T h e s e a d v a n t a g e s a r e g a i n e d by t h e f a c t t h a t M L M D S m o d e l s explicitly a s s u m e t h a t t h e observed dissimilarity data are error-perturbed. C o n t r a r y t o t h e classical m o d e l s , t h e m a x i m u m likelihood approaches estimate variance paramet e r s in a d d i t i o n t o t h e s t i m u l u s c o o r d i n a t e s . MLMDS enables the researcher to test between a l t e r n a t i v e m o d e l s . F o r e x a m p l e , M L M D S allows significance tests of the number of dimensions that represents the data best.

3.3. Dependent measures W e assess t h e e f f e c t s o n t h e f o l l o w i n g c h a r a c teristics:

T.Hdt. Bijmolt et al. ~Intern. J. of Research in Marketing 12 (1995) 363-371

367

Fatigue, boredom, stimulus knowledge, and task insight: Subjects are asked to indicate their feel-

Table 2 Effects o f the data collection m e t h o d s

ing of fatigue, boredom, stimulus knowledge, and task insight. Making the subjective ratings should tax the respondents as little as possible to prevent the burden of the task interfering with the experiment. Therefore, we developed four single-items scales, ranging from 0 to 100. The extremes of these scales are labelled: not fatigued-extremely fatigued, not bored-extremely bored, no insight-complete insight, and no knowledge-complete knowledge. This type of rating scales to evaluate subjective qualities are commonly used in psychological measurement (Tuckman, 1988). The scales are included in the questionnaire directly after the introduction of the main similarity judgment task and at the end of this task. Completion time: During the judgment task, subjects registered the time at several points in the questionnaire. From these recordings the total time needed to complete the task is calculated. Occurrence of missing values: For each subject, the number of missing similarity judgments is expressed as a percentage of the total number of similarity judgments. In the analysis the variable is coded into two categories: no missing values and missing values. Dimensionality: We use the likelihood ratio test to evaluate dimensionality. Although there are some theoretical problems in the application of this test, Bijmolt et al. (1994) showed that for MAXSCAL the likelihood ratio test outperforms several alternative decision criteria with respect to recovering the "true" dimensionality. Under conditions similar to those in the current study, they showed the likelihood ratio test in MAXSCAL to indicate the correct dimensionality in about 80% of the cases. Error variance level: MAXSCAL provides an estimate of the error variance. A small estimated error variance corresponds to a good fit of the distances derived by MAXSCAL to the obsep-ed dissimilarity data. Coordinates: MAXSCAL yields estimates of the stimulus coordinates from which (Euclidean) distances can be directly obtained. As the solutions of MAXSCAL are invariant under rotation

Effects

Data ~ ST

m~*zim~ ~ PC

CR

TC

Subject-related variables Fatigue

tz ¢

Boredom

,1 c V-c Ac

Stimulus knowledge /~

Task insight

zl v. A

16.2 1.2 25.8 0.4 36.5

a a a 'j

-0.0

43.7 0.2

24.t ¢ 32.6 f 5.9 a 17.t ¢ 30.9 a 40.0 ¢ 6.1 a 2 | . 0 ¢ 35.2 37.4 - 3.0 - 0.0 44.2 50.0 -7.7 -3A

31.7 ~ 20.9 ~ 42.6 ~ 28.7 r 4L2 2.0 54.4 -2.6

Similarityjudgments Completion time ~ Missingvalues (%) ~

3.2 a 22.2 a.c

7.8 ¢ 16.3 a

2.0a {).9 0.2 a

2.2a 1.8 0.8 ¢

19.3 f 18.7 f 19.6 a 40.0 ~

MDSsolution Dimensionali~y c Error variance

2 dim. 3 dim. ~

3.2 ¢ 2.1 1.0 ¢

ST = sorting, PC = paired comparisons, C R = c rankings, and T C = triadic combinations.

3.0 ¢ 2.0 1.0 e ~

b,c Significant differences on at a = 0.05 and 0.GI, respectively. a,L~ Means in the same row that share at least one scperscri~ are not significantlydifferent from one another (a = 0.05). and scale transformations, we assess effects on the stimulus coordinates through the distance matrix of the solution.

4. Results

4.1. Effects on fatigue, boredom, stimulus knowledge, and task insight Research question Q1 about the occurrence of fatigue, boredom, stimulus knowledge, arid task insight and differences in these aspects between different data collection methods is tested ~ a repeated measurement A N O V A for each of t ~ four aspects separately. In the ANOVA, the effects of the data collection methods are tested between subjects, and changes in the meascres during each task are tested within subjects. The means and increases of the measurements of these four variables are depicted for the data collection methods in Table 2. Table 2 contains the means and changes in the

368

T.HA. B(irnoltet al. /Intern. Z of Research in Marketing 12 (1995) 363-371

measurements across the task for each data collection method. As Lndicated by significant witl'fin-subject effects (fatigue: F = 114.11, d f = 1,170, p < 0.01; boredom: F = 127.39, df = 1,170, p <0.01), subjects become more fatigued and bored during the judgment tasks. There is a slight decrease in task insight ( F = 5.73, df = 1,168, p = 0.02), but stimulus knowledge does not change sdgnificantly during the task ( F = 0.11, df = 1,170, p = 0.75). The means as well as the increases of fatigue and boredom differ considerably between the data collection methods (Table 2). The mean level of fatigue and boredom is lowest for sorting, somewhat higher for paired comparisons, while for conditional ranldngs and triadic combinations it is largest. While the increase of fatigue and boredom is rather small during the sorting and to a lesser extent the paired comparisons tasks, the conditional rap.kings and triadic combinations cause large increases in fatigue and boredom. Stimulus knowledge and task insight of the sub~ : t s is almost constant across data collection methods. Hence, research question 1 cannot be confirmed completely. Whereas the judgment tasks cause an increase in fatigue and boredom, our experiment provides little evidence for learning effects in MDS judgment tasks. The data collection methods differ considerably in the extent to which they cause fatigue and boredom. 4.Z Effects on the similarity judgments

The completion time (research question Q2a) dffffers significantly between the data collection methods ( F = 134.78, df = 3,167, p < 0.01). Sort/rig is dearly the fastest method, with a mean completion time of 3.2 min. Paired comparisons take also relatively little time (mean of 7.8 min), while conditional ranldngs and triadic combinations (nr_~e the incomplete design using 80 out of 220 triads) take 19.3 and 18.7 rain, respectively. Most subjects (75%) complete the task entirely without missing answers. The occurrence of missing values (research question Q2h), coded into two categories 4, differs significantly ~ t w e e n the 4 Alternativeclassif'h:afions(e.g. 0%, 1- t0%, and 11-100%) h~d Io similar rcsulL~.

data collection methods (X 2 = 9.02, d f = 3, p = 0.03). The paired comparisons task has the highest percentage of subjects who filled in the entire questionnaire (83.7%). A slightly lower percentage is found in the conditional ranking and sorting tasks, 80.4% and 77.8%, respectively. For the method of triadic combinations many subjects skip a few evaluations, whereas only 60.0% of the subjects complete all judgments. 4.3. Effects on the M D S solution

First, we discuss the effects of the data collection methods on the dimensionality of the solution of MLMDS (research question Q3a). A MAXSCAL analysis (using a lognormal distribution of the error, which accounts for its skewed distribution) is performed for all 20 groups separately (4 data collection methods × 5 experimental groups) The five experimental groups ~.ere provide mere replications since we do not investigate their differences (see also footnote 3). The likelihood ratio test n average indicates a twodimensional solution for sorting and paired comparisons, but a three-dimensional solution for conditional rankings and triadic combinations. Hence, there are differences between the data collection methods: the number of dimensions derived from conditional rankings and triadic combinations data is higher than that from sorting and paired comparisons data. As there is evidence that the two-dimensional solution is appropriate for sorting and paired comparisons and the three-dimensional solution for conditional rankings and triadic combinations, we use these respective solutions to assess the effects on the error variance level (research question Q3h). The estimates of the error variance are 0.9, 1.8, 2.1, and 2.0 for the two-dimensional solutions and 0.2, 0.8, 1.0, and 1.0 fcr the threedimensional solutions for sorting, paired comparisons, conditional rankings, and triadic combinations, respectively. The error variances differ strongly between the three-dimensional solutions ( F = 15.09, df = 3,16, p < 0.01) and somewhat less strongly b~tween the two-dimensional solutions ( F = 2.39, d f = 3,16, p = 0.11). Note that the effects for two and three dimensions are consistent.

"LH.A. Bijmolt et al. ~Intern. Z of Research in Marketing 12 (1995) 363-371

Dimension 1 C

IKBLEH

DG

A

J

F

Dimension 2 B

i

HI

CGLD.I

AF K E

Dimension 3 i

L AFD CG.~KIH f



,

~ -

--

i0

~

d ,~

B -

i

A=BMW

E-----~

I

B = Citro,~

F = Me~:ed~

J = S,~b

=

~

C = Fiat

G=Op~

K = Toye¢~

D = Ford

H -- P e u g e o t

L = V

~

Fig. I. Line plots of the three dimensionsderived with MULTISCALE.

The error variance is substantially lower for sorting (and to a lesser extent for paired comparisons) than for conditional rankings and triadic combit~ations, which indicates that the sorting data (and to a lesser extent the paired comparisons data) are represented best by the distances obtained with MAXSCAL. The differences between perceptual maps derived from the data collected with the four alternative methods (research question Q3c) are investigated in two steps. First, we compute the distances from the two-dimensional MAXSCAL solutions for sorting and paired comparisons and from the three-dimensional MAXSCAL solution for conditional rankings and triadic combinations. Here, we retain the number of dimensions indicated by the likelihood ratio test to be appropriate for each method. The resulting 20 matrices with distances between the automobile brands, corresponding to 20 perceptual maps, are to be compared in the second step. These matrices are, therefore, simultaneously analyzed with MULTI-

SCALE (Ramsay, 1982), a metric maxir~um likelihood MDS program. We estimate a weighted Euclidean distance model to investigate whether a t:ommon perceptual map underlies the distances derived from the data of the four methods and whether the methods cause subjects to weigh the dimensions of this common perceptual map differently (Humphreys, 1982; Kinnear and Taylor, 1975). Fig. 1 contains line plots of t~e three dimensions 5 derived with MULTISCALE. Overall, the solution presented fits the 2:,)distaace matrices reasonably well. The mean correlatkm between the original distances (from MAXSCAL, computed in the first step) and the deAved tances differs significantly between the data collection methods. Paired comparisons have higher correlations on ~verage (0.91) compared to so~ing (0.81), conditional rankings (0.83), mid triadic

The li~elihood ratio test indicates the three-~mens~ soh:tion of MULTISCALEas the most appropriate.

3.70

T.H.A. Bijmolt et aL /Intern, Z of Research in Marketing 12 (1995) 363-371

combinations (0.87). So the perceptual map derived with MAXSCAL from paired comparison data represents the common underlying perceptual map best. In order to interpret the dimensions of the common perceptual map, the average ratings of the a u t o ~ i l e brands on five attributes and dummy variables for the country-of-origin of each brand are fitted into the perceptual map. The dimensions are interpreted on the basis of atm-outes which have a cosine larger than 0.80. The first dimension can be interpreted as status, with on the one side brands which are perceived as cheap, economical, small, ugly, and slow and on other side brands which are perceived as expensive, wasteful of energy, large, beautiful, and fast. We interpret the second and third dimensions to be related to the country-of-origin. The second dimension has Japanese and South-European automohiles as extremes. The third dimension tends to separate German automobiles from the other brands. We perform A N O V A to assess whether the d~nensional weights differ between the data collection methods. The weights for dimension

Table 3 A c c u m u l a t e d knowledge ,m data collection m e t h o d s Effects

Data collection methods a ST

PC

CR

TC

Fatigue Boredom

+ + +

+ +

-

-

StLmulus k r o ~ i e d g e

+

+

+

+

Task insight

+

+

+

+

Compk:tm~n t i m e

+ +

+

_+

-

M~,s/ng v a l u e s D-,~ q~,'ily

+

+ + +

+ +

+

Subject-related va.-iables

Sim£tarity judgments

MDS solution R e c o v e r y o f k n o w n d.~stances F i t ~ f t h e da*.a Dime~i~ Error va~arme

+ + +

+ + + +

+ + + --

+ + + --

Perceptual map

x

x

x

x

ST = so~ing. PC = pzired comparisons, CR = conditional rap3~n~, and TC = tr/ad/c cc~binations. + + = v e r y g o o d , + = g e o d , 4-_ = l n e d i u m , - = p o o r , x = s ~ z d ~ d b~zt n o o r d e r i n g ~c~sibte.

two (F-value = 9.53, df = 3, p < 0.01) and dimension three (F-value = 8.15, df = 3, p < 0.01) differ significantly between data collection methods, but the weights for dimension one do not (F-value = 0.81, df = 3, p = 0.51). If sorting and paired comparisons data are collected dimension two is given higher weight by the subjects, while if conditional ranking and triadic combinations are used dimension three receives higher weight. This is related to the fact that for the latter two methods three dimensions emerge, while two dimensions are derived from sorting and paired comparisons data. Apparently, using conditional ranking and triadic combinations, more information is extracted from the subjects and the country-of-origin dimension is split into two separate dimensions reflecting the country-of-origin features.

5. Discussion

In this study, we demonstrate that the choice of a data collection method is critical in an MDS positioning study, as it affects respondent fatigue and boredom, completion time, missing values, and the perceptual map that is retrieved from the respondents. Table 3 integrates the findings of this study and those of previous research to provide an overview of the accumulated knowledge on differences between data collection methods to date. The methods of sorting, paired comparisons, conditional ranking, and triadic combinations extract an increasing amount of information from respondents in that order, resulting in solutions of higher dimensionality. On the other hand, conditional ranking and triadic combinations require more completion time, cause a larger increase in fatigue and boredom of subjects, and result in higher levels of error variance in the MDS solution. For these reas,~ns, we recommend that conditional ranldngs and triadic combinations are used only if the stimulus set is relatively small, and in situations where the maximum amount of information is to be extracted from the respondents. If the stimulus set is relatively large, as is commonly encountered in marketing research, the methods of sorting and paired coin-

T.H.A. Bijmolt et al. / Intern. J. of Research in Marketing 12 (1995) 363-371

parisons are better suited for collecting similarity data. The choice between sorting and paired comparisons will depend on characteristics of the application, such as the number of stimuli and whether or not individual-level perceptual maps are desired. Sorting is clearly the fastest method and causes the least respondent fatigue and boredom. This method is especially appropriate for collecting similarity data if the number of stimuli is very large. A drawback of sorting is that it yields very little information for individual subjects, which makes it impossible to derive individual-level perceptual maps. The use of paired comparisoi~s offers a reasonable compromise between retrieved information and respondent fatigue. This judgment task has proved to take relatively little time and effort from the subjects, and almost all subjects complete the entire task. Furthermore, paired comparisons appears to offer an MDS solution intermediate to those based on sorting, conditional rankings, or triadic combinations. Nevertheless, we recommend that in applying paired comparisons attempts should be made to control respondent fatigue and boredom. We believe that next to model development, the further study of data collection methods constitutes an important contribution to a wellfounded application of MDS in marketing science and in practice. We suggest that future research should focus on gaining deeper insight into the process of the formation of the similarity judgments, on assessing the biases in perceptual maps caused by various factors, and on developing procedures to reduce such biases.

References Bijmolt, T.H.A., M. Wedei and M.C. Mellens, 1994. A Monte Carlo evaluation of maximum likelihood Multidimensional Scaling methods. University of Groningen. Institute of Economic Research, research memorandum 566. Cooper, L.G., 1983. A review of Multidimensional Scaling in

Marketing Research. Appt/ed P s y c ~ ' M , M * ~ 7, 427-450. Coxon, A.P.M., 1982. The user's gcide ~o ~ m k , ~ Scaling with special reference to t ~ MD~X) ~ of computer programs. London: Heinem~m¢. Henry, W.A. and R.V. StumP, 1975. T/me m ~ ae,coc~V measures for alternat~,e m u | d ~ a m e ~ seabag dam ¢¢glection methods. Journal of Market/rig ges~,rcb ~2, 1*J6170. Humphreys, M.A., 1982. Data co~Mctioneff~c~ ~m n o m ~ ¢ ~ Multidimensional Scaling soiutkms. E ~ ~ choiogical Measurement 42, lflgPS-Iff2"2. Johnso~z, M.D., D.R. Lehmann av.d D.R. Home, 19~,L The effects of fatigue on judgments of in~cl~*~k~ ~ . International Journal of Research in Markci/l~g 7. 35-43, 53-56. Johnson, M.D., D.R. Lehmann, C. Fornelt am[ D.g. Hot,e, 1992. Attribute abstraction, f e a t u r e ~ , the scaling of product similar/ties. | n ~ m a t M ~ m t ~ M Research in Marketing 9, 131-147. Kinnear. T.C. and J.R. Taytor, 1975. ~x¢ effec~ of W ~ judgment collection procedures on i n d V ¢ ~ dggergmces ~aling results: An empirical examination. In: R.C. C~ha¢ (ed.), 1974 combined proceedings, Angrican Mm~r,eficg Association, 162-163. Malhotra, N.IC, 1987. Validity and sm~ctural r e ~ - y of Multidimensiopal Scaling. Journal of Market/rig gesea~b 24, 164-173. Mclntyre, S H. and A.B. Ryans, 1977. Time and aocumey measures for alternative MuRklimeockma| S c a t ~ da~a collection methods: Some additiorml resuRs. ~ Marketing Research 14, 607-6t0. Neldell, L.A., 1972. Procedures for obtaining s ~ d h ~ s d~a. Journal of Marketing Research 9, 335-337. Ramsay, J.O.. 1982. Some statistical aWt~'onch*s to mensional Scaling data. Journa! of the RoyM Svd,t i s , ~ Society 145 (A), 285-312. Ran, V.R. and R. Ka'~. !971. ,~Itern~ive M u l t / d ~ m e ~ Scaling methods for large stimu|us sets. Jocrn~ of Mm-kev ing Research 8, 488-,194. Takane, Y., 1981. Multidimensional succe~,s/vecategories seating: a maximum likelihood method. [~ychometri~ 46, 9-28. Takane, Y., 1982. The method of triadic combinations: A new treatment and its application. B e h a v ~ t r ~ tl, 37-,~. Takane, Y., and J.D. Carroll, 1981. Nomm:tric likelihood Multidimensmna| Seating from d ~ r e c ~ r~qgings of sim/|ocities. Pss-chometrika 46, 389-405. Tockman, B.W., 1988. The sca]i,~g of mood. E d , - ~ c a ~ and Psychological Megsuremem 48, 419-427. WhiplMe, T.W., 1976. Variatkm ae/~ng M g ~ i ~ ¢ ~ / ~ Scaling solutions: An examinatio~ of th* eff¢~ ~ colMcthm differences. Journa| of Ma~ket~ Rese~ch t2k 98-103.