Probabilistic Multidimensional Scaling Using a City-Block Metric

Probabilistic Multidimensional Scaling Using a City-Block Metric

Journal of Mathematical Psychology 45, 249264 (2001) doi:10.1006jmps.2000.1311, available online at http:www.idealibrary.com on Probabilistic Mul...

222KB Sizes 1 Downloads 90 Views

Journal of Mathematical Psychology 45, 249264 (2001) doi:10.1006jmps.2000.1311, available online at http:www.idealibrary.com on

Probabilistic Multidimensional Scaling Using a City-Block Metric David B. MacKay Indiana University

Using a probabilistic model, exact and approximate probability density functions (PDFs) for city-block distances and distance ratios are developed. The model assumes that stimuli can be represented by random vectors having multivariate normal distributions. Comparisons with the more common Euclidean PDFs are presented. The potential ability of the proposed model to correctly detect Euclidean and city-block metrics is briefly investigated. These results are then contrasted to those obtained using a deterministic, nonmetric model.  2001 Academic Press

INTRODUCTION

It has been argued for some time that when the attributes of stimuli are obvious or separable, spatial or geometric models of the postulated psychological space should be constructed using a city-block rather than a Euclidean metric (Arabie, 1991; Attneave, 1950; Householder 6 Landahl, 1945; Shepard, 1987; Torgerson, 1958). In fact, much of the literature in experimental psychology involves stimuli with separable attributes. However, although there has been much work done developing probabilistic versions of Euclidean metrics, very little has been done with developing a probabilistic version of the city-block metric. Probabilistic models that assume Gaussian stimuli almost always use a Euclidean metric (viz. Ashby 6 Gott, 1988; Ashby 6 Maddox, 1993; Ashby 6 Perrin, 1988; Bijmolt, 1996; Bossuyt, 1990; De Soete, Carroll 6 DeSarbo, 1986; Ennis, Palen, 6 Mullen, 1988; Hefner, 1958; MacKay, 1989; MacKay 6 Zinnes, 1995; Zinnes 6 Griggs, 1974). Exceptions to this pattern of using Euclidean metrics with Gaussian stimuli include Ennis and Ashby (1993) and Ennis 6 Johnson (1993), who computed the moments of city-block distances among dimensionally independent stimuli, and MacKay, Bowen, and Zinnes (1996), who defined the PDF of a city-block distance ratio in a stimulus space of one dimension. The author thanks J. L. Zinnes, D. M. Ennis, and an anonymous reviewer for their helpful comments on an earlier draft of this paper. Address correspondence and reprint requests to David B. MacKay, Kelley School of Business, Indiana University, 1309 East Tenth Street, Bloomington, IN 47405-1701. E-mail: mackayindiana.edu.

249

0022-249601 35.00 Copyright  2001 by Academic Press All rights of reproduction in any form reserved.

250

DAVID B. MACKAY

Probabilistic geometric models are often estimated using maximum likelihood methods. The absence of readily computed PDFs for non-Euclidean metrics has been a limiting factor in going beyond Euclidean metrics. In this paper, the PDFs for city-block distances and distance ratios among multidimensional stimuli are developed. Computational forms are provided so they may be readily applied. This is followed by a comparison of city-block and Euclidean PDFs which indicates that there are some situations in which the two PDFs are very similar. The potential ability of models based upon the city-block PDF to correctly distinguish a city-block or Euclidean metric under different levels of stimulus variability is briefly explored in a simulation study. Corresponding results for a commonly used nonmetric model are also reported. CITY-BLOCK DISTANCES

It is assumed that the perceptual aspects of each of n stimuli can be represented by a random vector having a multivariate normal distribution. Specifically, for stimulus S j , j=1, ..., n let X j =(x j 1 , ..., x jp ) be a p-dimensional random vector and assume that it has a p-variate normal distribution with mean vector + j =(+ j 1 , ..., + jp ) and diagonal variance matrix 7 j . The city-block distance between S i and S j is p

p

d ij = : |x ik &x jk | = : |x ijk |. k=1

(1)

k=1

When the underlying distribution is normal, absolute values follow a ``folded normal distribution.'' To derive a folded normal PDF, let y k =x ijk for simplicity and consider a one dimensional normal PDF for random variable Y k g( y k )=

1 - 2? _ k

exp&

\

( y k &+ k ) 2 , 2_ 2k

+

&
(2)

where + k and _ 2k are, respectively, the mean and variance of Y k . To restrict the random variable to nonnegative values, consider the situation where X k = |Y k |. Now, X k is said to have a folded normal distribution because the distribution can be regarded as being formed by folding the part corresponding to negative Y k about the vertical axis and then adding it to the positive part. The relation of the normal and folded normal is illustrated in Fig. 1 for three density functions with normal distribution means of one, two, and three, respectively, and variances all of one. The folding effect diminishes as the means of the original normal distributions increase. The PDF of the folded normal distribution (Leone, Nelson, 6 Nottingham, 1961; Johnson, Kotz, 6 Balakrishnan, 1995) is f (x k )=

1 - 2? _ k

_ \

exp &

(x k &+ k ) 2 (x k ++ k ) 2 +exp & 2_ 2k 2_ 2k

+

\

+& ,

x k 0,

(3)

251

PDFS USING A CITY-BLOCK METRIC

FIG. 1. Relation of normal and folded normal PDFs.

or, equivalently, using a hyperbolic cosine function &2 1 2 2 &2 f (x k )=- 2? _ &1 k [cosh(+ k x k _ k ) exp(& 2 (x k ++ k ) _ k )],

x k 0.

(4)

To find the PDF of a two-dimensional city-block distance, start with the joint density of two independent folded normal random variables X i and X j , which is f (x i , x j )=

1 (x &+ ) 2 (x ++ ) 2 exp & i 2 i +exp & i 2 i 2?_ i _ j 2_ i 2_ i

\ \ + \ ++ (x &+ ) (x ++ ) _ exp & +exp & \ \ 2_ + \ 2_ ++ . 2

j

2

j

j

j

2 j

(5)

2 j

Transformations of X i and X j are sought that will (1) define the city-block distance and (2) be mathematically tractable. Define the transformations Z i =X i +X j and Z j =X i (X i +X j ). The first transformation provides the desired city-block distance and the second has convenient (0, 1) limits of integration. Then, X i =Z i Z j and X j =Z i &Z i Z j and the absolute value of the Jacobian of the transformation is Z i . In terms of the new variables, f (z i , z j )=

(z i &+ j &z i z j ) zi exp & 2?_ i _ j 2_ 2j

2

(z i ++ j &z i z j ) 2

\ \ + \ 2_ (z z &+ ) (z z ++ ) _ exp & +exp & \ \ 2_ + \ 2_ ++ . +exp &

2

i

j

i

2 i

2 j

++

2

i j

i

2 i

(6)

252

DAVID B. MACKAY

To find the density function g(z i ) of Z i =X i +X j , integrate out z j , g(z i )=

|

1

f (z i , z j ) dz j .

(7)

zj =0

As shown in the Appendix, the integral simplifies nicely to g(z i )=

1 - 2? a 2

_ :

2

2

:

:

k1 =1 k2 =1 k3 =1

8[(&1 k3 + j _ 2i (&1) k2 + i _ 2j +c k1 z i )(ab)]&(12) exp[(&1 k1 +k2 + i (&1) k1 +k3 +1 + j +z i ) 2(2a 2 )]

,

(8)

where a=- _ 2i +_ 2j , b=_ i _ j , c 1 =_ 2i , c 2 =_ 2j and 8 is the cumulative distribution function (CDF) of the standardized normal distribution. Similar methods were tried to derive PDFs for city-block distances in higher dimensional spaces, but none were found that did not require numerical integration. When using a transformation like that used to derive the PDF for two-dimensional city-block distances, the PDF for a city-block distance in a three-dimensional space can be represented as h( y i )=

|

1

f ( y i , y j ) dy j ,

(9)

yj =0

where Y i =X i +X j +X k , Y j =(X i +X j )(X i +X j +X k ) and f ( y i , y j ) is the joint density function of y i and y j . While numerical integration has been used for PDF estimation in maximum likelihood multidimensional scaling (MDS) algorithms before, it is desirable to avoid numerical integration due to its time-consuming nature and occasional numerical instability. To compute likelihood functions of higher dimensional city-block distances, approximations were sought which would make use of either the moments, derived from the Ennis et al. (1993) moment generating function ,(t) for city-block distances, or the corresponding cumulants, derived from the cumulant generating function (t)=ln ,(t). The folded normal distribution can be expressed as a function of the noncentral / distribution. Springer (1979) has shown that distributions of this type are uniquely determined by their cumulants. A review of approximations for / 2, / and Rayleigh distributions is provided by Johnson, Kotz, 6 Balakrishnan (1994). Based on their review, Haldane's (1937) method, a generalization of the widely cited Wilson and Hilferty (1931) approximation to the / 2 distribution, was used to estimate the PDF for city-block distances. Haldane's method approximates the distribution of [d ij E(d ij )] h, where d ij is the city-block distance between S i and S j , and E(d ij ) is the expected value, by a Gaussian distribution, with h defined from the cumulants of the distribution of d ij so as to accelerate its convergence to normality. The quality of the approximation improves as the dimensionality of the space increases. A comparison of the PDF estimated by this approximation to that obtained from Eq. (9) in three dimensions

PDFS USING A CITY-BLOCK METRIC

253

FIG. 2. Exact and approximate PDFs for city-block distances and PDFs for Euclidean distances among stimuli in three-dimensional space. In panel A, the differences between the centroids of the two stimuli are 0.1 on all three dimensions. For panels B and C, the differences are 1.0 and 2.0, respectively. In all three panels, the dimensional variances are (2.0, 1.0, 0.5).

thus serves as a ``worst case'' indicator of the error to be expected from using the approximation in higher dimensions. The quality of the approximation is excellent. Figure 2 illustrates the differences between the exact and approximate city-block PDFs for three situations. In panel A, the differences + k in the centroids of the stimuli for all three dimensions are 0.1. In panels B and C they are 1.0 and 2.0, respectively. Equal values of + k on all dimensionswere chosen to maximize the differences in the city-block and Euclidean metrics. Since studies on the use of similar approximations in Euclidean spaces have shown that the quality of the approximations diminishes as the differences in the magnitudes of the dimensional variances increase (Jensen 6 Solomon, 1972), an anisotropic variance matrix was defined. Dimensional variances were represented by a diagonal matrix with values (2.0, 1.0, 0.5) on the diagonal. PDFs for the exact city-block, approximate city-block, and Euclidean metrics are included in each panel. It is only when the variances are very large relative to the differences in the centroids that any appreciable difference occurs in the PDFs of the exact and approximate city-block distances. Even there, the largest absolute difference has a relative magnitude ( |Actual PDF&Approximate PDF|Actual PDF) of only 0.028. In all cases, the mode and curvature of the actual and approximate PDFs are very close. Differences between the city-block and Euclidean PDFs are much greater. As the relative magnitudes of the variances decrease, the distinction between the cityblock and Euclidean PDFs increases. These results suggest that in situations where the dimensional variability of the stimuli is high, distinguishing the correct metric may not be easy. Comparisons were also made in isotropic spaces in which the variances are the same on all dimensions. Results (not shown) were slightly better. CITY-BLOCK DISTANCE RATIOS

PDFs for distance ratios are needed when modeling preference ratio judgments. For preference ratio judgments, a subject indicates which of a pair of stimuli he or

254

DAVID B. MACKAY

she prefers and how many times he or she prefers the more preferred stimulus over the less preferred stimulus. Preference ratios extract more information from the subject on each trial than binary choices and often require fewer replications to estimate model parameters at a given level of accuracy. To find the density function of the ratio of two independent random variables X i and X j at x i and x j we start with the folded normal density function (4) and define the joint density f(x i , x j ) of X i , X j by the hyperbolic cosine representation of Eq. (5), &1 f (x i , x j )=(2?) _ &1 cosh( + i x i _ &2 ) cosh( + j x j _ &2 ) i _j i j

& 12 ( + 2j +x 2j ) _ &2 ). _exp(& 12 (+ 2i +x 2i ) _ &2 i j

(10)

Define the transformations Z i =X i X j and Z j =X j . Then, X i =Z i Z j , X j =Z j and the absolute value of the Jacobian of the transformation is Z j . In terms of the new variables, f (z i , z j )=z j

2 &1 &1 _ _ j cosh( + i z i z j _ &2 ) cosh( + j z j _ &2 ) i j ? i

1 1 _exp & (+ 2i +(z i z j ) 2 ) _ &2 & ( + 2j +z 2j ) _ &2 . i j 2 2

\

+

(11)

To find the density function g(z i ) of Z i =X i X j , integrate out z j , g(z i )=

|



f (z i , z j ) dz j .

(12)

zj =0

Integrating and canceling as before, we get, after a few steps,

\

\

+ 2j + 2i & 2_ 2i 2_ 2j

+
g(x i x j )= 2 exp &

i

j

32

i

j

32

i

where +_ &2 a=(x i x j ) 2 _ &2 i j b=(x i x j ) + i _ &2 ++ j _ &2 i j c=(x i x j ) + i _ &2 &+ j _ &2 i j ( + i &(x i x j ) + j ) 2 d=& 2(_ 2i +(x i x j ) 2 _ 2j ) ( + i +(x i x j ) + j ) 2 e=& . 2(_ 2i +(x i x j ) 2 _ 2j )

j

(13)

PDFS USING A CITY-BLOCK METRIC

255

A closed form expression not requiring numerical integration could not be found for distance ratios in spaces of two or more dimensions. To calculate the PDF using numerical integration, the transformation and integration of Eq. (12) is used where the original variables X i and X j are replaced by city-block distances in a twodimensional space. As before, we now seek a cumulant-based approximation that can be accurately and rapidly used to calculate the PDF of a ratio of city-block distances that are defined in a space of any dimensionality. Cumulants of the ratio are not easily calculated. It is thus better to make use of the property that P(d ij d ik
(14)

and work with the cumulants of the weighted difference. Normalizing transformations such as Haldane's approximation will not work with differences such as Eq. (14) that can take on negative values. An approximation that has been successfully used in similar situations (Imhof, 1961) is Pearson's (1959) three-moment central / 2 approximation to the distribution of noncentral / 2. Following Pearson, if we let E(R) and _ R be the mean and standard deviation of the ratio of two folded normal distributions, we can approximate the distribution of the ratio R by U, where

Ut

\

/ 2& && _ R +E(R), (2&) 12

+

(15)

and determine v so that R and U have equal third cumulants. Since E(/ 2v )=v and Var(/ 2v )=2v, E(U)=E(R) and Var(U)=_ 2R . The third cumulant of U (Mathai 6 Provost, 1992, p. 165) is } 3(U)=2 - 2 _ 3R - &.

(16)

To find the cumulants of the ratio, derive the cumulant-generating function (t) of the folded normal from Ennis and Johnson's moment generating function. In one dimension, for random variable |D| where DtN( +, _ 2 ), (t)=ln(exp[&t(2+&t_ 2 )2] 8[&( +&t_ 2 )_] +exp[t(2++t_ 2 )2] 8[(++t_ 2 )_]).

(17)

Remembering Eq. (14), the property that the cumulant of a sum of independent random variables is the sum of the cumulants, and the property that if a variate is multiplied by a constant c the s th cumulant is multiplied by c s, then the s th cumulant } s can readily be found by differentiating Eq. (17) s times with respect to t and letting t Ä 0. Adding a subscript i for dimensionality and a subscript j to indicate whether we are concerned with the first d ij or the second rd ik term of

256

DAVID B. MACKAY

Eq. (14), we get the following for the first three cumulants } 1(R), } 2(R), and } 3(R) of the ratio, } 1(R)=: : $ j j

i

\

&+ 2ij + ij 2 ++ ij 28 &1 _ ij exp 2 ? 2_ ij _ ij

\ +

\ \ + ++ ,

(18)

where $j=

if j=1 if j=2,

1

{&r \

} 2(R)=: : $ j + 2ij +_ 2ij & j

i

\

&+ 2ij + ij 2 ++ ij 28 &1 _ ij exp ? 2_ 2ij _ ij

\ +

2

(19)

\ \ + ++ + ,

where $j =

if j=1 if j=2,

1

{r

2

and &+ 2ij + ij 2 ++ ij 28 &1 _ ij exp 2 ? 2_ ij _ ij

_

\ \ + \ \ + ++ &+ + 2 +2 \ ? _ exp \ 2_ + ++ \28 \_ + &1++ &+ 2 &  ? _ exp \ 2_ +& ,

} 3(R)=: : $ j &2+ 2ij j

i

2 ij 2 ij

ij

ij

ij

2 ij 2 ij

3 ij

3

ij

(20)

where $j =

1

{&r

3

if j=1 if j=2.

Equating } 3(R) with } 3(U), we find that &=8_ 6R (} 3(R)) 2. From (15), RrUtb+c/ 2& , where r means ``is approximately distributed as,'' b=} 1 &2} 2(R) 2} 3(R), and c=} 3(R)(4} 2(R)). Since & is usually fractional, the incomplete gamma function is used to calculate 2 / & . When the difference in Eq. (14) is nonpositive, the same approximation is used as long as } 3(R)>0. Otherwise, approximate the distribution of (rd ik &d ij ).

PDFS USING A CITY-BLOCK METRIC

257

FIG. 3. Exact and approximate PDFs for city-block distance ratios and PDFs for Euclidean distance ratios among stimuli in a two-dimensional space. In panel A, the dimensional variances in the numerator and denominator are (0.02, 0.01). For panels B and C, the corresponding values are (0.2, 0.1) and (2.0, 1.0), respectively. All covariances are zero valued. In all three panels, the differences in the means of the real and ideal distributions are (2, 2) for the numerator and (1, 0) for the denominator.

The quality of the approximation is again very good. Figure 3 illustrates the quality of the approximation and the role of the variances in distinguishing between city-block and Euclidean metrics. In all three panels, the PDFs are for a ratio of distances in two dimensions where the differences in the means of the ideal and the stimulus in the numerator are (2, 2) and the differences in the means of the ideal and the stimulus in the denominator are (1, 0). As before, anisotropic values were chosen for the variancecovariance matrices of the differences. Equivalent variance covariance matrices were chosen for the numerator and denominator in all cases. For Panels A, B, and C the values of the variances are (0.02, 0.01), (0.2, 0.1), and (2.0, 1.0). Differences between plotted values of the actual and approximate cityblock distance ratio PDFs are less discernable than those for city-block distance PDFs. Numerically, the greatest difference, which occurred in Panel B, had a relative magnitude of 0.0148. Differences between the city-block and Euclidean PDFs are most noticeable when the variances are small. Distinguishing between city-block and Euclidean metrics in empirical studies using distance ratios may be difficult when the magnitudes of the variances are high. METRIC IDENTIFICATION

The increasing level of similarity between Euclidean and city-block PDFs at higher levels of stimulus variance raises the question of the ability to correctly identify Euclidean and city-block metrics. There is a large literature on metric identification. Arabie (1991) has provided a survey of the current state of MDS using the city-block metric. Much of the recent mathematical work has centered around developing optimization methods that overcome local minima problems with nonmetric models (Groenen 6 Heiser, 1996; Hubert, Arabie, 6 HessonMcinnis, 1992). To obtain an idea of how difficult it might be to distinguish city-block and Euclidean data, dissimilarity judgments were generated using both city-block and

258

DAVID B. MACKAY

Euclidean metrics in a two-dimensional space. Centroids for 12 stimuli were randomly drawn from a uniform distribution in the range [&0.5, 0.5]. An anisotropic Case III (Thurstone, 1927) variance structure was assumed. To estimate the effect of the magnitudes of the variances, 11 sets of data were generated. Twenty-four variances (12 stimuli, two dimensions) were randomly drawn from a uniform distribution in the range [0.0, b] for each set. Values of the upper bound b were equally spaced in the range [0.0, 2.0] for the 11 data sets. One thousand replicates of city-block distances and one thousand replicates of Euclidean distances were simulated for each of the 11 data sets. Each replicate had its own unique values for centroids and variances. One lower-half matrix of 66 similarity judgments was generated for each replicate by independently sampling the coordinates for each judgment and calculating a city-block or Euclidean distance. Using the known parameter values as our estimates, likelihoods for each of the 22,000 sets of data were calculated for both city-block and Euclidean metrics. For the city-block data, the percentage of times the city-block metric had the higher likelihood was calculated for each of the 11 sets of data. A corresponding calculation was made for the Euclidean data. Results are portrayed in Fig. 4. As expected, the percentage of correct identifications falls off as the level of variance increases. However, the percentage of correct identifications remains quite high, even when the upper bound on the variance is twice the range of the maximum difference in the centroids on either axis. The probability of correctly choosing a Euclidean or city-block metric is quite similar for each level of variation. The

FIG. 4. Probability of correctly choosing a city-block or Euclidean metric as a function of the upper bound of a uniform distribution from which stimulus variance parameters were drawn when using probabilistic scaling with a maximum likelihood criterion.

PDFS USING A CITY-BLOCK METRIC

259

FIG. 5. Probability of correctly choosing a city-block metric as a function of the upper bound of a uniform distribution from which stimulus variance parameters were drawn when using probabilistic and nonmetric scaling methods.

ability to achieve these results in an empirical analysis would, of course, depend upon the quality of the optimization process and the presence of other forms of variabilitytopics beyond the scope of this paper. Having seen that the probabilistic models are potentially able to detect the correct metric, even when stimulus variation is high and the city-block and Euclidean PDFs are relatively similar, we then decided to see if this were also true of nonmetric methods. It should not be expected that nonmetric methods would do well in the presence of high stimulus variation, but they may be quite robust in the presence of lower levels of stimulus variation. Accordingly, the above process was repeated with a maximum upper bound b of 0.1. The nonmetric analysis was conducted with KYST (Kruskal, Young, 6 Seery, 1973). STRESS, a badness of fit measure, was calculated using the known parametric coordinates as coordinate estimates. The percentage of replicates with the lowest STRESS was calculated for each of the 22 sets (11 variance conditions, two metrics) of data. For comparison purposes, the same data were evaluated using the likelihoods from the probabilistic model. The results are quite striking. Figures 5 and 6 indicate that the nonmetric model does perfectly, in either metric, when there is no stimulus variation. However, only a very small level of stimulus variation, b=0.01, is enough to seriously diminish the ability of the nonmetric model to correctly predict the city-block metric. When b=0.10, the probability of correctly detecting the city-block metric is 0.526 while the probability using the maximum likelihood criterion is 0.998. The nonmetric

260

DAVID B. MACKAY

FIG. 6. Probability of correctly choosing a Euclidean metric as a function of the upper bound of a uniform distribution from which stimulus variance parameters were drawn when using probabilistic and nonmetric scaling methods.

model did a little better when the data were Euclidean. The STRESS formula used was what is known as formula 2. Results using formula 1 (not shown) were quite a bit worse. It may, of course, be argued that the nonmetric analysis presented here is biased since the parametric centroids we have used for calculating STRESS are for a model that is contrary to the one assumed by the nonmetric model. To see if results would improve if the nonmetric program were allowed to estimate its own solution, the preceding analyses were repeated. The parametric coordinates provided the initial values and KYST then estimated its final solution. STRESS values for the final solution were used in the analyses. The probabilities of a correct decision (not reported) were marginally worse. While these nonmetric analyses are very exploratory, the results are compelling enough to recommend that the role of stimulus variation be included in the set of open issues being investigated as reasons for incorrect metric identification.

IMPLEMENTATION

Previous papers (MacKay, 1989; MacKay 6 Zinnes, 1995; MacKay 6 Zinnes, 1999) have described the development of PROSCAL, a probabilistic MDS model for similarity, preference ratio, and liking rating data using a Euclidean metric.

261

PDFS USING A CITY-BLOCK METRIC

Incorporation of the proposed city-block metric is straightforward, you simply substitute the log of the appropriate city-block PDF (actual or approximate) for the log of the corresponding Euclidean PDF in the likelihood function. PROSCAL output includes maximum likelihood estimates of mean and variance parameters for the objects and estimates of user-specified measurement model parameters for transforming judgments to city-block or Euclidean distances. Initial estimates used for the Euclidean metric are also used for the city-block metric. Estimation proceeds in an alternating three-stage sequence. Variances are first estimated holding mean and measurement model estimates fixed, then means are estimated holding variance and measurement model estimates fixed, and finally measurement model parameters are estimated holding mean and variance estimates fixed. This process is repeated until convergence occurs. As with other MDS models, a wide variety of applications are possible. Consistent with many previous implementations of Thurstonian models, it has been assumed that individual responses and respondents are interchangeable. One reason for doing this is to minimize the difficulties caused by incidental parameters (Baker, 1992). For applications that require the estimation of incidental parameters, marginal maximum likelihood estimation procedures should be developed.

CONCLUSION

Computationally efficient PDFs, actual and approximate, of city-block distances and distance ratios in a multidimensional stimulus space have been developed and discussed. Differences between city-block PDFs and corresponding Euclidean PDFs diminish as the stimulus variance magnitudes increase. Maximum likelihood methods hold promise for distinguishing city-block and Euclidean processes under varying levels of stimulus variability.

APPENDIX

To obtain the log of the PDF of a two-dimensional city-block distance, note that Eq. (6), when expanded, consists of a sum of four similar terms which differ only in their subscripts. The first term is 2

(z i z j &+ i ) (z i &z i z j &+ j ) zi exp & & 2 2?_ i _ j 2_ i 2_ 2j

\

2

+.

(A1)

To integrate each term, we employ an indefinite integral given in Abramowitz and Stegun (1967, p. 303), which, when modified for the notation of our problem, is

|

exp&(a+2bx+cz 2j ) dz j =(12)



? b 2 &ac b erf - c z j + +const., exp c c -c (A2)

\

+ \

+

262

DAVID B. MACKAY

where the error function erf is defined as erf (z)=&

2

|

-?

z

exp(&t 2 ) dt.

(A3)

0

Expanding Eq. (A1), the coefficients of Eq. (A2) are as follows: + 2i (z i &+ j ) 2 + , _ 2i _ 2j

\ + z + &z + z b=(12) & + \ _ _ _ +, a=(12)

i 2 i

c=

2 i 2 j

i

i

j

(A4)

2 j

z 2i z 2i + . 2_ 2i 2_ 2j

The definite integral is thus

\ \

exp &

(+ i ++ j &z i ) 2 2(_ 2i +_ 2j )

&erf

+<

2 - 2? - _ 2i +_ 2j

+ j _ 2i +(z i &+ i ) _ 2j

+\ \ - 2 _ _ - _ +_ + erf

i

2 i

j

2 j

+ j _ 2i &z i _ 2i &+ i _ 2j

\ - 2 _ _ - _ +_ ++ . i

j

2 i

(A5)

2 j

Repeating this process for the other three terms in the expansion of Eq. (A1) and combining terms, we get the PDF of a two-dimensional city-block distance expressed in terms of error functions, specifically ( + i ++ j +z i ) 2 + j _ 2i +_ 2j (+ i &z i ) erf 2(_ 2i +_ 2j ) - 2 _ i _ j - _ 2i +_ 2j

\ + \ + ( + ++ &z ) + _ &+ _ &_ z &exp& erf \ 2(_ +_ ) + \ - 2 _ _ - _ +_ + (+ ++ +z ) &+ _ ++ _ &_ z &exp& erf \ 2(_ +_ ) + \ - 2 _ _ - _ +_ + ( + &+ +z ) + _ ++ _ &_ z &exp& erf \ 2(_ +_ ) + \ - 2 _ _ - _ +_ + (&+ ++ +z ) + _ ++ _ +_ z +exp& \ 2(_ +_ ) + erf \ - 2 _ _ - _ +_ + (+ ++ &z ) + _ +_ (&+ +z ) +exp& erf \ 2(_ +_ ) + \ - 2 _ _ - _ +_ + ( + ++ +z ) + _ &_ ( + +z ) &exp& erf \ 2(_ +_ ) + \ - 2 _ _ - _ +_ + ( + &+ +z ) + _ +_ ( + +z ) +exp& erf \ 2(_ +_ ) + \ - 2 _ _ - _ +_ + &exp&

2

i

j

i

2 i

2 i

j

2 j

i

2

i

j

i

2 i

2 j

i

i

2 i

1 2 i

2 - 2? - _ +_

2 j

j

2

j

2 i

i

j

j

i

2 i

2 i

j

i

j

i

2 i

i

j

2 j

j

2 i

i

2 j

j

j

2 j

i

2

i

2 i

2 j

i

j

i

i

2 j

i

2 i

j

2 i

2 j

2 i

i

i

2 i

.

i

2 i

j

2 i

2 j

2 j

2 j

2 j

2

2 i

i

i

2

i

2 1 i

j

2 i

i

2 j

2 2

i

2 j

2 i

2 i

j

i

i

2 j

2 j

i

2 i

2 j

i

j

i

j

2 i

2 i

2 i

j

2

2 j

i

2 j

i

2 j

(A6)

PDFS USING A CITY-BLOCK METRIC

263

To get from here to Eq. (8), we just make use of the relationship of the standard normal CDF 8( } ) to the error function erf ( } ), 8(x)=(12)(1+erf (x- 2)),

(A7)

to represent the results in a more familiar form, and combine terms. REFERENCES Abramowitz, M., 6 Stegun, I. A. (1967). Handbook of mathematical functions with formulas, graphs and mathematical tables. Washington, DC: U.S. Govt. Printing Office. Arabie, P. (1991). Was Euclid an unnecessarily sophisticated psychologist? Psychometrika, 56, 567587. Ashby, F. G., 6 Gott, R. (1988). Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology: Learning, Memory and Cognition, 14, 3353. Ashby, F. G., 6 Maddox, W. T. (1993). Relations between prototype, exemplar, and decision bound models of categorization. Journal of Mathematical Psychology, 37, 372400. Ashby, F. G., 6 Perrin, N. A. (1988). Toward a unified theory of similarity and recognition. Psychological Review, 95, 124150. Attneave, F. (1950). Dimensions of similarity. American Journal of Psychology, 3, 515556. Baker, F. B. (1992). Item response theory: Parameter estimation techniques. New York: Dekker. Bijmolt, T. H. A. (1996). Multidimensional scaling in marketing: Towards integrating data collection and analysis. Capelle aan den IJssel, Netherlands: Labryrint. Bossuyt, P. (1990). A comparison of probabilistic unfolding theories for paired comparisons data. Berlin: Springer-Verlag. De Soete, G., Carroll, J. D., 6 DeSarbo, W. S. (1986). The wandering ideal point model: A probabilistic multidimensional unfolding model for paired comparisons data. Journal of Mathematical Psychology, 30, 2841. Ennis, D. M., 6 Ashby, F. G. (1993). The relative sensitivities of samedifferent and identification judgment models to perceptual dependence. Psychometrika, 58, 257279. Ennis, D. M., 6 Johnson, N. L. (1993). ThurstoneShepard similarity models as special cases of moment generating functions. Journal of Mathematical Psychology, 37, 104110. Ennis, D. M., Palen, J. J., 6 Mullen, K. (1988). A multidimensional stochastic theory of similarity. Journal of Mathematical Psychology, 32, 449465. Groenen, P. J. F., 6 Heiser, W. J. (1996). The tunneling method for global optimization in multidimensional scaling. Psychometrika, 61, 529550. Haldane, J. B. S. (1937). The approximate normalization of a class of frequency distributions. Biometrika, 29, 392404. Hefner, R. A. (1958). Extensions of the law of comparative judgment to discriminable and multidimensional stimuli. Unpublished doctoral dissertation, University of Michigan. Householder, A. S., 6 Landahl, H. D. (1945). Mathematical biophysics of the central nervous system. Bloomington, IN: Principia Press. Hubert, L., Arabie, P., 6 Hesson-Mcinnis, M. (1992). Multidimensional scaling in the city-block metric: A combinatorial approach. Journal of Classification, 9, 211236. Imhof, J. P. (1961). Computing the distribution of quadratic forms in normal variables. Biometrika, 48, 419426. Jensen, D. R., 6 Solomon, H. (1972). A Gaussian approximation to the distribution of a definite quadratic form. Journal of the American Statistical Association, 67, 898902. Johnson, N. L., Kotz, S., 6 Balakrishnan, N. (1994). Continuous univariate distributions (Vol. I). New York: Wiley.

264

DAVID B. MACKAY

Johnson, N. L., Kotz, S., 6 Balakrishnan, N. (1995). Continuous Univariate Distributions (Vol. II). New York: Wiley. Kruskal, J. B., Young, F. W., 6 Seery, J. B. (1973). How to use KYST, a very flexible program to do multidimensional scaling and unfolding. Murray Hill, NJ: Bell Telephone Labs. Leone, F. C., Nelson, L. S., 6 Nottingham, R. B. (1961). The folded normal distribution. Technometrics, 3, 543550. MacKay, D. B. (1989). Probabilistic multidimensional scaling: An anisotropic model for distance judgments. Journal of Mathematical Psychology, 33, 187205. MacKay, D. B., Bowen, W. M., 6 Zinnes, J. L. (1996). A Thurstonian view of the analytic hierarchy process. European Journal of Operational Research, 89, 427444, and 93, 639640. MacKay, D. B., 6 Zinnes, J. L. (1995). Probabilistic multidimensional unfolding: An anisotropic model for preference ratio judgments. Journal of Mathematical Psychology, 39, 99111. MacKay, D. B., 6 Zinnes, J. L. (1999). PROSCAL: A program for probabilistic scaling. Bloomington, IN: Indiana University. Mathai, A. M., 6 Provost, S. B. (1992). Quadratic forms in random variables. New York: Dekker. Pearson, E. S. (1959). Note on an approximation to the distribution of non-central / 2. Biometrika, 46, 364. Shepard, R. N. (1987). Toward a universal law of generalization for psychological science. Science, 237, 13171323. Springer, M. D. (1979). The algebra of random variables. New York: Wiley. Thurstone, L. L. (1927). A law of comparative judgment. Psychological Review, 34, 273286. Torgerson, W. S. (1958). Theory and methods of scaling. New York: Wiley. Wilson, E. B., 6 Hilferty, M. M. (1931). The distribution of chi-square. Proceedings of the National Academy of Sciences, 17, 684688. Zinnes, J. L., 6 Griggs, R. A. (1974). Probabilistic, multidimensional unfolding analysis. Psychometrika, 39, 327350. Received: August 4, 1998; published online February 16, 2001