JOURNAL
OF MATHEMATICAL
PSYCHOLOGY
30, 306316 (1986)
Bradley-Terry-Lute with an Ordered GERHARD University
Models Response TUTZ
of Regensburg
An extension of the Bradley-Terry Lute model is presented which allows for an ordered response characterizing the strength of preference. The generalization includes models with ties as special cases. The model is derived from assumptions on underlying random utility functions. Necessary and sufficient conditions for the existence of scale values are given in a representation theorem and the uniqueness of the scale is considered. Estimation of parameters, goodness of tit tests, and tests of linear hypotheses are treated in the framework of a weighted least-squares method. 0 1986 Academic Press. Inc.
1. INTRODUCTION A frequently used model for pair comparisons is the Bradley-Terry-Lute or BTL model. Proposed by Bradley and Terry (1952) the model is strongly related to the choice axiom of Lute (1959) and may be described as the choice axiom for pair comparison experiments. In the usual pair comparison experiment, the two alternatives are ordered with respect to preference, strength, or superiority by deciding which one is more preferable, stronger, or superior. Indifference between the alternatives is not allowed. Extensions of the BTL model which make provision for ties have been proposed by Glenn and David (1960), Rao and Kupper (1967), and Davidson (1970). ’ In this paper an extension is considered which allows an ordered response, characterizing the strength of preference for one of the alternatives. A response ordering may come up in a natural way. If a pair of stimuli is presented and the respondent can make a decision in favor of one of the alternatives, he will also know whether his preference is a strong or a weak one. In a discrimination experiment he will be able to decide whether the stimuli are differing only slightly or strongly. If BTL models are applied to sport competitions, more information about the ability of a team will be gathered if the response variable consists of categories for superiority. Of course one of the response categories may be characterized by no preference and the Rao-Kupper model with ties will turn out to be a special case. Address reprint requests to Gerhard Tutz, Lehrstuhl fiir Statistik, University of Regensburg, Postfach, 8400, Regensburg, West Germany. 306 0022-2496186 $3.00 Copyright 0 1986 by Academic Press, Inc. All rights of reproduction in any form reserved.
MODELS
WITH
AN ORDERED
2. THE
307
RESPONSE
MODEL
Suppose A = {a, ,..., a,} is a nonempty set of objects or stimuli. Let pii = ~(a,, aj) denote the probability of ai being prefered over ai. Then the BTL or strict utility model assumes a function U: A -+ Iw with &4a,) plj= eu(n,) + eu(a,)’
The BTL model may be derived from a random utility model (Block & Marschak, 1960). Assume discriminal processes ui+ Ui, i= l,..., k, where for any object ai, Ui is a random variable and ui a constant. If pii is given by pjj = p( Ui + Ui > Uj + Uj) and Ui-
Uj has a logistic distribution D(x)=(l
function +e-““)-I,
a > 0,
then the resulting model is equivalent to the strict utility model (Yellott, 1977). In this model the respondent has to choose between two alternatives. If the pair (a,, aj) is presented he can prefer ai or aj. Models with ties which allow the subject to prefer none of the presented stimuli have been proposed by Rao and Kupper (1967) and Davidson (1970). In a more general case the subject can choose between k ordered response categories. Given the pair (ai, aj) he or she may prefer, for example, ai, strongly or weakly, may be indifferent or may prefer aj weakly or strongly. Thus live response categories follow. Let R = {l,..., k} d enote the response categories and pii, be the probability of response r if (a,, a,) is presented. The pii, define a function p: A x A -+ (0, l)k
with p(ai, aj) = (pii, ,..., pijk). A system (A, p, k) will be called an ordinal response pair comparison system with k response categories. Again let the utilities ui + Ui connected to the simuli ai, i= l,..., n, be such that the differences U,- Ui have a logistic distribution function. Assume threshold parameters 8, ,..., ek- i such that alternative r is chosen if O,- i < (uj + U,) (ui+ Vi) < 0,, i = l,..., k with 8, = -co. Hence, the more ui+ Ui exceeds uj+ U, the more an alternative with a small number will be preferred. Let Xij~ R denote the response variable. If (a,, aj) is presented we have (1) or p(Xq
The model
Uid8,+Ui-Uj)=D(8,+Uj-Uj).
is based on the traditional
category boundaries
(2) approach
used
by
GERHARDTUTZ
308
Edwards and Thurstone (1952), Samejima (1969), and more recently in the context of regression models by McCullagh (1980) and Sterling (1984). Based on this concept a BTL model for ordered alternatives is defined as follows. DEFINITION. An ordinal response pair comparison system with k response categories (A, p, k) is a BTL(k) model if and only if there exist parameters 0, ,..., Ok~ 1 and a real function U: A + R, such that for all ai, uje A, and TE {l,..., k} all probabilities piir are given by
(1) p(X,
where D(x)=
l/(1 +epX), tIk= co, and
(2) p(X,-=r)=p(X,,=k+l-r).
If the pair (ai, uj) is considered, the preference of the first stimulus ai corresponds to the preference of alternatives with small numbers. Condition (2) is a natural symmetry condition, stating that the probability of prefering ai with a certain strength r should be equivalent to prefering uj with the “opposite” strength k + 1 - r. The following remark shows that the parameters oi,..., ok-, are restricted stronger than in regression models (Anderson & Phillips, 1981). In the following the abbreviation ui = ~(a,) will be used. LEMMA.
Let (A, p, k) be a BTL(k)
(1)
8,<
(2)
Zf k is even, then
model. Then
... de,.
ek,2= 0
and
for
ek,2-r = -eklZ+r
r = l,..., k/2.
Zf k is odd, then e,=
Proof:
r = l,..., (k - 1)/2.
for
-ek-r
1. For r= l,..., k - 1 we have p(X, < r + 1) > p(X, < r), and therefore we,+ I +
ld(Ui)
-
U(Uj))
>
D(O, +
u(Ui)
-
U(Uj)).
Because D is a strictly increasing function 13,< 0,+ r follows. 2. From p(X, = r) = p(Xj, = k + 1 - r), r = l,..,, k - 1, it follows that p&r)=
c p(X,=k+ 1 -I) I
l
Thus we have ~,+ui-uj=ln(p,i(r)/(l-p,i(r))=ln((l = -ln(pJk From 0, = -eker
condition
- r)/(l - p,(k-
-pji(k-r))/p,,(k-r)) r))) = -O,_, - (uj- ui).
(2) follows immediately.
1
MODELS
WITH
AN ORDERED
RESPONSE
309
Thus, for k= 2 we have 8, =0 and from the definition of a BTL(k) model the BTL model results immediately as the special case BTL(2). While the scale values ~(a,) represent utility values connected to the alternatives in A, the parameters fJ1,..., 8, characterize the discrimination power of the given response categories. We have p(X, = r) = D(0,) - D(8,- r) with 8, = -co, 0, = GO.If k = 2 we have 8, = 0, characterizing only the symmetry of preferences. If k > 2, e.g., k = 3 we have 0, = -8,. Then the probability of ties is p(X, = 2) = D( -0,) - D(0,). This reflects the “preference” of ties of the subject making decisions or, in the case of two competing teams, the probability that there will be neither a winner nor a loser. With more than two ordered response categories the BTL(k) model allows gathering more information from a subject’s response than the usual BTL model.
3.
REPRESENTATION
AND
UNIQUENESS
The ordered BTL model was motivated by a random utility model and so the definition of a BTL(k) model already postulates the existence of scale values ~(a,). Another approach is to characterize the model by conditions on the probabilities and then show the existence of scale values. The following representation theorem gives necessary and sufficient conditions which in part are similar to the multiplication condition of the common BTL model (Suppes & Zinnes 1963). In the following let for all ai, a,E A pii := p(X,< r) denote a cumulative probability and let wii(r) := pV(r)/( 1 - p&r)) denote the odds. REPRESENTATION THEOREM. A pair comparison system (A, p, k) with ordinal response is a BTL(k) model if and only if the following conditions are satisfied for all a,, uj, a,, u,EA, rE{l,..., k-l}:
(1)
(2) (3)
w&r) wii(r
WjAr) +
=
1 )lwii(r)
Wit(r) =
w,,(r) w&r)
=
w,,(r
+
1)/w,,(r)
p,(r)=l-pj,(k-r).
With minor modifications the representation theorem remains valid for the representation of a reduced BTL(k) system where pair comparisons (ai, ai) are not considered. For this, conditions (l)-(3) have to be formulated such that wii(r) or pii do no appear. The uniqueness of the numerical assignment U: A + [w is considered in the following theorem. UNIQUENESS
THEOREM.
scale in both the narrow Proof:
The function U: A -P [w in a BTL(k) model is a difference and wide senses. The parameters 8, ,..., 8,-, are unique.
1. Let the functions U, ii: A + 08, 0, & {l,..., k) + R satisfy p(X,<
r) =
310
GERHARDTUTZ
D(0, + ~(a,) - ~(a~)) = D(gr + C(a,) - ii(u From ln(wJr)/wkj(r)) = I?(q) - ii for all i, jE { l,..., n}, it follows for all ie { l,..., n}, u(u,) = ii
= ~(a,) - u(+)
- (ii(q) - u(u,))
and therefore ~(a,) = ii + c, where c = u(uI) - z?(ul). 2. From ln(wU(r + 1)/w&r)) = 8,+ 1 - 8, = gr+ r - 8, it follows by the same argument for r = l,..., k - 1, k > 2, that 8, = g, + d, where d is a constant. From O,=g,+d and f3kpl=8k--l+d we get f?,+Ok-,=8,+~k--1+2d and using e,= -ek--l, 8,= -iYk;c-l we have d = 0. If k = 2, 8 1= 8, = 0 is determined by the model. u For the BTL model various numerical assignments different from the u-scale have been considered (Suppes & Zinnes 1963). This may also be done for the BTL(k) model. A special representation results from the transformation vi = eUi, i= l,..., n, z, = e -*,, r = l,..., k, giving p(Xu < r) = Ui/(Ui + T,Uj).
While the parameters ?r are determined uniquely the scale IX A -+ 08, is a ratio scale. A short derivation shows that the special case k = 3, the BTL model with ties, is equivalent to the model suggested by Rao and Kupper (1967) with response probabilities given by p(X,=
l)=ui/(ui+rluj)
p(X, = 3) = Uj/(T1vi + u,).
Thus the representation model.
and uniqueness theorems also apply to the Rao-Kupper
4. ESTIMATION
The BTL(k)
model may be formulated
AND TESTING
as a linear model by
ln(p(Xti < r)/p(X, > r)) =
8, + Ui -
Uje
Thus the general approach of Grizzle, Starmer, and Koch (1969) to the analysis of linear models for categorical data is applicable. The approach allows transformations of the probabilities, possibly nonlinear, to get a linear structure. Estimators and goodness of fit tests as well as tests of linear hypotheses may be formulated in this framework with the same asymptotic properties as the likelihood ratio test
MODELS
WITH
AN ORDERED
311
RESPONSE
based on maximum likelihood estimation. The method of Grizzle et al. (1969) based on a weighted least squares (WLS) estimation procedure is preferable because no iterative procedure is necessary and available program packages such as NONMET II (Kritzer 1977) or SAS can be used. The method has been applied to pair comparison system and more general choice models by Beaver (1977) and Tutz (1985). The data of a pair comparison system may be arranged in the following way: Pairs
Response .. .
1 1
nil
.. .
g
n gl
. ..
k nlk
n1
%k
%
Here the pairs (i, j) are characterized by a single index 1E { l,..., g} and n,r is the frequency of response r if pair I is presented. n,, 1= l,..., g, denotes the number of replications of pair 1. In this design a subset of possible pairs (i, j), i, j E A may also be considered. Let p’ = (pi ,..., pi), where P:
=
(PSI
T..*Y Ps,k
- 1 )v
s = l,..., g
are the parameters of the sth multinomial distribution with components denoting the probability of response r with pair s given. Because the probability is given by psk = 1 - (p,, + . . . + ps,k _, ) it may be neglected. Let @’ = (@i ,..., &) with
ps, psk
where the components fisr = n,,/n, denote the sample estimates. A linear model is given by f(P)
= w
or
f(@)=X/3+&,
(3)
where f := (fi,...,f,,): R'kp')g + R", u < (k - 1) g, has partial derivatives up to second order. X is a fixed (U x m)-design matrix with rg(X) = m < U, p’ = (bl ,..., B,) is a parameter vector and E is a vector of error variables. In formulating the BTL(k) model as a linear model one should take into account that ui,,.., U, are not determined uniquely; X would not have full rank if Us,..., u,, 8 ,,..., ok- 1 are used. Thus the uniquely determined parameters uin := u;-- u,, i = l,..., n - 1, t+ ,..., &..Z _ , if k is even or 8, ,..., 8,,- 1J,2if k is odd are used.
312
GERHARD
TUTZ
Now a BTL(R) model (for k even) has the form
[~~(~~~~~~~~~)~=~
ii
:;)
y;
;I[:;].
If a reduced pair comparison experiment with a subset of pairs is considered, care must be taken that the parameters uin ,..., u,- I,n, 8, ,... are still determined uniquely. If X is a full rank matrix, uniqueness of the parameters is satisfied. If X is not a full rank matrix, the parameters have to be reduced to get uniqueness. Let C(b) be a block diagonal matrix with the usual sample estimates cov(fis) of the covariance matrices of @, in the blocks, where @,I(1 -lid) cov(@J
-Bsds*
...
-
PSI
B&k
~ 1
=; ‘.
s
. .
(
B&k
~ I( 1 -
a&k
- 1) 1.
Let D(@)= (aflap) be the (u x (k- 1) g) matrix of partial derivatives and V(a) = D(p) C(b) D(p) be the estimated covariance matrix of f(p). Then the weighted least squares estimator 1 for /3 is derived by minimizing
Q
W-‘(f(B)-xfl)
and has the form /J= (XT(~)-’ j? is a best, asymptotically
normally $G
for N=n,
x)-I
Av(fi)-’
distributed
(fi+?)+N(O,
f(B).
(4)
estimator with
(XT1X)-‘)
+ ... +n, + co, n,/N -+ 1, > 0. Matrix
S is given by
S= lim NV@). n-02 A test for goodness of fit of the model is given by
Q(8) =fW
W-’
f(B)- (@)’ VP,-‘(Wb
(5)
which has asymptotically a central X*-distribution with u-m degrees of freedom. Special hypotheses Ho. . Cb = 0 with a matrix C, rg C = c i m may be tested by
Q,th = (C~)‘CC(x’W-’ If H,, is true Q,-(B) has a central X*-distribution
C’l -‘Wp^,. with c degrees of freedom.
(6)
313
MODELSWITHANORDEREDRESPONSE
Common hypotheses about the parameters of a BTL model may be formulated within this framework. With C chosen adequately, e.g., the hypothesis of equal preferences HO: ~4,~= . . . = U, _ I,n may be tested by using Q&b).
5. COMPARISON WITH THE BTL MODEL In the BTL model with an ordered response, as well as in BTL models with ties, one has to distinguish between scale values that reflect attributes of the stimuli and scale values that reflect the preference of response categories. The threshold parameters which are of the latter type become important if the value of a future response is to be predicted. While the simple BTL model is only able to predict which team will win in a sport competition, the BTL(k) model, with more than two categories, allows a more specific prediction. In most situations, however, the main interest will be in the parameters u. In psychometrics, the 8values reflect the preference of subjects for the different answer categories, while the u-values refer to the scaling of stimuli. The u values reflect the subject’s reaction to properties of the stimuli. The BTL(k) model is advantageous not only for purposes of prediction but also for drawing inferences about the u values. The estimation of u-values in the BTL(k) model is more precise than in the simple BTL model, because the variance of the estimator is reduced. The improvement is demonstrated by an example which considers the asymptotic variance. The estimated covariance matrix off@) is a block diagonal matrix given by
where g = n(rt - 1)/2 is the number of pairs of distinct stimuli
and
is a symmetric matrix connected to the sth pair with elements
for i
this, the asymptotic covariance matrix (X’S’A’-’ for ,/%(/?-/I) with NV(jj) is easily calculated. Table 1 shows the resuls for the BTL model BTL(R) model with an ordered response having four categories. Three and thus three distinct pairs of stimuli, have been used with u13 = z423and values of 8 for the BTL(4) model.
314
GERHARD TABLE Asymptotic
BTL 1 2 3 4 5
model
1
Variance of fi( I?,, - u,J for a BTL(4) with Three Stimuli and u13 = u2,
0 = 0.25
9.78 16.76 35.95 87.22 228.60
TUTZ
8.71 14.49 29.84 70.53 180.30
O=l
BTL(4) 0=2
7.45 11.12 19.50 39.53 91.79
7.77 10.96 16.69 26.35 46.91
Model
model tJ=3 8.65 12.89 20.64 30.92 43.50
0=4
8=8
9.27 14.81 26.88 46.62 68.92
9.77 16.72 35.70 86.20 228.59
Generally, in the BTL(k) model more parameters have to be estimated than in the simple BTL model. Thus the improvement of the variance of ,/$ti, - uij) is expected to be less if a large number of parameters fI has to be estimated. In a design with only three stimuli and four response categories, two parameters u13, uZ3 and one parameter 8 have to be considered. Therefore the case of only three stimuli which is being considered here is a critical one because the number of u-values is rather low in comparison to the number of t3 values. A much stronger improvement can be expected if the number of stimuli is increased and the number of response categories k and thus the number of 8 values remains fixed. The information about the u values, of course, depends on the preference for answer categories. If only extreme answer categories are chosen, one actually has the case of the simple BTL model. Therefore in Table 1 the improvement is very small for extreme values of 8, such as 19=0.25 or 8= 8.0. In these cases the probability of all but one of the answer categories is almost zero and thus the ordering of the categories has no effect. In the range 0 = 1.0 to 8 = 3.0, however, non-zero probabilities result and an improvement follows, especially for higher u values. APPENDIX:
PRGDF
OF THE
REPRESENTATION
THEOREM
Proof: 1. Let (A, p, r) be a BTL(k) model. Then with ui= u(q) for ie A we have p&r) = e’l/(e” + e“-Or ) and In w&r) = 8, + ui - uj. From this, conditions (1) and (2) follow immediately. (p&r) = 1 - pji(k - r) is shown in the proof of the above lemma.) 2. Assume that conditions (l)-(3) hold. Define ek := CO 4 := ln(w12(l Ywdl)) 8, := In
wJ~)
-
(ul
-
~4~)
for
i = l,..., n
for
I = l,..., k - 1.
MODELS
Condition
WITH
AN ORDERED
315
RESPONSE
(1) may be written as
w,i(r)/wiAr) = w,ArYwjs(r) and we have for all i, j, s, t, 1~ { l,..., n], w,,(r)/wjs(r) = w,i(rYwih-) = WtdrYWjkr) and hence
ln(w,,(l)lWj,(l))=ln(w,,(l)lw,,(l))
=~n(Cw,,(l)I~,~(1)1C~12(1)I~,~(1)1) = u,- uj. For r = 1 and condition
(2) we have
Ur - uj = ln(w,,(rYwjAr))
=ln(Cwts(r)lw,Ar+ 1)l CWjAr+ l)lwi,(r)l Cw,,(r + lYWjAr+ 1)l) =ln(w,,(r
+ l)/wjS(r + 1)).
Knowing that In(wJ2)/wjS(2)) = u,- uj the derivation r = 2 and proceed iteratively as follows:
ln(W,,(r)lwj,(r)) = U, - uj By definition
may now be started with
for
rER.
we have ln(wls(r)lw12(r))
= In wdr) - (0, + u1 - ~2)
and since ln(w,,(r)/w,,(r)) = u2 - u, it follows that ln(w,,(r)) = 8, + u, -u,. Hence ln(w,,(r)/wJr)) = 8, + u, -u, -ln(w,(r)) and using ln(w,,(r)/w,(r)) = u, -u, we obtain In wis(r) = t?, + ui - u,. Therefore for i, s E (l,..., n ), it follows that p,(r) = D(O, + ui - u,). From (3) we get p(X,=
r) = pv(r) - p&r - 1)
=(l
and the conditions
-p,(k-r))-(1
1))
-p,,(k--r+
=pji(k-r+
l)-p,,(k-r)=p(Xji=k-r+
of a BTL(k)
model are satisfied.
1)
i
REFERENCES ANDERSON, J. A. & ordered categorical
480!3013-7
PHILLIPS,
variables.
R. R. (1981). Regression, discrimination Applied Statistics, 30, 22-3 1.
and measurement
models
for
GERHARD TUTZ
316
R. J. (1977). Weighted least-squares analysis of several univariate Bradley-Terry models. Jour-
BEAVER,
nal of the American
Statistical
Association,
12, 629-634.
H. D. & MARXHAK, J. (1960). Random orderings and stochastic theories of responses. In I. Olkin, S. Ghurye, W. Hoeffding, W. Madow, & H. Mann (Eds.), Contributions CO probability and statistics. Stanford: Stanford University Press. BRADLEY, R. A., & TERRY, M. E. (1952). Rank analysis of incomplete block designs, I. The method of pair comparisons. Biometrika, 39, 324-345. DAVIDSON, R. R. (1970). On extending the Bradley-Terry model to accomodate ties in paired comparison experiments. Journal of Mathematical Psychology, 65, 317-328. EDWARDS, A. L., & THLJRSTONE,L. L. (1952). An internal consistency check for scale values determined by the method of successive intervals. Psychometrika, 17, 169-180. GLENN, W., & DAVID, H. (1960). Ties in paired comparison experiments using a modified Thurstone-Mosteller model. Biometrics, 16, 86109. GRIZZLE, J. E., STARMER, C. F., & KOCH, G. G. (1969). Analysis of categorical data by linear models. BLOCK,
Biometrics, KRITZER,
H.
25, 4899504. (1911). NONMET
II, A program
for
analyzing
contingency
tables by weighted
least squares.
William Marsh Rice University. LUCE, R. D. (1959). Individual choice behaviour: A theoretical analysis. New York: Wiley. MCCULLAGH, P. (1980). Regression models for ordinal data. Journal of the Royal Statistical Society Series B, 42, 109142. PLACKETT, R. L. (1974). The analysis of categorical data. London: Griflin. RAO, P. V., & KUPPER, L. L. (1967). Ties in paired comparison experiments: A generalization of the Bradley-Terry model. Journal of the American Statistical Association, 62, 192-204. SAMEIIMA, F. (1969). Estimation of latent ability using a response pattern of graded scores. Psychometrika, Monograph Supplement 17. SUPPES, P., & ZINNES, J. L. (1963). Basic measurement theory. In R. D. Lute, R. R. Bush, & E. Galanter (Eds.), Handbook of mathematical psychology (Vol. 1). (pp. l-76). New York: Wiley. STERLING, W. D. (1984). Fitting linear models to ordinal responses. The British Journal of Mathematical
and Statistical
Psychology,
37, 263-270.
TUTZ, G. (1985). Diskrete probabilistische Reaktionsmodelle als kategoriale Regressionsansatze. Archiv ftir
Psychologie,
137, 99-l
14.
J. I. (1977). The relationship between Lute’s choice axiom, Thurstone’s theory of comparative judgement, and the double exponential distribution. Journal of Mathematical Psychology, 15, 109-144.
YELLOTT,
RECEIVED: May 23, 1985