Statistical analysis of data from experiments in human signal detection

Statistical analysis of data from experiments in human signal detection

JOURNAL OF MATHEMATICAL PSYCHOLOGY: Statistical Experiments I. G. Bell Telephone 6, 391-417 (1969) Analysis of Data from in Human Signal Detect...

1MB Sizes 0 Downloads 68 Views

JOURNAL

OF MATHEMATICAL

PSYCHOLOGY:

Statistical Experiments I. G. Bell

Telephone

6, 391-417

(1969)

Analysis of Data from in Human Signal Detection

ABRAHAMSON

Laboratories,

AND

Incorporated,

H. Murray

LEVITT] Hill,

New

Jersey

07974

An attack on some of the problems of analysis of data arising from experiments in human signal detection is presented. Under fairly general assumptions about the underlying noise and signal-plus-noise distributions (viz., that they came from the same location-scale family), techniques for estimating the receiver operating characteristic curve in both “Yes-No” and “Rating-scale” experiments are developed. The properties of these estimates are investigated and measures of goodness-of-fit are suggested where appropriate.

1.

INTRODUCTION

In situations involving detection, the human sensory system may be considered to act as if it employed a statistical decision maker which takes account of such nonsensory factors as relative frequency of occurrence and penalties for incorrect decisions. Although decision-theoretic models of human signal detection have been studied extensively for well over a decade, it is only recently that any attempts have been made at systematic investigation into statistical techniques for analyzing the data (Ogilvie and Creelman, 1966; Dorfman and Alf, 1968; Abrahamson, Levitt, and Landgraf, 1967). This paper presents several methods of handling data obtained from yes-no and rating-scale experiments; an analysis of the forced-choice experiment is reported elsewhere (Abrahamson and Levitt, 1968). Although there is some overlap with that already published, the present work provides a more general and more extensive treatment of the topic. Ogilvie and Creelman, for example, obtained solutions for the logistic distribution only, and Dorfman and Alf restricted their analysis to yes-no data for the case of normal distributions. In this treatment no distributional assumptions are made other than that the hypothesized distributions belong to the same location-scale family which covers both the above-mentioned cases. (Admittedly, the location-scale restriction is unfortunate, but mathematical tractability forces some constraints.) Simple alternatives to the maximum-likelihood solution are considered and the relative accuracy and precision of these estimates have been evaluated in 1 We wish Miss Lorinda

to thank Landgraf

Colin Mallows for her very

and Derek Hudson for their constructive able assistance in computer programming.

391 48016/3-S

criticisms

and

392 a Monte Carlo simulation. are proposed.

ABRAHAMSON

In addition,

2.

AND

LEVITT

several tests on the consistency

of the model

BACKGROUND

A variety of decision-theory models have been proposed, differing in emphasis and interpretation (Green and Swets, 1966; Peterson, Birdsall, and Fox, 1954; Treisman, 1965). Our main concern is with the underlying estimation problem and we describe a model which may be regarded as typical for this purpose. A stimulus of intensity Z is presented to the subject. Z consists of background noise, N, and possibly a fixed signal of intensity S so that we can assume Z = Nor Z = N + S according to whether a signal is absent or present in the stimulus. In practice, N may be applied externally with the signal or the signal alone may be applied so that N = 0. In either case, we assume E(N) = 0. The stimulus Z gives rise to a sensory effect, 1’. Because of the presence of internal noise in the system, a particular value of Z does not always give rise to the same effect J’; but we assume that E(j’ 1I) increases monotonically with Z over a large range. In the presence of external noise, given the presence (or absence) of S, the variability of J’ may be almost entirely due to N. If N = 0, the variability of J’ is due to internal noise only. Each of the two distributions of Z, with and without S, correspond to a distribution of J’ in the sensory continuum. J’ is not a physically measurable quantity, but we may define a quantitative equivalent of J’, denoted by J, which has physical dimensions. The two distributions of J’, with and without S, will induce distributions of J with densities, fSN and fN , say, respectively. The difference between the expected values of J calculated with respect to fsN and f,, , respectively, s = ESN( J) - EN(J), is the effective strength of the signal whose real intensity is S. The relationship between S and s is implicit in the relationship between Z and J, and may be assumed known, at least up to a constant factor (i.e., s = rg(S), where g is known but Y may not be). It is assumed that the subject decides that a signal is present in the stimulus only if J’ exceeds some critical level or, equivalently, J exceeds some constant c. The decision theory approach hypothesizes that the subject can control the choice of c in a way that is statistically responsible depending on the known probability that a signal is to be present in a stimulus and the relative rewards and penalties which the experimenter attaches to correct and incorrect decisions. In some models (e.g., Peterson et al. 1954; Swets, Tanner, and Birdsall, 1961) it is assumed that the criterion c is derived using the likelihood ratio procedure; but the decision procedure described here corresponds to a likelihood ratio procedure only for certain reasonable choices of fN and fsN (e.g., normal distributions with equal variances).

STATISTICS

IN

HUMAN

SIGNAL

DETECTION

A correct decision that a signal is present is called a hit, while an incorrect that a signal is present (when it is not) is called a false alarm. Defining

FN(x) = j-’-cofhr(t)dt

and

Fs&)

393 decision

= sz fsdt> 4 --m

where FN and FSN are assumedto possessuniquely defined inverses,F$ and Fyi, respectively, the hit and false alarm probabilities are P{Hit} = 1 -F&c)

= P(c)

P(False alarm) = 1 - FN(c) = p(c)

(2-l)

(seeFig. 1).

P[FALSE

ALARM]

.l-

FIG.

1.

Decision

procedure.

For fixed S, c may be varied by changing the relative rewards and penalties for correct and incorrect decisions.The points (p(c), P(c)), as c varies from --co to co, then represent a curve in the unit square which is called the receiver operating characteristic (ROC) curve and its equation, obtained by eliminating c from (2.1) is FI;‘( 1 - p) = F&(

1 -

P)

(24

(seeFig. 2). We shall assumefrom now on that fN and fsN belong to the same location-scale family. Letf(x) b e a known (and sufficiently differentiable) density with zero mean and unit variance. For somenumbers s, u, (T,, we assumethat

u and ugare the standard deviations of fN andfsN , respectively. Let

F(x) = j-z f(t) dt. -ax

394

ABRAHAMSON

p = P

FIG.

Then

AND

[FALSE

2.

ROC

ALARM]

curve.

(2.1) becomes

P(C) = 1 - F (51, and the ROC curve, (2.2) F-l(l Putting

LEVITT

5 = F-l(l

P(C) = 1 - F if),

is -P)

=$F-‘(1

- p) and 7 = F-1(1

-p)-+.

- P), we can rewrite

(2.5) as

(2.6) is a representation of the ROC curve as a linear function of (presumably) known transforms of the hit and false alarm probabilities. If u = us, (2.6) depends only d’ = s/o which is used as a measure of the subject’s sensitivity. In all cases, the experimenter knows whether or not a signal has been presented and is thus in a position to estimate the subject’s hit and false alarm probabilities for a fixed set of experimental conditions, after a sequence of repeated trails. These estimates ordinarily constitute the data. To estimate (2.6), standard techniques are not applicable since both 7 and t are estimated, and, furthermore, the joint distribution varies with (6, 7). The principal aim of this study is to provide methods of estimating the parameters of the model.

STATISTICS

IN

HUMAN

SIGNAL

395

DETECTION

Tests of the various assumptions underlying the model and the decision process are suggested. Attempts are made to investigate questions of bias, precision, consistency, sample size and power. The main techniques employed are: (a) the use of the asymptotic normality of the multinomial distribution, and (b) the “delta method” or method or propagation of errors, (Rao, 1965, Ch. 6). We shall adopt the notation that “X N N(p, 8)” means the random vector X is (multivariate) normally distributed with mean vector p and covariance matrix 2’. The sign A should be read as “is approximately distributed as.” 3. THE

YES-NO

EXPERIMENT

A sequenceof trials is performed in each of which the subject must state whether or not he thinks a signal is present. As far as the subject is concerned, the signal is presented at random with probability OLin any trial and is of fixed strength. The subject is told 01and is given a system of rewards and penalties (not necessarily quantitative), to influence his decisionprocedure, i.e., his choice of c. Let n be the number of trials, u the number of hits, z, the number of false alarms and 01the proportion of trials containing a signal.Thenp(c) and P(c) can be estimatedas

i(c) = (1 _” +

;

P(c) = 5.

By changing OL or the systemof rewards and penaltiesbetween sequences,estimates of (p(c), P(c)) can be obtained for severalvalues of c. Supposethat n, trials are made at ci (i.e., when the subject’s criterion is the unknown cJ, the signal being presented on a randomly assignedproportion, 01~ , of the trials and that ui hits and zli false alarms result, for i = l,..., Nor N different setsof conditions. Then putting

Pi = PCci)

and

Pi = P(c,),

i = I,..., N,

the estimatesare and

pii2!L,

Olgzi

= I,..., N.

Since we assumethat the subject decideson the presenceor absenceof a signal only by comparing J’ with a fixed criterion on each trial and that each J’ is independent of those observed in the preceding trials, ui and zli are independent binomial variables with parameters (Pi , mini) and (pi , (1 - c&J respectively. Thus, asymptotically,

(dl”

( t 1 ti ) - N

i! (;),

Pi(l -I%) 1 -ai o

(3.1)

396

ABRAHAMSON

AND

LEVITT

Let ST = F-1( 1 - $J, ti = F-1(1 - p,),

Yi = F-1( 1 - Pi)/ 7)i =F-‘(1 - P,)\’

(3.2)

and

P,(l- Pi) yi = (1- cii)f”(&)’ Iff(t) and!(q) Since

8,= p*u - Pi) z %f%i) -

(3.3)

are normal, yi and Si correspond to the probit weights (Finney, 1952).

(3.4)

and

we have asymptotically,

by (3.1), (3.5)

Estimating vi as Yi does not give an unbiased estimate, and assuming thatf possesses a continuous derivative, f’, the Taylor expansion [which extends (3.4)] gives Pi -Pi

Yi - Q =

_ (Pi - PdZ f’h*) 2 -nGi’

f(%)

(3.6)

where qi* is between Q and Yi , and approaches Q with probability Thus the bias in the estimateis

one as n, + co.

f’(Ti) 8, = E(Y, - 7/?JG= pi(l - pi) --+&.) 2Lu& f 3h) xz

si

f’(?li)

2ni

f(Q)

-~-

+o(&)

(3.7a)

=0(L). t

ni

Similarly, the bias in Xi as an estimateof ei is #i

=

E(Xi

-

5i)

=

-

Yi --+o($). zni

f’(&) f(ti)

(3.7b)

Both 0, and & may be estimatedby substituting pi and ji for Pi and pi , and Yi and Xi for qi and & , respectively, in (3.6). By using a jackknife procedure to estimateqi and ti bias can be reduced to o(l/nJ, (for a description of jackknife, seeMiller, 1964,

STATISTICS

IN

HUMAN

SIGNAL

397

DETECTION

and references cited there). Equation 3.5 will then still hold for reasonable choices ofF, such as the normal or logistic distribution functions. CASE (A). EQUAL SCALE FACTORS, o = a,

This casehas been considered briefly in Atkinson, Bower, and Crothers (1965, p. 211) where a simple estimateof s/a is given. The relation to be estimatedis 7=&S.

(3.8)

u

Let zi = xi - Yi .

(3.9)

From (3.5) and (3.8) we seethat for large ni (?zJl’2(Z” - 6) A N(0, Ti), where Ti = yi + 6i .

(3.10)

ri can be estimated by substituting the estimates $i(l

-fd)

Qi = (1 - cQ).f2(Xi)

and

8, =

E

a1

%f2Cyi)

pi,



(3.11)

in (3.10) to get an estimate,6, . An estimate of d’ = s/a which is linear in the Z’s and asymptotically unbiased and efficient (i.e., minimum variance) as M = mini ni -+ co, is therefore (3.12) The bias in this estimate is B ((G),,,

= E ((g,,

- $1 = ““~~;t~~“,i’~

(1 + o(1)) = o (&j,

as M + co. The variance of (f/~)~~ is approximately [z:i nJ~~]-r. Apart from estimating d’, we might consider whether our assumptionabout the form off is reasonable.In order to do this, we need to estimate the ci’s as well. It

398

ABRAHAMSON

is convenient

to parameterize

AND

LEVITT

in terms of s/a and ti = cJu. The likelihood

function

is

Thus

+ vi log{ 1 - F(Q) The maximum

likelihood

ecluations

+ {( 1 - q) ni - vi> log F(&)] .

are

(3.13a)

= 0,

i = l,..., N.

Using the facts that xi with probability

and

Yi - lli -+ 0,

1 and that, from (3.4) we have pi

approximate

- i$j ---f 0

solutions

-

{I

-F-l(rli)l

~5

(Ti

-

Yi)f(vJ~

of (3.13) are given by

and [( yi - A + (g/k]

+ (4

- .&)/pi = 0,

i = l,..., N;

STATISTICS

IN

HUMAN

SIGNAL

399

DETECTION

or

and

gi = (1 + &/~i)-l [(g

+ Yi + S,XJ~,])

i = l,..., N.

Notice that the approximate ML estimate so obtained is precisely the asymptotic linear minimum variance unbiased estimate obtained in (3.12). The covariance of [(f/o), e] is asymptotically given by ’ cov-1 (Q,tj

= 1-E. [&ogl]! ? 1

-- f-5 61

--

nN SN

-- n1 6, n1($-+$J

0

0 . --

nN

.

.

0

0

SN

which gives Var ((+I)

Cov ((;I,

= [T n~T~1-l

A) = [U + VrJ

(as before),

(3.14a) (3.14b)

C nk/Tk]-l, k

and the (i, j)th entry in Cov-r(e)

is

{COV-~(Q}~~ = S;j(y,rl + ~3;‘) ni - (T nk/Tk)-’ (where Sii = 1 if i = j and 0 if i # j).

n,nJS$,),

i,j = I,..., N,

(3.14c)

400

ABRAHAMSON

AND

LEVITT

Sotice that the X’lL estimate of (s/u) is not only consistent as the ni’s increase indefinitely, but consistent if only one of the ni’s increases indefinitely. Suppose n, increases without limit. Then form (3.14a) we have, asymptotically, since all the oh’s are positive,

which tends to, as n, increases. Furthermore, from (3.14b) we also have Cov(s/o), li) vanishing as nl increases, so that ii and (j/u) independent. For consistency of all the lI’s, however, one would ni’s increase indefinitely. The goodness-of-fit off and the assumptions that both fN and variance and that s is the location parameter of the problem, may statistic

Kl = i

see that we would are asymptotically require that all the

fsN have

the same be tested with the

[A - u - ~(W>>l”

i=l

F(&/L+){l IV

+ i;

- F(e,/a)}

%(l - “J

[P, - {I - F((& - s)6)}]” F((& - s)/6){1 - F((& - s)/6)} nisi ’

(3.15)

where 8 = s($) and ti = Ssi . Since N + 1 parameters have been estimated, K1 is approximately distributed as x:-;-l . If the fit is not good, it may be hard to trace the source of the difficulty, because one would have to test the assumptions separately. If one or more of the assumptions are invalid, then, as the ni’s increase so that ni& nj 4 vi , Ii and Pi will converge to the true-false alarm and hit probabilities, pi and Pi, respectively, while the statistician is, in effect, trying to estimate c and u by minimizing (3.15) (i.e., by maximizing the exponent in the asymptotic joint normal distribution of the hi’s and pi’s, which is equivalent to ML estimation for large samples). Thus Ki/Cj nj will converge to

[Pi - i1 - F(Cil~)~12 4 = %? [ T &./a){1 - F(q/u)} 4 - 4 [Pi - (1 - F((Ci - s)/a}]2 + T

F((Ci

- S)/O){l -

F((C.j

- S)/U)> “OLi I ’

(3.16a)

where the Pi’s and pi’s are, respectively, the true hit and false alarm probabilities. If all the assumptions are correct, Hr = 0. Otherwise, suppose the minimum is achieved at u” and co and let

pi0 = 1 -F(Q/aO),

Pi0 = 1 -F((Q

- S)/OO).

(3.16b)

STATISTICS

IN

HUMAN

SIGNAL

401

DETECTION

For large ai , (3.15) is approximately given by

Kl = f (A - Pi”)” . A(’ - Pi> i=l &(l

- pi)

p,o(l - p,“) %(’ - %)

+ f (6 - Pi”>” . Pi(’ - Pi) nisi, P?(l - P,O) i=l Pi(’ - Pi) which is, asymptotically, the weighted sum of noncentral x2 variables, each with one degree of freedom, so that we have, roughly,

p,o(l - p,“) ’ P?(l - P,“)

where xm2(z2) denotes a noncentral x2 variable with centrality parameter x2,

m

degrees of freedom and non-

and p,“( 1 - p,“) (P,“( 1 - Pi”)) p,(1 - Pi) ’ P,(l - Pi) Thus, if x:,~ is the upper 100~~/~ point of xm2 (i.e., P(xm2 3 xi,,} of the test based on Ki at the 100~ o/0level of significance is

. = E), the power

In order to achieve a power of at Ieast /? against a specified set of p,‘s and Pi’s (the pie’s and Pt’s would then be obtained from (3.1)), we would have to choose the ni’s sufficiently large so as to make A2 large enough to satisfy

P{x&J(~~)2 Axi&

= 8.

(3.17)

Tables of the noncentral x2 distribution are not readily available; but it is usually adequate to approximate the distribution of xm2(z2) with the distribution of axk2 where 01 and k are fitted by moments and are given by cx = (2x2 + m)/(z” + m) and

402

ABRAHAMSON

AND

LEVITT

k = (m L ,z”)“/(m + 2-z). The distribution of x,,~~(.$) can also be approximated as normal with mean m ~-~z? and variance 2(n1 {- 2x2). Equation 3.17 will, of course, be conservative in that ~-I~1x&(~2) is a “lower bound” distribution for K, and thus it will either underestimate the power for given ni’s or overestimate n,‘s for a given power. Another possibility when considering questions of power is to approximate the distribution of k; with that of a scaled x2 variable or normal variable, by fitting moments. As we have already noted, K1 is approximately distributed as the weighted sum of noncentral x2 variables, so that we have, approximately,

where

gi =

Pi( 1 - Pi) Gi = p,y1 -Ppio)

PO -Pi) PiO(l - Pi”) ’



and T.2 _ z

(pi

-p,o)z-n.a.

Pi( 1 - Pi)

z *-

Then

W,) = 5 Ml + ti2)+ Gi(l + Ti2)>, i=l

and

V(K,) = 2 f

{gi2(1 + 2t;) + Gi2(1 + 2T;)}.

i=l

The distribution of K1 is now approximated by that of oIxk2 where (Y = V(K,)[2E(K,)]p1 and k = E(K,)/ol, or by a normal distribution with mean E(K,) and variance V(K,). In a simple application of (3.17) in which a logistic distribution was fitted to data from a normal distribution with N = 5 and 01 = 0.5, it was found that a lower bound on the sample size necessary to reject the fit at the 5% level with power 0.5 was well over 5000 observations. If instead of the suggested lower bound distribution for k; , a normal approximation to the distribution is used, the predicted sample size is reduced by an order of magnitude. Predictions based on the latter approximation, however, may be somewhat optimistic. Details of these and other computations may be found in Abrahamson and Levitt (1968). CASE B. UNEQUAL SCALE FACTORS, o f

u,

In practice, it may not be reasonable to assume that the variances of fN and fshr are the same (Green and Swets, 1966). We thus have to estimate (2.6), a problem of

STATISTICS

IN

HUMAN

SIGNAL

DETECTION

403

regression when both variables are subject to error. As before, we may approach this problem using the asymptotic joint normality of XI **+ X, , Yr e.0 YN , or by using maximum-likelihood techniques directly as has been done by Dorfman and Alf (1968) for fN and fSN normal. Not surprisingly, these give the same asymptotic results. The likelihood function in this case is

+ ai log{1-F(Q) + [(I - 4 % - %I logm]. The maximum likelihood equations are given by

These equations may be solved iteratively using the matrix-approximation method described in Rao (1965), as has been done by both Ogilvie and Creelman (1966)

404

ABRAHAMSON

AND

LEVITT

and Dorfman and Alf (1968). An alternative, approximate appropriate consistent estimates in (3.18) to obtain

or, somewhat

procedure

is to substitute

more concisely (3.19a)

where (3.20)

4

2

0

PROPORTION

FIG.

3.

Data

from

Michigan

report

6 OF

30, Observer

FALSE

.8

IO

ALARMS

2 (Tanner,

Swets,

and Green,

March

1956).

STATISTICS

IN

HUMAN

SIGNAL

405

DETECTION

There is no explicit solution to (3.19) since the equations are nonlinear. However, a simple iterative procedure can be used by first putting ti = $ in (3.19a) and (3.19b) and substituting the resulting estimates of (u/o.J and (s/o,) in (3.19~) to get a new set of li’s. The yi and Si are reestimated with the new fi’s and the new weights are used in both (3.19) and (3.20). Convergence appears to be quite rapid. An analysis of the yes-no data in Tanner, Swets, and Green (1956) provided estimates of (u/us) and (s/us) after the first iteration that were within roughly 1 o/o of the asymptotic estimates. A typical fitted curve is shown in Fig. 3. For the case where an observed proportion is either 0 or 1, an approximate solution may be obtained by using either 1/4mi , or 1 - (1/4mJ, respectively, in place of the observed proportion, where mi equals ‘yini for hits and (1 - LY,)~, for false alarms. A closer approximation to the maximum-likelihood solution may be obtained by replacing the corresponding value of Xi (or YJ by an “effective value” obtained by substituting the estimated values of p and 6 (or P and 7) in (3.4). This procedure is analogous to that proposed by Fisher for handling extreme proportions in probit analysis (Irwin and Cheeseman, 1939). The covariance matrix for large samples is given by

cov-p&o) CJ -7 US

= ~-q&lo,q~+)]~.

s

5N

9 US

---

nj

u

h

us

0

0

---

nN

u

SN

us

0

406

ABRAHAMSOiS

AND

LEVITT

Thus

(3.21a)

(3.21b)

i,j= where?$

= 1 ifi

= jandoifi

#j,

1 ,*-*, N,

(3.21d)

and

A rough test of the hypothesis that u = us is given by using the fact that (c$~) is approximately N[(o/o,)], w”) where the variance w2 = W-l xi ni&Si is estimated and the hypothesis rejected if u = 1 ($J - 1 I/W is “too large.” The biases, B(s$J and B(crp8), seem very difficult to assess owing to the complicated nature of the function and the number of estimates in (3.19a). The ci’s can be estimated as c”i = Sli and the oh’s as 7ji = (SC&$ - (ups). Thus, the compatibility off and the location-scale assumptions on the relationship between fN andf,, , may be tested using

K, = C [bi -tl i

mu

-F(6i)112 - %!iH

%(l - %) + 1 Pi - (1 -wL)112 niOLi. i

F(+jiX1

-F(7ji))

(3 22)

407

STATISTICS IN HUMAN SIGNAL DETECTION Since N + 2 parameters have been estimated (0, czg, f1 ,..., fN), distributed as &s . The validity of the assumption that fN and fshi both belong family defined by f should be revealed in the degree of linearity the points (Xi , Yi) are plotted. A common measure of linearity

ci (Xi - Tz-)(Y, - P) p = [Ci(Xi - X)2xi (Yi - P)y

K2 is approximately to the location-scale which appears when is (3.23)



where pi = &, YJN, X = J7~-, X,/N. A value of p close to 1 would be judged to reveal linearity. Since the distribution of (Xi , YJ varies from point to point, even the asymptotic distribution of p appears to be analytically intractable. Empirical estimates of the cumulative distribution of p for several typical cases are shown in Fig. 4.

L

8’ Ay 029 06

,flI.” 064

068

072

271

076

0.00

OBSERVED FIG. 4. are defined

Ordered in Table

plot of p estimates 1.

from

Monte

0.04

088

092

LL096

IO

VALUE

Carlo

study.

The

data sets A, B, C, and D

The plot of (Xi , YJ will not be linear if F is not the true distribution function underlying the model. If G is the true underlying distribution and the location-scale assumptions hold, the ROC curve, when transformed by F-l (which would be linear when G = f), actually plots F-l(G[(c - s)/uJ) against F-l(G(c/c)), or the curve y=F-l Considerations 480/6/3-6

[ G I$ WWI

- +ja

of power and sample size are similar to those of Case (A).

408

ABRAHAMSON AND LEVITT

A test of whether the observer by considering the statistic

is using a prescribed

set of criteria

K2 = [E - g*]’ [C&‘(Q][~ where g* = (f,*, [a*,..., fN*)’ correspond to the are the estimated operating points. Although there should be roughly chi-square with N - 2 degrees on several sets of data shows a good approximation freedom (see Fig. 7). If 0 = Us , the prescribed criteria may be defined Li where

-

may be obtained

(3.24)

g*],

prescribed set of criteria and e is some reason to believe that K3 of freedom, a Monte Carlo check to chi-square with N degrees of in terms of the likelihood

ratios,

(3.25) Ks may thus be determined using the g* obtained from (3.25) and CG(‘$ obtained from (3.14~). Simpler, consistent estimates of (u/us) h ave been derived using procedures developed by Madansky (1959). T wo such estimators of slope are given by (3.26a) and (3.26b), respectively. Estimates of s/us may be obtained from (3.19b). ci n,(Y, Ci

(xi

-

I)'

%/8i

-

F)“/&

{xi

9i/Si

- (N -

(xi

l/2

1) ?Z$i/Si)/xi

?c?i/Si}

I

(3.26a) '

or

xi (Yi - P)" - E&IL

-=5

( us 12 [

xi

(Xi

-

P = f Y,/N, i=l

MONTE

2Q2

-

JL$

-yi gini xi

pi/ni

1 li2

(3.2613)

'

X = 5 X,/N. i=l

CARLO STUDIES

A Monte Carlo simulation was carried out in order to compare and check the validity of the estimation procedures and to obtain empirical estimates of the distribution of the various estimates. Four sets of data yere considered with (s/us , u/u., ; 5) equal to (1,l; 5, 5.25, 5.5, 5.75, 6), (1,l; 5.5, 5.75, 6, 6.25, 6.5), (0.5, 0.75; 4.25, 4.75, 5.25, 5.75, 6.25) and (I, 1; 4.5, 4.75, 5, 5.25, 5.5, 5.75, 6, 6.25, 6.5). These sets of

STATISTICS

IN

HUMAN

SIGNAL

DETECTION

409

410

ABRAHAMSON

AND

LEVITT

data are referred to as A, B, C, and D, respectively. A binomial sampling process was simulated at each value of ti and 7i with cyi and n, fixed at 0.5 and 200, respectively. The simulation was repeated 100 times fo;,each set,ef data. Table I shows the average values of s/a\,< and u/us and their associated standard errors. The estimates given by (3.26a) and (3.26b) are virtually as accurate as the maximum-likelihood estimates and only marginally less efficient. Good agreement was obtained between the average maximum likelihood estimates of the estimation and [ V(Q$,)]~~‘} and the standard errors of the estimates obtained error {i.e., [V($)]1/2 in the Monte Carlo simulation. The distributions of the estimates about their mean values were found to be approximately normal. The cumulative distributions of the test statistics p, U, K, , and Ka have been obtained empirically using ordered plots. Several typical results are shown in Figs. 4, 5, 6, and 7, respectively. As expected, u was found to be approximately half-normal with u/us = 1. The straight line in Fig. 5 shows the expected plot for N(0, 1). The half-normal plot of u for u/u, =0.75 is approximately that for N{ -0.25/[V(a/~,)]~/~, I}. Figure 6 plots the ordered values of K, against the quantiles of a gamma distribution with parameter 7, where 7 = [(N - 2)/2]. The straight line shows the expected plot for Xi-, . K, was found to be approximately chi-square with N - 2 degrees of freedom. K3 was also found to be approximately chi-square, but in this case a good fit was obtained with N rather than N - 2 degrees of freedom. Figure 7 plots the ordered values of K3 against the quantiles of the gamma distribution corresponding

60

/

1

I

I

I

I 1

OUANTILES,

FIG.

5.

HALF-NORMAL

Half-normal

plot

DISTRIBUTION

of

u estimates.

STATISTICS

IN

HUMAN

SIGNAL

411

DETECTION

809

6.74 co t fz > E z 0 B k 2 0

5.39

404

270

1.35

0

I 0

I 2 I8

I I 09 PUANTILES

2 31

OF GAMMA

6.

FIG.

I

7.

plot

I6 5

ETA = I 5

of KS estimates.

I

OF GAMMA

Gamma

I

546

DISTRIBUTION.

I

WANTiLES

FIG.

Gamma

I

4.37

I

3.27

I

DISTRIBUTION,

I

ETA=

plot of KS estimates.

2.5

412

ABRAHAMSON

AND

to XN$ the expected plot of X,a IS shown appears to depend in a complicated way on distributions are shown in Fig. 4. Despite values of p seldom fell below 0.6 for all four for nisi or ni( 1 - ai) substantially less than

4.

THE

LEVITT

as a straight line. The distribution of p the 6, , yi , O(~, and nj Estimates of these the large differences between the curves, cases. This may not be the case, however, 100.

RATING-SCALE

EXPERIMENT

One of the drawbacks in the experiment described in Sec. 3 is that it requires a new set of ni observations to determine the ith point on the ROC curve. Not only is this a rather time-consuming process, but since several sessions will usually be required, the possible presence of session-to-session variation may result in inconsistent data. A more efficient procedure (and one which is more likely to result in consistent maintainance of criteria) is to ask the subject to rate his degree of certainty at each trial instead of giving a simple yes-or-no response. In the experiment we now consider, the subject must state at each trial his degree of certainty (or confidence) that a signal is present. K increasing degrees of certainty are defined: C, , C, ,..., C, . We assume that a subject responds by selecting K - 1 levels of J, ca < ca < ... ck (cr = -co, clc+a = 00) and classifying the received stimulus I as C, if J appears to lie between c, and c,+r . Under the previously mentioned assumption about the monotonicity of r(z), this would amount to deciding in favor of C, on observing J if r(c,) < Y(J) < Y(c,+~). The classification probabilities are

P{C, 1SN) = Qi = Qi(c) = j;lfSly(t) I

p(c, / N} = pi = qi(c) = I:jilfN(t)

dt = ~~ci+l-s~‘u”rct,dt; (c~-s)lo*

dt = /:;;;‘“f(t)

dt.

(4.la)

(4Sb)

i Let OLbe the relative frequency with which the signal of strength s occurs, let n be the total number of trials and wi and xi the numbers of trials classified as Ci for which the signal is, respectively, present and absent. Then estimates of the Qi’s and qi’s are

The random vectors (Q, ,..., QJ and (& ,..., &) are independent. Each ci can be regarded as defining a binary decision situation, as in Sec. 3. At the ith stage, the

STATISTICS

IN

HUMAN

SIGNAL

subject “decides that a signal is present” so that the “hit” probabilities are

if J > ci , i.e., if J falls in Ai = uiai Qj ,

Pi = P(c,) = P(A ) SN) = jm f&t) ci The “false alarm” probabilities

413

DETECTION

dt = c Qj = 1 -F jai

iy).

(4.2)

are

pi = p(c,) = P(A 1N) = & qi = 1 - F (+I.

(4.3)

Obviously, Qi = Pi - Pi+l and qi = pi - p,+r ; and once again the point corresponding to ci on the ROC curve is given by F-l(l

- Pi) = $F-‘(1

-pi)

- +.

Now, from standard multinomial theory, we know that for large n,

and similarly (but independently) !a

-

41)

-WI2 Q2U

-

92)

***

-q2!?k-1

With Pi = CjaiQj and di = Cjai (ii, it follows that (PI ,..., Pk) is independent of ($1 ,*--v &) and, asymptotically, each is (degenerately) jointly normally distributed with Var(Pi)

= & Pi( 1 - Pi);

Var(ii)

=

(1 1 +

Pit1

Cov(Pi , Pj) = & Pi(l - Pi),

-Pi);

cov($i

7Bj) =

i
(1 1 a)n Pi(1 - Pj),

i
For i < j, let ysj = yji =

aij = aj, =

Pdl - Pj)

(1 - 4fF-V orf[F-l(l

- ~i)lf[F-V - P,)I ’

Pi(l - Pj) - Pi)]f[F-l(1

- Pj)] ’

(4.4a) (4.4b)

ABRAHAMSON AND LEVITT

414 Tjj

=

r =

Tj;

=

yij

+

Sjj

(4.4c)

(

T = {Tag],

A = @,i/,

frij),

(i, j < k - 1).

(4.4d)

Let Xi, Eli , ti, and qi be defined as in (3.4). For large n, the X’s and Y’s are approximately jointly normal with E(X,) + ti , E( Y,) - ?li , and YZCov(X, , Xj) = yij , n Cov(Y, , Yj) = aij and Cov(X, , 1;) = 0. If X’ = (Xi ,..., ,U,-,) and Y’ = (Yl ,..., Ykml), we have ?P(X CASE (A).

- 5) A N(0,

and

r>

?c2(Y - 7)) k N(0,

A).

EQUAL SCALE FACTORS, 0 = (TV

Again, we have to estimate (3.8). Let Zi be again defined as in (3.9): Zi = Xi - Yi . For large n, 2’ = (2, ,..., 2,-i) is approximately (k - 1) variate normal, with

E(Z) =k

(‘-)

e,

e’ = (1, l,..., l),

and

n Cov(Z) e= T. An estimate of T, T, can be formed by replacing Pi andpi by pi and ji , respectively, in (4.4). Then an approximately minimum variance estimate of s/o is

s^ i-1u

=

e~pZ/elple

=

c

;ijq,

1 I

ij

where the S is thK(i, j)th element of F-l; The variance of (s/u) is, in this case, approximately

p,

(4.5a)

ij

(C +-I

and the bias is

where 0, and #j are defined by (3.7). To test the goodness-of-fit off, we assume that c2 ,:.., clc are constant and, using u = 8 given by (4.5), estimate ci as o(~ where the ti’s maximize the (estimated) sample likelihood:

-w) = fi [~i(~S>l”‘[4i(~S)l”~, i=l

log

L(S)

=

i i=l

[wi

log

Q&id) +

xi

log

qi(uS)]*

STATISTICS

IN

HUMAN

SIGNAL

415

DETECTION

The “maximum likelihood(ML) equations”are:

%im =f( 11 xi

7i

Xl

JY7i)

-

wi JTrli-1)

-

wli+d

I

- F(Q)

i

I-! - OT +m ;F(&) xi%&,)- F(5,+1)x~F(~;) i = 2, 3 ,..., k.

(4.6)

Solving these equations will be quite difficult (and even more so if the ML estimate of s/a is also sought, which is why we recommend substituting 6 for u in (4.6)-the, general case appears quite impracticable-in any case, 6 is asymptotically equivalent to the ML estimate). Relying on the strong consistency of the estimators Qi , & , pi and ji , (4.6) can be approximated as

- pi+J - wi(Bi-I - Pi)l@i&i-II-’ +f(Ei)[%I(Pi - $i+I>- %(bi-1 - Pi)ll3Ri-Il-’ = O9

f(7i)[Wi-1(Pi

i = 2,..., k,

(4.7)

i = 2 ,..., k.

(4.8)

or f(7i)F(7i)

a”w’

=f(YXl

+ 63

+.mi)

- &+J/Wi

+fwN

q&)(1

- 4” (G’ + 6.3

+ (1 - JLl>/%b2

- ii,l)/%

+ (1 - A&id(l

- a12,

The RHS of (4.8) is a function of the data only and is hence known. The LHS is a function of ci (= uti) and can be plotted so that (at least) a graphical solution is feasible. The covariance matrix of the ti’s so obtained is approximately given by Y-‘(s/a, 5) where the (i,j)th entries of Y(s/o, 5) are given by Yll = fm 2 {f(7i+r) i=l

Yly,i = cNf(rli+,)

-f(?i)>“/!&

-f(7J>“/Qi

9 - V(7i) -f(7i-J>2/Qi-J,

i = 2,..., k,

and for i, j 2 2,

+ Qi’) + (1 - 4f”(MG., + $)I, yi,i-l = -n[~f(7i>f(7i-1)/Qi-1+ (1 - ~)f(Si)f(5i-,)/~~-ll, Ii-j1 > YQ = 0, ‘yii = 4zf2(7i)(Q;:,

Using the solution [, = -co,

1.

[, ,..., [, , &+, = CO of (4.6), (4.7), or (4.8) and

416

ABRAHAMSON AND LEVITT

tjj = & - (f/u), a g 00 d ness-of-fit been estimated)

criterion is (remembering

that k parameters have

K4 = n a:t Psi - w/i+1) -F(7jJN2 [ i=l F(%+d -w-d

+ (1 _ @,,i Hi - wL+d -%31* i=l m5+1)- m5)

1

where W)

= wji,,)

%(4 = F&+1) K4 should be approximately

- wji), i = 1, 2 ,...) K.

- %3,

distributed

as xi-* .

Again we have to estimate the relation (2.6), w h ere the data are X and Y as defined previously in this section. The procedure is the same as that in Sec. 3(B), but complicated by the noninaependence of the Xi’s and Yi’s. To estimate (u/u,) and (s/o,), we need to minimize, in 5, (u/a,) and (s/us), the (estimated) quadratic from in the exponent of the asymptotic joint normal density of X and Y: log-L* = (X - S)‘P-‘(X

- ?$ + (Y - $5

+ $; c)’ d-1 (Y - $5

+ $

c)

where e’ = (1, I,..., 1) and f and d^ are the usual estimates of r and d defined in (4.4), which leads to

-=6 i 08i

(y - Y,)’ 6-q (e - f$)’A-@ - .$e)



(4.9a)

(4.9b) where and

STATISTICS

($,‘A-‘i

(F-1 + the solution

of which

IN

HUMAN

c = f-lx

SIGNAL

+ ($j

seems quite a formidable

417

DETECTION

6-l

[Y + ($j

e];

(4.9c)

task.

REFERENCES ABRAHAMSON, I. G., AND LEVITT, H. Statistical analysis of data from experiments in human signal detection. Unpublished report, Bell Telephone Laboratories, February, 1968 (available on request). ABRAHAMSON, I. G., LEVITT, H., AND LANDGRAF, L. Statistical estimation of parameters of decision-theory models. Paper presented at the 74th Meeting of the Acoustical Society of America, November, 1967, in Journal of the Acoustical Society of America, 1967, 42, 1195. (Abstract.) ATKINSON, R. C., BOXER, G. H., AND CROTHERS, E. J. An introduction to mathematical learning theory. New York: Wiley, 1965. DORFMAN, D. D., AND ALF, E., JR. Maximum likelihood estimation of parameters of signal detection theory - A direct solution. Psychometrika, 1968, 33, 117-124. FINNEY, D. J. Probit analysis. (2nd ed.) Cambridge: Cambridge Univer. Press, 1952. GREEN, D. M. Psychoacoustics and detection theory. Journal of the Acoustical Society of America, 1960, 32, 1189-1203. GREEN, D. M., AND SWETS, J. A. Signal detection theory and psychophysics. New York: Wiley, 1966. GUILFORD, J. P. Psychometric methods. New York: McGraw-Hill, 1954. IRWIN, J. O., AND CHEESEMAN, E. A. On the maximum-likelihood method of determining dosage-response curves and approximations to the median effective dose, in cases of a quanta1 response. Journal of the Royal Statistical Society, 1939, Suppl., 6, 174-185. MADANSKY, A. The fitting of straight lines when both variables are subject to error. Journal of the American Statistical Association, 1959, 54, 173-205. MILLER, R. G. A trustworthy jackknife. Annals of Mathematical Statistics, 1964, 35, 1594-1605. OGILVIE, J. C., AND CREELMAN, C. D. Maximum likelihood estimation of ROC curve parameters. Paper read at Eastern Psychological Association, New York, April, 1966. PETERSON, W. W., BIRDSALL, T. G., AND Fox, W. C. The theory of signal detectability. Transactions I.R.E. Professional Group on Information Theory, 1954, P.G.I.T. 4, 171-212. Reprinted in R. D. Lute, R. R. Bush, and E. Galanter, (Eds.), Readings in mathematical psychology. New York: Wiley, 1963. RAo, C. R. Linear statistical inference and its applications. New York: Wiley, 1965. SWETS, J. A. (Ed.) Signal detection and recognition by human observers. New York: Wiley, 1964. SWETS, J. A., TANNER, W. P., AND BIRDSALL, T. G. Decision processes in perception. Psychological Review, 1961, 68, 301-340. TANNER, W. P., SWETS, J. A., AND GREEN, D. M. Some general properties of the hearing mechanism. Technical Report 30, 1956, Electronic Defense Group, Dept. of Electrical Engineering, University of Michigan. TREISMAN, M. T. Signal detection theory and Crozier’s law: Derivation of a new sensory scaling procedure. Journal of Mathematical Psychology, 1965, 2, 205-217. RECEIVED:

MARCH

8, 1968