COMPUTERS
AND
BIOMEDICAL
Generalized
6,588-595 (1973)
RESEARCH
Analysis
of Quanta1
Dose-Response
Curves*
RUDOLPH H. DE .JONC Departments of Anesthesiology and Pharmacology and the Anesthesia Research Center, lJninir:ersityof Washington School of Medicine, Seattle, Washington 98195 Received March 26, 1973 Statistical advances now have broadened the scope of quanta1 dose-response analysis. Currently used probit and logit techniques suffer from two drawbacks, of which their inability to handle 0 % and 100 % responses is particularly troublesome. As a consequence, observations must be bunched into dose groups. The present method generalizes quanta1 dose analysis by treating observations as individual yes-or-no events. Grouping thus is no longer essential. Further advantages are that the dose-response curve is defined in terms of the salient parameter, the EDscr (median dose), so yielding its magnitude and variance directly.
1 NTRODUCTION
When an organism responds in a yes-or-no fashion to a stimulus the response is said to be quantal. Examples of biological quanta1 responses are dead-alive, sick-cured, awake-anesthetized, and so on. As many drug studies deal with this type of response, quanta1 dose-response relations in particular have been explored extensively. When the response of a group of subjects to various drug doses is studied, a dose-response relation is obtained that can be represented graphically. It has been found empirically that plotting the logarithm of the drug dose versus the percentage of subjects responding yields a smooth sigmoid curve which rises rather steeply in its center portion and changes but little at its lower and upper tails (Fig. I ). To approach such a dose-response curve statistically, one must find a mathematical model that matches the main features of the experimental observations. Two models provide this match: the cumulative normal distribution (area under the normal probability density curve) and the logistic function. By suitable transformation, either curve can be shaped into a linear function that greatly simplifies its statistical description. As the quanta1 dose-response curve matches the curves, so it matches their linear transformations. Quanta1 data thus can be treated (with suitable weighting factors) with the techniques of least-squares linear fitting. * Work supported by Research Career Development Award 5K3-GM28168-07 thesia Research Center Grant No. GM1 5991. 588 Copyright 0 1973 by Academic Press, Inc. All rights ofreproduction Printed in Great Britain
in any form
reserved.
and by Anes-
589
DOSE-RESPONSE CURVE
I
5
10
1.1
I IO
\2 5
I
/
12 /
158
/
I
13 14 Log dose / I
20
25
I
I
15
1.6
I
31.6--z
DOSt?
FIG. 1. Representative dose-response curve for an inhalation anesthetic. Dose (in Torr) is shown in arithmetic as well as logarithmic units. The percentage of animals not responding to a noxious stimulus is shown on the vertical axis.
Probits-transforms of proportions scaled to the normal distribution functionand logits-transforms of proportions scaled to the logistic function-permit the previously mentioned linear operations and have been used for many years. But, probit and logit analysis suffer from two major drawbacks, neither one having been solved adequately till recently. The purpose of this paper is to introduce nongrouped dose-response analysis, and a FORTRAN program to implement it.
GENERAL CONSIDERATIONS
The shortcomings of probit and logit analysis arise from a common factor, namely that 0% and 100% responses correspond respectively to minus and plus infinity transforms-thus are essentially unmanageable. Several ways around this problem have been developed such as the “working probit” (I), the “rule of 2n” (2), and graphical methods (3). The second restriction, an immediate consequence of the preceding, is the need to gather subjects into equi-dose groups so that the proportion of subjects responding can be calculated. While no unsurmountable restriction in a controlled environment, grouping is an impediment to clinical studies where the dose must be tailored to the patient’s needs. Grouping also introduces a subtle computational bias, for it obscures the fundamental yes-no nature of an individual’s response. A further, though less formidable, problem is that the location and direction parameters of the logit or probit line (i.e., intercept and slope) are not the units of primary interest. It is the median dose (EDSo : dose producing a response in half the subjects) that is wanted. Again, as a consequence of being derived from roundabout parameters, computation of the variance of the ED,, can be a considerable and not altogether satisfactory task.
590
DEJONG
One way around the grouping problem is to treat each experimental subject as an individual unit that does or does not respond. Nongrouped single quanta1 analysisthus getsaround the two major drawbacks of probit or logit transformation in one stroke. In fact, it avoids linear transformation altogether. The theoretical foundations for this approach have long been known (seeGarwood (4) for instance), but solution of the nonanalytical joint distribution function did not becomepractical till the advent of high speedcomputers. Soon thereafter. nongrouping in a logistic model framework began to be applied to biomedical problems, with multivariate analysis of the coronary heart diseasestudy (5) being one landmark. Recently Waud (6) adapted these techniques to dose-response relations by parametrizing in terms of the ED,o and slope. Responsesin the usual assay procedure can now be coded simply as either one (“yes”) or zero (“no”). It should be immediately apparent that this treatment frees the experimenter from the bonds of grouping, for every subject is treated as an individual responseunit reacting to a specific drug dose. While grouping, of course, still is permissible, it is no longer essential.A further advantage of Waud’s method is the direct computation of the median doseand its variance. For completeness’ sake, it should be noted that these procedures-the present one included-are open-ended, thus do not provide for a “stop point.” While the estimate of the ED,o probably will not be affected much, the variance estimate may be degraded by sequential methods of experimentation. That is, sample size and doselevels preferably would have been fixed prior to the dose-responseassay. THEORY
The problem at hand is one of fitting a suitable two-parameter function to binomial variates having a value of either 0 or 1. The parameters are derived by maximizing the joint density.function--the product of individual binomial probabilities of occurrence (7). To facilitate computation, the logarithm of the product-a sum of terms-is maximized by term-by-term differentiation. The sum of partial derivatives so obtained is set to zero, and the two equations in two unknowns solved for the parameters (#). The resultant two simultaneous normal equations have no exact analytical solution. Instead, they are approximated by a first-order Taylor expansion about initial estimatesof the parameters. The approximations are then refined recursively until the changefrom one estimateto the next becomesnegligible. The solution involves a rather cumbersome matrix of second order partial derivatives well suited to computer evaluation. A bonus of expansion is that this Hessian is the inverse variance-covariance matrix; variances and standard errors of the estimatesthus can be computed directly. Elements of the matrix (seeWaud (6) for details) can be expressedin terms of observed and expected probabilities. The former, of course, are the actual experimentally observed findings. The latter,
DOSE-RESPONSE
591
CURVE
the expected probabilities, are derived from the mathematical model chosen to represent the dose-response relationship, thus are values one would predict when substituting the appropriate dose in the model function. As mentioned, two models-cumulative probability and logistic-are generally used. Since, for practical purposes, the two yield essentially identical curves, one might prudently choose that model that most simplifies the computational procedure. This model, as shown by Walker and Duncan (5), for instance, is the logistic function. (Garwood (4) has detailed the recursive solution of the normal equations using the cumulative probability function as a model.) The logistic function, with location parameter CIand scale parameter (slope) fl is represented by p?x (1)
p=m’
where p is the probability of an event occurring within limits of 0 and 1 and x the drug dose. Designating by q the probability of an event not occurring, we obtain q = 1 -p. Substituting into (l), we obtain
q=-L. ea+@I+ 1 Neither p nor q look very manageable at this stage; their ratio p/q = ea+5x,
however, is a considerable simplification, logarithm of (3),
particularly
ln(p/q) = CI+ fix.
13)
on obtaining
the natural (4)
The (natural) log of the ratio of subjects responding and subjects not responding is called the logit logit (P>= In (p/q) = a + fix.
(5)
Logits thus should lie on a straight line with slope B and logit-axis intercept of a. The usual regression methods estimate the parameters CIand p of the line which, by interpolation, yields an estimate of the EDsO. Whether solving the normal logit equations by minimizing the error sum of squares (2) or by maximizing the joint probability (7), one is still left with two parameters neither one of which is of immediate use. Reformulating the logistic function in terms of dose (or concentration) and median dose (EDsO) yields an expression in parameters of much more immediate interest. Representing the (unknown) median dose or concentration by 44, and the
592 (known) dose or concentration into (I),
DE JONG
by D, where D = exp(dose), yields, by substitution
where B = dp/dD is the slope of the function. Using (6) as the model, one can now solve the normal equations in terms of median dose M and slope @.As mentioned. the variance and standard error of these estimatesis obtained as an immediate byproduct of the procedure. INITIAL ESTIMATES The first-order Taylor approximation to the tangent plane of the likelihood function is a poor estimate except in the immediate neighborhood of the exact values of the parameters. Nonconvergence, where the estimates get farther and farther removed from the true values, is common. Reasonably close estimates of the median dose and slope of the curve thus are a necessaryprerequisite, short of venturing into second or higher order expansions. Adequate estimates of the ED50 can be obtained by eye from a provisional hand-drawn dose-response curve; and equating the slope to I suffices for the majority of instances. A more convenient method of estimating both parameters prior to expansion was adopted for the present program. A standard logit-log dose least-squaresregression is performed on the data by subroutine APPROX. As the responsesare input as 0 or l-respectively representing logits of minus and plus infinity-it became necessaryto find alternative expressions.The “rule of 2n” proposed by Berkson (2) certainly is the simplest and more than adequatefor the present purpose. With n representing the total number of observations, the expressionsbecome logit (1) cm= In (2n -~ 1) and logit (0) =y-In (2n --- I ). Regressionthen proceeds to yield the parameters of this logit line from which the median dose(p = 4 = 0.5) iscomputed. The slopep of the provisional line is obtained from the regressionand passedto the main program along with the estimate of the ED,,. CODING This program was originally developed as an interactive routine for a Digital Equipment Corporation PDP-15 user terminal. To broaden the application to batch-processing, simplification was required. FORTRAN was chosen since it is available on the majority of large installations and commonly also for mini-
DOSE-RESPONSE
593
CURVE
computers. Extensive error checking was retained to aid the user in tracking down problems. Output comprises a table of successive estimates of ED,, and slope, the first pass representing approximations returned by a subroutine performing ordinary
. EDso f SE. . SLOPE ? SE. .OEGREES FREEDOM . LOGIT EON. ~---
--_
AN
’
BUILD UP MATRIX
FIG. 2. Flowchart for DOSRES program.
logit regression. When ED=,,, and slope estimatesdiffer by lessthan 0.1% from the preceding values, convergence is considered adequate, and the final values with their standard errors printed (Fig. 2). The degreesof freedom are printed next to permit calculation of the 95 % confidence range of the estimates using the corresponding t-statistic. Finally, the logit equation is output so that dose values other than the ED,, (ED95 for instance) can be calculated, and the dose-responsecurve plotted. The program has been checked in-house and against results published by others. The data in Table 1 for instance have been used by several authors as a severe test since four of the sevendose levels yield 100% mortality. The present program computed ED,, = 0.1492 and slope = 2.126 with respective standard errors of 0.0400 and 0.777; as against EDJo = 0.149, slope = 2.1263 and respective standard errors of 0.0402 and 0.771 obtained by Waud (6), who terminated processingwhen
594
DE
JONG
TABLE I DEATHS FROM Dose (mg)
0.0625 0.125 0.25 0.5 1.o 2.0 4.0
BACER~AL
TCAWN”
Sample size (n)
Deaths (Y)
5 5 5 5 5 5 5
1 3 3 5 5 5 5
Fraction dead (r/n) 0.2
0.4 0.h 1.o I.o I .o I.0 I___.___
0 Data cited by Waud (6) and Garwood (4).
the parameters changed lessthan 1%. (Waud’s data were reproduced exactly when cut-off was reduced from the present 0.1% to lx,.) INPUT
Assuming that most userswould use punched cards as a batch-processor input medium, the following is aimed at them. (An interactive FORTRAN program using teletype and punched paper tape as alternate input media is available as DECUS Library program # 15-62.) A program listing will be sent on request-note that the comment lines may be omitted for the actual run. Each observation, comprising a subject’s responseto one dose, is punched on one card in (11, F12.0) format. The response, being 0 or 1, is entered in integer format (11)in column 1. Columns 2-13 are for the floating point value of the dosein (F12.0) format. The input format statement on line 910 may be altered to accommodate additional response-dosepairs on one card. The logical input unit is assumedto be 4. If it is not, the READ statement on line 40 must be altered accordingly. The logical output unit is assumedto be 5. If it is not, change the WRITE statementsto the line printer’s unit number. End-of-data is conveniently signaled by a blank card (a zero dose would be biologically meaningless).The program checks the input data for errors : a response other than 0 or I, or a negative dose, will be flagged with an error message,and the card ignored. Input exceeding 100 responsedose pairs (not counting the sentinel card) is flagged as buffer overflow, with processing terminated by program exit. More than 100 data pairs can be accommodated by redimensioning the blank common variables RES and DOS, and resetting the counter (K) in statement 50 accordingly. Actual processing of sums, products, and logarithms is in double precision to minimize rounding and truncation errors of matrix inversion. Should a solution (i.e., EDso and slope essentially stationary) not be reached in 25 passes,the program
DOSE-RESPONSE CURVE
595
terminates with “HOPELESS! NO SOLUTION.” A zero determinant also is a terminal error. Negative variance is a nonterminal error but the standard error cannot be computed of course, as shown by “VARIANCE < 0; NO S.E.” REFERENCES 1. FINNEY, D. J. “Probit Analysis,” 2nd ed. Cambridge Univ. Press, London/New York, 1964. 2. BERKSON, J. A statistically precise and relatively simple method of estimating the bio-assay with quanta1 response, based on the logistic function. J. Amer. Statist. Assn. 4&M-599 (1953). 3. LITCHFIELD, J. T., AND WILCOXON, F. A simplified method of dose-effect experiment. J. Pharmacol. Exp. Ther. %,99-113 (1949). 4. GARWOOD, F. The application of maximum likelihood to dosage-mortality curves. Biometrika 32,46-58 (1942). 5. WALKER, S. H., AND DUNCAN, D. B. Estimation of the probability of an event as a function of several independent variables. Biometrika 54,167-179 (1967). 6. WAUD, D. R. On biological assays involving quanta1 responses. J. Pharmacol. Exp. Ther. 183, 577-607 (1972). 7. SILVERSTONE, H. Estimating the logistic curve. J. Amer. Statist. Assn. 52,567-577 (1957).