COMPUTER PROGRAMS IN BIOMEDICINE 2 (1972) 111-117. NORTH-HOLLANDPUBLISHINGCOMPANY
MAXIMUM LIKELIHOOD ESTIMATION OF PARAMETERS IN MULTIEXPONENTIAL FITS WHEN DATA FOLLOW POISSON DISTRIBUTION* T. SANDOR and G.D. WILSON Harvard Medical School and Peter Bent Brigham Hospital, Boston, Massachusetts, USA and Information Design, Inc., Lexington, Massachusetts, USA A computer program is described which uses the maximum likelihood method to fit data which obey Poisson fluctuation with the following function
N = ~ A v e x p (-art). V
The program provides a simultaneous optimum estimation of the parametersA v, and a v v = 1, 2. . . . . p and their variances. For the assessment of the goodness of fit the values of X2 and o, the residual variance, are also computed. A complete discussion of the problem is given in Sandor et al., and the same methods of estimation are used in the program. Maximum likelihood
Curve fitting
Poisson distribution
1. Computational methods 1.1. Mathematical approach The study of tracer kinetic data pursued at both the Peter Bent Brigham Hospital and Harvard Medical School has indicated the need for a curve fitting method which can provide a simultaneous optimum estimation of the parameters and their variances for the following fitting function
N= ~Avex p (-art),
v = 1,2 . . . . , p .
(1)
In our application N and t denote counts and time, respectively, and the A v and the a v are u n k n o w n parameters. The value o f p depends on the nature of the particular problem being considered but it usually does not exceed 5. Since N represents counts accumulated during short time intervals from a radioactive source, it may be assumed to follow Poisson statistics. In the following we shall briefly describe the maximum likelihood procedure which was used in order to estimate the "best" values of the components of the parameter vectors A = A 1, A2, • • •, Ap and ff = a l , c ~ 2 , . . . , ap in eq. (1), provide a program listing in FORTRAN IV for the IBM 360/65 computer and finally give an example to demonstrate the performance of the program.
* This work was supported in part by grants HEl1668, HE05832, and GM01910 from the National Institutes of Health, U.S. Publick Health Service.
112
T. SANDOR and G.D. WILSON
1.2. The method* The prerequisite of the maximum likelihood method is to find the likelihood function of the problem. If we have k independent observations -ff = nl , n 2 . . . . , n k and the probability of t h e / t h observation is p(nj; A, ~), then the joint probability of the observations is
j=k L(h-;.,4, a-)= 1-I p(nj;A,a). j=l
(2)
The probability distribution of the observations, or more explicitly the probability distribution of the counts in thejth interval can be written in accordance with the Poisson distribution as exp ( - (n/))
p(n/;A, if) -
(nj)nj
(nj) r
(3)
where
t]
v=p
tj_ 1
v=l
(ni): f
~
(3a)
Avexp(--%t) dt
denotes the expectation value of the counts in the jth time interval characterized by the boundaries t/_ 1 and t/. Substituting the expression ofp(nj; A, ~) from eq. (3) into eq. (2) yields
(
j=k
(nl}nl (n2)n2 . . . (nk)nk
j--1
(nl)! ( n 2 ) ! . . . (nk)!
(4)
The likelihood equations can be obtained by differentiating the logarithm of the likelihood function with respect to each of the parameters A 1, A2, • • •, Ap and a 1 , a 2 , . . . , ap and equate them to zero. This procedure results in the system of equations
tj exp ( - a v t ) d t = o,
(5a)
j=l v = l , 2 . . . . . p. nj
_
t exp (--%t) dt = 0,
/=l
(5b)
tj-1
Eqs. (5a) and (5b) are functions of both.4 and ff through (n/) and their solutioniprovides the "best" estimate of the parameters. The maximum likelihood method also furnishes the variances of the parameters. These are the elements of the diagonal of the dispersion matrix, the inverse of which is straightforwardly given by the second derivatives oflnL with respect to the parameters. The general element of the inverted matrix (V -~ ) is * A more detailed description of the method was published in [ 1 ].
MAXIMUM LIKELIHOODIN MULTIEXPONENTIALFITS
V-'
[ ~21nL
113
(6)
( rs ) = - ~--~-~r~s] A r = ~ r ' as=~ s'
where Ar and &s are the "best values of the parameters resulting from the solution of eqs. (5a) and (5b). The dispersion matrix V can then be obtained by inverting ( V - 1 ) , that is V= ( v - l ) -1"
1.3. Numerical procedures The practical applicability of the maximum likelihood method to various biomedical problems largely depends on how efficiently the system of equations (5a) and (5b) associated with the method can be solved. These equations are nonlinear in the parameters.4 and a and an appropriate numerical technique has to be employed to obtain a solution for the parameters. A technique for the non-linear least-squares method based on the Gauss-Newton iterative procedure has been developed by Hartley [2]. We have modified this technique to be applicable to the system of non-linear equations associated with the maximum likelihood method where the data follow the Poisson distribution. The program requires an input parameter-vector that may be denoted as 0 {-'4-,~} = 0A1,0A2 . . . . . 0Ap, 0°q, 0or2. . . . . 0¢tp. This input vector is not computed by the present program. The procedure uses the first two terms of the Taylor expansion to determine the difference between the "best" values of the parameters and those from the respective approximations resulting from a given step of the iterative procedure. The method results in convergence for sufficiently good input parameter-vector. When p is small (_< 3) convergence is usually obtained even when the input vector substantially deviates from the "true" vector. When p is 4 or 5, the input vector can be more critical. Eqs. (5a) and (5b) can be contracted in the following manner with a simplifying change in notation:
_]+nj ]= 1
-0, OOi
i = 1,2 . . . . . 2p,
(5c)
where the previously def'med parameters.4 = A 1 ,A 2 . . . . . Ap and ff =Otl, ot2 . . . . . ty.p are replaced by 0- = the Taylor expansion ofeq. (5c) can be written as
01,02 . . . . . 02p. With this notation j=k /=1
h=2p {I ~])l ~2(n/) nj a(nj) a(nj) } ~ -1 + OOiaOh (nl)2 aOi ~0h Dh h=l - -
,
i,h=
1,2 .....
2p,
(7)
/=1 where Dh is the difference between the "best'~ value of the h th component of the parameter and the value of the same component for a given step in the iteration. Starting with a parameter vector 00 = {00i}, i = 1, 2 , . . . , 2p, corrections to the elements 00i are computed. These corrections are proportional to the elements of the solution D1, D 2 . . . . . D2p of eq. (7) which can be written in the matrix form
T. SANDOR and G.D. WILSON
114
?
?
l ~-AR~Y ISITIALIZE C~IPUTEq'AND s
CARDS
! t
EXPECTED VALUES
ANDOAT
1 I CONTROL PRINT PART OF CARDS AS USER'S AID
C~PUTE -~X ANDFOET
1
1
1
INITIALIZE ANDCO~IPUTE CONSTANTSAND COEFFICIENTS
A's AND D ' s
TE~TION?
,
l IC~POTEAND[ PRINT a, Xz PA~NETERS
I
?
SET?
r--)
Fig. 1. Cll
C12
•..
C1 h
""
C1 2p
D1
C1
C21
C22
•..
Czh
..
C22 p
D2
C2
Cil
Ci2
...
Cih
..
Ci2 p
Di
C,.
C2p 1
C2p 2
"'"
C2ph
..
C2p2p
D2p
_C2vJ
(8)
where
]=1
OOiaOh
(n])2 00i
~Oh
(9)
and
ci--- ]=k ~
I-- 1 + n~n]l O(n])oOi
(10)
]=1 Once the
Di, i =
1, 2 , . . . ,
2p have been determined, a new parameter-vector 10 is defined by the equations
lOi = oOi + vDi,
i = 1, 2 , . . . ,
2p
(11)
where v is a proportionality constant which can assume values between - 1 and 1. Upon replacing 0 0 in the above Gauss-Newton equations by the derived parameter-vector 10, corrections to the elements lOi canbe calculated. By repetition of the entire procedure successive approximations 2 0, 3 0, . . . , can be obtained•
MAXIMUM LIKELIHOODIN MULTIEXPONENTIALFITS
115
2. Program structure The main steps of the program are shown in the flow chart. Explanation for certain steps in the flow chart is given in the following. INITIALIZE AND COMPUTE CONSTANT AND COEFFICIENTS. The coefficients are defined by eq. (9) and the constants by eq. (10). The subroutine involved in the initialization is CINIT. The computation of the coefficients and constants in eqs. (9) and (10), which are denoted as C(K,L) and CON(L) in the program, uses the expressions: FA1, FA2, AF(K,L), FF(K) and FF(L). These are related to eqs. (9) and (10) as indicated below FA1 = - 1 + Y(I)/FU(I), FA2 = Y(I)/[FU(I)] 2, where Y(I) = n/. is the jth observation, FU(I) = (n/.), the expected (or calculated) values [eq. (3a)]. The FU(I)'s are computed in the subroutine EXV, which in turn utilizes subroutine COE to generate the appropriate exponential terms in eq. (3a). The AF(K,L) are related to the second partial derivatives of (n/) which are computed in the subroutine SPDEV. FF(K) and FF(L) are the appropriate first partial derivatives of (n/) computed in the subroutine FDEV. SOLVE SYSTEM OF EQUATIONS. When the constants are computed the system of simultaneous linear equations given by eqs. (8) can be solved. This is done by calling subroutine SIMQ that is available from the IBM Scientific Subroutine Package. The solution of eqs. (8) provides D i (i = 1,2 . . . . ,2p) which are the corrections for the respective parameters as specified by eq. (11). INITIALIZE Q-ARRAY AND COMPUTE Q's. The function Q is defined as Q(0-) = J~_..j '='L'k j=l
,,j in%>
i k -
i=1
It is initialized by calling subroutine QINIT. The values of this function are computed for v-values ranging from - 1 to +1 in 0.2 increments [v is defined in eq. (11)]. The procedure is performed by using do-loops and calling subroutines QCAL and COMP. COMPUTE Q-MAX AND FOPT. The v-value for which the value of the function Q is the largest is called FOPT in the program. COMPUTE NEW A's AND B's (B = a) When FOPT is found the new improved parameters are determined in accordance with eq. (11). LAST ITERATION? The number of iterations is set by the constant NIMAX. If NO, the current number of iterations is less than NIMAX the program returns to the point indicated in the flow chart. If YES, then the next main step is executed.
116
T. SANDOR and G.D. WILSON
COMPUTE EXPECTED VALUES. The values of FU(I) are computed by using the A's and B's determined in the last iterating cycle. CALCULATE AND PRINT X2, o, AND PARAMETERS. To help the investigator assessing the fit, the residual variance, defined as
O'=
k - 2p ] = 1 [n/. (meas.) - n/. (calc.)] 2
and X2 defined as In/. (meas.) - n! (calc.)] 2 / r/j (calc.)
X2 = i=1
are determined for the best iterative cycle by using subroutines SIGMA and CHI. The variances of the parameters are determined in accordance with eq. (6) by using subroutine MINV that is available from the IBM Scientific Subroutine Package. At the end of the printout in addition to the parameters and their (+) errors the respective half lives (TH) defined as (ln2)/B are also printed. It may happen that convergence is not achieved within the number of cycles specified by NIMAX. For such cases the values of several or all variances are negative. This is indicated by the word "imaginary" at the place of the error. It can also happen that the input-value of the parameters are so much different from the "true" values that the computation diverges. A typical sign of this case is that after a few cycles one of the FU values becomes negative. When this occurs, further execution of the program is stopped except for providing a printout of the measured and calculated data as well as the parameters. LAST DATA SET? The program has the capability of evaluating several independent data sets, the number of which is specified by a constant LL.
3. Sample run
3.1. Sample input This program does not provide the approximate values of the parameters A and B which are needed to begin the iterative procedure. These input data are to by supplied by the user. The user must specify the number of exponential terms in the fitting function. This is achieved by setting the dimension values of the following arrays F, ML, M, C, D and FF both in the main program and in the respective subroutines twice as large as the number of exponentials IP; for a three-exponential case, for instance F(6,6), ML(6), M(6), C(6,6), D(6), FF(6). The dimensioning in the above cases must be exact. The other arrays that have to be dimensioned are: Q, T, Y, FU, A, B, FA, E, EE, AF, and CON. These arrays may have dimension values larger than necessary; this feature serves the convenience of the user when he wants to use the program for data sets of various length. Further quantities to be specified are IP, the number of exponential terms, LL, the number of data sets to be fitted, NIMAX, the number of cycles in the iteration,
MAXIMUM LIKELIHOOD IN MULTIEXPONENTIALFITS IQN, AQN,
117
the number of Q-values to be computed which are _> 0; it also controls the number of Q-values which are < 0, this is used in defining QINT, the interval length for the computation of Q, which is given by QINT = 1/AQN.
3.2. Sample output The output has several features which have been introduced only to help the user assessing the convergence. For this reason the following quantities are printed after each cycle in the iteration: The values of A, C, CON, and FOPT. After the completion of the computation, the measured values of the observations T and Y, and the calculated (expected) values of Y denoted as FU are given in tabular form. Finally the values of the parameters and their errors are shown.
4. Hardware requirements The program is written in double precision FORTRAN IV for the IBM 360/65 at the Harvard Computing Center, Cambridge, Mass.
5. Availability of the program A listing of the program is available at cost from Information Design, Inc., P.O. Box 340, Lexington, Mass. 02173.
Acknowledgements The authors are grateful to Drs. Margaret F. Conroy and Norman K. Hollenberg for valuable discussions and critical remarks.
References [1] T. Sandor, M.F. Conroy and N.K. HoUenberg,Mathematical Biosciences9 (1970) 149. [2] H.O. Hartley, Technometrics 3 (1961) 269.