A note on fitting multiexponential functions of time to experimental data

A note on fitting multiexponential functions of time to experimental data

MATHEMATICAL BIOSCIENCES 195 A Note on Fitting Multiexponential Functions of Time to Experimental Data M. W. SIMPSON-MORGAN Department of Experimen...

515KB Sizes 3 Downloads 25 Views

MATHEMATICAL

BIOSCIENCES

195

A Note on Fitting Multiexponential Functions of Time to Experimental Data M. W. SIMPSON-MORGAN Department of Experimental Pathology, John Curtin School of Medical Research, Australian National University, Canberra, Australia Communicated by Richard Belhnan

ABSTRACT A method based on the theory of difference equations, which was described recently for fitting multiexponential functions of time to experimental data, is shown to be unsuitable for experimental data with inherent error variation. While it is substantially correct in theory and works with exact values of such functions, the criterion given in the method for determining the number of exponentials for data with inherent error is incorrect and the number of exponentials cannot be so determined. Even when the number of exponentials is assumed correctly, the method will not work for points lying on a known function when the values of these points have been rounded to a relative accuracy greater than could be expected for data from most biological experiments.

INTRODUCTION

Parsons [l] recently described a method for fitting multiexponential functions of time to experimental data. Although this method is essentially correct in theory and can easily be shown to work with exact values of the sums of exponentials at equally spaced intervals of time, it fails when used with rounded values of these sums, even when the precision of these rounded values is better, by an order of magnitude or more, tfian can be expected for most biolo&cal data. The method fails for two reasons. First, the number of exponentials cannot be determined as outlined in [l], as there is no “suitable criterion” for the determinant D(p + 1, n) being zero. The inequality given by Parsons [l] as being a suitable criterion takes into consideration the error of only one element of the determinant Mathematical Biosciences 5 (19691, 195-199 Copyright 0 1969 by American Elsevier Publishing Company, Inc. 14

M. W. SIMPSON-MORGAN

196

in estimating the likely error of its calculated value, whereas the actual error is compounded from the errors of all the elements and, besides being difficult to estimate, it would be bigger than indicated by the suitable criterion. Second, even if the number of exponentials is known, the successful application of the method requires data with a precision seldom attainable in biological experiments. EXAMPLE

Using the notation in [l] throughowt, consider r(2) = Al exp(-42&

+ AZexp(-&

where A1 = A2 = 0.5, a, = 0.02, and a, = 0.2. TABLE I luZWi,m0F

CALCUL+A’llONS

n

D(2, ti; (0

1 2 3 4 5 6 7

5.19x 10-a 4, :@x lo-% 3.1.1 x IO-’ 2.67x IO-” 2.16x 1O-3 1.85x 10-a 1.23x 10-a

D(3, n) (2) 3.7x 9.4x 4.8 x 4.5 x 13.6x 12.4x 4.1 x

lo-’ lo-’ lo-’ 10-i lo-’ lo-’ 10-7

USING VALUES OF r(t)

Erroti (3) 4.0x 4.0x 0.2x 2.0x 1.5x 3.0x 3.4x

ROUNDED

(2)c (1)X (3)

10-6 10-S 10-S 1o-5 IO-5 10-S 10-S

::: :: 42:0 22.4 8.3

Mean values Calculated from mean values of X, and X2 Calculated from mean values of cl and c2

TO FOUR DECIMAL

T=l

PLACES=

T=2

a1

a2

G

a2

0.0179 0.0114 0.029 1 0.0182 0.0208 0.0106 0.029 1

0.196 0.186 0.225 0.194 0.201 0.172 0.258

a.0208 0.0222 0.0196 0.0186 0.0197 0.0193

0.201 0.206 0.199 0.195 0.199 0.198

0.0190 0.0196 0.0204

0.205 0.205 0.204

0.0200 0.0200 0.0201

0.200 0.200 0.200

u The values of D(2, n) and Df3, n) were calculated using consecutive values of r(t), and values of al and a2 using consecutive values of r(t) (T = l), or alternate values of r(t) (T = 2). b Rounding error in r,+4. e Values in this column greater than 1 indicate that, according to Parson’s suitable criterion [I], D(3, n) is nonzero.

Values of r(f) were calculated, accurate to seven decimal places, for t = 1, 2, 3,. . . ) 12, and were then rounded off to four, three, or two decimal places. These rounded values were then treated as experimental data according to El]. The results of calculations using values of r(t) rounded to four decimal places are presented in Table I. All determinants D(2, pt) and D(3, n), ?l= 1,2,3,... 7, were nonzero, as determined by the suitable criterion in [l], which should indicate that the values of r(t) do not lie on a curve representing the sum of two exponential functions of time. Nevertheless, Mathematical Biosciences 5 (19691, 195-199

FITTING

MULTIEXPONENTIAL

FUNCTIONS

197

since the maximum. rounding error in any of the values of t(l) used was less than 0.01x, the values of al and a2 were calculated assuming that there were two exponentials. It can be seen from Table I that values of CI% and a, calculated from successive sets of simultaneous equations (7) in [I] for n = 1, 2, 3, . . . , 7 varied appreciably. However, the mean va! ues of al and a,, and values of al and a, calculated from mean values of .X1and TABLE ATTEMPTS

II TO CALCULATE

SECUTIVE

VALUES OF r(t)

DECIMAL

PLACES=

a1

AND a2 FROM CON-

ROUNDED

TO THREE

Xl 1 2 3 4

1.023

2.041 1.664 1.934

0.679 0.927

1.675

0.688

5 6

1.813 1.753

7

2.098

0.816 0.759 1.085

1.158 0.949 1.056 0.975 0.983 1.499 1.171 1.110

1.854

0.854

1.OOl

x2

0.884 0.715 0.878 0.717 0.830 0.253 0.926 -0.743b 0.852”

QAll figures in this and Table III were calculated accurate to seven decimal places but have been rounded for convenience of presentation. L Mean values of A’, and X2. c Values of X1 and X2 calculated from mean values of cl and c2.

X2, or from mean values of c1 and c2 were similar and reasonably close to the true values of 0.02 and 0.2. When alternate values of r(t) were used in the calculations, that is, making T = 2, all determinants D(2, n) and, D(3, n) were again nonzero, but the values of a1 and a, calculated from successive sets of simultaneous equations varied less, and the mean values of a, and a,, and the values of al and a, calculated from mean values of X1 and X2 or cl and c2 were nearly exactly correct. The results of these calculations are also given in Table I. For values of r(t) rounded to three decimal places (maximum rounding error
198

M. W. SIMPSON-MORGAN

values

of c1 and c, varied considerably and many led to values of X1 greater than 1, suggesting an exponential that increases with time. Moreover, the mean value of X1 and the value of X1 calculated from mean values of cl and c, were both greater than 1. When alternate values of r(t) were used to calculate al and a2, only one calculated value of X1 was greater than 1. Values calculated for al and avavaried considerably, but the mean of all values of a, and a2, and the values of al and a, calculated TABLE

III

CALCULATION

OF fZl AND a2 FROM ALTERNATE

THREE DECIMAL

n

VALUES GF t(t)

TO

PLACES

Xl

x,

0.646

0.964

0.670

0.0186

0.200

0.950 0.943 0.961

0.648 0.628

0.0254 0 ..0293

0.217 0.233

c2

a2

1

1.633

2 3

1S98 1.571

4

1.633

0.616 0.592 0,646

5

1.756

0.754

1.006

0.672 0.750

0.0199 -0.0030

6

1.674

0.682

0.973

0.701

0.0139

from

ROUNDED

0.199 0.129 0.178

Mean values Calculated from mean values of X1 and X2

0.0173

0.192

0.0172

0.194

Calculated from mean values of cl and c2

0.0187

0.192

the

myans of all values of X, and X2, or cl and c2, were closer to the true values of gl and a2. When the values of cl and c2 that led to the value of X1 grester than 1, and the corresponding values of X1 and &, and al and a, were omitted in calculating the various means, the different estimates of a, and a2 were closer to each other, bat not much closer to the true values of these constants. The results of these calculations are presented in Table III. When values of r(t) rounded to two decimal places (maximum rounding error < 1.0 %) were used in the calcu’lations, all determinants D(2, n) using consecutive or alternate values of r(t) were effectively zero. All attempts to calculate a, and a, as earlier, or even using every third value of r(r) (T = 3) failed. Moreover, all other attempts to use this method of analysis for sums of exponentials accurate to only two decimal places have failed, even when multiple regression techniques were used to calculate the constants c,, from a large number of simulated experimental points. Mathematical Biosciences 5 (19691, 195-l 99

FITTING MULTIEXPONENTIAL

FUNCTIONS

199

DISCUSSION

At first sight the method of analysis presented by Parsons [l] is attractive for its simplicity, and as the inequality following Eq. (6) in that article should be the method theoretically could be applied to functions that increase with time and tend asymptotically to an unknown constant value. However, it is not possible to determine the number of exponentials necessary to fit experimental points using the criterion suggested by Parsons [I]. The exalcnple presented in this article shows that with simulated experimental points that are more precise than could be hoped for in practice, the criterion used to determine the number of exponentials does not work, but that if the number of exponentials can be determined by some other method, reasonable estimates of the exponents can be calculated by various averaging procedures. Unfortunately, however, errors of data from biological experiments are seldom likely to be less than & lx, and under these conditions the method of analysis fails, even when the number of exponentials is known. Thus any workers considering using this method of analysis should determine whether their results are sufficiently precise to allow its use, particularly before they commit time to the development of computer programs for its implementation or a satisfactory way to determine the number of exponentials. Even when this method of analysis might be applicable, it should be pointed out also that there are many biological situations where the constants A in the function r(t) are not all positive, so that it is possible to have points of inflection in the function. In these circumstances, although the individual exponentials are all changing most rapidly at earlier points of time, their sum is not necessarily doing so ; moreover, earlier values are not necessarily the greatest in magnitude. Prudenct: would seem to dictate that rather th.an terminate an experiment early, as suggested by Parsons [l] with the consequent risk of incorrectly analyzing the data, it should be continued for sufficient time to allow such possible errors to become apparent. For many experiments, this will entail continuing the experiment until only the most slowly decaying exponential remains. REFERENCE 1 D. H. Parsons, Math. Biosci. 2 (1968), 123-128,

Mathematical Biosciences 5 (1969), 195-199