Some applications of a digital computer program to estimate biological parameters by non-linear regression analysis

Some applications of a digital computer program to estimate biological parameters by non-linear regression analysis

BIOCHIMICA ET BIOPHYSICA ACTA 42I BBA 26738 SOME A P P L I C A T I O N S O F A D I G I T A L C O M P U T E R P R O G R A M TO E S T I M A T E B I O...

297KB Sizes 0 Downloads 53 Views

BIOCHIMICA ET BIOPHYSICA ACTA

42I

BBA 26738

SOME A P P L I C A T I O N S O F A D I G I T A L C O M P U T E R P R O G R A M TO E S T I M A T E B I O L O G I C A L P A R A M E T E R S BY N O N - L I N E A R R E G R E S S I O N A N A L Y S I S

G{)RD{)N L. ATKINS Department of Biochemistry, University of Edinb~rgh kledical School, Teviol Place, Edinburgh, 1£H8 9AG (Great Britain) (Received June 24th, 1971)

SUMMARY I. A digital c o m p u t e r p r o g l a m for non-linear regression analysis has been used to fit a wide range of biological functions d i r e c t l y to e x p e r i m e n t a l data. 2. The a p p l i c a t i o n s included a sine curve, the Michaelis Menten e q u a t i o n , o x y g e n binding to haemoglobin, the kinetics of allosteric enzymes a n d power functions. 3- A sine curve has also been fitted to d a t a in a h i s t o g r a m form b v m i n i m i s i n g the sum of squares of areas. 4. W h e r e a comparison can be made, the goodness of fit was as good as, or b e t t e r than, t h a t using other methods.

INTROI) UCTION The previous p a p e r I describes a digital c o m p u t e r p r o g r a m which can be used to calculate biological p a r a m e t e r s b y a non-linear regression analysis. The p u r p o s e s of this p a p e r are : (I) to illustrate how easily it can be a d a p t e d for use w i t h a wide range of functions, (2) to m a k e a comparison with a function, the Michaelis-Menten equation, where non-linear regression has a l r e a d y been used b u t with a different minim i s a t i o n procedure, a n d (3) to illustrate some novel a p p l i c a t i o n s of non-linear regression to functions of biological interest. METHOD Suitable sets of d a t a were t a k e n from published e x p e r i m e n t s . F i v e of tile sets were t a k e n from tables of results, the o t h e r two sets were c a l c u l a t e d from figures. T h e p r o g r a m 1 was a d a p t e d for each function b y m o d i f y i n g the m a i n p r o g r a m a n d the subroutine SU,~ISQUARES. The d a t a were e q u a l l y w e i g h t e d for m o s t of t h e a p p l i c a t i o n s because in only one case could weighting factors be c a l c u l a t e d from s t a n d a r d deviations of the d a t a points. The p r o g r a m s , w r i t t e n in IMP, were run on t h e IBM 36o/5o at the E d i n b u r g h Regional C o m p u t i n g Centre. Biochint. Biophys. dora, 252 (1971) 421-426

422

(,. L. ATK1NS

RESUI.TS AND DISCUSSION

SPw fu~ctio~l M a n y substances in animals show a periodic v a r i a t i o n in concentration. The rates at which t h e y are e x c r e t e d also change periodically. F o r example, MH.LS (ref. 2, Fig. IC) provides d a t a for the r a t e of p h o s p h a t e excretion b y four men d u r i n g 24 h, b u t does not fit a n y function to them. His d a t a were used here as if each p h o s p h a t e d e t e r l n i n a t i o n referred to a precise time r a t h e r t h a n being the m e a n value for a period of I h. A sine function was fitted to t h e m : p h o s p h a t e excretion rate as a ". , 0 d e v i a t i o n from the 24-h m e a n = P = 0a sin (02t--0a). The initial e s t i m a t e s (0~ - - 60, 0~ o.25 a n d 0a = I.O) were o b t a i n e d b y d r a w i n g a sine curve b y eye. W e i g h t i n g factors were c a l c u l a t e d as the reciprocals of the variances using the s t a n d a r d d e v i a t i o n s i n d i c a t e d b y MILLS 2. The function was not c o n s t r a i n e d to a 24-h periodicity. The p a r a m e t e r e s t i m a t i o n converged at six i t e r a t i o n s to give the function P =-- 47.9~3 sin (o.2758ot--o.56982). The result is shown in Fig. I. -

-

Michaelis Me,zle~ equation A n u m b e r of p r o g r a m s have been p u b l i s h e d a-5 for fitting the Michaelis-Menten equation, v = V[Sj/(Km + I S ] ) , to e n z y m e kinetic d a t a . Two a,4 use the non-linear regression m e t h o d suggested b y WILKINSON6 b u t the o t h e P uses the linear t r a n s f o r m m e t h o d of BLISS AND JAMES 7. The d i s a d v a n t a g e of using a t r a n s f o r m a t i o n is t h a t bias m a y be i n t r o d u c e d into the regression analysis. This is discussed later. The d a t a of WILKINSOX (ref. 6, Table I) give the initial velocities (v, ffmoles N A D + p e r 3 rain per m g e n z y m e protein) of an e n z y m e c a t a l y s e d reaction at various s u b s t r a t e concentrations (iS], mM NMN). The initial e s t i m a t e s were those of BLISS AN]) JAMES 7 (who used the same d a t a set): V = o.676 ffmole NAD~ per 3 rain per m g enzyme protein and K,,, = o.559 mM. The Michaelis-Menten equation was fitted d i r e c t l y to the data, using equal weighting. Three i t e r a t i o n s in the p r o g r a m gave the final values: V := 50

o

c

~, E

xt

t~

40

80

000 4Q

E o

0 ff

o

o_

ko°

ko

-40

/4 oo'

20

,o/

0

i

, 10

a

2JO

~

/

o

0

o/°

/

~3

--80

/o

30

/

30

t lh~

0

0

015

i 10

,,

15

[s]

lqg. I. A s i n e f u n c t i o n , P -- 47-99.3 sin (o,2758ot - 0 . 5 6 9 8 2 ) f i t t e d t o d a t a for t h e u r i n a r y e x c r e t i o n of p h o s p h a t e . P'ig. 2. T h e M i e h a e l i s M e n t e n e q u a t i o n iitte{1 d i r e c t l y to d a t a for t h e i n i t i a l v e l o c i t i e s of a r e a c t i o n (v) a t v a r i o u s s u b s t r a t c c o n c e n t r a t i o n s ([£']),

Biochim. Biophys. Acta, 252 (1971) 421 426

PARAMETER ESTIMATION BY COMPUTER PROGRAM

423

0.690395, t'2m : 0.596539 a n d residual v a r i a n c e = 1.840" IO-4. These results were v e r y similar to those of WILKI~SON~: O.69O, O.595 a n d 1.841 .lO 4; a n d BLISS AND JAMEST: O.690397, O.59664 a n d 1.84o. zo -4, respectively. The result is shown in Fig. 2. A llosterie proteins: Haemoglobin BARCROFTs o b t a i n e d d a t a for t h e fractional s a t u r a t i o n of h a e m o g l o b i n b y o x y g e n ( y / I o o ) at v a r i o u s o x y g e n c o n c e n t r a t i o n s (x, m m Hg). These d a t a , when plotted, p r o d u c e t h e sigmoid curve u s u a l l y shown b y allosteric proteins. H e fitted to them, b y an undisclosed m e t h o d , t h e function of HILL~: y / I o 0 = K x ' * / ( I + K x n) a n d o b t a i n e d K - - 2.92. IO -4 a n d n = 2.5. BARCROFT'S d a t a (ref. 8, t a b l e on p. 485) were used here with equal w e i g h t i n g factors, a n d the initial e s t i m a t e s were the p a r a m e t e r s c a l c u l a t e d b y BARCROFT. The p r o g r a m converged at six i t e r a t i o n s to give the values K = 1.86. IO -~ a n d n == 2.62. ] ' h e s t a n d a r d d e v i a t i o n of residuals was o.o1429 whereas BARCROFT's p a r a m e t e r s gave 0.01598. The result is shown in Fig. 3. Allosteric enzymes The Hill e q u a t i o n can be c o n v e r t e d to a linear form (tile Hill plot 10) b y using a l o g a r i t h m i c t r a n s f o r m a t i o n . CHANGEAUX11 used a modified Hill equation, v = VIIS],, / ( K + [ S ] " ) , a n d t h e corresponding Hill plot, log ( v / ( V - - v ) ) = n . l o g [ S ] - - l o g K, to describe the kinetics of an allosterie enzyme. The d i s a d v a n t a g e of using this Hill plot is t h a t one of the p a r a m e t e r s to be e s t i m a t e d (V) a p p e a r s in the d e p e n d e n t variable. A d i g i t a l c o m p u t e r p r o g r a m could a v o i d this problem, a n d one such p r o g r a m has been published1% However, it d e p e n d s on using linear forms of t h e Hill e q u a t i o n so t h a t tile d a t a are given u n e q u a l weighting. The d a t a set used here was t h a t of WINKER et al. (ref. 12, Fig. 3) which contains values of v a n d ~S], b u t no units are defined. Using initial e s t i m a t e s of V = 200, K = I. 9, n = 2.0 a n d weighting factors equal to u n i t y , the p r o g r a m converged at seven i t e r a t i o n s to give V - - 197.o 9, K - - 3.8425, n = 2.2182 a n d s t a n d a r d d e v i a t i o n of residuals = 5.9561. WINKER et al. 1~using their p r o g r a m with 10

200[ Y lOO

075

050

025

/

/

/

15oi /'d

o/

10o~

5o~

/o o,,, ° o

20

6'0

8 'o x

Fig. 3. The original Hill e q u a t i o n y / i o o to haemoglobin.

0

? d

od ¢ t

t

i

t

2

4

6

8

D] Kxn/(I + K x n) fitted to d a t a for the binding of oxygen

Fig. 4. The Hill e q u a t i o n fitted directly to d a t a for the initial velocities (v) of a reaction catalysed b y an allosteric enzyme at various s u b s t r a t e concentrations (kS]).

t3iochirn. Biophys. Acta, 252 (I97 I) 421-426

424

(;. L. ATKIN>

inital estimates V 200, Ko.r, - 1 . 9 o 5 , ~z = 2.I 5 o b t a i n e d final values 1" I99.087 _K.. 5 = L 9 o 4 9 ( e q u i v a l e n t to I< : 4.oo90), n 2.155 a n d S.D. of residuals (i.218I. T h e fit o b t a i n e d b y t h e p r o g r a m of \VIEKER 4 al. was n o t as good as t h a t described here, p o s s i b l y because the l i n e a r forms p r o d u c e d bias or g a v e a different w e i g h t i n g to the d a t a . T h e r e s u l t is shown in Fig. 4. Power functio,s

T r a c e r k i n e t i c d a t a w h e n collected over a long t i m e r a n g e are often Iitted to power f u n c t i o n s . F o r e x a m p l e , HOLTZMAN la gives a f u n c t i o n y : 0, I(t-} G ) / G ! 0:~ a n d WISE gt al. l~ use a f u n c t i o n y -: 0~ C 0..,e 0a t. T h e d e p e n d e n t v a r i a b l e , y, is e i t h e r the f r a c t i o n of r a d i o a c t i v e m a t e r i a l r e m a i n i n g in a n a n i n m l or t h e specific a c t i v i t \ ' of the m a t e r i a l at t i m e t. H o w e v e r , no t)rogram fl)r fitting power f u n c t i o n s a p p e a r s to have b e e n p u b l i s h e d . The d a t a set used here was t h a t of \VINE ct al. (ref. 14, T a b l e 1) where y is t h e r e l a t i v e specific a c t i v i t y of Ca ~ in t h e p l a s i n a a n d u r i n e of a m a n after a single i n j e c t i o n of 4~Ca'-'1 (pC,,'/d: i n j e c h d per g (;a'-'~). HOLTZMAN'S TM f u n c t i o n was fitted to this d a t a , e q u a l l y weighted, using i n i t i a l e s t i m a t e s O~ o . 4 , 02 0. 3 a n d 0:~ : : I . o . The final values, after three i t e r a t i o n s , were 0~ 1 . 0 S 5 0 , 0~ - - 0 . 0 0 1 7 3 1 1 , 0 a - - 0.33532 a n d S.I). of residuals =: 0.o25145 . F o r the f u n c t i o n of \VISF, d al. ~ the i n i t i a l e s t i m a t e s 0

(a) "%°~'o~o..

- 05

°"o.%o 0

0

-10

k 0 Go

%

-15 i

I

i

,

100

u_ Q m

0

i°~o

(b)

75 c

"°"o\

E >

-05

~,

5o

0

-10

o~

O,

5

25

f:

/

ea~ -15 i

f

%

2 log

(t)

t(h)

l q g . 5. T w o f o r m s oil p o w e r f u n c t i o n t i t t e d t o t r a c e r k i n e t i c d a t a . For (a) t h e f i t t e d f u n c t i o n w a s

it i ooo

s p e c i f i c a c t i v i t y = 1.685o [ o . o o i 7 3 1 7

.i,l ] ....

. F o r (b) t h e f u n c t i o n w a s specific a c t i v i t y =

o.43S55 t o.ao2a'a e o.o0a4a6a t. F o r c o n v e n i e n c e t h e r e s u l t s arc d r a w n o n a d o u b l e log p l o t . FiR. O. A s i n e f u n c t i o n f i t t e d to d a t a in h i s t o g r a m f o r m b y m i n i m i s i n g a s u m of s q u a r e s of a r e a s .

Biockim. Biopkys. Acta, 252 ( i 9 7 t) 421 420

P A R A M E T E R E S T I M A T I O N BY C O M P U T E R PROGRAM

425

were 01 = 0.4, 0= = 0.3 a n d 0a - - o.oo4. Six i t e r a t i o n s gave the final values of 0~ o.43855, 0= --- o.3o232, 0a = o.0o24362 a n d S.D. of residuals = o.o2oo14. The function fitted b y WISE et al. to their own d a t a , b y a m e t h o d n o t described, was y = o.43354 t ~.,.,7574e-0.00a957t. I t fits the d a t a less well with a S.D. of residuals = o.o26311. The p r e s e n t p a r a m e t e r e s t i m a t i o n shows t h a t (i) the second function describes the exp e r i m e n t a l d a t a b e t t e r t h a n does the first one, a n d (ii) a b e t t e r fit is o b t a i n e d b y using the present digital c o m p u t e r p r o g r a m . The results are shown in Fig. 5. Data in histogram f o r m Often d a t a can only be collected in histogranl form, e.g. for m a t e r i a l s which are e x c r e t e d in urine a n d the urine is collected over r e l a t i v e l y long times. FORT AND M1L[~S~5 p u b l i s h e d values for the excretion r a t e of K + over 24 b, when urine was collected from a m a n during seven u n e q u a l time periods. T h e y fitted the d a t a to a sine function : rate of K~ excretion (/2equiv/min) -~ m e a n excretion r a t e during 24 h~ 0~ sin (0.,t-0a) b y m i n i m i s i n g the sum of squares of areas between the fitted curve and the histogram. T h e y used a digital c o m p u t e r p r o g r a m , b u t gave no details a b o u t it. The d a t a of l"ot~x AND MILLS (ref. 15, Fig. IB), with equal weighting and initial e s t i m a t e s 0~ -- 30, O.a = O.26, 0~ == 1.57, were fitted here to the above sine function. The e s t i m a t i o n converged at t e n i t e r a t i o n s to give 01 =- 29.763, 02 = o.31195 a n d 0.~ ~ i.(}898. The period was not c o n s t r a i n e d to 24 h. The result is shown in Fig. 6. General comments I n non-linear regression analysis s t a n d a r d errors for the e s t i m a t e d p a r a m e t e r s can only be deternfined a p p r o x i m a t e l y . Their calculation is only useful when it is desired to c o m p a r e the results of fitting a given function to m a n y d a t a sets. S t a n d a r d errors are not given for the p a r a m e t e r s e s t i m a t e d above, because each function is only fitted to one d a t a set. The m a i n d i s a d v a n t a g e of a linear t r a n s f o r m a t i o n m e t h o d is t h a t although the error in the m e a s u r e m e n t of a v a r i a b l e is u s u a l l y n o r m a l l y d i s t r i b u t e d the error in the t r a n s f o r m e d variable will n o t be. The effect is to i n t r o d u c e bias into the regression analysis. The o t h e r d i s a d v a n t a g e is t h a t some t r a n s f o r m e d d a t a are given g r e a t e r weighting t h a n others so t h a t p r o p e r w e i g h t i n g factors n m s t be used to correct this. The p r o b l e m s of bias a n d correct weighting factors are discussed in detail b y \'VILKIXSON6 a n d DOWD AND R I G G S 16. CONCLUSION

The p r o g r a m described in the previous p a p e r 1 has been a p p l i e d to a wide range of functions which are often used in biology. Unlike m o s t previously published m e t h o d s for fitting these functions to e x p e r i m e n t a l d a t a (by d i g i t a l c o m p u t e r or not) the p r o g r a m fits the functions d i r e c t l y w i t h o u t their being t r a n s f o r m e d to a simpler form. I t can also be easily a d a p t e d to minimise either a sum of squares of residuals or a sum of squares of areas. W h e n a comparison can be m a d e , the p r o g r a m gives a fit which is as good as, or b e t t e r t h a n , those of other m e t h o d s .

Biochim. t~'iophys. Acta, 252 (1971) 421 42(,

426

( ; . L . ATK1NS

ACKNOWLEDGEMENTS i would like to thank their

technical

Science Research

assistance.

Miss Helen Wightman Part

of t h i s w o r k

and Miss Caroline Thompson

was supported

by

a grant

from

for the

Council.

RIiFERENCILS I 2 j 4 .5 6 7 q io 11 12 I.] 14 15 i6

(;. L. ATKINS, Biochim. Biophys. Acta, 252 (197 I) 405 . J. N. 3'IILLS, Physiol. Rev., 46 (1966) 128. W. W. CLEL.ttN1), Nature, 198 (1963) 463 . K. ][5. HANSON, R. LING AND t7.. HAVIR, Biochem. Biophys. Rt,s. Commun., 29 (t967) 1~)4. T. G. HoY AND D. M. (]OLDBERG, lnt, J. Biomed. Computing, 2 (1971) 7 L G. N. WILKINSON, Biochem. J., 8o (1961) .324. C. I. BLISS AND A. T. JAMES, Biometrics, 22 (1966) .573. J. BARCROFT, Bioche~l. jr., 7 (1913) 481A. V. HILL, J. Physiol. London, 40 (19IO) Proc. iv. J. WY,~IAN, Cold Spring Harbor Syrup. Quant. Biol., 28 (1963) 483 . J, P, CHANGEAUX, Bull. Soc. Chim. Biol., 46 (1964) 947. H. J. WIEKER, N. J. JOHANNES AND B. HESS, F E B S Lett., 8 (197o) 178. R. B. HOLTZMAN, Radiat. Res., 25 (1965) 277. 31. E. WISE, S. B. OSBURN, J. ANDERSON AND ]v. W. S. TOMLINSOX, Math. Biosci., 2 (I908) I99. A. FORT AND J. N. MILLS, Nature, 226 (197o) 657 . J. IL. DOWD AND D. S. RIGC,S, .[. t~iol. Chem., 24o (1965) 863.

Biochim. Biophys. Acta, -'52 (I97 I) 42t 420