Agricultural and Forest Meteorology, 32 (1984) 71--77
71
Elsevier Science Publishers B.V., Amsterdam -- Printed in The Netherlands FITTING CROP GROWTH DATA USING MARQUARDT'S ALGORITHM* D.W. STEWART Agrometeorology Section, Land Resource Research Institute, Research Branch, Agriculture Canada, Ottawa, Ontario K I A 0C6 (Canada)
(Received June 15, 1983; revision accepted March 18, 1984) ABSTRACT Stewart, D.W., 1984. Fitting crop growth data using Marquardt's algorithm. Agric. For. Meteorol., 32: 71--77. A least-squares method using Marquardt's algorithm is used to fit crop growth data to a non-linear growth model. This non-linear model is expressed as a differential equation which cannot be integrated analytically. For comparison purposes, two other non-linear models based on the logistic and Gompertz equation are fitted to the same data. INTRODUCTION Many m e t h o d s have been d e v e l o p e d to relate c r o p yield t o w e a t h e r variables. T h e s e range f r o m linear regression analysis as used b y T h o m p s o n ( 1 9 6 2 , 1969), and Williams ( 1 9 7 2 ) t o v e r y c o m p l e x s i m u l a t i o n m o d e l s such as E L C R O S (De Wit et al., 1 9 7 1 ) and SIMED {Holt et al., 1975). T h e f o r m e r are empirical, additive m o d e l s using m o n t h l y t e m p e r a t u r e and p r e c i p i t a t i o n averages and have a relatively large n u m b e r o f c o e f f i c i e n t s which are calculated by least squares. H e n c e , a large d a t a base is n e e d e d for a reliable solution. S i m u l a t i o n models, by c o n t r a s t , are c o n c e p t u a l in n a t u r e with c o e f f i c i e n t s derived f r o m physical and physiological m e a s u r e m e n t s . T h e y are best used in c o n j u n c t i o n with an e x p e r i m e n t a l p r o g r a m w h e r e gaining u n d e r standing o f g r o w t h - e n v i r o n m e n t i n t e r a c t i o n s is t h e p r i m a r y objective. F o r m a n y applied p r o b l e m s , such as large-area yield p r e d i c t i o n , s i m u l a t i o n m o d e l s require physiological and e n v i r o n m e n t a l i n f o r m a t i o n which is n o t available. What is n e e d e d are simpler m o d e l s which can be driven b y available m e t e o r o l o g i c a l data b u t which are n o t r e s t r i c t e d to the empirical, additive f o r m o f linear regression. T h e r e is also considerable advantage in using daily w e a t h e r variables d i r e c t l y as a great deal o f detail can be lost using m o n t h l y normals. E x a m p l e s o f simple simulation m o d e l s are described in Baier (1973), Selirio and B r o w n ( 1 9 7 9 ) and B y r n e and D r u m m o n d (1980). Since these m o d e l s are non-linear, a m e t h o d o t h e r t h a n linear regression m u s t be used to evaluate c o e f f i c i e n t s in the models. F u r t h e r m o r e , the m o d e l o f B y r n e and D r u m m o n d ( 1 9 8 0 ) is e x p r e s s e d as a differential e q u a t i o n which creates special p r o b l e m s f o r c o e f f i c i e n t evaluation. In this paper, I describe a m e t h o d
* LRRI Contribution No. 80-102 0168-1923/84/$03.00
© 1984 Elsevier Science Publishers B.V.
72 for fitting non-linear models to dry-matter data with emphasis placed on the problems which occur when the model is a differential equation. METHODS
Byrne and D r u m m o n d (1980) derived the equation
dW/dt = alaEpW/(1.0 + a2W ) - a 3 W
(1)
where W is crop dry mass, t is time, a is the available fraction of water in the root zone, Ep is pan evaporation and a l , a2 and a3 are coefficients. This equation cannot be integrated analytically to express W as a straight-forward function of c~Ep and t. To calculate the coefficients, Byrne and D r u m m o n d (1980) minimized the function F1 defined as
F, = ~_ (dW/dt--al(~E,W/(1.0 + a21/¢) + a3W) 2
(2)
i=1
where W is measured crop dry mass, dW/dt is estimated rate of dry matter accumulation from measurements and n is the number of measurements during the growing period. The main difficulty with this m e t h o d is the use of dW/dt since any scatter in W will be amplified in dW/dt. In fact Byrne and D r u m m o n d (1980) did reject some data because of erratic behaviour of
dliC/dt. Therefore, there is some advantage to using only W and to minimizing F2, the least-squares function, expressed as
Fz = ~ (W-- l~) 2
(3)
i=l
where W is expressed as a function of a E p , a l, a z, a3 and W0. W0 is a constant of integration of eq. 1 which is set equal to the initial dry mass. To calculate F2, eq. 1 is expressed in finite difference form as
( W i - Wi-l)/At = al(O~Ep)i~r/(1.O + a z W ) - - a 3 f f r
(4)
where = (We + Wi-~)/2.0
(5)
Wi and (~Ep)i represent dry mass and modified pan evaporation on the ith day, Wi-a represents dry mass on the (i -- 1)th day and At is a time step of one day. Rearranging eq. 4 and solving for Wi results in the following quadratic equation Wi = ( - - B + (B 2 -- 4AC) °'s)/2A
(6)
where A = az(1.0 + a3At/2.0 )
(7)
73
B = 2.0 -- a 1 ( O L E p ) i A t + a 3 At(1.0 + a 2 Wi -
1)
(8)
and 2.0--al(aEp)iAt)--a2Wi_12(1.0--a3At/2.0)
C = Wi-l(a3At--
(9)
The negative sign immediately in front of (B 2 -- 4AC) °'5 can be disregarded as it leads to negative and thus imaginary values of Wi. Thus Wi can be calculated for each day given estimates of a l , a2, a3 and W0. For example, W~ can be calculated since Wi-1 is W0 for the first day. Then W2 can be calculated knowing W~, etc. To obtain a l, a2, a3 and W0, eq. 6 is differentiated with respect to each coefficient and the resulting four non-linear equations with four u n k n o w n s are solved using Marquardt's algorithm (Marquardt, 1963; Conway et al., 1970). In this case, eq. 6 cannot be differentiated analytically. However, Nash (1979) developed a very efficient version of Marquardt's algorithm with a numerical differentiation scheme and this has been used to solve for the four coefficients. In solving for a l, a2, a3 and W0, values of c~Ep and W are needed. These were tabulated by Byrne and D r u m m o n d (1980) at ten-day intervals and are presented in Table II. I assumed that (~Ep values represent an average for the ten-day intervals. Values of (~Ep measured each day could have been used directly if they were available from the Byrne and D r u m m o n d (1980) paper. To illustrate a more c o m m o n problem of non-linear fitting, models based on the logistic and Gompertz equations were fitted to the same data. Both equations were derived in a general way by Thornley (1976). The logistic equation assumes that rate of dry matter production is proportional to the product of (~Ep, W and (WMAx - - W ) where W M A x is a m a x i m u m dry mass attained when t becomes infinite. That is dW/dt
= D 10~EpW(WMAx
--
W)
(10)
which when integrated results in W =- WMAX/(1.0 + b 2 e x p ( - - b l I 1 ))
(11)
where bl
~- b l WMAX
b2
=
(WMA x --
(12)
Wo)/W 0
(13)
and I1 =
7 J
(aEp)dt
(14)
i=1
bl is a coefficient and m is the number of days from the beginning of the growing period. I1 can be approximated by simple daily summation expressed as
74
Il =
~ (O~Ep)iAt i=l
(15)
The Gompertz equation is derived from two differential equations, i.e.
dW/dt = ~ E v x W
(16)
dx/dt = --c~x
(17)
where x is a second state variable representing plant senescence. Integration results in W = Wo exp (c212)
(18)
where
I2
= ; aEp exp(--clt)dt
(19)
i=1
and c~ and c 2 are coefficients. 12 can be replaced by a daily summation expressed as 12 =
~ (O~Ep)i e x p ( - - c l t ) A t i=!
(20)
Equations 11 and 16 can be differentiated analytically with respect to their u n k n o w n coefficients and can be fitted to data using any standard program of the Marquardt algorithm (see for example, Conway et al., 1970; Ray, 1982). RESULTS AND DISCUSSION
Values of coefficients of each model with their standard errors are given in Table I. Table II shows values of dry mass for each model, W, (~Ep and the standard error of estimate and correlation coefficient of each fit. All models gave good estimates of dry mass with extremely high correlation coefficients. These coefficients were expected since W is highly correlated with time (r = 0.979). It would be difficult to rank the models on their goodness of fit since the data base is quite limited. However, the standard errors ranked the models in the same order in which they would be ranked on theoretical grounds. The largest standard error and thus the worst fit occurred with the logistic equation, the most empirical of the models. WMAX is a theoretical limit when t approaches infinity and varies for different growing periods. Since coefficients should be independent of the plant environment, the logistic equation as developed here is n o t suitable as a growth model. Selirio and Brown (1979) have used a logistic equation in a different
75 TABLE I Coefficients with their standard error for the differential equation by Byrne and Drummond (1980), the logistic equation and Gompertz equation Coefficient
Values
Standard Error
Dimensions
11.24 0.01128 0.0007154 0.01698
28.28 0.0802 0.0000549 0.0139
kgha -1 mm ha kg -1 day- I
75.79 6687.0 0.005004
36.33 386.3 0.000605
kg ha -I kg ha -l mm
1.598 0.03501 0.02357
3.056 0.0509 0.0857
kg ha -1 day -l mm -1
Differentialequation W0 al a? a3
Log~ticequation W0 WMA X
bl
Gomper~ equation W0 cl c?
TABLE II Values of measured dry mass (1~) and dry mass calculated from the differential equation of Byrne and Drummond (1980) (WB), the logistic equation (WL) and the Gompertz equation (WG) (all in kg ha -1 ) Time (days)
0~Ep (mm day -1 )
I~
WB
WL
WG
10 20 30 40 50 60 70 80 90 100 110
15.7 9.3 10.6 10.1 14.3 20.6 15.2 10.5 9.9 14.7 14.2
30 60 130 860 1390 2450 4260 4690 5030 5720 6600
54 123 295 596 1331 2900 4016 4537 4937 5860 6628
164 258 427 679 1254 2627 3883 4686 5306 5946 6301
34 121 339 676 1346 2708 3894 4647 5227 5910 6426
264 0.997
289 0.996
266 0.997
Standard error o f e s t i m a t e Correlation coefficient
way. T h e i r m o d e l can be d e s c r i b e d by
dW/dt ' =
(dW/dt')if(c~)
(21)
w h e r e f ( a ) = a / 0 . 8 w h e n a ~ 0 . 8 , f(c~) = 1 .0 w h e n a / > 0 . 8 a n d (dW/dt')i = b W ( W M A x - - W ) ; ( d W / d t ' h is a n i d e a l i z e d r a t e o f g r o w t h w h e n w a t e r is n o t
76 limiting, t' is an a c c u m u l a t i o n o f h e a t u n i t s f r o m t h e b e g i n n i n g o f t h e growing p e r i o d , b is a c o e f f i c i e n t , a n d WMAx r e p r e s e n t s an u p p e r genetic yield limit f o r ideal g r o w t h c o n d i t i o n s . T h e G o m p e r t z e q u a t i o n avoids the use o f WMAx and idealized g r o w t h b y c o m b i n i n g g r o w t h and s e n e s c e n c e into o n e relatively s i m p l e e q u a t i o n . Its u s e f u l n e s s c o u l d p r o b a b l y be enlarged b y r e p l a c i n g t in eq. 18 w i t h a h e a t unit accumulation. T h e d i f f e r e n t i a l e q u a t i o n b y B y r n e and D r u m m o n d has an e x t e n s i v e t h e o r e t i c a l d e v e l o p m e n t ( B y r n e , 1 9 7 3 ; B y r n e et al., 1976). I t is m o r e c o m p l i c a t e d to fit to d a t a t h a n t h e logistic or G o m p e r t z e q u a t i o n b e c a u s e o f t h e n e e d to d i f f e r e n t i a t e n u m e r i c a l l y t h e i n t e g r a t e d e q u a t i o n w i t h r e s p e c t to t h e c o e f f i c i e n t s . H o w e v e r , this d o e s p r o d u c e a fit t o d r y - m a t t e r d a t a w i t h o u t e s t i m a t e s o f dW/dt. T h e c o m p l e x i t y o f biological s y s t e m s o f t e n d i c t a t e s t h a t d i f f e r e n t i a l e q u a t i o n s u s e d t o d e s c r i b e such s y s t e m s c a n n o t be i n t e g r a t e d analytically. T h e simple finite d i f f e r e n c e s c h e m e d e s c r i b e d h e r e c o m b i n e d w i t h n u m e r i c a l d i f f e r e n t i a t i o n and the M a r q u a r d t a l g o r i t h m s h o u l d t h e r e f o r e h a v e w i d e a p p l i c a t i o n in agricultural research. ACKNOWLEDGEMENT I t h a n k Mr. Binns, E n g i n e e r i n g a n d Statistical R e s e a r c h I n s t i t u t e , Agric u l t u r e C a n a d a f o r his v a l u a b l e c o m m e n t s o n this p a p e r . REFERENCES Baler, W., 1973. Crop-weather analysis model: review and model development. J. App. Meteorol., 12 : 937--947. Byrne, G.F., 1973. An approach to growth curve analysis. Agric. Meteorol., 11: 161--168. Byrne, G.F., Torssell, R.W.R. and Sastry, P.S.N., 1976. Plant growth curves in mixtures and climatological response. Agric. Meteorol., 16: 37--44. Byrne, G.F. and Drummond, J.E., 1980. Fitting a growth curve equation to field data. Agric. Meteorol., 22: 1--9. Conway, G.R., Glass, N.R. and Wilcox, J.C., 1970. Fitting non-linear models to biological data by Marquardt's algorithm. Ecology, 51: 503--507. De Wit, C.J., Brower, R. and Penning de Vries, F.W.J., 1971. A dynamic model of plant and crop growth. In: P.O. Wareing and J.R. Cooper (Editors), Potential Crop Production, A Case Study. Heinemann Educational Books, London, pp. 177--142. Holt, D.A., Bula, R.J., Miles, G.E., Schreiber, M.M. and Peart, R.M., 1975. Environmental physiology, modeling and simulation of alfalfa growth. I. Conceptual development of SIMED. Research Bulletin 906. Agricultural Experimental Station, Purdue University, West Lafayette, IN, in cooperation with ARS, USDA. Marquardt, D.W., 1963. An algorithm for least-squares estimation of non-linear parameters. J. Soc. Ind. Appl. Math., 11: 431--441. Nash, J.C., 1979. Compact Numerical Methods for Computers: Linear Algebra and Function Minimization. Adam Hilger, Bristol. 227 pp. Ray, A.A., 1982. The NLIN Procedure, ASA User's Guide: Statistics pp. 18--37. SAS Institute Inc., Cary, NC. Selirio, I.S. and Brown, D.M., 1979. Soil moisture based simulation of forage yield. Agric. Meteorol., 20: 99-114.
77 Thompson, L.M., 1962. Evaluation of weather factors in the production of wheat. J. Soil Water Conserv., 17 : 149--156. Thompson, L.M., 1969. Weather and technology in the production of corn in the U.S. corn belt. Agron. J., 61 : 453--456. Thornley, J.H.M., 1976. Mathematical models in plant physiology. Academic Press, London, 318 pp. Williams, G.D.V., 1972. Geographical variations in yield wheat relationships over a large growing region. Agric. Meteorol., 9: 265--283.