A rainfall generator for agricultural applications in the tropics

A rainfall generator for agricultural applications in the tropics

Agricultural and Forest Meteorology, 63 (1993) 1-19 Elsevier Science Publishers B.V., Amsterdam 1 A rainfall generator for agricultural applications...

966KB Sizes 1 Downloads 57 Views

Agricultural and Forest Meteorology, 63 (1993) 1-19 Elsevier Science Publishers B.V., Amsterdam

1

A rainfall generator for agricultural applications in the tropics P.G. Jones a and P.K. Thornton b aThe International Centre for Tropical Agriculture ( CIA T) , A A 6713, Cali, Colombia bThe International Fertilizer Development Center (1FDC), PO Box 2040, Muscle Shoals', AL 35662, USA (Received 20 March 1992; revision accepted 10 September 1992)

ABSTRACT Jones, P.G. and Thornton, P.K., 1993. A rainfall generator for agricultural applications in the tropics. Agric. For. Meteorol., 63: 1-19. A new rainfall generator, based on a third-order Markov chain, has been developed to simulate the year-to-year variation in rainfall amount that is observed in the tropics. To do this, some of the parameters of the model are themselves sampled randomly. The model has been fitted to over 40 sites in the tropics. The model was tested in detail for three sites in Central and South America and in the African Sahel. Few significant differences were found between the historical and simulated means and variances of yearly and monthly rainfall amount and raindays per month. There were some problems with simulating wet and dry sequences of longer than 9 days. When historical and simulated rainfall records were used to drive a crop simulation model, there was no significant difference in the mean and variance of maize yield response. The results reported suggest that the rainfall model performs adequately for many applications.

INTRODUCTION

The implications of outstandingly good or bad years for the resource-poor farmer in the tropics can be profound, much more so than for many temperate farmers who tend to be cushioned from rare or catastrophic events by the effects o f government policies such as crop insurance or guaranteed minimum prices. For economically vulnerable farmers, the impact of extreme weather can have an importance that far outweighs its apparent probability of occurrence.

Simulation modelling techniques are increasingly being used as a means of extrapolating the results o f field trials and of designing and assessing new agrotechnology that can then be passed on to farmers. One of the corCorrespondence to: P.K. Thornton, The International Fertilizer Development Center, PO Box 2040, Muscle Shoals, AL 35662, USA.

0168-1923/93/$06.00 © 1993 Elsevier Science Publishers B.V. All rights reserved.

2

P.G. JONES AND P.K. THORNTON

nerstones of many types of modelling activity is weather generation, where a statistical model is used to generate a time series of the climatic variables of interest. The generated series is assumed to possess the same properties as the historical time series that was used to estimate the statistical model. In the tropics and subtropics, one of the most important single determinants of agricultural activity is rainfall. Existing rainfall generators tend to suffer from an inability to model accurately the year-to-year variation in rainfall that is frequently observed, and thus they cannot model very wet or very dry years. It is these tails of the distribution that may be of vital importance in assessing agrotechnology. This paper describes a new rainfall model based on a third-order Markov process. A feature of the model is that some of the parameters are sampled randomly. The results of testing are presented, both in terms of descriptive statistics and in terms of the rainfall model's functioning when driving a simple water balance model and a simulation model of the growth and development of maize. The results are compared with those obtained by using the Richardson rainfall model W G E N (Richardson, 1985) for the same sites. THE RAINFALL

MODEL

Rainfall is modelled using a two-stage third-order Markov chain. First, it is determined whether any particular day is wet; this depends on whether there was any rainfall on the 3 previous days. If so, then the amount of rainfall is determined. The model is described below.

Probability of a wet day The probability of Day i being wet (W) is defined as

P(W/DI D2D3) = @-~(bi +ai_ldl -k-ai_2d2 -q-ai_3d3)

(1)

where tI)-I is the inverse of the probit function, b i is the monthly baseline probit of a wet day following 3 consecutive dry days, a m are binary coefficients for rain (1) or no rain (0) on Day m, and d m are lag constants. Thus the probability of a wet day following 3 dry days is ~ - l (bi), and the probability of a wet day following 3 wet days is ~-~(b~ + d~ + d2 + d3), for example. This part of the model is thus specified by 15 parameters: the baseline probabilities, bg, derived for each month, and three lag constants, d~, d2 and d3, that are unchanging from month to month. In setting up this model, we used the generalized linear modelling package (GLIM) (Baker and Nelder, 1978), to analyze historical data from over 40 stations in the tropics and subtropics, located in Niger, Burkina Faso, Colombia, Brasil, Ecuador, Paraguay, and the USA. The length of the daily rainfall records at each site varied from 20 to 52 years. Tables were produced

A G R I C U L T U R A L APPLICATIONS IN THE TROPICS

3

of rain day incidence dependent on state by month; that is to say, counts were tallied and summed over years for each occurrence of the eight possible states of rain or no rain on the previous 3 days to produce two tables. One table showed the existence of the states, while the other showed the occurrence of rain on the day following each of these states. In analyzing the data, we used a binomial error term and a probit link function; we treated the occurrence of rain on Day i - 1 , Day i - 2 and Day i - 3 as the independent variables and the monthly total as another variable. This allowed us to test the significance of the lag constants by using a chi-squared statistic. The results showed conclusively that a third-order Markov rainfall model was necessary, since the chi-squared statistic, owing to the inclusion of the third-order lag in the model, was highly significant for 92% of the tropical locations studied. This method of fitting the model also allowed us to test the significance of any interaction between the lag constants and the probabilities for the 12 months. Although certain datasets did show small interaction effects, this was generally not the rule, and it was concluded that under a probit transform the lag effects could be considered additive to the monthly effects (see eqn. (1)). The residual deviance, tested as a chisquared statistic, was insignificant in almost all cases. Rainfall on a wet day Rainfall is modelled by using the censored gamma distribution, restricted below 0.1 mm, to determine daily rainfall amounts on those days that rainfall is experienced (Sterne and Coe, 1982). The method of maximum likelihood is used to estimate the mean and shape parameters of this distribution for each calendar month, thus giving rise to 24 additional model parameters. Interpolating back to daily data In generating rainfall records, the monthly baseline probabilities (the probability of rain after no rain for 3 successive days) are interpolated to daily probabilities by using the 12-point Fourier transform described in Jones (1987). The lag effects are then added to each day's probit transform of the baseline probability to produce a matrix of 365 or 366 days by eight states (wet or dry conditions on 3 successive days). The inverse probit transform is then used to transform this matrix to normal probabilities. Similarly, the monthly mean and shape parameters of the gamma distribution of rainfall amounts are interpolated to daily values by using the 12-point Fourier transform. This basic model gives excellent simulations of short-term variation in rainfall. However, as have all Markov approaches to date, it fails to predict year-to-year variation as measured by the annual variance of rainfall. A

4

P.G. JONES A N D P.K. T H O R N T O N

further stage of development is clearly necessary, and another level of variability must be incorporated.

Annual variance and the variability of parameters The parameters of the model, being simply estimates obtained from sometimes short data sets, have associated standard errors. We proposed that any random sampling should be based on the uncertainty of the parameter estimates themselves. The 12 monthly baseline probabilities, bi, are autocorrelated because of the yearly progression of weather, even in the tropics; thus, a resampling scheme must take these correlations into account. In what may be referred to as the 'sampling' version of the basic model, this is done by randomly sampling from a 12-variate normal distribution. We have not tested the possibility of resampling the lag constants (d~, d2, and d3) because the method based on sampling the baseline probabilities results in a good fit of the test data. The resampling scheme can be represented by b* i --- s i R N i - F b i ,

i = 1 , . . . , 12

(2)

where b*i is the sampled value of hi, the baseline probability of rain; s~ is the standard deviation of hi; and RN~ is a random normal number. The resampiing algorithm involves the Cholesky square root decomposition of the correlation matrix of monthly rainfall. The correct correlation matrix to use would be that of the baseline probabilities in their probit transform. In practice, however, this is difficult to calculate with short data sets. We thus assumed a surrogate correlation matrix and used the standard errors per year obtained in the original G L I M analysis multiplied by the square root of n - 1, where n is the number of years. The pseudo-random normal number generator of Marsaglia and Bray (1964) is used for rapid resampling of the 12 monthly baseline probabilities in their probit transform. The algorithm then adds in the lag constants and produces a new matrix of 365 or 366 days by eight states for each year for which rainfall records are required.

Problems with the probit transform In the course of testing the model with random resampling, we found that it did not work well when the rainfall probabilities were very low. This includes certain months in the test data sets for Guatemala City and Tillabery, Niger (see the following section). Subsequent analysis showed that the use of the probit transform produces a systematic bias. When resampling is used, low probabilities are overestimated and high probabilities are underestimated after retransformation. Simulations of completely random numbers were used to evaluate the empirical relationship of the standard error to the overall probability level. Probits, produced from runs of up to 200 years, were

AGRICULTURAL APPLICATIONS IN THE TROPICS

5

summed to monthly means and retransformed to probabilities. The variances of the retransformed monthly mean probabilities were then compared with the actual variances introduced in the simulations. An exhaustive analysis using GENSTAT (Numerical Algorithms Group, 1980) was performed to identify the discrepancy in the model. The bias in the monthly probabilities was found to be related completely (explaining 100% of the variance) and simply, although empirically, to the probability level and the standard deviation. In the algorithm for the rainfall model with sampling, this relationship is used to correct the monthly baseline probabilities by adding to them the correction factor Di, defined as D i =

bi(O.55228s 2

- 0 . 2 6 1 5 4 s 3)

(3)

where for Month i, b i is the baseline probability of a wet day following 3 dry days, and si is the standard deviation of the baseline probability. TESTING THE MODEL

Sites and data

The following sites were chosen for detailed testing of the rainfall model: (1) The national observatory in Guatemala City, Guatemala, with daily rainfall data covering the period 1932 to 1971 (latitude 14.35°N, longitude 90.32°W, height above sea level 1501 m). (2) Palmira, in the department of Valle, Colombia, with daily rainfall data covering the period 1932 to 1981 (latitude 3.30°N, longitude 76.18°W, height above sea level 965 m). (3) Tillabery, Niger, with daily rainfall data covering the period 1933 to 1976, minus the years 1940-1944 inclusive (latitude 14.12°N, longitude 1.27°W, height above sea level 290 m). The variation shown by these sites is summarized in Fig. 1, which shows monthly mean rainfall. For all data analysis, incomplete years were discarded from further consideration. For Guatemala City, 1938 and 1968 were discarded, leaving 38 years intact. For Palmira, all 52 years in the sample were used. For the Sahelian site, 1933, 1939, 1947 and 1972 were discarded, leaving a sample of 35 years. Models tested

Three rainfall models were tested: (1) the third-order Markov chain model outlined above, without parameter sampling, denoted TM; (2) the third-order Markov model with sampling, denoted TMS; (3) Richardson's (1985) model, denoted RM. For the two versions of the third-order Markov model, parameters were

6

P.G. JONES

00~

L

~

AND

P.K. THORNTON

GUATEMALCI ATY

~ 100 --

TOTAl. 1 1 7 B ~ (. - 38)

0

2OO

(b)

PALMIRA v°T~" s""ml -5=.

E

1 1 ° ° ~ 0

_ 2o0 (c)

TILLABERY TOTAL483 m m (n-3S)

0 --J F

M AMJ

d

AS

ON

D

Fig. I. Monthly mean rainfall at three sites: (a) the National Observatory in Guatemala City; (b) Palmira, Colombia; (c) Tillabery, Niger.

estimated by using the procedure outlined above. Parameter values are shown in Table 1. At Tillabery, no rainfall has been recorded in January since 1933, and rainfall is rare in February, March, April, November and December. The standard errors of the monthly baseline probits for these months are essentially indeterminate. Rather than use the huge standard errors as calculated using GLIM, we decided to limit these arbitrarily. Subsequent analysis showed that the choice of the limit is far from critical, in that the baseline probability of rainfall for such months is so low. Richardson's model was chosen as a representative 'benchmark' weather simulator against which to test the third-order Markov model. It has been tested and used extensively in crop modelling and other applications (see International Benchmark Sites for Agrotechnology Transfer, 1989, for example), and it is a comparatively straight-forward model that is easy to use. It simulates daily values of rainfall, maximum and minimum temperature and solar radiation for a given location (Richardson and Wright, 1984). Rainfall is generated independently on any day, and then the values of the other variables are generated, depending on whether the day is wet or dry. The model was designed to preserve the dependence in time, the correlation between the variables, and the seasonal characteristics in actual weather data for the location of interest (Richardson, 1985). Richardson's model uses a first-order Markov chain to generate rainfall. The probability of rain on any

AGRICULTURAL APPLICATIONS IN THE TROPICS

TABLE 1

Parameters derived for the third-order Markov rainfall model Month

1 2 3 4 5 6 7 8 9 10 11 12 D1 D2 D3

Guatemala city n = 38

Tillabery n = 35

Palmira n = 52

x

ct

b

sb

x

ct

b

ab

x

c~

b

sb

2.0 2.4 4.5 5.4 9.9 11.1 10.2 9.9 10.9 9.3 5.2 4.0

0.924 0.935 0.638 0.753 0.767 0.899 0.836 0.896 0.863 0.708 0.782 0.632

-1.723 -1.796 -1.650 -1.359 -0.654 -0.160 -0.386 -0.440 -0.145 -0.615 -1.275 -1.669

0.377 0.415 0.352 0.298 0.244 0.283 0.260 0.256 0.290 0.249 0.284 0.355

7.6 7.2 8.3 10.0 8.9 6.5 3.7 5.4 6.6 9.2 7.2 6.6

0.637 0.644 0.627 0.634 0.624 0.587 0.621 0.666 0.595 0.675 0.684 0.681

-0.940 -0.741 -0.738 -0.498 -0.530 -0.715 -0.969 -0.968 -0.779 -0.432 -0.489 -0.686

0.256 0.258 0.247 0.251 0.249 0.251 0.257 0.257 0.251 0.251 0.253 0.247

0.0 10.5 6.1 5.1 5.8 7.7 9.9 13.1 9.6 7.4 3.1 2.1

0.000 0.318 0.515 0.518 0.727 0.593 0.752 0.759 0.735 0.604 0.712 1.550

-4.562 -2.767 -2.518 -2.064 -1.324 -0.818 -0.445 -0.329 -0.777 -1.609 -2.638 -2.944

0.415 0.415 0.415 0.415 0.298 0.257 0.259 0.274 0.257 0.341 0.415 0.415

0.7894 0.1312 0.1608

0.4910 0.2396 0.1590

0.1380 0.1530 0.1252

x is the mean rainfall per rain event ( = c¢//~), ~ is the shape parameter of the gamma distribution, b is the baseline probit P(W/DDD), D1, D2 and D3 are the three lag probits, and sb is the variance of b used for

sampling.

day is conditional on the occurrence of rain on the preceding day. Thus only two transitional probabilities are required. If Day i is wet, then rainfall amount is determined by sampling from a gamma distribution specified by ~, the shape parameter, and fl, the scale parameter. The values of P~(W/W), P~(W/D), ~ and fl are held constant for any month, but they vary from month to month. The values calculated for the three sites are shown in Table 2.

Rainfall sequence generation For all simulated sequences, 250 years of daily rainfall data were generated in an attempt to obtain asymptotic values of the generated parameters. For the Richardson model, this takes approximately 160 s on a 33 Mhz 386 personal computer with a mathematics coprocessor. The third-order model with sampling takes some 190 s on the same machine. Model comparison

Rainfall amount Simulated monthly mean rainfall amounts, yearly totals, and variances, are given in Table 3 for all three sites for the three rainfall models used. For the drier months in Guatemala City and Tillabery, calculated variances of rainfall

8

P.G. JONES AND P.K. THORNTON

TABLE 2

Parameters derived for the Richardson rainfall model Parameter MonthJ Guatemala P(W/W) P(W/D) fl

F

city

n = 0.236 0.038 0.904 2.143

38 0.289 0.032 0.930 2.644

M

0.232 0.049 0.631 7.369

A

M

J

0.318 0.669 0.776 0.093 0.246 0.566 0.752 0.767 0.892 7.203 13.068 12.500

Palmira n = 52 P(W/W) 0.428 0.520 0.472 0.558 0.577 0.496 P(W/D) 0.184 0.246 0.267 0.384 0.350 0.273 ct 0.636 0.645 0.627 0.633 0.624 0.586 fl 11.919 11.193 13.219 15.722 14.254 11.046

Tillabery P(W/W) P(W/D) c~ fl

n = 35 0.000 0.333 0.000 0.002 0.000 0.318 0.000 32.868

0.167 0.006 0.515 11.868

0.136 0.019 0.518 9.921

J

A

S

O

0.728 0.707 0.777 0.655 0.422 0.396 0.608 0.292 0.829 0.893 0.866 0.698 12.274 11.142 12.648 13.258

0.397 0.182 0.621 5.942

N

0.390 0.100 0.758 6.777

0.385 0.453 0.567 0.569 0.184 0.249 0.443 0.391 0.665 0.595 0.674 0.683 8.044 11.131 13.617 10.555

0.210 0.267 0.390 0.433 0.325 0.171 0.089 0.235 0.408 0.487 0.243 0.054 0.710 0.585 0.755 0.750 0.731 0.598 8.136 13.180 13.092 17.594 13.190 12.628

0.400 0.002 0.752 4.954

D

0.260 0.048 0.621 6.510

0.517 0.280 0.681 9.737

0.000 0.000 0.000 0.000

P(W/W) and P(W/D) are probabilities of a wet following a wet and a dry day respectively; ct and fl are the shape and location parameters of the gamma distribution for rainfall amount.

are large and unstable; these are omitted from Table 3. Statistical differences at the 5% level between the simulated and the historical values are marked with an asterisk. Monthly and yearly distributions were tested for normality by using the Lilliefors and Shapiro-Wilk tests. If the null hypothesis of normality was not rejected in both tests, then t and Fstatistics were calculated. If there was any indication of non-normality, then the Mann-Whitney and Squared Ranks tests (Conover, 1980) were used ingtead to test the means and the variances, respectively. All three rainfall models simulated monthly mean amounts that were in general close to the historical values. Of the 108 simulated monthly values, no monthly mean generated by any of the models failed the appropriate parametric or non-parametric test for equality with the appropriate historical monthly mean. Both RM and the third-order model without sampling, TM, significantly under-estimated the yearly variance in all cases, whereas the sampling model (TMS) led to no significant differences in the yearly variances and infrequent significant differences in the monthly variances (in most cases, failure was due to excessive variability, rather than too little).

Guatemala

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

57.3 1574.3 59.0 1289.7 53.0 1251.7 60.1 1886.1

0.9 0.5 2.2 2.8 -

68.0 1980.3 66.6 1083.8" 68.3 1532.3 66.9 1614.7

3.0 -

3.4

3.0

3.0 2.7 2.4

2.9 2.4

F

10.2 -

8.6 8.8 7.5

1.2 1.6 1.1 , 1.1 -

86.3 2724.4 85.3 1891.8" 89.8 2436.4 88.3 2762.4

M

3.4 4.1 3.7 3.8 -

138.7 5000.6 145.3 3452.6* 140.6 3248.4* 137.8 5723.4

19.4 487.6 19.9 327.4* 18.8 310.0" 19.8 404.0

A

18.0 375.5 17.7 248.1" 17.4 221.5" 18.7 356.4

124.9 2994.6 125.5 2678.3 131.3 2916.5 130.9 4739.9*

129.3 5977.5 131.3 3267.2* 125.6 3680.5* 120.0 4173.5

M

56.2 1614.4 58.4 1051.6" 58.3 1240.4 53.7 1500.3

68.5 1791.0 68.4 1124.6" 70.3 1391.7 71.1 2227.4

239.4 6084.3 236.7 3472.9* 233.1 4180.4" 242.7 6692.9

J

122.9 2808.1 129.1 2377.3 121.1 2416.2 113.3 3025.5

26.5 359.2 25.7 255.1" 27.9 353.0 26.9 328.3

191.7 6087.0 199.1 4094.3* 188.8 3912.4" 187.2 6614.3

J

188.8 6338.4 189.4 3839.7* 176.1 4381.6" 180.7 6304.9

38.2 814.2 42.5 629.8 36.9 647.9 40.2 903.2

177.7 4997.0 179.3 3949.8 167.0 3773.2 169.2 5822.5

A

76.9 1900.2 74.3 1675.5 71.3 1598.9 74.8 2292.8

61.9 1262.9 59.1 908.3 63.9 1248.8 61.9 1671.8

240.4 10007.7 238.7 4162.0" 230.6 3979.6* 230.7 7340.7

S

14.4 621.7 16.5 370.0* 21.1 380.0* 22.0 589.4

143.9 4138.1 147.1 2773.3* 140.0 3876.9 132.6 3919.2

132.6 5921.3 132.8 3116.9" 143.7 3727.4* 145.0 5238.5

O

7.7 8.1 7.3 7.8 -

0.1 0.0 0.2 0.2

0.8 0.7 0.8

75.9 2205.1 77.8 1241.9" 76.3 1466.3" 77.5 2081.4

D

0.4

103.0 2938.5 104.0 1885.5" 108.3 1881.5" 105.1 2990.1

22.3 583.4 2.9 328.9* 25.8 415.9 29.3 715.0

N

483.2 18428.6 492.2 10296.3* 473.1 10370.3* 471.8 16907.3

993.2 39202.1 1006.1 20983.9* 1006.6 20528.8* 999.2 50167.0

1175.0 44712.7 1182.7 23721.1" 1153.7 25738.4* 1168.3 44145.7

Year

38

250

250

250

35

250

250

250

52

250

250

250

n

M e a n (x, m m ) a n d v a r i a n c e (var). S i g n i f i c a n t differences b e t w e e n h i s t o r i c a l (Hist) a n d s i m u l a t e d v a l u e s a r e i n d i c a t e d b y a n asterisk. R M is the R i c h a r d s o n model, T M is t h e t h i r d - o r d e r M a r k o v m o d e l w i t h o u t s a m p l i n g , a n d T M S is the t h i r d - o r d e r M a r k o v m o d e l w i t h s a m p l i n g .

Hist x var RM x var TM x var TMS x var

Tillabery

Hist x var RM x var TM x var TMS x var

Palmira

Hist x var RM x var TM x var TMS x var

J

Rainfall amounts per month

TABLE 3

.-~ 7~ o

Z -7

>

.~

10

P.G. JONES AND P.K. THORNTON

TABLE 4 Raindays per month J

F

M

A

M

J

J

A

S

O

N

18.8 17.5 19.0 14.0 18.6 18.2 18.2 36.6*

17.9 27.4 17.7 14.9" 17.0 17.6" 16.8 37.1

21.9 8.5 21.7 9.8 21.5 12.5" 21.1 33.0*

14.3 4.3 23.9 10.4 14.3 4.3 15.3" 5.5* 15.0 4.6 17.9 6.8* 14.6 5.1 34.1 13.5

D

Year

n

Guatemala Hist x var RM x var TM x var TMSx var

1.5 2.8 1.3 2.0 1.6 2.2 1.8 4.8*

1.2 3.7 1.1 1.6" 1.2 1.4" 1.3 2.9

1.8 3.6 1 2 . 9 21.5 3.3 4.4 2 7 . 1 15.4 1.8 3.6 1 2 . 9 2 1 . 1 2.7 5.4 17.1" 9.4* 1.8 3.7 1 2 . 2 21.6 2.6 5.6 17.2" 13.1 2.1 3.7 1 2 . 2 21.5 5.9* 11.8" 29.5 29.0*

1.9 4.5 1.8 2.7* 1.8 2.4* 2.1 5.9

121.7 38 231.9 1 2 0 , 8 250 96,5* 1 2 0 , 5 250 118.1" 1 2 0 . 6 250 334.4

Palmira Hist x var RM x var TM x var TMS x var

7.6 9.4 14.9 25.3 7.5 9.0 9.2* 9.5* 7.4 9.5 11.0 12.8" 8.3 9.2 25.3* 24.1

10.4 1 3 . 9 1 4 . 0 10.6 7.2 7.1 9.3 21.6 23.4 26.0 21.5 1 8 . 1 1 4 . 2 21.8 10.2 1 4 . 0 1 3 . 3 10.6 6.8 7.3 8.7 10.9" 9.6* 12.0" 9.4* 8.2* 8.2* 8.9* 10.3 1 4 . 0 1 4 . 6 10.6 7.2 6.9 9.7 12.7" 15.4" 13.7" 13.4" 10.0" 1 0 . 5 12.1" 10.8 1 3 . 7 1 4 . 5 10.6 7.3 7.5 9.5 26.6 26.6 3 2 . 1 27.9 1 7 . 2 22.1' 21.4

15.7 14.9 15.0 9.4* 15.1 18.8 15.1 32.5*

14,3 25,0 13,9 10,7" 14,7 18,0 14.5 34.5

1.9 3.9 2.0 2.6* 2.6* 2.9 2.7* 4.7

0.1 0.2 0.1 0.3* 0.1 0.1" 0.2 0.2

11.4 131.0 52 21.5 1081.1 ll.0 1 2 7 . 3 250 11.9" 129.8" 12.1 1 3 2 . 2 250 14.2" 149.7" 12.0 1 3 3 . 0 250 29.7 762.9

Tillabery Hist x var RM x var TM x var TMS x var

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

0.1 0.1 0.1 0.1 0.2 0.2* 0.2 0.2*

0.2 0.2 0.3 0.4* 0.2 0.2 0.2 0.2

0.7 0.9 0.7 0.8 0.7 0.7 0.7 1.9"

3.1 7.3 12.4 5.5 5.2 1 8 . 2 3.0 6.8 1 2 . 4 3.3* 5.4 6.9* 2.9 7.3 1 2 . 1 3.1" 6.6 8.9* 3.2 7.1 1 2 . 0 6.7 13.8' 20.5

14.3 8.0 11.2 8.0 14.3 7.7 6.7* 6.4 13.5 7.8 11.6 7.9 13.7 7.5 23.0* 13.3"

0.1 0.1 0.0 0.0 0.1 0.I 0.1 0.1

48.1 35 102.9 47.3 250 36.8* 47.4 250 40.0* 47.4 250 123.6

Mean (x) and variance (var). Significant differences between historical (Hist) and simulated values are indicated by an asterisk. RM is the Richardson model, TM is the third-order Markov model without sampling, and TMS is the third-order Markov model with sampling.

Raindays per month An analysis of the mean number of rain days per month is given in Table 4. Statistically significant differences in means and variances were tested as for rainfall amount by using the non-parametric tests referred to above. For the rainfall model TMS, raindays per month is an output that is being manipulated directly owing to random resampling; the monthly means should match the historical monthly means, and the variances should increase, compared with models RM and TM (as for rainfall per month). This can be observed from Table 4 in all cases, apart from October at Tillabery, for the sampling model. Generally, all models reproduced mean raindays per month

AGRICULTURAL APPLICATIONS IN THE TROPICS

11

successfully, whereas simulated variances for the non-sampling models were again under-estimated. Note that TMS over-estimated variance in all cases of significant difference.

Consecutive dry sequences per month For all historical and generated sequences, runs of successive dry days were counted for each month. These were then cumulated (an example is shown in Fig. 2 for the m o n t h of April at Guatemala City) and tested for departure from the relevant historical cumulative distribution function using the Kolmogorov test. A chi-squared test was also performed on these distributions. These were two-, three-, or four-cell tests, depending on the distribution. Any cells with low counts (n < 5) were amalgamated with a neighbouring cell. These results are shown in Table 5, in terms of failure at the 5% level for the respective tests. The relatively poor performance of RM for Palmira is noteworthy, as is that of TM and TMS for Tillabery. This suggests that the different models have varying abilities to model within-month dry sequences at different sites. Long-sequence analysis, discussed later, was also used for further investigation of this aspect.

Third-order state probability analysis The ability of the various models to simulate 3-day wet-dry sequences was tested in the following way. For each sequence, the number of monthly occurrences of the eight wet-dry states associated with the third-order model (000, 001,010, 011, 100, 101, 110, 111, where 0 signifies dry and 1 wet) was counted. These state occurrences were then compared with the historical state counts, and a chi-squared test was performed. Because overlapping triads were tested, the resultant values of chi-squared were divided by three. Results by m o n t h and by state are shown in Table 6, together with statistically significant departures. There is some indication that the monthly state occurrences for the sampling model are being distorted somewhat, particularly in the wetter months. It is interesting that RM, as a first-order Markov model, reproduces the historical occurrences quite well (this ability is built into TM). Some caution is needed in interpreting this test, because the occurrence of the eight states is not completely independent: for example, the occurrence of triad 010 implies the previous occurrence of triads 101 or 001.

Pentad and long-sequence analysis Wet-dry sequences over longer periods of time were tested by comparing simulated and observed frequencies for 5-day state occurrences for each entire

12

P.G. JONES AND P.K. THORNTON

1.0

I

( a ) ~ '

0.5

o

0.1

5

10

15

20

25

30

25

30

DRY SEQUENCE LENGTH

1.0

LU0.5 0.1

5

10

15

20

DRY SEQUENCE LENGTH

1.0

.< o

E

; : H~ IORC I AL

0.5

5 0

0.1

5

10

15

20

25

30

DRY SEQUENCE LENGTH

Fig. 2. Cumulative probability functions showing the number of consecutivedry days in April for historical rainfall data (52 years) and for simulated rainfall data (250 years): (a) Guatemala City, National Observatory; (b) Richardson model (RM); (c) third-order Markov model without sampling (TM); (d) third-order Markov model with sampling (TMS).

13

AGRICULTURAL APPLICATIONS IN THE TROPICS

TABLE 5

Consecutive dry sequences by month By month

J F

Palmira

Guatemala RM

TM

C

C

TMS

M A

M J J

Tillabery

RM

TM

TMS

RM

CK C C CK CK

C

C

C

C

C

A

C

S

C

O N D

C

CK

TM

TMS

C

C

C

C C C

C C CK

CK CK

C

CK

Results of the X2 (C) and Kolmogorov (K) tests. K or C indicates failure at the 5% level; insufficient data is indicated by -. Richardson model (RM), third-order Markov model without sampling (TM), third-order model with sampling (TMS).

sequence. Thus, each year was split into 73 non-overlapping 5-day sequences, which were converted into the appropriate wet-dry pentad; the occurrences were then counted and cumulated. A 32-cell (25, ranging from 00000, 5 dry days, to 11111, 5 wet days) chi-squared test was performed. The maximum number of consecutive wet and dry days was obtained from each sequence, together with the number of wet and dry runs that exceeded 10 consecutive days. The simulated numbers of long wet and dry sequences were compared with the historical numbers by using a two-cell chi squared test. Results are shown in Table 7. The analysis of pentad frequency showed significant departures in pentad frequency from historical records for both RM and TMS at Guatemala City and Palmira. All rainfall models had difficulty in simulating longer wet and long dry sequences (more than 9 consecutive days). Similarly, the goodnessof-fit between simulated and historical maximum dry and wet sequences was poor for all models: RM and TMS overestimated both the maximum wet and dry sequence for Guatemala City; the maximum dry sequence for Palmira was underestimated by all three models; and for Tillabery, the models overestimated the maximum dry sequence and underestimated the maximum wet sequence. Onset and duration of the growing season

Daily ratios of actual to potential evapotranspiration were obtained for all rainfall sequences by using a water balance model based on WATBAL (Reddy, 1979). This routine calculates available soil water, runoff, and EA/ET

14

P.G. JONES AND P.K. THORNTON

TABLE 6

Triad states probability analysis Guatemala RM

Palmira

Tillabery

TM

TMS

RM

TM

TMS

RM

TM

TMS

3.41 11.52 0.37 0.81 4.78 7.82 7.64 4.27 9.95 1.98 2.17 3.77

2.83 0.94 2.52 4.84 1.51 23.44* 14.92" 18.07" 29.75* 1.90 1.99 1.54

2.51 9.78 9.81 8.09 8.38 3.38 7.24 6.80 11.27 10.93 7.16 3.93

4.99 2.47 1.70 2.26 4.15 0.34 2.37 2.37 2.82 11.41 5.71 2.87

6.02 1.10 2.89 7.14 11.94 7.62 6.75 3.70 3.51 40.02* 15.09" 7.55

0.00 0.55 1.46 6.89 1.13 2.57 3. l 1 3.81 5.45 3.15 0.58 0.51

0.00 1.40 1.60 3.33 3.72 3.48 3.01 3.22 3.07 15.47" 15.09" 2.71

0.00 3.23 1.23 1.46 0.75 10.45 10.63 18.43" 3.76 14.21" 0.83 4.14

27.41" 3.42 2.21 5.50 3.87 7.93 4.02 10.75

3.72 1.66 0.88 1.55 1.59 4.73 1.64 2.51

10.56 1.65 3.59 2.22 1.97 4.95 1.79 6.62

17.44 2.07 6.30 2.27 2.66 11.11 2.23 4.09

5.10 0.68 1.62 2.99 2.57 5.40 3.08 4.52

10.30 4.46 13.88 9.25 6.03 7.10 10.06 7.47

15.10 5.84 13.43 1.29 7.90 8.01 2.28 7.70

By Month (7 d.f.) J F M A M J J A S O N D

2. l I 7.51 1.31 0.71 3.79 4.52 2.45 1.24 4.32 5.67 4.18 3.34

By State (0 = dry, 1 = wet)

(ll d.f.) 000 001 010 011 100 101 110 111

2.97 3.09 2.31 3.44 2.41 11.29 2.02 5.38

8.35 4.48 4.66 2.56 4.36 6.92 1.72 16.19

Values of Z2. An asterisk indicates significant difference from historical values at the 5% level. Richardson model (RM), third-order Markov model without sampling (TM), third-order model with sampling (TMS).

for one time period (here a day) in response to inputs of rainfall during the period, soil water capacity, evaporation and a crop cover factor. Soil water capacity was set at 100 mm; a crop cover factor of 1 and a potential evaporation constant of 8 mm day-~ were used throughout. In order to compare historical with simulated growing season variables, the following were calculated from each yearly series: (1) day number where EA/Ev first exceeds 0.8 for 5 consecutive days (a proxy for the start of the growing season); (2) day number where EA/Ev last exceeds 0.5 for 8 consecutive days (a proxy for the end of the growing season); (3) the number of days between the start and end of the season where EA/E v exceeds 0.5 (number of 'growing' days); (4) the number of days between the start and end of the season where EA/E T is less than 0.5 (number of 'stress' days). For the sites tested, there were years when the criteria for the start and end

15

AGRICULTURAL APPLICATIONS 1N THE TROPICS

TABLE 7

Pentad and long wet and dry sequence analysis Historical Guatemala Pentad ;(2 (31 d.f.) Long ( > 9) wet/dry sequences ;(2 (1 d.f.) Longest dry sequence (days) Longest wet sequence (days)

_

RM

53.1" 3.0

150 37

163 42

_

154.8" 19.5"

TM

43.5 7.6* 137 36

TM S

64.9* 7.9* 168 48

Palmira

Pentad ;(2 (31 d.f.) Long ( > 9) wet/dry sequences ;(2

41.9 2.5

136.5" 55.9*

(1 d.f.)

Longest dry sequence (days) Longest wet sequence (days)

62 20

41 15

46 24

57 38

41.8 9.9*

32.6 12.7"

42.6 12.5"

Tillabery Pentad ~2 (31 d.f.) Long ( > 9) wet/dry sequences ;(2

_

(I d.f.)

Longest dry sequence (days) Longest wet sequence (days)

245 32

264 10

246 13

273 17

A n asterisk indicates a significant difference at the 5% level compared with the historical data. Richardson model (RM), third-order Markov model without sampling (TM), third-order model with sampling (TMS).

of the growing season (as defined above) were never satisfied. These years, denoted indeterminate years, were excluded from the analysis (Table 8). Yearly variances for the four variables were then calculated, and significant differences were again tested using the t and F tests (for normal distributions) or the Mann-Whitney or Squared Ranks tests for non-normal distributions. By these criteria, Guatemala and Tillabery have a fairly sharply defined start and end to the growing season. Results for Palmira are excluded from Table 8 because of the absence of a well-defined dry season (see Jones (1987) for the problems of defining a growing season for Palmira). The results again illustrate that all models simulate means well. The addition o f sampling to the third-order Markov model tended to increase the variances substantially. It is noteworthy that the variance at the end of the growing season is so much less than that at the start for the sites analyzed. For the agriculturalist, this might well be thought of as a perversity of nature, in that the critical decision as to when to sow has to be taken when the variability is highest.

16

P,G. JONES AND P.K. THORNTON

TABLE 8 Growing season variable analysis Day number of year:

Guatemala Historical RM TM TMS

Number of growing days

Number of stress days

x

var

x

var

Indeterminate years (total)

Start of growing season

End of growing season

x

var

x

var

162.6 158.7 161.7 162.7

787.85 361.49" 399.33* 422.96*

311.6 315.1" 311.3 312.5

250.19 194.04 140.02 170.29

112.1 120.6 116.3 112.1

994.16 475.66* 521.85" 641.58"

37.8 36.8 34.4 38.7

548.08 359.38* 214.01" 406.27

I 0 0 0

(38) (250) (250) (250)

217.5 217.4 217.6 218.4

572.10 412.33 469.73 487.56

271.6 274.9 273.9 274.5

107.26 220.89* 250.86* 255.65*

35.4 32.8 30.5 32.5

368.40 229.07 181.55 246.23

19.6 25.6* 26.8* 24.6

193.16 277.43 389.14 372.46

8 49 43 45

(35) (250) (250) (250)

Tillabery Historical RM TM TMS

An asterisk indicates significant difference at the 5% level compared with the historical data. Richardson model (RM), third-order Markov model without sampling (TM), third-order model with sampling (TMS).

Crop growth and development using CERES-Maize Historical and simulated rainfall sequences were used as input to the CERES-Maize model (Ritchie et al., 1989). This is a simulation model of crop growth and development that operates on a daily time step. The model is sensitive to nitrogen, water, soil, climate and a wide range of management factors. Potential dry matter production is a function of photosynthetically active radiation, and the dry matter produced on any day is partitioned between the plant organs that are growing at that time. The phenology of the crop is driven by the accumulation of daily thermal time. Data inputs include daily weather variables (maximum and minimum temperatures, rainfall and solar radiation), crop and management information and soil pedon data. CERES-Maize has been tested in many locations around the world, and performs well in a wide variety of environments (see Keating et al. (1991) for a description of extensive validation work in Kenya in rain-fed conditions, with simulated and observed maize yields ranging from 0 to 8000 kg ha t). The maize model was run for one site only, Palmira, Colombia. Because the Richardson weather generator simulates the variables required for running CERES-Maize and because temperatures and solar radiation are dependent on rainfall, this model was used in conjunction with the appropriate historical and the other simulated rainfall sequences. Temperature and solar radiation data for 12 years from CIAT, located some 3 miles from Palmira, were used

AGRICULTURAL APPLICATIONS IN THE TROPICS

17

TABLE 9

Results of CERES-Maize simulations at Palmira, Colombia Yield (t ha ~)

Historical RM TM TMS

Days, E - M

Rainfall, E - M (mm)

x

var

x

var

x

var

6.65 6.92 6.76 6.72

1.520 0.904* 1.176 1.400

83.6 83.5 83.6 83.6

1.64 2.05 2.12 2.01

188 188 181 189

7396.5 3384.1" 4352.4* 5366.9

n

49 245 245 245

Mean (x) and variance (var) of yield, number of days from emergence to maturity (E-M), and total rainfall from emergence to maturity. An asterisk indicates significant difference at the 5% level compared with historical data.

to estimate the simulation parameter values for RM, and these were then used, with the relevant rainfall record, to produce a complete set of weather variables for CERES-Maize. For all four rainfall sequences, the same 'treatment' was run over a number of years: the cultivar was CORNL281, sowing date was 30 November, the soil was a loamy clayey Typic Pellustert located at CIAT, nitrogen was assumed to be non-limiting, and all initial conditions were identical. The object was to isolate as much as possible the effects of the rainfall generators used, when comparing them with the historical rainfall sequence. Results are tabulated in Table 9 in terms of mean and variance of maize yield, the number o f days from crop emergence to maturity, and total rainfall during the same period. Yields are shown as cumulative probability functions in Fig. 3. 1.0 F.... 0~ 0 e,.o.

d

0.5

//

F... T ,'/ 0

,'

'

/

'

t~l

- - - -

~ S

0.0

3

5

7

9

YIELD (t/hal

Fig. 3. Cumulative yield functions simulated using CERES-Maize, historical (n = 49) and simulated (n = 245) rainfall sequences, and identical crop management regimes: Palmira, Colombia.

18

P.G. JONES AND P.K. THORNTON

When the Richardson model was used, the variance of grain yield was significantly different from the historical value, and the critical level for the t-test comparing the historical and the simulated mean yield was approximately 0.07. ('Historical' is not an entirely appropriate description, because maximum and minimum temperature and solar radiation were still simulated in this case in response to historical rainfall sequences.) Note that at the top of the yield distribution, water is non-limiting, and all rainfall models converge to produce a similar maximum yield. This value of 9.2 t ha -1 may be taken as the yield in optimal conditions for this variety and the particular set of management conditions that pertain. At the lower end of the yield distribution, however, the superiority of the sampling rainfall model becomes apparent. CONCLUSIONS

The results of the testing reported above suggest that the third-order Markov model performs adequately for many applications. The ability to add in more monthly and yearly variance has also been demonstrated, and this appears to take place without upsetting the overall performance of the model. The third-order Markov chain model with sampling should be appropriate for situations where this increased variance is deemed important. The results also show that the Richardson model and third-order Markov model without sampling perform similarly. There are some indications, however, that the third-order model without sampling increases certain annual variances in the analyses performed above, as compared with the Richardson model. The three sites used for testing show very different characteristics. It would perhaps be surprising if the same statistical structure were capable of simulating all aspects of rainfall in a completely satisfactory manner at all sites. In particular, long wet and dry sequences are not modelled very well by any of the generators tested. This underlines the fact that caution is always needed when using a statistical weather generator: 'acceptable' performance rather depends on the use to which the simulated rainfall records are to be put. The computations required in estimating the larger numbers of parameters involved in the third-order Markov chain model are not a significant problem. The generation of rainfall data using the sampling model involves less than a 20% increase in the time required, compared with the Richardson model. Work is currently being undertaken to sidestep the necessity of using an external statistical package (GLIM) to estimate some of the parameters for the third-order model. In addition, the possibility of using the third-order sampling model for spatial interpolation of rainfall records is being investigated.

AGRICULTURALAPPLICATIONSIN THE TROPICS

19

ACKNOWLEDGEMENTS

The authors gratefully acknowledge the comments and helpful suggestions of Julio Henao and two anonymous referees; remaining errors and omissions are the authors' responsibility. REFERENCES Baker, R.J. and Nelder, J.A., 1978. Generalized Linear Interactive Modelling Manual, release 3. Numerical Algorithms Group, Oxford, UK. Conover, W.J., 1980. Practical Non-Parametric Statistics, 2nd edition. Wiley, New York, pp. 216-248. International Benchmark Sites Network for Agrotechnology Transfer, 1989. Decision Support System for Agrotechnology Transfer v.2.1: User's Guide. Department of Agronomy and Soil Science, College of Tropical Agriculture and Human Resources, University of Hawaii, Honolulu, HI. Jones, P.G. 1987. Current availability and deficiencies in data relevant to agro-ecological studies in the geographic area covered by the IARCS. In: A.H. Bunting (Ed), Agricultural Environments. CAB International, Wallingford, UK, pp. 69-83. Keating, B.A., Godwin, D.C. and Watiki, J.M., 1991. Optimising nitrogen inputs inresponse to climatic risk. In: R.C. Muchow and J.A. Bellamy (Eds), Climatic Risk in Crop Production: Models and Management for the Semiarid Tropics and Subtropics. CAB International, Wallingford, UK, pp. 329-358. Marsaglia, G. and Bray, T.A., 1964. A convenient method for generating normal variables. Society for Industrial and Applied Mathematics Review, 6(3): 260-264. Numerical Algorithms Group, 1980. GENSTAT, A General Statistical Program. Numerical Algorithms Group, Oxford, UK. Reddy, S.J., 1979. WATBAL User's Guide: a program for calculating soil waterbalance, written in BASIC Plus. Mimeo, ICRISAT, Patancheru, India. Richardson, C.W., 1985. Weather simulation for crop management models. Trans. ASAE, 28(5): 1602-1606. Richardson, C.W. and Wright, D.A., 1984. WGEN: A Model for Generating Daily Weather Variables. USDA, Agricultural Research Service, ARS-8, 83 pp. Ritchie, J.T., Singh, U., Godwin, D.C. and Hunt, L.A., 1989. A User's Guide to CERES-Maize V2.10. International Fertilizer Development Center, Muscle Shoals, AL, 86 pp. Sterne, R.D. and Coe, R., 1982. The use of rainfall models in agricultural planning. Agric. Meteorol., 26: 35-50.