Agricultural and Forest Meteorology 90 Ž1998. 51–63
Spatial scales of climate information for simulating wheat and maize productivity: the case of the US Great Plains William E. Easterling a
a,)
, Albert Weiss
b,1
, Cynthia J. Hays
b,2
, Linda O. Mearns
c,3
Department of Geography, PennsylÕania State UniÕersity, UniÕersity Park, PA 16802, USA School of Natural Resource Sciences, UniÕersity of Nebraska-Lincoln, Lincoln, NE, USA c National Center for Atmospheric Research, Boulder, CO, USA
b
Received 2 December 1996; received in revised form 26 September 1997; accepted 7 November 1997
Abstract The spatial aggregation of climate and soils data for use in site-specific crop models to estimate regional yields is examined. The purpose of this exercise is to determine the optimum spatial resolution of observed climate and soils data for simulating major crops grown in the central Great Plains Žmaize, wheat., beginning at a scale of 2.88 = 2.88 ŽT42., which is close to that of the European Centre for Medium-Range Forecasting ŽECMWF. general circulation model ŽGCM. grid cell and progressively disaggregating climate and soils data to finer spatial scales. Using the Erosion Productivity Impact Calculator ŽEPIC. crop model, observed crop yields for the period 1984–1992 are compared with yields simulated with observed 1984–1992 climate. The goal is to identify the spatial resolution of climate and soils data which minimizes statistical error between observed and modeled yields. Agreement between simulated and observed maize and wheat was greatly improved when climate data was disaggregated to approximately 18 = 18 resolution. No disaggregation results for hay were statistically significant. Disaggregation of climate data finer than the 18 = 18 resolution gave no further improvement in agreement. Disaggregation of soils data gave no additional improvement beyond that of the disaggregation of climate data. q 1998 Elsevier Science B.V. Keywords: Climate change; Agriculture; Scaling; Crop model
1. Introduction The scale resolution of most crop models used in simulating climate change impacts is, for all practical purposes, a single point on the earth’s surface. To ) C orresponding author. Fax: q 1-814-863-7943; e-mail:
[email protected] 1 Fax: q1-402-472-6614; e-mail:
[email protected] 2 Fax: q1-402-472-6614; e-mail:
[email protected] 3 Fax: q1-303-497-1625; e-mail:
[email protected]
get regional estimates of crop response to climate change requires either scaling the model simulations from point estimates or scaling the data inputs from point measurements and then performing the simulations. Both ways will result in some loss of information due to aggregation error Ždiscussed below.. Focusing on the latter one, the effect of spatial scale of climate inputs on crop simulations is examined in this paper. A common approach to estimating crop yield response to climate change is to insert simulated
0168-1923r98r$19.00 q 1998 Elsevier Science B.V. All rights reserved. PII S 0 1 6 8 - 1 9 2 3 Ž 9 7 . 0 0 0 9 1 - 9
52
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
change in climate elements computed by general circulation models ŽGCMs. indirectly into physiological crop models Že.g., Ritchie et al., 1989; Rosenzweig, 1989; Rosenzweig and Parry, 1994; Mearns et al., 1992.. This approach involves adjusting the historical climate records of a regional distribution of observing stations to reflect the differences between GCM-simulated current climate and GCMsimulated climate change from greenhouse forcing. Typically, GCM simulations of climate change are computed for a global grid where grid cells are of the order of 38 latitude or coarser. Hence, GCM climate change simulations from a single grid point are uniformly applied to all climate stations within a 38 Žapproximately 250 km = 320 km in the mid-latitudes., or larger, grid cell. The statistical properties of observed climate series are not affected by the GCM adjustments, hence, the statistical properties of the GCM time series are not imparted on the observed climate series. The adjusted climate records are incorporated into crop models and simulations are performed. The scaling of station-level simulations to the regional level, if done at all, is usually accomplished by averaging the simulations across stations Žsometimes weighing the average by some criteria such as percent land area represented by each station.. An alternative scaling procedure is to average the climate data inputs Žadjusted by the GCM climate change simulations as described above. of two or more climate stations across an area of dimensions determined by sensitivity analysis, and then performing the crop simulations. Assuming that the ideal areal dimensions for averaging the climate data are less than a GCM grid cell or any other region serving as the scaling target Ži.e., state or province., the areal Žsubregional. simulations can then be averaged to aggregate yields to the regional level. This latter approach approximates a partitioning strategy outlined below and is followed in this research. While we recognize that vast majority of crop– climate impact studies have used the first scaling procedure described above, we examine the partitioning approach as an alternative scaling procedure that deserves testing. We argue that the identification of the spatial resolution of observed climate data at which observed and simulated crop yields are in closest agree-
ment is a useful beginning point for determining the appropriate spatial scale for simulating regional crop response to climate change. This argument is made because we are optimistic that, with steady improvements, climate models will eventually simulate the statistical properties of current climate accurately across a range of spatial scales—at which time, the optimal scale of climate model simulations for computing crop yields will be no different from that of observed climate information. The purpose of this paper is to examine the effect that different levels of spatial aggregation Ži.e., the size of the areas from which the synthesis of multiple in situ data points into a single areal estimate is made. of climate and soils data used in crop models have on the agreement between modeled and observed crop yields. We begin with aggregation at the GCM grid scale and progressively disaggregate to finer spatial scales of resolution. Specifically, this paper will evaluate relative strengths of relationships among climate and soils factors, and crop yields over a range of spatial scales. The coarsest resolution matches the European Centre for Medium-Range Weather Forecasting ŽECMWF. model of 2.88 = 2.88 grid point spacing Žhereafter referred to as the ‘GCM’ grid scale.; the finest resolution is one thirty-sixth of the ECMWF’s GCM cell size 4 Ž0.58 = 0.58 or approximately 40 km = 50 km point spacing in the mid-latitudes.. This finest scale is typical of that used for nested regional climate model experiments ŽGiorgi and Mearns, 1991.. The study area is described by a network of grid cells covering a large portion of the central Great Plains of the US, replicating much of the area contained in the recent MINK study ŽRosenberg et al., 1993..
2. Relevant literature LeDuc and Holt Ž1987. point out that scant attention has been given to the problem of scaling mechanistic model simulations of crop production across areas of contrasting soils and climate. Dugas et al.
4 This is the finest resolution possible with the spatial density of cooperative meteorological observing stations available in the study area.
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
Ž1983. examined the decay in the strength of the relationship between simulated yields at a single point and those at increasing distances from that point in the southern Great Plains, finding that the decay in agreement was nearly linear at distances of 0–150 km and logarithmic beyond those distances. However, they made no comparison in agreement between simulated and observed yields over those distances. Lal et al. Ž1993. coupled a geographic information system ŽGIS. to the DSSAT V2.1 crop models ŽInternational Benchmark Sites Network for Agrotechnology Transfer Project, 1989. in order to provide spatially detailed soils and meteorological information to the crop models for the estimation of regional crop productivity. They also used the GIS to search for areas of similar environmental characteristics to ones that were modeled as a means for broadening site-specific model applications. Haskett et al. Ž1995. used 21 years Ž1971–1991. of climate data aggregated to the state climate division level in Iowa to compare soybean yields simulated with the GLYCIM model with observed yields at the county and state levels. They found that GLYCIM accurately simulated observed soybean yields at both the county and state levels over those 21 years. They did not, however, attempt to evaluate the effect of more or less aggregation of climate data on the performance of GLYCIM. Mechanistic ecological models, including crop models, consist of nonlinear variable–process relationships ŽLeDuc and Holt, 1987; Rastetter et al., 1992.. Simple areal averaging of variable values or of model estimates as a means of aggregating sitespecific simulations is likely to introduce aggregation error of a magnitude that is proportional to the degree of nonlinearity Že.g., concavity. of the functional form of the variable–process relationships. This problem is known as the ‘fallacy of averages’. A number of statistical approaches for dealing with problems of nonlinearity in model aggregation are suggested by Rastetter et al. Ž1992.. One such approach is known as ‘partitioning’. In this research, crop growth factors Žsoils, climate, management. are partitioned into progressively smaller sized land units, beginning at the GCM grid cell size, such that environmental and management variability within units is minimized. White and Running Ž1994. argue that this approach allows the approximation of a
53
linear variable–process relationship. Scale analysis leads to the selection of spatial resolution that reduces areal variance and reduces aggregation errors associated with nonlinearities in the crop model functions.
3. Materials and methods In this section, the region of study, the crop model used in the analysis, the allocation of modeling data to a climate model grid network and the procedures used to compare observed and simulated yields over a continuum of spatial scales within the grid network are described. The area chosen for this study encompasses a majority of the Missouri– Iowa–Nebraska–Kansas ŽMINK. region. This choice allowed the crop model data sets created for the MINK study ŽEasterling et al., 1992. to be used in this study. The area studied is 91.38W–102.68W longitude by 37.98N–43.58N latitude. It is overlain by eight GCM grid cells that are 2.88 = 2.88 Ž236 km = 312 km. and includes the majority of the MINK region ŽFig. 1.. Three different crops were modeled for the period of 1984–1992 using the Erosion Productivity Impact Calculator ŽEPIC., a crop growth model which is described below. The crops modeled were dryland meadow hay Žin the two northwest GCM grid cells., dryland wheat Žin the two southwest GCM grids. and dryland maize Žin the four eastern GCM grids.. This distribution mimics the predominant crops currently grown in those grid cells. 3.1. The EPIC model EPIC is a crop simulation model developed to estimate the relationship between soil erosion and crop productivity ŽWilliams et al., 1984.. In addition to including routines for soil erosion and plant growth, the model includes components for weather simulation, hydrology, nutrient cycling, plant growth, tillage and crop management. EPIC operates on a daily time step. Among factors simulated by EPIC are evapotranspiration Žbased on the Penman– Monteith model., soil temperature, crop potential growth, growth constraints Žwater stress, stress due to high or low temperature, nitrogen and phosphorus
54
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
Fig. 1. GCM grid and nested mesogrid network for the study area, including locations of cooperative weather stations and those stations also used for generating solar radiation, relative humidity and windspeed data stochastically.
stress. and yield. EPIC uses a single model for simulating all crops, although of course, each crop has unique values for model parameters. The crop growth model uses radiation-use efficiency in calculating photosynthetic production of biomass. The potential biomass is adjusted daily for stress from the following factors: water, temperature, nutrients Žnitrogen and phosphorus., aeration and radiation. Atmospheric CO 2 concentrations influence photosynthesis through the radiation-use efficiency term. They influence water use efficiency through the stomatal conductance term in the Penman–Monteith model. For this study, CO 2 concentrations are held at current ambient levels. Crop yields are estimated by multiplying the above-ground biomass at maturity Ždetermined by accumulation of heat units or speci-
fied harvest date. by a harvest index Žeconomic yield divided by above-ground biomass. for the particular crop. EPIC requires data on soil physical properties Žfor example, bulk density, water-holding capacity, wilting point. and management Žfor example, fertilization, tillage, planting, harvesting, irrigation.. The weather variables necessary for driving the EPIC m odel are daily values of precipitation, minimumrmaximum air temperature, solar radiation, windspeed and relative humidity. A detailed discussion of EPIC input data attributes unique to this study comes in Section 3.2. EPIC has been subjected to numerous validation exercises. Extensive tests of EPIC simulations were conducted at over 150 sites and on more than 10
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
crop species and generally those tests concluded that EPIC adequately simulated crop yields ŽKiniry et al., 1990; Rosenberg et al., 1992.. Easterling et al. Ž1996. specifically validated the EPIC yield predictions during recent episodes of climate extremes that emulate expected attributes of climate warming in the Great Plains. 3.2. EPIC input data for this study 3.2.1. Soil data The State Soil Geographic ŽSTATSGO. database ŽUSDA, 1994. was used with a geographic information system to determine the dominant soil Ždefined as the most extensive soil. for each of the different grid scales. The STATSGO soil association maps are compiled from generalized soil survey maps at 1:250,000 scale. The STATSGO data are divided into map units with a minimum resolution of 625 ha, which can contain up to 21 soils. The percentage of each soil is known within each map unit but its distribution within each map unit is not known. Therefore, the dominant soil is only an estimate since a map unit is not necessarily totally contained within a grid cell. STATSGO also lists a prime farmland classification for each soil. This classification rates farmland from 0 to 9 based on soil and climatic properties and was used to determine the dominant soil type depending on what crop was being modeled in an area. Only soils with the rating of prime farmland were considered for maize and wheat; soils from all ratings of farmland were considered for hay. 3.2.2. ObserÕed and simulated climate data For this study, observed daily values of minimumrmaximum temperature and precipitation from cooperative weather stations were used. The other climate variables required by EPIC Žsolar radiation, windspeed, relative humidity. were not available as daily values nor were they as widely available as temperature and precipitation across all cooperative stations, hence they were simulated. EPIC provides options for simulating various combinations of the above-weather variables with a stochastic weather generator. Solar radiation is simulated from monthly means of daily solar radiation and is adjusted for days with precipitation ŽRichardson, 1981.. Wind-
55
speed is simulated using a model, which considers monthly averages of average daily speed and direction. Relative humidity is simulated from monthly averages and is adjusted to account for days with precipitation ŽWilliams et al., 1990.. A nearest neighbor criterion was used to assign simulated daily values of solar radiation, windspeed and relative humidity to each of the cooperative stations Ži.e., data on those values from the closest station with data available were assigned to cooperative stations missing those values.. The observed climate variables were obtained from cooperative weather stations that measured minimumrmaximum temperature and precipitation during the 1983–1992 period. The 10-year period is too short to draw reliable statistical inferences from the EPIC simulations. Hence, for each climate station, a synthetic 100-year record of daily climate values was created by repeating the 1983–1992 daily climate values nine times and then concatenating the nine sets plus the original values; this repetition is consistent with previous model applications ŽEasterling et al., 1996.. Lengthening the climate record synthetically minimizes yield distortions by EPIC introduced by initial values of cumulative controls of crop growth Že.g., water balance, soil nutrient levels.. In a subsequent step, the ten 1983 simulated yield values were removed to minimize the serial discontinuity caused by having the 1983 climate occur immediately after the 1992 climate ten times. The remaining nine simulated yield values for each year Ž1984–1992. were averaged to obtain nine annual yield estimates. Determining the allocation of climate data for each GCM grid cell, or finer-resolution mesogrid cell, was more complex than it was for soils. Four different methods were considered: Ž1. a straight average of the cooperative stations within a cell; Ž2. an inverse distance method using the five closest stations to the center of the cell ŽHubbard, 1994.; Ž3. averaging EPIC yield estimates over all cooperative stations used to obtain those estimates in a grid cell; and Ž4. using a single station as representative of the entire grid cell. An ANOVA was performed using absolute error values ŽAE, calculated as the absolute difference between EPIC simulated yield and observed yields. to test for significant differences among the four
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
56
methods. Mesogrid cells at the smallest resolution Ž0.58 = 0.58. within the two northeastern Žmaize. and the two southwestern Žwheat. GCM grid cells ŽFig. 1. were randomly selected to eliminate the possibility of spatial correlation. Results show that, for the northeast grid cells, some methods used were significantly different from each other Ž a F 0.05. but that the straight averaging method and the inverse distance method were not significantly different Ž a :0.05. from each other and those two methods had the smallest mean absolute error. In the southwest grid cells, none of the methods were significantly different. Thus, for simplicity, we chose to use the straight averaging technique since the added difficulty of using the inverse distance technique gave little improvement to the agreement between simulated and observed yields. The observed climate was the average of the weather stations that were contained within a grid cell. The areal density of stations across the study area ranged from approximately one station per 400 km2 in the eastern mesogrid cells to one station per 2000 km2 in the western mesogrid cells. For three of the smallest mesogrid cells Ž0.58 = 0.58. in the northwest part of the study area, there was the problem of inadequate representation of stations. In the three cells, a ‘nearest neighbor’ criterion was used to allocate climate data from nearby locations. 3.2.3. ObserÕed yield data Observed yields were tabulated for all GCM and nested mesogrid cells using data from the National Agricultural Statistical Service ŽNASS. county yield
estimates ŽUSDA, 1984–1992.. An average yield for each grid cell was computed with a geographic information system using the counties in each grid cell, where each county’s yield was weighed by the proportion of grid cell area encompassed by that county. Meadow Žwild. hay for forage is the dominant crop grown in the northwestern GCM grid cells. The meadow hay yields were from the NASS statistics category of ‘other hay’ Žall hay except alfalfa.. In the years that the agricultural census was taken during the 1984–1992 period, the majority of the other hay acreage was in wild hay so this category was used to estimate the yields of dryland meadow hay. Hay yields for South Dakota, which are not reported in non-census years, were estimated from linear regression equations relating hay yields from census years between adjacent counties in South Dakota and Nebraska. 3.3. Preparation of EPIC representatiÕe farms Representative farms developed for the MINK project Ždescribed by Easterling et al., 1992. were used in this research. Detailed profiles of relevant production characteristics Žcultural practices, planting and harvesting dates, heat units required for maturity. were compiled for a selection of ‘typical’ farm enterprises across the MINK region. The range of values for those production characteristics is shown on Table 1. This range was used to prepare EPIC to simulate the diversity of production systems across the MINK region.
Table 1 Crop management practices used in the EPIC model to help differentiate representative farms of the study area Production characteristic
Crop Maize
Nitrogen
Automatic, - 200 Žkgrha. total, –100 Žkgrha. 40 Žkgrha. applied before planting 30 Žkgrha. applied in spring at planting with the remaining applied at nitrogen stress level of 0.85 April 25–May 10 Sept. 1–Sept. 20 April 1
Planting dates Žallocated by MLRA. Heat units to 1500–1800 Žallocated by MLRA. maturity Ž8C.
Wheat
1400–1560 Žallocated by MLRA.
Meadow hay
1875–2200 Žallocated from normal heat unit accumulation.
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
3.4. Allocation of data to a grid network Tests of the effects of progressive disaggregation of climate and soils information on agreement between simulated and observed crop yields require the allocation of crop model data sets Žclimate and soils. to a spatial hierarchy of climate model grid cells, beginning at the coarsest or GCM resolution Ž2.88 = 2.88. and progressing to finer resolutions. Hence, each of the eight GCM grid cells was subdivided into a network of 9 Ž0.98 = 0.98 or 79 km = 104 km. and then 36 Ž0.58 = 0.58 or 39 km = 52 km. equal sub-GCM grid cells we call a ‘mesogrid’ network. In addition to the 9 and 36 mesogrid divisions, the extreme northeastern GCM grid cell Žeastern Iowa. and extreme southwestern GCM grid cell Žwestern Kansas. were further subdivided into 4 Ž1.48 = 1.48 or 118 km = 156 km., 16 Ž0.78 = 0.78 or 59 km = 78 km. and 25 Ž0.68 = 0.68 or 47 km = 62 km. equal mesogrids enabling more detailed scale comparisons than the others. 3.5. Procedures EPIC simulations were performed for various levels of disaggregation of soils and climate data Žover the hierarchy of grid cells discussed in Section 3.4. in order to examine scale effects on agreement between modeled and observed yields. The various disaggregation combinations tested are summarized on Table 2. At each level of disaggregation, the results were averaged over the mesogrids in order for ease of comparison with the coarsest level of disagTable 2 Pairwise combinations of soils and climate data aggregations simulated with EPIC in this study Climate aggregation Žcell-size. 2.88ŽECMWF. 1.48 0.98 0.78 0.68 0.58 a
Soil aggregation Žcell-size. 2.88 1.48 0.98 0.78 0.68 a
X Xb Xa Xb Xb Xa
0.58 Xa
Xb
gregation. Particular attention was given to sorting out the importance of soils data resolution vis-a-vis ´ climate data resolution. To this end, an additional 36 simulations were done at the 0.58 = 0.58 mesogrid cell level for the northeastern GCM grid cell where a soil with highly contrasting soil water capacity Žalmost twice the available water content. to the dominant soil in each mesogrid cell was examined pairwise with the dominant soil. Linear regression was used to describe the covariance between observed and modeled yields at the three main levels of spatial aggregation of data Žfrom the GCM level to the finest mesogrid level.. Segmented regression ŽSASrSTAT, 1989. was used to model MAE’s Žmean absolute errors. vs. the range of mesogrid cell sizes in two illustrative GCM grid cells: the northeastern GCM grid Žeastern Iowa maize. and the southwestern GCM grid Žwestern Kansas wheat.. A segmented regression consists of two regression equations that, are subject to the facts that the resulting curve be smooth and continuous and are selected by minimizing the sum of squares of the equations. The two equations intersect at the joint Žinflection point. x 0 . We argue that the inflection point x 0 is the grid cell resolution at which marginal improvement in agreement between observed and modeled yields diminishes with further disaggregation of soils or climate information.
4. Results The large number of combinations of soils and climate data over the different levels of spatial aggregation required the development of a simple scheme to identify unique combinations on tables and figures. Each unique combination of soils and climate data is identified by Si Ci , where S is soils data and C is climate data disaggregated to the i level Ž i s 1: GCM 2.88 = 2.88, i s 9: 0.98 = 0.98, i s 36: 0.58 = 0.58..
Xa Xb
4.1. Linear regression results
Xb Xa
Pairwise combinations performed for all GCM grid cells. Pairwise combinations performed for northeastern and southwestern GCM grid cells only.
b
57
Observed yields were regressed against simulated yields for three contrasting GCM grid cells Žmaize in the eastern Iowa cell and wheat in the western
58
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
Fig. 2. Linear regressions of observed vs. EPIC-simulated maize yields in eastern Iowa, with progressively disaggregated climate and soils data.
Kansas cell, and meadow hay in the northwestern Nebraska cell.. At the GCM level Ž S1C1 ., observed yields explained only 43% of the variation in simulated crop yields over the 1984–1992 period in the eastern Iowa cell ŽFig. 2.. For the case where soils and climate data were simultaneously disaggregated, a large improvement in goodness of fit Ž r 2 . between observed and simulated yields occurred in the step from S1C1 Ž r 2 s 0.427. to S9 C9 Ž r 2 s 0.658.. Further disaggregation of soils and climate data to S36 C36 added no improvement in goodness of fit beyond that of S9 C9 ŽFig. 2.. When only one soil classification Ž S1 . was used with all levels of climate data C1,9,36 , the results were virtually identical to the above ŽFig. 3.. These above results suggest that the disaggregation of climate, not soils, was the only factor that improved agreement between observed and simu-
Fig. 3. Linear regressions of observed vs. EPIC-simulated maize yields in eastern Iowa, with progressively disaggregated climate data only.
Fig. 4. Linear regressions of observed vs. EPIC-simulated maize yields in eastern Iowa to examine soil sensitivity, contrasting soils ŽSeaton vs. Nicollet. aggregated at the GCM level and climate disaggregated to one-thirty sixth Ž C36 . of the GCM level.
lated yields. An additional test was performed to clarify the importance of disaggregation of soils data vis-a-vis disaggregation of climate data. In this test, ` a soil ŽSeaton soil. with much higher water holding capacity than the S1 soil ŽNicollet soil. used in the foregoing analysis was compared with the S1 soil to see if such made any difference in goodness of fit. Climate data was disaggregated to C36 for both soils. Virtually no difference between the two contrasting soils was evident in the goodness of fit ŽFig. 4.. This result reinforces the conclusion that, in the range of GCM and mesogrid scales examined here, agreement between observed maize yields and maize yields simulated with the EPIC model is not sensitive to increased spatial resolution of soils data, but is highly sensitive to increased spatial resolution of climate data. The trends identified above in the eastern Iowa maize regression held for winter wheat in western Kansas, although the relationships between observed and simulated yields were weaker at all levels of climate and soils data aggregation. Regression results for meadow hay in northwestern Nebraska were not significant and are reported no further in this analysis. It is noteworthy that the regression equations usually underestimate crop yields, especially when climate data were disaggregated below the C1 level. This is evident on Figs. 2 and 3 by the best fit lines falling below the 1:1 line for all equations using disaggregated climate data. Because of this underestimation or bias, the mean absolute errors were not
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63 Table 3 Mean absolute error ŽMAE. of normalized yields Žtrha. for all crops at various levels of soils and climate data aggregation Level of soil and climate data aggregation S1C1 S1C9 S1C36 S9 C 9 S36 C36 S36 C1
Mean absolute error of normalized yield Žtrha. Dryland Dryland Dryland maize wheat meadow hay
0.686 0.468 0.500 0.452 0.480 0.679
0.290 0.218 0.209 0.201 0.204 0.301
0.190 0.180 0.182 0.184 0.184 0.174
always consistent with the linear regression results. The cause of the bias is only speculated here to be the absence of important explanatory variables. The underestimation bias tended to inflate mean absolute errors ŽMAE., especially when disaggregated climate data were used, even when the regression equation had a strong fit. For example, a model using disaggregated C36 climate data could have an MAE nearly equal to the MAE for a model using C1 climate data; while linear regressions were showing that, with C36 data, the slope of the linear regression line was approximating the 1:1 line Žindicative of a strong fit., with C1 data, the slope was nearly a horizontal line Žindicative of a weak fit.. The point here is that the bias obscures the true effect of soil and climate aggregation on MAEs and, hence, makes analysis difficult. The underestimation bias was eliminated by normalizing the yield data to a mean of 0 Žthe average yield over the 1984–1992 period was subtracted from the yield for each individual year. for both the observed and simulated yield. The variance of the yields was left unchanged. Absolute errors were calculated using the normalized yields. The means of the absolute errors are listed in Table 3. Analysis of variance was performed to determine the statistical significance of differences in normalized absolute errors among the Ci Si levels of disaggregation. The test was separated into 2 parts; 5 one
5
Meadow hay results were not significant and are not reported here.
59
Table 4 Probabilities that the absolute errors from normalized yield are equal for all dryland maize at various levels of soils and climate data aggregation Level of soil Probability Ž a . and climate Level of soil and climate disaggregation disaggregation S1C1 S1C9 S1C36 S9 C9 S36 C36 S36 C1 S1C1 S1C9 S1C36 S9 C 9 S36 C36 S36 C1
0.0015 0.0066 0.0007 0.0027 0.9061
0.0015 0.0066 0.6345 0.6345 0.8072 0.4721 0.8570 0.7678 0.0022 0.0093
0.0007 0.0027 0.9061 0.8072 0.8570 0.0022 0.4721 0.7678 0.0093 0.6714 0.0010 0.6714 0.0039 0.0010 0.0039
for maize and one for wheat, since the normalized absolute errors were of different magnitudes for each crop. The significance levels Žprobabilities, a . for each pairwise comparison of Si Ci for maize and wheat are shown in Tables 4 and 5. The results of the above test show that the normalized absolute errors, for maize and wheat, are statistically significantly Ž a F 0.05. different between C1 and C9 and between C1 and C36 , but not between C9 and C36 ŽTables 4 and 5.. This supports the conclusion that disaggregation of climate data from a resolution of C1: 2.88 = 2.88 Ž240 km = 310 km. to a resolution of C9 : 0.98 = 0.98 Ž79 km = 104 km. results in a significant improvement in agreement between simulated and observed maize and wheat yields. Similar improvement is gained by disaggregating climate data from a resolution of 2.88 = 2.88 to a resolution of C36 : 0.58 = 0.58 Ž39 km = 52
Table 5 Probabilities that the absolute errors from normalized yield are equal for all dryland wheat at various levels of soils and climate data aggregation Level of soil Probability Ž a . and climate Level of soil and climate disaggregation disaggregation S1C1 S1C9 S1C36 S9 C9 S36 C36 S36 C1 S1C1 S1C9 S1C36 S9 C 9 S36 C36 S36 C1
0.0068 0.0027 0.0010 0.0015 0.6843
0.0068 0.0027 0.7555 0.7555 0.5214 0.7411 0.6069 0.8388 0.0020 0.0008
0.0010 0.0015 0.6843 0.5214 0.6069 0.0020 0.7411 0.8388 0.0008 0.8988 0.0002 0.8988 0.0004 0.0002 0.0004
60
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
km.. Hence, the main improvement in agreement between simulated and observed yields resulted from the initial disaggregation of climate data from C1 to C9 , with no significant improvements with further disaggregation of climate data. Disaggregation of soils data gave no significant improvements in agreement beyond what was achieved with disaggregation of climate data for maize and wheat ŽTables 4 and 5.. 4.2. Segmented regression results Segmented regression analysis allowed the examination of the yields across a more detailed and extensive range of data aggregation scales than the previous analyses. It permitted the pinpointing of the scale at which further disaggregation of climate andror soils data had little or no effect on the agreement between observed and modeled yields. For maize in the eastern Iowa GCM grid cell and nested mesogrids, the segment of the distribution x - x 0 was quadratic whether soils were disaggregated or not ŽFigs. 5 and 6.. In the case where soils data are not disaggregated but climate data are, the inflection point x 0 was computed to be at resolution S1C19 Ž54 km = 72 km. ŽFig. 5.. The slope of the segment of the distribution x ) x 0 was zero, hence, there was no improvement in agreement between simulated and observed maize yields with further disaggregation of climate data. When the soils data in each mesogrid were also disaggregated with the climate data, the inflection point x 0 was computed
Fig. 6. Segmented regression of mean absolute error between observed and EPIC-simulated maize yield in eastern Iowa vs. spatial divisions Ž1–36. of the GCM grid cell, with progressively disaggregated climate and soils data.
to be at resolution S15 C15 Ž60 km = 80 km. ŽFig. 6.. The slope of the distribution x ) x 0 was zero. These results suggest that as the spatial resolution of climate and soils data increases in the eastern Great Plains, so does the agreement between observed and EPIC-simulated maize yields to a resolution in the range of 54–60 km = 72–80 km, depending on whether or not soils data are disaggregated along with climate data. No improvement in agreement between simulated and observed yields with further disaggregation of soils or climate data beyond x 0 was realized. Segmented regression was not possible for wheat in the western Kansas GCM grid cell and nested mesogrids. The distribution of MAEs was approximately linear with grid cell size and the slope of distribution was horizontal, regardless of how soils data were treated.
5. Discussion
Fig. 5. Segmented regression of mean absolute error between observed and EPIC-simulated maize yield in eastern Iowa vs. spatial divisions Ž1–36. of the GCM grid cell, with progressively disaggregated climate data only.
The results from the linear regression analyses for maize in eastern Iowa and wheat in western Kansas indicate that, when climate data are aggregated Žusing simple spatial averaging. at the GCM scale of resolution Ž2.88 = 2.88., covariance between EPIC-simulated crop yields and observed crop yields over the period 1984–1992 was weak. This weak covariance is likely to become weaker still as the resolution of climate data coarsens to the scales of the suite of GCMs most commonly used in climate change analy-
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
sis Žapproximately 38.. Such calls into question the reliability of past regional estimates of yield response to climate change using a relatively sparse spatial distribution of climate stations. Disaggregation of climate and soils data from the GCM scale to progressively smaller scales of resolution resulted in stronger covariance between modeled and observed wheat and maize yields. The statistical significance of differences in normalized absolute errors between the GCM 2.88 = 2.88 resolution and the 0.98 = 0.98 resolution support the regression finding that the greatest increase in goodness-of-fit between observed and modeled yields occurred in the disaggregation step from the GCM resolution to the 0.98 = 0.98 resolution. Agreement between EPICsimulated maize and wheat yields improved with this initial level of disaggregation of the climate data. No significant further improvement was gained by disaggregating climate data to the 0.58 = 0.58 resolution. A partial explanation for the improvement in agreement between modeled and observed yields in the first step of disaggregation from the GCM scale possibly concerns the effect of coarse-scale averaging on precipitation frequency and intensity. In reaching an areal average based on daily precipitation values, the larger the number of stations included in the areal average, the more evenly distributed the precipitation will be, with fewer extreme rainfall events. Hence, at the GCM scale, precipitation occurs as frequent low-intensity events. Disaggregation of the climate data Ži.e., using fewer observing stations in a smaller area. causes the distribution of precipitation events to approach the ‘true’ properties Ži.e., properties of a single rain gage., resulting in more accurate simulation of crop yields. It is also reasonable to suggest that, like precipitation, averaging over space attenuates the daily variability of temperature too Ži.e., fewer temperature extremes.. The point here is that there seems to be a trade-off between the regionality of crop simulations gained by aggregating the climate data vs. loss of realistic statistical properties of the climate data; and that there is some level of climate data aggregation at which those properties are sufficiently preserved and beyond which the properties disappear, thus leading to loss of agreement between simulated and observed
61
yields. It is useful to point out that some degree of averaging out of extremes Žof either temperature or precipitation. is probably beneficial to the functioning of the crop models since the models do not deal realistically with extreme daily events such as excessively high temperatures Že.g., above 408C. and damaging precipitation events Že.g., hail and high winds.. Furthermore, a number factors are probably responsible for our inability to improve agreement between simulated and observed yields at finer than 18 resolution of climate data, including: Ž1. possible inaccuracy of observed county-level yields, which are meant to ‘represent’ areas approximately the size of our smallest meso-grid cell; and Ž2. the possibility that biological and human controls of yield variability Že.g., pest and pathogen outbreaks, differences in management practices. dominate climate controls at finer spatial scales. Disaggregation of soils data had little effect on either the regression results or the analysis of the normalized absolute errors. Even when using a greatly contrasting soil to the one used to characterize a test mesogrid cell, there was little or no measurable effect on EPIC-simulated yields at the scales used in this research. This lack of a measurable soil effect with disaggregation may be partly attributed to the fact that yield data reflect all soils in a given area Žusually a county.; unless a large percentage of yields in a county are associated with a single soil type, then adequate soil disaggregation is not occurring even at the finest level of resolution in this study. Moreover, this finding may be limited to regions of relatively simple terrain and ample precipitation as found, for the most part, across this study area. This finding does, however, lend confidence to the current practice within surface–atmosphere interaction models—often the source of surface energy flux calculations needed in GCMs—of using only one soil to represent an area as large as 18 = 18, especially in important agricultural plains regions. Nothing could be concluded reliably from the analysis of wild meadow hay in the western ECMWF grid cells. The interannual variability of hay yields was so low that it was not possible to compute a best-fit equation that was statistically significant. Segmented regression allowed a precise estimate of the ideal scale of disaggregation of climate data to
62
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63
be made which minimizes statistical error of maize yields. An important finding of this exercise was that the optimum spatial resolution of climate data for simulating regional maize yields need not be the smallest areal unit possible Ži.e., a single point.. For eastern Iowa maize, the optimum resolution of climate data was about 60 km = 80 km or just finer than 18 = 18 resolution. In order to generalize the findings of this research, the methodology used in this study needs to be applied to other regions. Other crop models, in addition to EPIC, should be used in future research in order to test the robustness of the above findings.
6. Conclusion The results reported in Section 4 lead us to conclude that the averaging of climate data over individual stations across 18 regions in the Great Plains and then performing crop simulations is an effective scaling procedure. We argue that we have identified a level of spatial partitioning of climate data that preserves linear variable–process relationships within the EPIC model Ži.e., linear relationships between spatially averaged climate and simulated yields.. Thus, partitioning allows the averaging of climate data while minimizing aggregation error. As techniques of downscaling GCM outputs to smaller areas improve, the above-conclusion warrants their use in order to achieve the spatial resolution of climate information necessary reliably to estimate regional crop productivity in response to climate change. Simulation of crop yield response to climate change using GCMs Žas outlined in Section 1. needs to be compared with simulation of yield response to climate change using downscaled climate changes.
Acknowledgements The authors thank Deanna Batty for help with word processing, and Dr. Ken Hubbard and Dr. Dennis Jelinski for their review. This research was supported in part by the National Science Foundation ŽNSF-DEB-9523612. and in part by the US Department of Energy’s ŽDOE. National Institute for Global Environmental Change ŽNIGEC. through the NIGEC
Great Plains Regional Center at the University of Nebraska-Lincoln. DOE Cooperative Agreement ŽNo. DE-FC03-90ER61010.. Financial support does not constitute an endorsement by DOE of the views expressed in this articlerreport.
References Dugas, W.A., Arkin, G.F., Jackson, B.S., 1983. Factors affecting simulation crop yield spatial extrapolation. Trans. Am. Soc. Agric. Eng., pp. 1440–1444. Easterling, W.E., Rosenberg, N.J., McKenney, M.S., Jones, C.A., Dyke, P.T., Williams, J.R., 1992. Preparing the erosion productivity impact calculator ŽEPIC. model to simulate crop response to climate change and the direct effects of CO 2 . Agric. For. Meteorol. 59, 17–34. Easterling, W.E., Chen, X., Hays, C.J., Brandle, J., Zhang, H., 1996. Improving the validation of model-simulated crop yield response to climate change: an application to the EPIC model. Clim. Res. 6, 263–273. Giorgi, F., Mearns, L.O., 1991. Approaches to the simulation of regional climate change: a review. Rev. Geophys. 29, 191–216, Paper no. 90RG02636. Haskett, J.D., Pachepsky, Y.A., Acock, B., 1995. Estimation of soybean yields at county and state levels using GLYCIM: a case study for Iowa. Agron. J. 87, 926–931. Hubbard, K.G., 1994. Spatial variability of daily weather variables in the high plains of the USA. Agric. For. Meteorol. 68, 29–41. International Benchmark Sites Network for Agrotechnology Transfer Project ŽIBSNAT., 1989. Decision Support System for Agrotechnology Transfer Version 2.1. Department of Agronomy and Soil Science, University of Hawaii, Honolulu, HI. Kiniry, J.R., Spanel, D.A., Williams, J.R., Jones, C.A., 1990. Demonstration and validation of crop grain yield simulation by EPIC. EPIC-Erosion Productivity Impact Calculator: 1. Model Documentation. USDA-ARS Tech. Bull. No. 1768, pp. 220–234. Lal, H., Hoogenboom, G., Calixte, J., Jones, J., Beinroth, F., 1993. Using crop simulation models and GIS for regional productivity analysis. Trans. Am. Soc. Agric. Eng. 36 Ž1., 175–184. LeDuc, S.K., Holt, D.A., 1987. The scale problem: modeling plant yield over time and space. In: Wisiol, K., Hesketh, J.D. ŽEds.., Plant Growth Modeling for Resource Management. Vol. 1. CRC Press, Boca Raton, FL, pp. 125–137. Mearns, L.O., Rosenzweig, C., Goldberg, R., 1992. The effect of changes in interannual climatic variability on CERES-wheat yields: sensitivity and 2=CO 2 studies. Agric. For. Meteorol. 62, 159–189. Rastetter, E.B., King, A.W., Cosby, B.J., Hornberger, G.M., O’Neill, R.V., Hobbie, J.E., 1992. Aggregating fine-scale ecological knowledge to model coarser-scale attributes of ecosystems. Ecol. Appl. 2 Ž1., 55–70.
W.E. Easterling et al.r Agricultural and Forest Meteorology 90 (1998) 51–63 Richardson, C.W., 1981. Stochastic simulation of daily precipitation, temperature, and solar radiation. Water Resour. Res. 17 Ž1., 182–190. Ritchie, J.T., Baer, B.D., Chou, T.Y., 1989. Effect of global climate change on agriculture in the Great Lakes region. In: Smith, J., Tirpak, D. ŽEds.., The Potential Effects of Global Climate Change on the United States, Vol. 1, Appendix C. Office of Policy, Planning and Evaluation, US Environmental Protection Agency, Washington, DC, pp. 1-1–1-30. Rosenberg, N.J., McKenney, M.S., Easterling, W.E., Lemon, K.M., 1992. Validation of EPIC model simulations of crop responses to current climate and CO 2 conditions: comparisons with census, expert judgement and experimental plot data. Agric. For. Meteorol. 59, 35–51. Rosenberg, N.J., Crosson, P.R., Frederick, K.D., Easterling III, W.E., McKenney, M.S., Bowes, M.D., Sedjo, R.A., Darmstrater, J., Katz, L.A., Lemon, K.M., 1993. Paper 1. The MINK methodology: background and baseline. Clim. Change 24, 7–22. Rosenzweig, C., 1989. Potential effects of climate change on agricultural production in the Great Plains: a simulation study. In: Smith, J., Tirpak, D. ŽEds.., The Potential Effects of
63
Global Climate Change on the United States, Vol. 1, Appendix C. Office of Policy, Planning and Evaluation, US Environmental Protection Agency, Washington, DC, pp. 3-1–3-43. Rosenzweig, C., Parry, M., 1994. Potential impact of climate change on world food supply. Nature 367, 133–138. SASrSTAT, 1989. User’s Guide Vol. 2, GLM-VARCOMP, Version 6, 4th edn. Sas Institute, Cary, NC, 846 pp. USDA, 1984–1992. County Level Estimates National Agricultural Statistics Services. Washington, DC. USDA, 1994. State Soil Geographic ŽSTATSGO. Data Base Data Use Information. Misc. Pub. No. 1392. US Dept. of Agric., Soil Conser. Serv., National Soil Survey Center. White, J.D., Running, S.W., 1994. Testing scale dependent assumptions in regional ecosystem simulations. J. Vegetation Sci. 5, 687–702. Williams, J.R., Jones, C.A., Dyke, P.T., 1984. A modeling approach to determining the relationship between erosion and soil productivity. Trans. Am. Soc. Agric. Eng. 27, 129–144. Williams, J.R., Jones, C.A., Dyke, P.T., 1990. The EPIC model. EPIC-ErosionrProductivity Impact Calculator: 1. Model Documentation USDA-ARS, Tech. Bull. No. 1768, 3–92.