ELSEVIER
Calibration-Based Models for Correction of Area Estimates Derived from Coarse Resolution Land-Cover Data Aaron Moody* and Curtis E. Woodcock* Calibration-based area correction models provide improved estimates of cover-type proportions measured at coarse scales. Three separate regions (Plumas and Stanislaus National Forests and Lake Tahoe Basin) in the northern Sierra Nevada serve as test sites: the first for model calibration and the others for extrapolation and validation of the area correction methods. The use of one (or a few) large sites to calibrate models for extrapolative application over large areas represents a global test-site sampling strategy that most easily can be employed for the development of global land cover and land-cover change products. The inverse estimator, calibrated on scale-specific interclass transition matrices, produces the best overall results at the coarser scales for both validation sites. Good results probably occur because the conditional probabilities in the transition matrices implicitly carry information about the spatial organization of the landscape. This method performs poorly at finer scales, or in cases where the calibration and validation sites have dissimilar spatial organization, or class confusions. The slope estimator, calibrated on the coefficients of scale-specific proportion transition lines, produces improved estimates over uncorrected values at all scales. The slope estimator appears to generalize most successfully because it characterizes the basic tendency of small classes to diminish and large classes to increase in size as the landscape is represented at increasingly coarser scales. The positive results achieved using the inverse estimator and the slope estimator indicate potential for
*Department of Geography, University of North Carolina at Chapel Hill, North Carolina t Department of Geography and Center for Remote Sensing, Boston University, Boston, Massachusetts Address correspondence to Aaron Moody, Dept. of Geography, Univ. of North Carolina, Chapel Hill, NC 27599-3220. Received 5 August 1995; revised 2 January 1996. REMOTE SENS. ENVIRON. 58:225-241 (1996) ©Elsevier Science Inc., 1996 655 Avenue of the Americas, New York, NY 10010
using such a posteriori calibration methods to improve coarse resolution land-cover area estimates over large regions. © Elsevier Science Inc., 1996
INTRODUCTION
Many problems in earth system science and resource management require accurate data on the nature, extent, and location of land-surface cover. For example, models of net primary productivity, surface energy balance, hydrologic processes, biogeochemical cycling, and climate all incorporate some representation of land cover to drive submodel components. Likewise, management of ecological resources depends on reliable measurements of land surface characteristics, as well as natural and human-induced land-cover changes. These general fields of inquiry are increasingly focused at continental or global scales. At these scales, frequent assessment of land cover and land-cover change can be accomplished most effectively using synoptic, low spatial resolution, high temporal frequency remotely sensed data. The best current option for meeting these criteria are data from the Advanced Very High Resolution Radiometer (AVHRR) sensors on board the NOAA series of polar-orbiting meteorological satellites. However, both the thematic and proportional accuracy of land-cover data are influenced by the size of the instantaneous field of view (IFOV), or pixel size (Gervin et al., 1985; Townshend and Justice, 1988). This scaledependence of accuracy is related not only to the spatial resolution of the sensor, but to the interaction between the sensor resolution and the spatial characteristics of the phenomenon being mapped (Woodcock and Strahler, 1987). Of course, the spatial characteristics (and thus classification accuracy) are themselves related to the taxonomy used to classify the surface. These 0034-4257 / 96 / $15.00 PII S0034-4257(96)00036-3
226 Moody and Woodcock
accuracy issues have implications for understanding transformation processes, for managing threatened resources, and for deriving land-surface parameters for input to models of earth system processes. To achieve the desired accuracies for global datasets, it is necessary either to improve techniques for extracting land-cover information from coarse-scale remotely sensed data, or to develop methods for a posteriori correction of landcover area estimates. We evaluate two methods for improving coarseresolution estimates of land-cover proportions using calibration-based correction procedures. Scaling models developed over a calibration site are inverted and applied to validation locations. The models are evaluated with respect to their ability to improve coarse-scale estimates of land-cover proportions for the validation sites. The calibration site is the Plumas National Forest and the validation sites are the Stanislaus National Forest and the Lake Tahoe Basin. All three sites are located in the Sierra Nevada Mountains and are populated by the same basic cover types. The context for this study relates to the development of l-kin global land-cover data using measurements from the Moderate Resolution Imaging Spectroradiometer (MODIS) scheduled for launch on the EOS-AM (Earth Observing System) platform in 1998 (Strahler et al., 1995). The MODIS sensor is the primary EOS instrument for large-scale monitoring of terrestrial and oceanic systems. MODIS will have 36 narrow spectral bands in the range of 0.4-14/tm, with 250-m resolution in red and near-infrared (NIR) bands, 500-m resolution in the five visible and NIR bands, and 1000-m resolution in the remaining bands (Running et al., 1994). Algorithms are presently under development by the Land Group of the MODIS Science Team to produce land cover and land-cover change products on a seasonal basis at 1-km spatial resolution using these data. Calibration and validation are important elements of algorithm design, product evaluation, and product evolution for this project (Strahler et al., 1995). An important component of the calibration/validation strategy' for the MODIS land-cover algorithm is a global network of test sites for which high resolution satellite data and validated land-cover classifications will be available. The test sites are spatially distributed in an attempt to represent the range of global biomes (Strahler et al., 1995). However, due to the scale of the product and the cost associated with developing test areas, the sites are sparsely distributed in relation to the scales at which land cover might vary. This sampling approach represents a departure from the usual strategy for the validation of remote sensing classification results. Typically there is a manageable limit to the spatial extent of satellite-based classifications within which it is possible to identify and visit an appropriately distributed, representative sample of presum-
ably independent test sites. This is a statistical approach where the population of interest is defined, an appropriate sampling scheme is applied, and estimates of population parameters and associated confidence intervals are inferred from the sampled data. At the global scale there are considerable economic and logistic challenges to sampling a large nmnber of field sites within each regional biome. As such, calibration and validation sites are likely to be used in an extrapolative sense to assess and refine results over broad regions. This approach presumes a level of determinism in the relationships of interest, implying that the calibrated model can be extrapolated to the population, and precluding the development of confidence and prediction intervals. This approach is limited in terms of evaluating the quality of the model and model predictions with respect to statistical significance. In this research we assume that an intensive sampling strategy will not always be feasible at the global scale, and the models are implemented and evaluated in this deterministic or extrapolative framework. However, a notable example of the probabilistic sampling strategy for a large-scale mapping project is the Forest Resources Assessment 1990 Project (FRA90) conducted by the Food and Agricultural Organization (FAO) for the global assessment of tropical forest resources (FAO, 1995). In this study, 117 Landsat Multispectral Scanner scenes representing roughly 10% of the tropical belt, were selected using a stratified random sampling design. For each sampled site, data from 1980 and 1990 were acquired and used to estimate land cover area and land-cover change within the tropics. Although costly, this approach does provide a probabilistic sample from which statistical inferences can be drawn. The Plumas National Forest is one of the MODIS test sites and will serve for validation and calibration over an area that includes the northern and central Sierra Nevada range (Strahler et al., 1995). The Stanislaus National Forest and the Tahoe Basin are typical of the region which the Plumas will represent. It is important to reiterate that the Plumas is an external calibration site and, as such, site-wide coefficients from the Plumas provide calibration for the models used to correct site-wide proportion estimates for the validation sites. This is a departure in strategy from related research efforts discussed in the Background section below. BACKGROUND
Efforts to map continental or global land cover from remote sensing have typically used time-series data from the NOAA-AVHRR sensors at either 1.2 km 2 or roughly 18 km 2 resolution. Classifications are usually based on time series of maximum-value-composited NDVI (Normalized Difference Vegetation Index) data. Methods
Correction of Area Estimates from Coarse Resolution Data
include a) unsupervised clustering based on temporal signatures (Townshend et al., 1987; Loveland et al., 1991); b) clustering based on variables that are derived from temporal signatures (Lloyd, 1990); c) supervised classification in temporal space (Ehrlich and Lambin, 1996); or d) decision tree classification based on predetermined critical thresholds in NDVI and sometimes surface temperature values derived from AVHRR thermal bands using the split-window algorithm of Price (1984) (Running et al., 1994; Borak et al., 1995; Ehrlich and Lambin, 1995). As presently defined, the MODIS global land-cover product will use a supervised artificial neural network algorithm that is trained on timetrajectories of surface reflectances, surface temperature, spatial texture, directional information, and other measured and ancillary data (Strahler et al., 1995). A variety of factors can lead to errors in remote sensing classifications performed using low spatial resolution data. One source of error relates to the interaction between the spatial pattern, or the scales of variability in the landscape, and the spatial resolution at which the surface is being measured and represented (Woodcock and Strahler, 1987; Townshend and Justice, 1988). Mismatches between the scales of spatial variability and sampling scale can lead to over- or underestimation of land-cover areas (Moody and Woodcock, 1994; Mayaux and Lambin, 1995). For a given land-cover taxonomy, classes that dominate the landscape will be increasingly overrepresented as the landscape is sampled at progressively coarser resolutions. Conversely, small classes will be overwhelmed by the signal for the more dominant classes, and will diminish in size at coarser scales (Turner et al., 1989; Moody and Woodcock, 1994; Mayaux and Lambin, 1995). This general effect is modulated by other elements of landscape organization, specifically the level of spatial association of the classes and the interclass adjacencies in the landscape (Turner et al., 1989; Moody and Woodcock, 1995). In remote sensing classification, this scenario is further convolved with influences resulting from nonlinear spectral mixing, atmospheric effects, and sensor response characteristics. These latter effects relate to energy transfer in the remote sensing measurement stream from the surface, through the atmosphere, and through the sensing system. While these measurement considerations may affect land-cover area estimates, it is convenient for the purpose of isolating some of the confounding factors to assume that scale-dependent error results solely from the spatial effects outlined above. Under this scenario it should be possible to model analytically the scaledependent change in area estimates when the spatial properties of the landscape are known (Turner et al., 1989). An alternative approach for correcting proportion error is to develop empirical scaling relationships based on representative areas for which accurate high-resolution land-cover data are available. These scaling models
22 7
then can be inverted and applied to other locations of the same general type for which only coarse-scale data are available. This type of model is potentially feasible for operational application to global data sets and is the approach tested in this article. Several calibration-based models of this general nature have emerged in the remote sensing literature. These studies have frequently employed one of two related methods for area correction that are derived from classical regression and typically referred to as the classical estimator and the inverse estimator (Czaplewski and CaRs, 1992; Walsh and Burk, 1993; Kalkhan et al., 1995). Both methods use linear models in which the relationship between actual and observed class proportions is realized by a class transition matrix. The coefficients for the transition matrix are determined from ground or other calibration data that serve as a representation of the true surface. Investigations frequently focus on comparing the statistical properties of the classical and inverse methods when ground or airphoto data are used to calibrate area estimates from TM-based (Thematic Mapper) classifications (Czaplewski and Catts, 1992; Walsh and Burk, 1993; Kalkhan et al., 1995). The underlying premise is that misclassifications may bias areal estimates and that these estimates can be improved once the interclass confusions for a set of sample sites are known. There is some general consensus that the inverse estimator is the most robust method (Czaplewski and Catts, 1992; Walsh and Burk, 1993; Kalkhan et al., 1995), In its simplest form, and as typically presented, the inverse estimator is defined as tc = L* r ,
(1)
where tc is a k × 1 vector of true proportions for each class k, Pc is a k × k matrix of class transition probabilities, and rc is a k× 1 vector of area estimates from the classified map (Kalkhan et al., 1995). The inverse estimator is the basis for the first model tested in this analysis, although a different s~Tnbology is adopted and the context is slightly different (see Methods). A less common approach, referred to here as the slope estimator, is based on a linear regression model that relates coarse resolution estimates of class proportions to actual or reference proportions (Mayaux and Lambin, 1995; Roesch et al., 1995). This is a form of "classical" model in which the known but incorrect values are estimated based on unknown correct values (Brown, 1982). For example, if Pk0 represents the reference proportions for each class k in the test site and Pk~ represents the measured proportions at some coarse spatial resolution r also for the test site, then
ekr =fl0 + #1" ek0 q- e r r o r .
(2)
This model can be inverted to estimate Pk0 in cases where only Pkr is known. The model in Eq. (2) can
228
Mood~¢and Woodcock
characterize the fundamental scaling relationship where small classes tend to disappear, and large classes increase in size as the landscape is represented at progressively coarser scales. In this case, a single linear model is developed for each scale. However, separate linear models for each class k or nonlinear models might also be used to characterize these relationships (Moody and Woodcock, 1995). Mayaux and Lambin (1995) have experimented with regression-based approaches for improving estimates of forest area for the entire tropical belt using a two-class case. Their method is based on a proportion-scaling relationship between 30-m and 1-km data using measurements of forested and nonforested areas at these two resolutions. The coefficients of the linear model are additionally adjusted by an index of spatial pattern derived from the coarse resolution data (Mayaux and Lambin, 1995). This regression-type model is the basis for the second method adopted in our analysis, although our data contains multiple classes and a set of five increasingly coarser resolutions. With an eye toward general global application, we also examine the stability, and thus extensibility, of the slope relationships by evaluating them for the series of scales at a range of sites including a spatially and compositionally distinct site (Nile Delta). Considering multiple scales provides insight to the performance of the calibration models for intermediate sensor resolutions that will be provided by MODIS, MISR (Multi-Angle Imaging Spectroradiometer) and other future sensors. For example, different model types may perform better at different scales, and/or there may be intermediate scales at which the improvement in area estimates afforded by such models is limited. A multiresohition analysis also allows assessment of the form of scaling behavior and model performance across scales. The resolutions examined here correspond to the 250 m, 500 m, and 1000 m MODIS resolutions.
METHODS Study Sites The calibration site is the Plumas National Forest in the northern Sierra Nevada Mountains in California. The vegetation is composed of shrub formations and pine and oak woodlands at the lower elevations, mixed conifer and riparian hardwoods at intermediate elevations, and mixed conifer combined with brush at higher elevations. Brush and grasslands are distributed throughout the area and rock outcrops exist at high elevations. The primary validation site is the Stanislans National Forest. This area is also located in the Sierra Nevada roughly 1.5 ° south of the Plumas, and its vegetation can be characterized in much the same way. Both sites have been studied as part of a project to develop vegeta-
tion mapping and timber inventory procedures for the U.S. Forest Service (Woodcock et al., 1994a). Landcover maps have been produced using Landsat TM imagery and unsupervised image classification supported by air-photo and field validation (Woodcock et al., 1994a). Basic class types for both sites include grass/ barren, brush, hardwood, conifer, and water. The accuracy of these maps has been assessed using methods based on fuzzy sets (Woodcock et al., 1994b). Using a stringent MAX operator, the general land-cover classes are 84% and 79% accurate for the Phimas and Stanislaus sites, respectively. These values increase to 93% and 88% using a RIGHT operator (Woodcock and Gopal, 1992; Woodcock et al., 1994b). The Lake Tahoe Basin is used as an additional validation site for testing the calibrated models. This location contains the same basic classes as the Plumas and Stanislaus and has been mapped using the same classification procedures. The Nile Delta is used as a fourth site that is included in some of the analysis as an extreme example of extrapolating the calibration data. For each site, the 30-m land-cover data are aggregated to a series of coarser scales using a plurality-based aggregation procedure. For each coarser resolution of interest, a sampling grid is coded with the value of the most frequently occurring class within each grid cell. This method was used to generate new maps at 150 m, 240 m, 510 m, and 1020 m resolutions. The resulting dataset allows modeling of changes in land-cover proportions as a function of aggregation level. It is consequently possible to calibrate the models using the Plumas data, and invert them for the validation sites to estimate "true" proportions from coarse-scale area measurements. The measured class proportions at each scale for the three Sierra Nevada sites are given in Tables la-c and shown graphically in Figures la-e.
Proportion Correction Methods Using the Plumas data, the relationship between 30-m class proportions and proportions at each lower spatial resolution are calibrated for both the inverse estimator and the slope estimator methods. Once calibrated, each method is then inversely applied to the coarse-scale measurements of class proportions for the validation sites at each of the four scales (150 m, 240 m, 510 m, and 1020 m) and evaluated with respect to its ability to estimate the actual proportions as determined at 30 m. The results are further evaluated to indicate how model performance changes at different scales, whether the relative performance of the two models changes with scale, and the sensitivity of model performance to general landscape characteristics.
Inverse Estimator The methodology for the inverse estimator usually involves developing a transition matrix from a large sample
Correction of Area Estimates from Coarse Resolution Data
229
Table la. Class Proportions at Each Scale of Analysis for the Plumas National Forest Data Plumas Class Proportions Resolution Class Type
30 m
150 m
240 m
510 m
1020 m
Barren Brush Hardwood Water Conifer
5.70 20.2 18.0 1.0 51.2
3.70 18.4 16.5 1.6 57.2
3.10 16.8 16.2 1.1 59.9
2.20 14.0 15.7 1.1 64.6
2.60 11.1 15.3 1.O 68.7
Table lb. Class Proportions at Each Scale of Analysis for the Stanislaus National Forest Data Stanislaus Class Proportions Resolution Class Type
30 m
150 m
240 m
510 m
1020 m
Barren Brush Hardwood Water Conifer
17.1 24.8 10.9 0.47 44.3
16.4 23.1 10.2 0.51 48.8
15.7 22.2 9.80 0.52 51.0
14.6 20.1 9.20 0.53 55.2
13.9 19.0 8,50 0.47 58.0
Table lc. Class Proportions at Each Scale of Analysis for the Tahoe
Basin Data Tahoe-Basin Class Proportions Resolution Class Type
30 m
150 m
240 m
510 m
1020 m
Barren Brush Hardwood Water Conifer
14.9 25.4 1.05 0.40 53.3
13.5 25.3 0.53 0.34 59.4
12.9 24.0 0.33 0.31 61.7
12.7 21.4 0.14 0.14 65.3
12.3 17.5 0.0 0.0 70.1
of test sites distributed within the study area. In our analysis, however, the transition matrix is developed from the single calibration site and tested on the validation sites. As noted previously, this approach follows from the idea that, at the global scale, the sampling density will be such that each site will probably serve as a representative sample for a large area. While this limitation is likely the case in the M O D I S scenario, it may not apply for all large-area land cover studies, such as the aforementioned FAO study (FAO, 1995). If Trc is a k × k classtransition matrix developed from the calibration site c, Pr~ is a vector of the measured class pro_.~ortions from the validation site v at resolution r, and P0v is a vector of the estimates of true class proportions for site v, then Pot, = Trc* Pry.
(3)
The elements of the matrix Trc represent the percentage of each class k which is misclassified into each of the
other classes at resolution r (Kalkhan et ai., 1995). A separate transition matrix is developed for each of the four scales considered. Slope Estimator
The slope estimator is based on a linear model relating the actual and estimated cover-type proportions for the calibration site as
A0v = e~ro-/~0rc
#lre
(4)
where ~Orc and ~lrc are the intercept and slope of the proportion transition line developed from the calibration site at resolution r using the model in Eq. (2). In this case, the same coefficients/%~c and/%c are used for all classes k. The calibrated model for each scale is based on 20 measured pairs of values. That is, for a given scale, the calibration site is divided into four equal sized subregions. For each subregion, the initial proportions
a 0 A X
b
Plumas Stanislaus Tahoe
x
X
A
g,
X
1.0. Z~ A
o.
o
go. 2
m
~.
o
~o
4~o
800
d,o
200
~o'oo
,~o
,~o
8~o
1000
Resolution
Resolution
C
d
A
Q
O
o
A
z~
to
A
A
X
X
x
x
x o
X
X
200
0 6
x
,o0
o~o
800
x
2;o
1000
Resolution
400
600
800
10'0 0
Resolution
e
o.
X
x
o
.,=_ o
X
0
A A
A
2oo
400
6~o
8~o
lObO
Resolution
Figure 1. Class proportions as estimated acros:, scales for the Plumas National Forest, Stanislaus National Forest, and Lake Tahoe Basin: a) barren; b) brush; c) hardwood; d) water; e) conifer.
Correction of Area Estimatesfrom CoarseResolution Data 231
(P0) and the coarse-resolution proportions (Pr) are measured for each of the five classes. Using these data, a separate model is developed for each of the four scales considered (150 m, 240 m, 510 m, and 1020 m). The models are then used to correct coarse-resolution proportion estimates for the validation sites, and the results are evaluated with respect to model performance at each scale. To investigate the potential for extrapolation of this procedure, the true transition relationships are calculated for each of the four separate sites and evaluated with respect to the stability of their slopes across scales. As indicated previously, the Plumas, Stanislaus, and Tahoe are all relatively similar sites in the northern Sierra Nevada. The Nile Delta is a large low-lying site with a long history of human habitation. Multiple regression is used for a preliminary investigation into the relationship between land-cover spatial characteristics and the slope rate of change (SRC); that is, the rate at which the /~-values of the proportion transition lines change with increasing cell size (r). For this analysis, the data from the four subregions of the Plumas and an additional seven regions from the Stanislaus are pooled in order to increase the number of subregions within which proportion transitions and spatial characteristics are measured. The Stanislaus, therefore, no longer serves as a validation site in this context. For each subregion, proportion transition lines are calculated at each resolution. To approximate the SRC (which is nonlinear) we use the slope of the relationship between resolution (r) and the//1-values from the proportion transition lines. The resulting SRC values are then modeled as a function of (a) the proportion of the dominant class within the subregion (maximum class size) and (b) the mean patch size for each subregion. Patch size is determined using the r.le software developed by Baker and Cai (1992) and implemented using the Geographical Resource Analysis Support System (GRASS) software (USA-CERL, 1991).
Evaluation Criterion For the model validation phase of the analysis, the success of each proportion correction method is determined for each of the coarser resolutions (150 m, 240 m, 510 m, and 1020 m) using a measure of total error defined as
TEr= ~ I 'k°v--Pk°~ l
(5)
where Pk0vis the actual proportion of class type k for the validation site v and ihk0v is the estimated proportion. RESULTS AND DISCUSSION Calibrated Proportion Estimates Changes in class proportions due to aggregation of the 30 m Stanislaus land-cover map are illustrated in Fig-
ure 2. The distance between the line for a given class and the line below it represents the size of that class at the resolution of interest. For this site, there are small reductions in the proportions of barren and hardwood, a moderate reduction for brush, and a large increase for conifer (Fig. 2). The goal of our analysis is to reduce the scale-dependent changes in class proportions so that coarse-scale proportion estimates are closer to the proportions observed at 30-m resolution. The inverse estimator and slope estimator methods are tested for their ability to accomplish this. As described above, each correction method is calibrated on data from the Plumas site and subsequently inverted and applied to the Stanislaus and Tahoe sites. Proportion errors (Pkr~-Pk0~) based on the 1020 m proportion estimates are shown in Figures 3 and 4 for the Stanislaus and Tahoe sites, respectively. The uncorrected data are presented along with results from the two calibration methods. Symbols corresponding to the individual classes are displayed with respect to their original proportions along the zero line on the x-axis. Total errors for each method are also provided. Inverse EslJmator The transition matrix presented in Table 2 contains class confusions between the 30-m data and the 1020-m data for the Plumas. The matrix diagonal indicates the proportion of pixels originally assigned to each class that are still correctly labeled after aggregation. The off-diagonals down a column give the proportion of pixels that are labeled as one class, but really belong to another class. For example, 64 % of the 30-m pixels that are classified as barren after aggregation to 1020 m actually are barren at the original resolution. Roughly 20% of those pixels are actually brush at the original resolution, and so on. Relatively large transitions occur between barren and brush, brush and conifer and hardwood and conifer. These class pairs also tend to be spatially associated with each other in the landscape. Transition matrices, calibrated on the Plumas site for each scale, provide the matrices. (Trc) used for the inverse estimator [Eq. (3)]. Of the two methods, the inverse estimator produces the best results at 1020 m with a TEr of 10.26 and 16.73 for the Stanislaus and Tahoe sites, respectively (Figs. 3 and 4). At the Stanislaus site, this method performs worst for the hardwood class, which "absorbs" a much larger percentage of the conifer class in the calibration site (12.7%, see Table 2) than it does in the validation site (6%). Hardwood is therefore overcorrected using the inverse estimator. The opposite assessment applies to the undercorrection of barren which absorbs 4 % of the conifer class for the calibration site and nearly 8% for the validation site. Transitions from conifer to the other classes are important because, since conifer is such a large class these percentages represent a substantial area.
232 Moodyand Woodcock
Stanislaus Class Proportions O
....
O
barren [ i
brush hardwood water conifer
..........*.......
C
._o t~ o
o o
J
O} O~
o _m c5
......"..'2..:..:2.2.:2!:--=.=.==.=...... ~ . . . . . ~.~..==........................................................................ ~.......
E
O i'M O
o.
.........................
W. . . . . . . . . . . . . . . . . . . . . . . . . . .
t .................................................................
~. . . . . . . . . . . . . . . . . . . . . . . . . . .
•.......
O
50
100 Log(Resolution) (m)
500
1000
Figure 2. Class proportions for the Stanislaus (validation site) as a function of aggregation level. The size of a given class at a given resolution is represented by the distance between the line for the class of interest and the line below it. The inverse estimator probably performs well because the transition matrix carries information related to the relative size, spatial pattern, and typical adjacencies of the classes in the landscape. Cover types which are spatially disaggregated, such as brush, will tend to have a low value along the diagonal (correct classification) and will be redistributed among other classes which are m o r e highly aggregated spatially. This effect will be modified in part by the original size of the class under consideration. Similarly, there will be frequent transition between classes that tend to be adjacent to one another in the landscape. For example brush and conifer have a high degree of spatial association in the Plmnas Forest, which is reflected by the relatively large transition values between these two classes in Table 2. For the Stanislaus, the inverse estimator succeeds because the validation and calibration sites are similar in terms of spatial patterns, class composition, and class
proportions. The method may be sensitive, however, to a variety of situations that can occur when using an external calibration site. First, if the two sites do not contain the same suite of class types, then either the method will not work due to incompatible matrix dimensions, or if this problem is avoided, then proportions will be assigned to classes that do not actually exist. Second, if the relative proportions of the c o m p o n e n t classes in the two sites differ greatly, then they will have different transitional patterns, leading to errors in proportion estimates. For example, while the calibration site contains 18% hardwood, the Tahoe contains almost none. This disparity leads to the considerable overcorrection of hardwood for the Tahoe (Fig. 3). For the other classes, the Tahoe proportions are more similar to those found in the calibration site, and the model works well in these cases. It is interesting to note that the remaining classes in the Tahoe are all slightly
Correction of Area Estimatesfrom Coarse Resolution Data 233
Stanislaus: Estimation Errors for Calibrated Proportions (1020m)
T--
Uncorrected: TE : 25.05
[
Slope Estimator: TE = 16.80 nverse ~-st mater: TE = 10.26
T--
[] X <>
0
•
•
I
I
water hardwood barren brush conifer
O0
¢,0 0
/
n ,
~t
// /,/
n
Od
/ 0
// //
I
I
0.0
0.1
I
0.2 Original Proportion (Pko)
I
I
0.3
0.4
Figure 3. Corrected proportion estimates for the Stanislaus at 1020 m resolution. Results are presented for the uncorrected estimates, inverse estimator, and slope estimator. Symbols corresponding to the individual classes are displayed with respect to their original size along the zero-line on the x-axis. underestimated. This is probably because the hardwood class is absorbing more of each class than it should, due to the mismatch between the two sites. In a related case, if the spatial patterns (class-specific aggregation and interclass adjacencies) differ between sites, then the transitional patterns in the validation site will be different from those expressed in the calibrated matrix and poor area estimates will result. These problems with the inverse estimator are all resolved in situations where the transition matrix can be calibrated from a representative sample of small sites within the region of interest. This may not be a realistic scenario for global analysis, however, as the cost of sampling enough sites is prohibitive.
Slope Estimator Model Performance Figure 5 shows transition lines relating initial class proportions to coarse-resolution proportions. Each l i n e is
based on 20 values: five cover types for four subregions. As expected, the slopes increase in magnitude with coarser resolution. This expresses the general relationship that small classes disappear and large classes increase as the spatial resolution is degraded. The slope and intercept coefficients from these lines are used to supply the values of/~0rc and •lrc in Eq. (4) for calculating the proportion estimates for the validation sites. At 1020 m, the slope estimator reduces the TEr to 16.80 and 20.78 for the Stanislaus and Tahoe sites, respectively. For the Stanislaus, this method produces moderate overcorrections for water and conifer and undercorrects barren and brush (Fig. 3). The slope estimation probably performs well because it reflects the general relationship between class proportion and scale. That is, the slope of the lines increase with scale in response to the tendency for small classes to diminish and large classes to increase as the scene is aggregated
234
Moodyand Woodcock
Tahoe Basin: Estimation Errors for Calibrated Proportions (1020m)
Uncorrected: TE = 28.7 .Slope Estimator: TE = 20.78 inverse Estimator: TE = 16.73
tO
--[] × • •
\4\
0
\, \, 0
water hardwood barren brush conifer
\,\ \, \, N,
i
//// // / / / o
0
.f
n m
j.
// / //
\\
//
I
I
I
I
I
I
0.0
0.1
0,2
0.3
0.4
0.5
Original Proportion (Pko) Figure 4. C o r r e c t e d p r o p o r t i o n estimates for t h e T a h o e Basin at 1020 m resolution. I n t e r p r e t a t i o n is t h e same as Figure 3.
at coarser resolutions (Table 1, Fig. 1, and Fig. 5). This effect is moderated, however, by the spatial organization of the landscape. For example, a small but highly aggregated class (such as water) will disappear much more slowly than a dispersed class of the same size. This
effect is illustrated by the stability of the proportions for water seen in Tables la and lb and Figure le. Note also the stability of hardwood (a small- to moderatesized, aggregated class) in comparison to the instability of brush (a moderate-sized, disaggregated class) (Figs.
Table 2. Transition Matrix T~c R e p r e s e n t i n g Pixel R e a s s i g n m e n t D u e to Aggregation from 30 m to 1020 m Resolution for t h e Plumas (Calibration) site" Composition o f Land-Cover Classes at 1020 Meters Aggregated Cover Type Components Barren Brush Hardwoud Water Conifer
Barren
Brush
Hardwood
Water
Conifer
0.644 0.196 0.053 0.007 0.092
0.118 0.491 0.122 0.003 0.245
0.040 0.154 0.520 0.004 0.277
0.068 0.060 0.047 0.645 0.179
0.039 0.174 0.127 0.003 0.643
<'This matrL'¢ is used for the inverse estimator [Eq. (3)] to correct the 1020 in Stanislans Forest and Tahoe Basin proportion estimates.
235
Correction of Area Estimates from Coarse Resolution Data
Proportion Transition Lines Calibrated from Plumas 0
Z e r o - o n e line: S l o p e = 1.0 150m: Slope = 1.069 240m: Slope = 1.124 510m: Slope = 1.216 1 0 2 0 m : S l o p e = 1.301
......... .... -----
¢,..0 0
j
-'c" O.. v t...o 1E o
J
/ J
//
f
s~
,,i
."
.-
,
./
/
.~-
13_
I L'%I (:3
r~
• [] • Z~
150 m 240 m 510 m 1020 m
•
Z~
(:3 0 i
i
i
i
i
i
i
0.0
0.1
0.2
0.3
0.4
0.5
0.6
Initial P r o p o r t i o n ( P k o )
Figure 5. Calibrated transition lines for the Plumas (calibration site) used to support the slope estimator method.
lb, and lc). As a result of this d e p e n d e n c e on spatial pattern, the slope of the transition lines may vary considerably from one landscape to the next, as well as between classes. This suggests the importance of investigating class-specific as well as landscape-specific models, although only the latter are presently considered. For a landscape where the spatial distribution of classes is random, the function describing the transition line [Eq. (2)] for a given scale will characterize the fundamental relationship between class size and proportion error mentioned above. For realistic landscapes, the function deviates from this random situation, but the general relationship will still be expressed. Beyond this basic relationship, the slope estimator is sensitive to the particular characteristics of the calibration site, such as the spatial pattern of the classes and the initial class proportions in the scene. That is, the proportion transition slope for a given scale will depend on the complex interactions between resolution, spatial organi-
zation, and class proportions. The slope estimator does not depend, however, on the presence of the exact same suite of classes between the calibration and validation sites. In this sense, the slope estimator is a more extensible method than the inverse estimator, for which the classes present in the calibration and validation sites must be the same. If, as we assume, the general relationship expressed by the slope estimator is extensible, then this method should result in an overall site-wide improvement in proportion estimates as expressed by a measure such as TEr. On a class-specific basis, however, the performance may be quite variable. As such, the slope estimator should work moderately well across a wide range of landscapes, but probably not as well as a finely calibrated inverse estimator would for a narrower range of sites. It is interesting that the general patterns of model performance are similar for the Stanislaus and Tahoe sites (Figs. 3 and 4). For both sites, the inverse estimator
236
Moody and Woodcock
C) ('0
Plumas Stanislaus Tahoe Basin Nile Delta
Ckl T-"
#/ 1//
C) 0,1
,>"" 7 /
(I) Q. 0 tO (tl (..-
IO
,, t.O (~
C~ O "1-'-
0
I
I
I
I
I
200
400
600
800
1000
Resolution (m)
Figure 6. Slopes of the transition lines calibrated for four separate sites as a fimetion of scale.
overestimates hardwood and pertbrms quite well for the other classes. Similarly, the slope estimator water, greatly underestimates brush and overestimates conifer. Notice also that the slope estimator provides nearly the same value as the uncorrected estimate for brush at both sites. These similarities indicate that the two validation sites differ from the calibration site in much the same ways. For example, Tables l a - c show that, at 30 m, both validation sites have much more barren, more brush, and much less hardwood than the calibration site.
Slope Stability As described in the Methods section, we evaluate the stability of the transition relationships for each of the test sites. The transition slopes calculated for the four separate sites are presented in Figure 6. For each site, the slope vahles increase with increasing resolution cell size as expected. This demonstrates the stability of the general trend for large classes to get larger and small classes to get smaller with increasing cell size. A dispar-
ity is evident, however, between the scaling behavior of the three Sierra Nevada sites, and that of the Nile Delta. The steeper lines for the Sierra sites suggest that the underlying relationship is more pronounced and immediate at these sites, while the class proportions at the Egypt site are more stable within the range of resolutions considered. The Sierra sites are all relatively natural, mountainous landscapes with strong topographic and microclimatie controls on vegetation distribution. As such, these sites are characterized by heterogeneous patterns of vegetation where many small patches of class types are interspersed with other types. In this situation large classes will rapidly begin to dominate the landscape at coarse scales, because the smaller classes are mixed with other types rather than existing in homogeneous patches. In contrast, the Nile Delta is a large fiat landscape that has undergone dramatic modification by humans. As such, the environmental controls on the spatial extent of the component class types are fairly simple, and classes tend to occur in large homogeneous areas,
Correction of Area Estimates from Coarse Resolution Data 23 7
Stanislaus
Uncorrected ,SlopeEstimator reverse ~-stimator
]
0
LM
~
0 I--
0
/
T'-
/
/
/
/
/
/
/
/
/
/
I
~
I
/
/
/ tt~ I
I
I
I
I
200
400
600
800
1000
Resolution(m) Figure 7. Comparison of the total error for the correction methods across all scales. These are calibrated on the Plumas and tested on the Stanislaus. even when they are relatively insignificant classes in the overall landscape. These results suggest the possibility of a library of transition slopes for different classes of landscapes that are known to scale in distinct ways. For example, in this case one might calibrate the model for a naturally vegetated, high mountainous area using the average of the three Sierra sites, and calibrate for low-lying agricultural areas using the values from the Nile Delta or similar areas. This strategy would clearly involve a large amount of empirical evaluation and would also benefit from an improved understanding of the specific relationships between landscape spatial pattern and proportion error. Overall Performance Figures 7 and 8 show changes in total error (TE,) as a function of resolution for each scaling method when the Plumas-calibrated models are applied to the Stanislaus
and Tahoe, respectively. Note that for both methods and at both sites, the improvement of the model-based area estimates over the uncorrected estimates increases as a function of scale. For both sites, the inverse estimator extrapolates poorly at fine scales but does considerably better than the slope estimator at coarse scales. Even though the slope estimator is not the best method at the coarser scales, it always produces improved results over the uncorrected data. Conversely, the inverse estimator actually does worse than the uncorrected data at fine scales. W e expect that the inverse estimator performs badly at fine scales because, at these resolutions, the transition matrices (T,c) are affected by highfrequency peculiarities of the calibration site rather than general landscape-scale transition patterns. This may also indicate that the fine-grained structure of the test sites differ from the calibration site, but the coarsegrained structures are more similar. These results are consistent with the discussion of
238 Moody and Woodcock
Tahoe Basin
tO Od
Uncorrected
Slope Esti .m.ato.r verse ~-sumat0r
0 t'M / LU
//
tl/
/
/
/
/
/
/
f
/
/
I',r-o~ ° ~
-~
j/.--./~ j-J / / / / / /
0
/
/
3/,/
/
/ I
200
I
I
400
600
I
800
I
1000
Resolution (m) Figure 8. Comparison of the total error for the correction methods across all scales. These are calibrated on the Plumas and tested on the Tahoe Basin.
Figures 3 and 4 above. That is, the slope estimator, as a generalizable procedure, should perform moderately well and will probably do so over a wide variety of landscapes. This method, however, incorporates no information that is specific to the spatial characteristics of the landscapes in question and so will fail to do extremely well, even in similar landscapes such as the three sites considered here. The inverse estimator is more specific to the typical adjacencies and landscapescale patterns in the calibration site and thus implicitly carries some information about spatial organization. While this method does well in the case of the sites presented here, it remains to be seen whether it is more or less extensible than the slope estimator, particularly in the context of region-based calibration for global land-cover characterization. A remaining question regards the relationship between the slope rate of change (SRC) and the spatial
characteristics of the land-cover classes. Although not our primary purpose here, we address this issue briefly using data from the Plumas and Stanislaus subregions as described in the Methods section. Recall that SRC or the scale-dependent rate of change in the slopes of the transition lines is modeled as a function of maximum class size and mean patch size. The regression results shown in Table 3 concur with expected relationships between proportion error and land-cover properties. The positive coefficient for maximum class size indicates that as the size of the largest class increases, there is an increasingly large shift in class proportions with each scale change. Similarly, the negative coefficient for mean patch size indicates that as the typical patch size in the scene increases, the resistance of class proportions to aggregation also increases; and the slopes of the proportion transition lines change more slowly as a function of scale. A surface illustrating the relationship between
Correction of Area Estimates from Coarse Resolution Data
239
"°°'°""°-°.. °. . . . "'-.°. .° °--..°.
"'°'"'"'"'"'...°.....°.. eq._ 04
.
T'*
0 OT'_. .,r,..r~
b 0 0
Figure 9. Best fit surface generalizing relationship between maximum class size, mean patch size, and the slope rate of change (SRC) for the slope estimator method.
m a x i m u m class size, m e a n patch size, and SRC is developed using the 11 data points and shown in Figure 9. While interpreting a surface created from so few data points is problematic, Figure 9 is presented as an indication of the type of controls that may be operating in this landscape type. For this small dataset, m a x i m u m class size and mean patch size appear important in controlling the rate at which proportional errors will develop as a function of scale. However, this preliminary assessment has some
limitations. First, the p o w e r of the statistical analysis is limited due to the small sample size. Second, since all the data subsets come from similar sites, they represent only one end of the spectrum in terms of the potential complexity of classes in a landscape. Note, for example, the m u c h lower SRC in the Nile Delta data (Fig. 5), which is coincident with very large patch sizes. A better understanding of these relationships is important and requires a m o r e complete analysis than is presented here.
Table 3. Regression Summary for the Model Relating the Slope Rate of Change (SRC) to the Size of the Maximum Class and the Mean Patch Size for 11 Subregions from the Plumas and Stanislaus Sites a
CONCLUSIONS
Intercept Maximum class size Mean patch size
Coefficient
Standard Error
0.0117 0.0092 - 0.0006
0.0031 0.0039 0.0002
t-value e > Itl
3.76 2.39 - 2.55
0.006 0.044 0.034
a The adjusted Re for the model is 0.62 F = 6.579 on 2 and 8 degrees of freedom, p = 0.02.
Methods for correcting low-resolution m e a s u r e m e n t s of land-cover proportions are respectively calibrated and tested on a set of extensive, forested areas in the Sierra Nevada Mountains. In contrast to related analyses that calibrate models with many test sites that are internal to the study area, we calibrate the models on a large external test site and apply the models in an extrapolative sense on spatially distinct validation sites. This approach more realistically accommodates the nature
240
Moody and Woodcock
of high-confidence data that likely will be available for global land-cover mapping missions such as the MODIS land cover/land-cover change activities. Results achieved using the inverse estimator and the slope estimator indicate potential for using a posteriori calibration methods to improve land-cover proportion estimates over large areas. In both cases, while classspecific accuracies are quite variable, considerable imprnvements in total error are realized, particularly at coarser scales. If properly integrated into the process of developing global land-cover datasets, these methods should provide improved estimates of class proportions and areas of land-cover change. Considerable effort is needed to understand better the specific sensitivities of these methods to the spatial and proportional characteristics of the calibration and validation sites. While the inverse estimator performs well, this method may break down when used in the extrapolative sense, and produce spurious results if the calibration site either has very different interclass associations or contains different component classes from other areas in the region. If this method is too sensitive to the characteristics of the calibration site, it will not be well suited to the production of global data sets where intensive calibration activities are not usually feasible. The slope estimator responds to a more generalizable scaling behavior where large classes increase in size and small classes decrease when larger cell sizes are used to sample the landscape. In this study, the same calibration model is used for each class in the taxonomy, although nonlinear and class specific models might also be used to great advantage. While different landscape types will have different transition slopes, within-type calibrations may be fairly consistent. If so, then this method may be more regionally stable, or stable within general landscape taxa than the inverse estimator. For the sites considered here, the slope estimator is also more consistently effective across scales and produces improved results even at fine resolutions when the inverse estimator extrapolates poorly. The slope estimator can probably be improved with the addition of variables that explicitly describe the spatial characteristics of the landscape, such as class aggregation, maximum class size, or patch size (Mayaux and Lambin, 1995). The incorporation of spatial measures especially may improve results for cover types that are of a moderate size in the landscape. That is, in those cases where the general relationship (large classes grow and small classes shrink with aggregation) is least stable, measures of spatial pattern may help to resolve the scale-dependence of cover-type proportions. As a logical extension of this research, the assessment of these methods in a probabilistic sampling framework based on a large n u m b e r of sites would provide the necessary information to objectively assess the qual-
ity and reliability of modeled estimates. For example, the combination of these methods with data such as the FAO samples could be used to improve the calibration and evaluation of modeled area estimates. Additionally, such models may be improved by incorporating nonlinear and / or class-specific terms. This work was' supported in part by the National Aeronautics and Space Administration under Contract NAS5-31369. We gratefully acknowledge Raymond Czaplewski and Eric Lambin for thoughtful comments on earlier drafts of the manuscript. We also thank Scott Macomber and Mary Pax-Lenney for the use of larut-cover maps.
REFERENCES
Baker, W. L., and Cai, Y. (1992), The r.le programs for multiscale analysis of landscape structure using the GRASS geographical information system, Landscape Ecol. 7(4): 291-302. Borak, J., Fisher, P., Strahler, A., and Moody, A. (1995), Local-scale evaluation of a technique for land-cover classification based on composited NDVI data, in Proc. Am. Soc. Photogramm. and Remote Sensing Conf. 27 Feb.-2 Mar., Charlotte, NC, American Society for Photogrammetry and Remote Sensing (ASPRS), Bethesda, MD, Vol. 3, pp. 796805. Brown, P. J. (1982), Multivariate calibration, J. Roy. Star. Soc. 3:287-321. Czaplewski, R. L., and Catts, G. P. (1992), Calibration of remotely sensed proportion or area estimates for misclassification error, Remote Sens. Environ. 39:29-43. Ehrlich, D., and Lambin, E. F. (1996), Broad scale land-cover classification and interannual climatic variability. Int. J. Remote Sens. 17(5):845-862. Food and Agriculture Organization (FAO) (1995), Forest resources assessment 1990: global synthesis, Rome, Italy, FAO Forestry Paper No. 124, 93 pp. Gervin, J., Kerber, A., Witt, R., Lu, Y., and Sekhon, R. (1985), Comparison of level I land cover classification for MSS and AVHRR data, lnt. J. Remote Sens. 6(1):47-57. Kalkhan, M. A., Reich, R. M., and Czaplewski, R. L. (1995), Evaluation of statistical properties of the inverse estimator for remotely sensed areal estimates using simple random sampling in Proc. Ant. Soe. Photogramm. and Remote Sensing Conf., 27 Feb.-2 Mar., Charlotte, NC, American Society for Photogrammetry and Remote Sensing (ASPRS), Bethesda, MD, Vol. 3, 258-270. Lloyd, D. (1990), A phenological classification of terrestrial vegetation cover using shortwave vegetation index imagery, Int. J. Remote Sens. 11:2269-2279. Loveland, T. R., Merchant, J. W., Ohlen, D. O., and Brown, J. F. (1991). Development of a land-cover characteristics database fi~r the conterminous U.S., Photogramm. Eng. Remote Sens. 57:1453-1463. Mayaux, P., and Lambin, E. F. (1995), Estimation of tropical forest area from coarse spatial resolution data: a two step correction fimction for proportional errors, Remote Sens. Environ. 53:1-16.
Correction of Area Estimates from Coarse Resolution Data
Moody, A., and Woodcock, C. E. (1994), Scale-dependent errors in the estimation of land-cover proportions -implications for global land-cover datasets, Photogramm. Eng. Remote Sens. 60(5):585-594. Moody, A., and Woodcock, C. E. (1995), The influence of scale and the spatial characteristics of landscapes on land-cover mapping using remote sensing, Landscape Ecol. 10(6):363379. Price, J. C. (1984), Land surface temperature measurements from the split-window channels of the NOAA 7 Advanced Very High Resolution Radiometer, J. Geophy. Res. 89(D5): 7231-7237. Roesch, F. A., Van Deusen, P. C., and Zhu, Z. (1995), A comparison of various estimators for updating forest area coverage using AVHRR and forest inventory data, Photogram. Eng. Remote Sens. 61(3):307-311. Running, S., Loveland, T., and Pierce, L. (1994), A vegetation classification logic based on remote sensing for use in global biogeochemical models, Ambio 23(1):77-81. Strahler, A., Moody, A., Lambin, E., et al. (1995), MODIS Land Cover Product: Algorithm Technical Basis Document, NASA document for MODIS Product No. 11, Parameter Nos. 2669 and 2671, Greenbelt, MD. Townshend, J., and Justice, C. (1988), Selecting the spatial resolution of satellite sensors required for global monitoring of land transformations, Int. J. Remote Sens. 9(2):187-236. Townshend, J., Justice, C., and Kalb, V (1987), Characteriza-
241
tion and classification of South American land cover types using satellite data, Int. J. Remote Sens. 8(8):1189-1207. Turner, M., O'Neill, R., Gardner, R., and Milne, B. (1989), Effects of changing spatial scale on the analysis of landscape pattern, Landscape Ecol. 3(3 / 4):153-162. USA-CERL (1991), GRASS 4.0 User's Reference Manual, U.S. Army Corps of Engineers Construction and Engineering Research Laboratory, Champaign, IL. Walsh, T. A., and Burk, T. E. (1993), Calibration of satellite classifications of land area, Remote Sens. Environ. 46:281290. Woodcock, C. E., and Gopal, S. (1992), Accuracy assessment of the Stanislaus Forest vegetation map using fi~zzy sets, in Pro& Fourth Biennial Remote Sensing Applications Conf. Bethesda, MD, April, Orlando, FL, American Society for Photogrammetry and Remote Sensing (ASPRS), 378-394. Woodcock, C. E., and Strahler, A. H. (1987), The factor of scale in remote sensing, Remote Sens. Environ. 21:311332. Woodcock, C. E., Collins, J., Gopal, S., et al. (1994a), Mapping forest vegetation using Landsat TM imagery and a canopy reflectance model, Remote Sens. Environ. 50:240-254. Woodcock, C. E., Gopal, S., Macomber, S. A., and Jakabhazy, V. D. (1994b), Accuracy assessment of the vegetation map of the Plumas National Forest, Technical Report, Center for Remote Sensing, Boston University, 19 pp.