International Journal of Applied Earth Observation and Geoinformation 22 (2013) 139–146
Contents lists available at SciVerse ScienceDirect
International Journal of Applied Earth Observation and Geoinformation journal homepage: www.elsevier.com/locate/jag
Area estimation from a sample of satellite images: The impact of stratification on the clustering efficiency Francisco Javier Gallego ∗ , Hans Jürgen Stibig Joint Research Centre of the EC, Italy
a r t i c l e
i n f o
Article history: Received 8 July 2011 Accepted 7 March 2012 Keywords: Stratified sampling Area estimation Land cover Cluster sampling Correlogram
a b s t r a c t Several projects dealing with land cover area estimation in large regions consider samples of sites to be analysed with high or very high resolution satellite images. This paper analyses the impact of stratification on the efficiency of sampling schemes of large-support units or clusters with a size between 5 km × 5 km and 30 km × 30 km. Cluster sampling schemes are compared with samples of unclustered points, both without and with stratification. The correlograms of land cover classes provide a useful tool to assess the sampling value of clusters in terms of variance; this sampling value is expressed as “equivalent number of points” of a cluster. We show that the “equivalent number of points” is generally higher for stratified cluster sampling than for non-stratified cluster sampling, whose values remain however moderate. When land cover data are acquired by photo-interpretation of tiles extracted from larger images, such as Landsat TM, a sampling plan based on a larger number of smaller clusters might be more efficient. © 2012 Joint Research Centre of the European Commission. Published by Elsevier B.V. All rights reserved.
1. Introduction: stratified spatial sampling Spatial sampling is a frequent tool in various types of surveys. In demographic or social sciences it can be seen as an alternative to more traditional approaches (Haining, 1990; Kumar, 2007). In forestry there is a long tradition of spatial sampling (Finney, 1948; Bickford et al., 1963; Scott, 1998) because list-frame alternatives are seldom feasible. For the same reason the use of spatial sampling in ecology started very early (Bormann, 1953; Cottam and Curtis, 1956). Spatial sampling is widely used in virtually all environmental fields, from soil (Brus and De Gruijter, 1997) to plant ecology (Fortin et al., 1989) or wildlife studies (Morrison et al., 2008), in which the size of the population is largely unknown and indirect sampling methods need to be used (Buckland et al., 2001). Stein and Ettema (2003) give an overview of spatial sampling techniques. For agricultural statistics the expression “area frame sampling” is often used for spatial sampling to highlight that it is an alternative or a complement to “list frame sampling” (Faulkenberry and Garoui, 1991; Ford et al., 1986; Pradhan, 2001). In an area frame survey the sampling frame is defined by a cartographic representation of the surveyed territory and a rule that defines the sampling units. Area frames generally match quite precisely the population, but the concepts are different. For example the population can be the European Union (EU), and the sampling frame may be a
∗ Corresponding author at: JRC, tp. 483, Via Fermi 2749, 21027 Ispra (VA), Italy. Tel.: +39 0332 785101; fax: +39 0332 789029. E-mail address:
[email protected] (F.J. Gallego).
representation of the EU in a given cartographic projection divided into squares of 1 km2 . The choice of a projection, necessary for the precise definition of the frame, is not a trivial question, especially if the territory is large and contains regions or countries for which the cartographic material is available in different projection systems. For area estimation it is desirable that the cartographic projection is equal-area. For the specific case of the EU, the INSPIRE initiative (INitiative for SPatial data InfrastructuRE) recommends a Lambert-Azimuthal Equal Area projection (Annoni et al., 2001). The units of an area frame can be points, transects (lines of a certain length) or pieces of territory, often named segments. When the units of the frame are points, the survey may be called a “point frame survey”. Points are in principle dimensionless, but in practice they may be defined as having a certain size or support, for example the EU Land Use and Cover Area-frame Survey (LUCAS) defines a point as a circle of 3 m diameter (Gallego and Delincé, 2010). Stratification generally improves the standard error of estimates. The recommended number of strata is also a frequently debated issue: classical textbooks of sampling techniques recommend a small number of strata unless the covariable on the basis of which the stratification is defined is very strongly correlated with the target variable(s) (Cochran, 1977, pp. 132–134), but other studies prefer building a large number of small strata (Brus et al., 1999). Stratifications with a large number of strata are often based on a geometric partition, for example the cells of a square grid. Remote sensing studies targeting area estimation may need to sample satellite images if the target region is very large. For example the TREES-2 project (TRopical Ecosystem
0303-2434/$ – see front matter © 2012 Joint Research Centre of the European Commission. Published by Elsevier B.V. All rights reserved. doi:10.1016/j.jag.2012.03.003
140
F.J. Gallego, H.J. Stibig / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 139–146
Environment observation by Satellites) used a sample of 100 Landsat TM images (full or quarter scenes) to estimate rainforest changes between 1993 and 1997 (Achard et al., 2002). The FRA2010 (Forest Resource Assessment) remote sensing survey (http://www.fao.org/forestry/fra/remotesensingsurvey/en/) analyses TM images on a sample of tiles of 10 km × 10 km and 20 km × 20 km located with a systematic pattern every degree of latitude–longitude across the world (Eva et al., 2010; Potapov et al., 2011; Brink and Eva, 2009). In Europe a sample of approximately 200 sites of 10 km × 10 km is being assessed by the Geoland2 Project (http://www.gmes-geoland.info). An example of failure of an estimation system based on a sample of satellite images is the so-called MARS Activity B (Meyer-Roux and Vossen, 1994) whose aim was providing rapid estimates of crop area change in the EU based on a panel or sample of 60 squares of 40 km × 40 km. The failure of this method had little to do with the use of a sample or with the sampling method; it was rather due to the use of pixel counting as a basis for the estimation without calibration with ground data. With this method the margin for subjectivity of the estimates was much wider than the inter-annual variability of crop areas and the role of images became purely cosmetic (Gallego, 2006). Stehman et al. (2005) compare the efficiency of sampling units of 5 km × 5 km, 10 km × 10 km and 20 km × 20 km using the map of the NOAA-Coastal Change Analysis program (CCAP). Although initially the system is fed with Landsat-TM images, the sampling scheme might be better adapted to very high resolution (VHR) images. Sampling satellite images is also useful for estimating the thematic accuracy of land cover maps (Stehman, 2009). Stehman (2005) analyses the issue of image sampling from the perspective of the overall error obtained by combination of the systematic error and sampling error, highlighting that a sampling approach can give better estimates if it reduces the bias. A practical way to assess the potential variance and possible bias of area estimates using a sampling plan is considering a land cover map as pseudo-truth (Eva et al., 2010; Stehman et al., 2005). With an exhaustive knowledge of the region it is easy to check if an estimator is biased or to compute the standard error of different possible estimators. In classical survey literature the usefulness of cluster sampling is generally assessed by comparing the variance yielded by a simple random sample of n unclustered elementary units with the variance yielded by a sample of n elementary units grouped in n/M clusters. The elementary units are usually well defined entities, such as a person, a household or an enterprise. Clustering typically leads to an increase of variance that is compensated by lower costs. In spatial sampling we need to attribute a size or support to the sampling unit that is to some extent arbitrary and so is the number M of elementary units in a cluster; thus comparing with a sample of n/M clusters may be misleading. A square of 10 km × 10 km may be seen as a sampling unit with an enlarged support. We compare the variance yielded by a sample of n points (elementary units with a small support) and the variance yielded by a sample of n largesupport units to quantify the effect of enlarging the support (and increasing the cost per unit). However we prefer to look at the large support units as clusters of points because we use as a key tool the intra-cluster correlation (ICC) that is typical of the cluster sample analysis and we make a less usual comparison between a sample of n unclustered points and a sample of n × M clustered points. The sampling efficiency of clusters with a given size in terms of variance depends on the spatial structure of the target variable. For model-based processes the sampling error can be predicted from the spatial structure described by the variogram (Domburg et al., 1994; de Gruijter et al., 2006). The method developed by Domburg et al. not only applies to cluster sampling, but also covers a wider range of sampling approaches.
The sampling efficiency can be quantified estimating a coefficient Q such that a sample of nQ unclustered points yields the same variance as a sample of n large-support units or clusters. This coefficient Q depends on the spatial structure of each land cover type and can be called the value or “equivalent number of points” of a cluster (Gallego, in press). The aim of this paper is exploring the impact of stratification in the cost-efficiency assessment of clusters compared with unclustered sampling of points.
2. Areas of study and data We consider two test areas, far away and different from each other: The 15 countries that were members of the EU in 2001 (EU15), and South-East Asia (SEA). In both test areas we have used Global Land Cover 2000 (GLC2000) (Mayaux et al., 2006) as a basis for stratification. GLC2000 has been produced using VEGA 2000, a dataset of 14 months of daily global data acquired by the VEGETATION instrument on board SPOT 4. The nomenclature is based on the Land Cover Classification System (LCCS), developed by the FAO (Di Gregorio, 2005). The EU15 test area measures approximately 3.2 million km2 . Besides GLC2000, used for stratification, two sets of data were used as source of land cover information with fine resolution and coarse resolution: As coarse resolution data we used CLC2000, produced by photointerpretation of a mosaic of Landsat ETM+ images (JRC-EEA, 2005). CLC2000 can be downloaded from the data service of the European Environment Agency (http://www.eea.europa.eu/dataand-maps/data). CLC2000 is a polygon layer with a minimum mapping unit of 25 ha, but it is usually distributed as a raster layer with 100 m resolution. This is the format we have used for this work. The results in terms of “equivalent number of points” can be interpreted as “number of 100 m pixels”, but the results are the same for a smaller support size with only tiny changes due to boundary effects: the reason is that the information given by a “small point” (say 1 m) is the same as the information given by the corresponding 100 m pixel. Thus saying that a cluster or sampling unit of 10 km is equivalent to 5 pixels of 100 m can be read also as “equivalent to 5 points of 1 m”. As fine resolution data set, we used LUCAS 2001 (Land Use/Cover Area-Frame survey). LUCAS 2001 was carried out in EU15 with a two-stage systematic sampling that can be looked at as a singlestage systematic sampling (Gallego and Bamps, 2008; Gallego and Delincé, 2010): Primary Sampling Units (PSU) were selected with a grid of 18 km without stratification. Each PSU is a cluster of 10 points following a 5 × 2 rectangular pattern with a 300 m step. The point is conceived as a circle of 3 m diameter. For South-East Asia (SEA) we had no fine resolution data equivalent to LUCAS. The main data came from the photo-interpretation of 416 clusters of approximately 20 km × 20 km carried out in the TREES-3 project (Achard et al., 2010; Bodart et al., 2011). The clusters are located in the 1◦ latitude–longitude nodes (Fig. 1). They are linked with the design of the FAO FRA2010 sample (Potapov et al., 2011). The photo-interpretation nomenclature was rather simplified: Tree cover, tree cover mosaic, other wooded land, other land cover (including cropland) and water, plus different types of missing data (clouds, shadows, and other missing data). We specifically address a deforestation or “tree cover loss” indicator between 1990 and 2000. The indicator is computed as the area that changes from “tree cover” to “other” (wooded or not) plus half the area that changes from “tree cover” to “mosaic” and half the area that changes from “mosaic” to “other”; this implicitly assumes that the mosaic has roughly 50% of forest.
F.J. Gallego, H.J. Stibig / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 139–146
141
Fig. 1. Photo-interpreted sample in South-East Asia.
LUCAS and TREES-3 do not provide exhaustive “pseudo-truth” land cover data. This has limited some of the tests and analysis that can be performed using CLC2000 as target data set. 3. Methods
V (y¯ ch ) =
We consider square clusters of elementary units (for example pixels). We make the simplifying assumption that all clusters in the sampling population have the same size M, so that the population of N elementary units is divided into N = N/M clusters. To compare sampling plans based on clusters and sampling plans based on unclustered points, a frequent approach is defining two sample designs (clustered and unclustered) with the same survey cost and trying to find out which design yields a lower variance of the estimates. Here we consider a different way to communicate with the user: If a sample of n clusters yields the same variance as a sample of n × Q unclustered points, we can say that a cluster is worth Q unclustered points. The ICC is a traditional tool to assess cluster sampling: if yij is the value for the elementary unit j within cluster i, the variance in each stratum h is: N
1 = (yij − y¯ h )2 estimated by Nh M − 1 h
Sh2
M
i=1 j=1
sh2 =
1 nh M − 1
We take a stratified random sampling of nh clusters in each stratum h. Here we consider a single stage scheme, i.e. all the M units in each sampled cluster are observed; in this case the variance of the estimate y¯ ch can be written (Cochran, 1977, ch. 9):
n h
Nh − nh 2 S (1 + (M − 1)M,h ) nh Nh M h
where the subscript c refers to a cluster-based estimation. If we have data for a suitable sample of points inside each cluster, the ICC M,h can be estimated by ˆ M,h =
(yij − y¯ h )2
(1)
i=1 j=1
nh T
1 nh M(M − 1)sh2
(yij − y¯ h )(yik − y¯ h )
(3)
i=1 (j,k)=1
where (j,k) refers to all T = M(M − 1) possible pairs of points inside the cluster i and sh2 is the classical estimator for the variance. Obviously all formulas lose the strata subindex h when we consider non-stratified sampling. The ICC is closely related to the ratio between the within-cluster and between-clusters components of the variance. Combining the classical analysis of variance for stratified sampling and cluster sampling (Lohr, 2010, Ch. 3.4 and 5.2), we can get three main components (Table 1): variance within clusters, between clusters inside each stratum and between strata. M,h = 1 −
M
(2)
2h M3h ≈ (M − 1)(2h + 3h ) (2h + 3h )
(4)
If suitable data are available, the impact of stratification on the clustering efficiency can be examined by comparing the ICC (or alternatively the variance components) for stratified and
Table 1 Variance components for stratified cluster sampling. Degrees of freedom
Sum of squares
Variance
H
Between strata
H−1
1 =
¯ Nh (y¯ h − y)
2
S12 = 1 /(H − 1)
h=1 N
Between clusters in each stratum
Nh − 1
2h =
h
M(y¯ i − y¯ h )
2
2 S2h = 2h /(Nh − 1)
i=1 N
Within clusters in each stratum
Nh (M − 1)
3h =
M h
i=1
j=1
(yij − y¯ i )
2
2 S3h = 3h /Nh (M − 1)
142
F.J. Gallego, H.J. Stibig / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 139–146
non-stratified sampling. In particular when we use a land cover map as “truth”, computing the ICC and Q is easy: we have an exhaustive knowledge of the population and we can simulate any type of sampling. We could have used this approach with the coarse resolution data set CLC2000 in EU15, but we prefer to follow an approach that is suitable for all our test cases. For SEA, having data for a sample of clusters of 20 km × 20 km, we can easily simulate smaller clusters, but the assessment of larger clusters is not straightforward. An additional difficulty appears when we want to exploit the LUCAS 2001 point observations: the data available for each cluster of 10 km × 10 km or 20 km × 20 km is absolutely insufficient: only 10 points arranged in a rectangle of 1500 m × 300 m. For a given cluster size M the approximate value (or equivalent number of points) of a cluster in each stratum is QM,h ≈ 1/M,h unless the sampling rate is large or the sample size nh and M are small (Gallego, in press). The value QM,h is quite stable in each stratum when nh changes, but the overall value of QM in the region is not so stable when the sample size nh changes in a way that is not proportional from one stratum to another. We assume now that the sampling rate for different strata is constant (i.e., proportional allocation). The computation of a pooled value of QM across strata is easier if we consider the area estimates for a given land cover type zˆh = Dh y¯ h , where Dh is the total area of stratum h. We have QM,h ≈ V (ˆzh )/V (ˆzch ) and QM ≈ V (ˆz )/V (ˆzc ), where V (ˆzh ) is the variance of the area h and V (ˆz ) = estimation with a sample of nh points in stratum V (ˆzh ) is the sum across strata. V (ˆzch ) and V (ˆzc ) =
h
V (ˆzch ) are
h
the corresponding variances based on a sample of nh clusters in each stratum. We can derive
Q V (ˆzch ) h M,h QM = h
(5)
V (ˆzch )
This means that QM is a weighted average of the value QM,h in each stratum with a weight V (ˆzch ) = Dh2 V (y¯ ch ). We can group the pairs of points inside the same cluster of points into groups of pairs at a distance d with a certain tolerance. We have a partition of all possible pairs into a finite set of groups and we
can write T =
d , where d is the number of pairs at distance
d
d inside a cluster. The suitability of the sample inside the cluster means amongst others that d is sufficiently large for each distance d; this condition is not met with the LUCAS-2001 sample. Systematic sampling schemes generally provide a large number of pairs for certain distances, but information is missing for other distances that are not compatible with the sampling plan. For example the FRA2010 sample used in SEA provide a large number of pairs at a distance of less than 28 km (length of the diagonal of each cluster), but there are no pairs of points at a distance between 28 km and 80 km. We can rewrite Eq. (5) as 1 d 1 = nd T d S 2 h nh
ˆ M,h
dM
i=1 d=0
d(j,k)=d
(yij − y¯ h )(yik − y¯ h ) =
dM
d
d=0
T
(d)(6) ˆ
where dM is the maximum distance of pairs inside a cluster and (d) ˆ is the estimated correlogram at distance d. Thus the intracluster correlation can be computed as a weighted average of the correlogram of elementary units, that can be estimated from a sample of points, such as LUCAS, that are not adapted to estimate the intra-cluster correlation. We can use Eq. (6) to estimate the intra-cluster correlation ˆ M,h if we have no limitation on the availability of data to compute the correlogram for a sufficiently dense set of distances d, but in this case straightforward computation of ˆ M,h through
Fig. 2. Computed and adjusted correlograms for tree cover loss in South-East Asia (non stratified). Table 2 Strata definition in EU15 and strata extent (in %) for each cluster size. Strata
Cropland per cluster in GLC2000 (%)
5 km
10 km
20 km
30 km
0 1 2 3 4 5 6
0 0–5 5–20 20–40 40–60 60–80 >80
23.1 10.4 12.2 9.5 9.3 11.0 24.5
18.5 11.2 13.3 11.2 10.9 12.2 22.7
10.0 15.8 14.6 12.6 12.5 14.7 19.8
6.4 17.6 14.7 13.7 13.7 16.0 17.9
Eq. (3) is also possible and Eq. (6) does not give a particular advantage. The limitation that arises from the missing (d) ˆ for certain distances can be overcome by fitting a decreasing function to the correlogram. With our data we have found that the family of func˜ d = exp(a × db ) gave a good fit to the correlograms. This tions means that we assume a smoothly decreasing correlogram (Fig. 2); the assumption seems to be confirmed if we consider the values of the correlograms computed on a very large number of pairs (d > 10 000). The very large number of pairs required to have stable correlograms may seem surprising in front of a statistical tradition that considers that a sample is large when the sample size is above 30, but is supported by the variance computation of correlograms for 0–1 variables (Gallego, in press). The stratifications in both test areas have been based on GLC2000. In EU15 the stratification rule was focused on agricultural land cover. It was defined through the percentage of agricultural land per cell (cluster) according to GLC2000 (Table 2). The intervals were arbitrarily chosen, but roughly correspond to the usual choice in agricultural area frame surveys (Taylor et al., 1997). In SEA the stratification was more focused on forest; GLC2000 was recoded into 3 classes: tree cover, tree cover mosaic, and other land cover (including other natural or woody vegetation and agricultural land). On this basis, 4 strata were defined (Table 3). As Tables 2 and 3 show, the cluster size has a strong impact on the stratification in the European test case and a more moderate impact in South-East Asia. As it can be expected, strata that correTable 3 Strata definition in South-East Asia and strata extent (in %) for each cluster size. Strata
GLC2000
1 2 3 4
Tree cover ≥50% Mosaic tree cover ≥50% Other land cover ≥50% Each class <50%
Cluster size 5 km
10 km
20 km
30 km
43.9 13.7 37.3 5.1
42.9 12.9 37.1 7.1
42.2 12.3 36.5 9.0
41.7 12.1 36.6 9.6
F.J. Gallego, H.J. Stibig / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 139–146 Table 4 Equivalent number of points of sites of different size for vineyards in EU 15 (fine resolution). Strata
0 1 2 3 4 5 6
Site size 5 km
10 km
20 km
30 km
18.1 7.5 8.9 5.6 4.8 6.9 3.2
52.0 13.4 10.6 6.6 6.0 10.2 3.7
177 28.6 12.9 8.0 8.0 16.6 4.4
378 48.4 14.7 9.0 9.6 23.1 5.0
spond to strong dominance or absence of a given land cover class are larger if the cell size is small (fine resolution of stratification), while mixed strata are larger when the cell size is large (coarse resolution stratification).
For EU15 we consider in this paper three examples of land cover type that are reported both by fine resolution (LUCAS) and by coarse resolution data (CLC): arable land, forest and vineyard. Arable land and forest are two major classes, while vineyard has been chosen as an example of relatively minor land cover type with a more irregular spatial distribution. In addition we examine two examples of annual crops (wheat and sunflower) that are not reported by CLC because CLC is not updated every year and therefore does not consider classes that usually change from one year to another. Fig. 3 ˜ d = exp(a × db ) of the class represents the adjusted correlograms vineyard at fine scale, i.e. using LUCAS data in each stratum of the GLC2000-based stratification defined in Table 2. The value of a cluster, in terms of equivalent number of points, changes very much from one stratum to another. Table 4 illustrates this variability for the example of vineyards (LUCAS data) while Table 5 reports the equivalent number of points per stratum for the “tree cover loss” indicator in SEA. These two examples correspond to substantially different situations: vineyards in the EU have a very different spatial structure from one stratum to another, for example in stratum 0 it only appears in a few scattered plots; spatial patterns of tree cover loss in SEA seem to change much less from one stratum to another. As stated above, the overall value QM for a cluster is a weighted average of the value in each stratum, but the weight depends on the sampling plan (sampling rate in each stratum). In order to obtain an overall indication of the value QM , we have assumed a sample size in each stratum proportional to the size of the stratum (uniform sampling rate). This leads us to the values QM for stratified and non-stratified random sampling reported in Tables 6 and 7. In the particular case of vineyards fine resolution data, we can see that the high values of QM,h for stratum h = 0 have little impact on the overall value QM ; indeed the weight of stratum 0 for the application of Eq. (3) is less than 2%, while stratum 6 has a weight around 55%.
Table 5 Equivalent number of points of sites of different size for tree cover loss in SEA.
1 2 3 4
5. Discussion and conclusions Sampling schemes with sampling units (5 km × 5 km to 30 km × 30 km) are more and more often used to derive estimates of land cover area or change in large regions, as well as to assess the accuracy of land cover maps with HR or VHR satellite images. We look at these large support sampling units as clusters of smaller units and we combine classical tools of survey sampling analysis with a spatial statistics tool (correlogram) to explore the combined impact of clustering (or large support effect) with stratification. The results show an increase of the “equivalent number of points” or “value” (in terms of estimation variance) QM of a cluster when we consider stratified sampling compared with the corresponding value under non-stratified sampling. The stratification efficiency (Cochran, 1977) is related with the variance components (Table 1). If we assume a constant sampling rate we have for unclustered schemes u =
(1 + 2 + 3 )(N − H) (2 + 3 )(N − 1)
(7)
And for clustered schemes:
4. Results
Strata
143
Site size 5 km
10 km
20 km
30 km
3.1 3.2 2.7 2.7
4.3 3.8 3.5 3.6
6.5 4.5 5.1 5.3
8.8 5.1 6.7 6.9
c =
(1 + 2 )(N − H) 2 (N − 1)
where 2 =
H
2h and 3 =
h=1
(8) H
3h .
h=1
If the variance of y does not change too much from one stratum to another, the pooled ICC is approximately: M,st ≈
2 2 + 3
(9)
On the other hand in a non-stratified sampling the betweenclusters sum of squares 2 incorporates the between-strata sum of squares 1 and the ICC becomes: M,nst ≈
1 + 2 1 + 2 + 3
(10)
From Eqs. (9) and (10) we conclude that we should expect in general M,st < M,nst and therefore QM,st > QM,nst . The more efficient the stratification the higher is 1 and the stronger the difference between M,st and M,nst . If we consider a fixed cluster size M and several possible stratifications, 3 and the sum 1 + 2 are fixed and so is M,nst . An efficient stratification (high 1 ) is associated to a low 2 , to a low strata-pooled ICC M,st and therefore to a high value QM,st of a cluster. From Eqs. (7)–(10) we can derive QM,nst M,st u ≈ ≈ QM,st M,nst c
(11)
Thus the ratio between the ICC for stratified and non-stratified sampling is close to the ratio between the efficiency of the stratification u for unclustered and c for clustered schemes. We have made several assumptions and approximations (in particular the homogeneity of variance across strata) and we may have doubts on the validity of the approximation (11). We can check with real data. Our EU high resolution data (LUCAS) are not suitable to estimate the stratification efficiency for clusters. For the data coming from photo-interpretation in the EU (CLC) and SEA (TREES-3) we have compared the stratification efficiency for clustered and unclustered schemes and QM for stratified and non-stratified sampling. The results confirm that the increase of QM,st for stratified sampling compared to QM,nst is associated to the increase of the stratification efficiency for clustered sampling compared with unclustered sampling, although the ratios do not fully coincide. Table 8 reports the comparison for the cluster sizes that are being considered by the Geoland 2 project in the EU (10 km) and TREES-3 in SEA (20 km).
144
F.J. Gallego, H.J. Stibig / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 139–146
Fig. 3. Adjusted correlograms per stratum for arable land in EU15.
Table 6 Overall values QM for sites of different sizes in EU15 under stratified (Str) and non-stratified sampling (Nst). Site size 5 km
Arable land coarse resolution Arable land fine resolution Vineyards coarse resolution Vineyards fine resolution Forest coarse resolution Forest fine resolution Wheat Sunflower
10 km
20 km
30 km
Str
Nst
Str
Nst
Str
Nst
Str
Nst
3.2 4.4 3.1 4.9 2.9 2.8 6.9 12.3
1.9 2.4 1.9 2.8 2.4 2.4 5.0 7.0
4.5 5.6 3.8 6.5 3.4 3.3 8.8 24.1
2.4 2.9 2.5 3.7 2.8 2.7 7.9 15.1
4.8 7.2 8.3 9.5 4.1 4.3 10.6 65.8
2.6 3.3 3.4 5.2 3.2 3.1 8.9 16.7
5.8 9.1 11.5 13.1 4.7 4.8 12.3 120.1
2.8 3.8 4.4 6.6 3.5 3.4 10.8 21.3
Table 7 Overall values QM for sites off different sizes in South-East Asia under stratified (Str) and non-stratified sampling (Nst). Site size 5 km
Tree cover 2000 Mosaic tree cover 2000 Other wooded 2000 Other land cover 2000 Tree cover loss 1990–2000
10 km
20 km
30 km
Str
Nst
Str
Nst
Str
Nst
Str
Nst
2.4 4.3 3.7 2.7 3.0
1.8 3.6 2.9 1.8 2.8
2.9 5.4 4.5 3.3 4.0
2.0 4.4 3.6 2.0 3.7
4.1 6.8 5.6 3.8 5.8
2.4 5.5 4.7 2.3 5.3
4.9 8.3 7.5 6.2 7.8
2.7 6.4 5.6 2.5 6.9
Eq. (11) can be also derived with a reasoning following the scheme represented in Fig. 4: A stratified sample of n clusters (or sampling units with an enlarged support) is equivalent in terms of variance to a stratified sample of QM,st × n unclustered points and to a simple random sample of ˚u × QM,st × n (small) points. If we
follow the other side of the scheme we conclude that it is equivalent to ˚c × n unstratified clusters and to QM,nst × ˚c × n unstratified points. Eq. (11) means that the result we have obtained can be read in two different ways:
Table 8 Comparison between QM and stratification efficiency for coarse resolution land cover classes (clusters of 10 km for CLC in EU15, and clusters of 20 km for TREES-3 in SEA). QM
Stratification efficiency
Stratified
Non-stratified
Clustered
Unclustered
EU15: Arable land CLC EU15: Vineyards CLC EU15: Forest CLC
4.5 3.8 3.4
2.4 2.5 2.8
2.69 1.28 1.60
1.39 1.01 1.15
SEA: Tree cover 2000 SEA: Mosaic tree cover 2000 SEA: Other wooded 2000 SEA: Other land cover 2000 SEA: Tree cover loss 1990–2000
4.1 6.8 5.6 3.8 5.8
2.4 5.5 4.7 2.3 5.3
2.48 1.23 1.15 1.71 1.06
1.38 1.04 1.03 1.21 1.01
F.J. Gallego, H.J. Stibig / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 139–146
Fig. 4. Link between stratification efficiency for clustered and unclustered sampling and equivalent number of points for stratified and non-stratified sampling.
• the contribution of clustering or enlarging the support of sampling units to reducing the variance of the estimations is stronger for stratified than for non stratified sampling. • the contribution of stratification to reducing the variance of the estimations is stronger for cluster sampling than for unclustered sampling of points. We can look at a large support unit (e.g. 10 km × 10 km) as a cluster of pixels with a size that can be of the order of 1 m2 , if we consider VHR images, or approximately 30 m × 30 m, if we consider Landsat-TM images. The results reported show that, even if the size M of a cluster is of the order of thousands or millions of pixels, its value in terms of sampling variance is in most cases less than 10 unclustered pixels. This highlights once again how meaningless are questions of the type “is 1% a sufficient sampling rate in this region to have good land cover area estimates?” A further development of this paper might come from the paper by Domburg et al. (1994) that provides an expression of the variance of estimators for model-based spatial processes with a wide range of sampling schemes (not only cluster sampling). The expression obtained by Domburg et al. could be an alternative to Eq. (2) as a starting point to generalize a sampling efficiency measure expressed as “equivalent number of points”. Cost-efficiency considerations are more heterogeneous and depend very much on the data acquisition approach and on the logistic difficulties of each region if we consider in situ observation as an alternative. In the EU, where the territorial and organizational infrastructure is generally good, the ground survey cost per point is moderate when we have a dense sample: in the case of LUCAS, the average unit cost per ground observation ranges between 14D and 23D (Eurostat, 2007), depending on the imputation criteria of general management cost. Surprisingly the cost per point did not change much between the cluster sampling scheme used in LUCAS 2001 and the unclustered two-phase sampling approach used in LUCAS 2006 and LUCAS 2009 (Gallego and Delincé, 2010). With this benchmark, cost-efficiency assessments become extremely challenging for systems based on a sample of VHR images: to be competitive, the cost per cluster should not be more than a few hundred Euros, including image purchase and treatment; this is far from being feasible, unless access to VHR images become a free public service. It is now the case for most coarse resolution images and seems to be the direction towards which the distribution of medium resolution images (10–50 m) is going. In South-East Asia logistic difficulties are much sharper and intense surveys based on in situ observations in forest areas may be discarded a priori. Therefore the question needs to be put in different terms, depending on the type of images to be used. If VHR images are considered crucial, values of QM for different cluster sizes can be used for the trade-off between VHR images of different resolution. In the context of FRA2010 or other projects that use image tiles extracted from Landsat TM or similar images,
145
there is a wider range of sampling plans that can be considered and compared. Would it have been more efficient extracting a larger number of smaller tiles? For example would it be feasible to sample a cluster of 1 km × 1 km every 10 longitude and latitude instead of a cluster of 20 km × 20 km every degree? Of course photointerpretation quality may be degraded if the tile size goes below a certain threshold that needs to be carefully chosen. The sampling variance indications suggested in this paper are only one element to be combined with an analysis of the thematic characteristics of a project. The concept of “equivalent number of points” gives an easyto-understand indicator for users who wish to assess the possible sampling error of different approaches in the context of a more comprehensive comparison that includes other aspects, such as thematic identification accuracy, organizational complexity and cost. In South-East Asia spatial correlation is significantly lower for tree cover loss than for tree cover area and the equivalent number of points is higher. This is likely to be true for most land cover change types, but still needs more case-studies to be considered as an “empirical truth”. In this paper we have compared sampling schemes on the basis of the variance of the associated estimators. We have not taken into account the issue of bias. All the area estimators we have used are unbiased, but bias can come from the observation modes that are best adapted to each sampling approach. In all cases the most likely source of bias is misclassification or misidentification of the land cover type. More specifically the estimation bias for a given class is close to the difference between the commission error and the omission error obtained from a confusion matrix. If we have an unbiased estimation of the confusion matrix, we can correct the bias of the estimator based on the map area using a calibration estimator (Card, 1982; Gallego, 2004). For samples of unclustered points the most usual data collection methods are field surveys or photo-interpretation when access to the points is problematic or very expensive. For field surveys we can consider the example of LUCAS (Eurostat, 2007): a check on a subsample of 8200 points by supervisors gave a disagreement around 2–3% for dominant crops that can go up to 50% for rare crops that surveyors are not able to identify. For major land cover classes (arable land, forest) the disagreement is around 2%, except when their definition leaves a wide margin to interpretation (e.g.: natural vegetation). The rate of errors in photo-interpretation is usually higher than in field survey, except when location on the ground is problematic, but it is generally accepted to be lower than in automatic image classification (Congalton and Green, 1999). For single stage cluster sampling with large clusters, as the ones we have considered, the natural data collection method is image classification. The bias of image classification has been often mentioned in the literature. It can be reduced by tuning the classification parameters or changing the training data. If the operator has some information on the area of each land cover class the bias can be corrected to some extent, although a certain subjectivity is introduced. The issue of margin for subjectivity in direct area estimation from classified images (pixel counting) is seldom discussed and analysed. This introduces additional difficulties on the comparison of sampling approaches from the point of view of bias, that remains a topic for deeper analysis. We have compared clustered and unclustered samples (or samples based on large support units with small support units) using a stratification that is adapted to clusters. This might give an unfair advantage to cluster sampling. We have compared the efficiency of the stratification we have used with more detailed stratifications. If both are based on GLC2000 the difference is that the detailed stratification can allocate each 1 km2 pixel to a different stratum, regardless of the behaviour of the neighbouring pixel. In both test
146
F.J. Gallego, H.J. Stibig / International Journal of Applied Earth Observation and Geoinformation 22 (2013) 139–146
regions the reduction of variance obtained with more detailed stratification is very small: always between 1% and 4% for all the land cover classes we have examined. However it might happen that detailed stratification based on more detailed land cover maps gives a more substantial advantage to unclustered or small-support sampling units. References Achard, F., Eva, H.D., Stibig, H.J., Mayaux, Ph., Gallego, J., Richards, T., Malingreau, J.P., 2002. Determination of deforestation rates of the world’s humid tropical forests. Science 297, 999–1002. Achard, F., Stibig, H.J., Eva, H.D., Lindquist, E.J., Bouvet, A., Arino, O., Mayaux, P., 2010. Estimating tropical deforestation from Earth observation data. Carbon Management 1 (2), 271–287. Annoni, A., Luzet, C., Gubler, E., Ihde, J., 2001. Map Projections for Europe. Report EUR 20120 EN. JRC, Ispra, Italy, 131 pp. Bickford, C.A., Mayer, C.E., Ware, K.D., 1963. An efficient sampling design for forest inventory: the northeastern forest resurvey. Journal of Forestry 61 (11), 826–833. Bodart, C., Eva, H., Beuchle, R., Rai, R., Simonetti, D., Stibig, H.-J., Brink, A., Lindquist, E., Achard, F., 2011. Pre-processing of a sample of multi-scene and multi-date landsat imagery used to monitor forest cover changes over the tropics. ISPRS Journal of Photogrammetry and Remote Sensing 66 (5), 555–563. Bormann, F.H., 1953. The statistical efficiency of sample plot size and shape in forest ecology. Ecology 34, 474–487. Brink, A.B., Eva, H.D., 2009. Monitoring 25 years of land cover change dynamics in Africa: a sample based remote sensing approach. Applied Geography 29 (4), 501–512. Brus, D.J., De Gruijter, J.J., 1997. Random sampling or geostatistical modelling? Choosing between design-based and model-based sampling strategies for soil (with discussion). Geoderma 80 (1–2), 1–44. Brus, D.J., Spätjens, L.E.E.M., De Gruijter, J.J., 1999. A sampling scheme for estimating the mean extractable phosphorus concentration of fields for environmental regulation. Geoderma 89 (1–2), 129–148. Buckland, S.T., Anderson, D.R., Burnham, K.P., Laake, J.L., Borchers, D.L., Thomas, L., 2001. Introduction to Distance Sampling: Estimating Abundance of Wildlife Populations. Oxford University Press, New York. Card, D.H., 1982. Using known map category marginal frequencies to improve estimates of thematic map accuracy. Photogrammetric Engineering and Remote Sensing 48 (3), 431–439. Cochran, W., 1977. Sampling Techniques. John Wiley and Sons, New York. Congalton, R.G., Green, K., 1999. Assessing the Accuracy of Remotely Sensed Data: Principles and Practices. Lewis Publishers, 137 pp. Cottam, G., Curtis, J.T., 1956. The use of distance measures in phytosociological sampling. Ecology 37, 451–460. de Gruijter, J.J., Brus, D.J., Bierkens, M.F.P., Knotters, M., 2006. Sampling for Natural Resource Monitoring. Springer, Berlin. Di Gregorio, 2005. Land cover classification system: classification concepts and user manual: LCCS, FAO environment and natural resources series. N.8. http://books.google.com/last (accessed 21.06.11). Domburg, P., de Gruijter, J.J., Brus, D.J., 1994. A structured approach to designing soil survey schemes with prediction of sampling error from variograms. Geoderma 62 (1–3), 151–164. Eurostat, 2007. LUCAS 2006 Quality Report. Standing Committee for Agricultural Statistics, 22–23 November 2007. Document ESTAT/CPSA/522a, Luxemburg. Eva, H., Carboni, S., Achard, F., Stach, N., Durieux, L., Faure, J.F., Mollicone, D., 2010. Monitoring forest areas from continental to territorial levels using a sample of medium spatial resolution satellite imagery. ISPRS Journal of Photogrammetry and Remote Sensing 65 (2), 191–197.
Faulkenberry, G.D., Garoui, A., 1991. Estimating a population total using an area frame. Journal of the American Statistical Association 86 (414), 445–449. Finney, D.J., 1948. Random and systematic sampling in timber surveys. Forestry 22, 64–99. Ford, B.L., Nealon, J., Tortora, R.D., 1986. Area frame estimators in agricultural surveys: sampling versus nonsampling errors. Agricultural Economics Research 38 (2), 1–10. Fortin, M.-J., Drapeau, P., Legendre, P., 1989. Spatial autocorrelation and sampling design in plant ecology. Vegetatio 83 (1–2), 209–222. Gallego, F.J., 2006. Review of the Main Remote Sensing Methods for Crop Area Estimates, Remote Sensing Support to Crop Yield Forecast and Area Estimates ISPRS archives, XXXVI, 8/W48, pp. 65–70 http://www.isprs.org/proceedings/XXXVI/8W48/65 XXXVI-8-W48.pdf. Gallego, F.J., 2004. Remote sensing and land cover area estimation. International Journal of Remote Sensing 25 (15), 3019–3047. Gallego, F.J. The efficiency of sampling very high resolution images for area estimation in the European Union. International Journal of Remote Sensing, in press. Gallego, J., Bamps, C., 2008. Using CORINE land cover and the point survey LUCAS for area estimation. International Journal of Applied Earth Observation and Geoinformation 10, 467–475. Gallego, F.J., Delincé, J., 2010. The European Land Use and Cover Area-Frame statistical survey (LUCAS). In: Benedetti, R., Bee, M., Espa, G., Piersimoni, F., Wiley (Eds.), Agricultural Survey Methods. John Wiley & sons, pp. 151–168 (Chapter 10). Haining, R., 1990. Spatial Data Analysis in the Social and Environmental Sciences. JRC-EEA, 2005. In: Lima, V. (Ed.), CORINE Land Cover Updating for the Year 2000: Image 2000 and CLC2000; Products and Methods. JRC, Ispra, Italy, Report EUR 21757 EN. Kumar, N., 2007. Spatial sampling design for a demographic and health survey. Population Research and Policy Review 26 (5–6), 581–599. Lohr, Sh., 2010. Sampling: Design and Analysis, 2nd ed. Brooks/Cole, Boston. Mayaux, P., Eva, H., Gallego, J., Strahler, A.H., Herold, M., Agrawal, S., Naumov, S., et al., 2006. Validation of the global land cover 2000 map. IEEE Transactions on Geoscience and Remote Sensing 44 (7), 1728–1737. Meyer-Roux, J., Vossen, P., 1994. The first phase of the MARS project, 1988–1993: overview, methods and results, 1994. In: Conference on the MARS Project: Overview and Prospectives: Report EUR 15599 EN Office for Official Publications of the E.U., Luxembourg, Belgirate, Italy, 17–18 November, 1993, pp. 33–79. Morrison, M.L., Block, W.M., Strickland, M.D., Collier, B.A., Peterson, M.J., 2008. Wildlife Study Design, 2nd ed. Springer, New York, 386 pp. Potapov, P., Hansen, M.C., Gerrand, A.M., Lindquist, E.J., Pittman, K., Turubanova, S., Wilkie, M.L., 2011. The global landsat imagery database for the FAO FRA remote sensing survey. International Journal of Digital Earth 4 (1), 2–21. Pradhan, S., 2001. Crop area estimation using GIS, remote sensing and area frame sampling. International Journal of Applied Earth Observation and Geoinformation 3 (1), 86–92. Scott, C.T., 1998. Sampling methods for estimating change in forest resources. Ecological Applications 8 (2), 228–233. Stehman, S.V., 2005. Comparing estimators of gross change derived from complete coverage mapping versus statistical sampling of remotely sensed data. Remote Sensing of Environment 96 (3–4), 466–474. Stehman, S.V., 2009. Model-assisted estimation as a unifying framework for estimating the area of land cover and land-cover change from remote sensing. Remote Sensing of Environment 113 (11), 2455–2462. Stehman, S.V., Sohl, T.L., Loveland, T.R., 2005. An evaluation of sampling strategies to improve precision of estimates of gross change in land use and land cover. International Journal of Remote Sensing 26 (22), 4941–4957. Stein, A., Ettema, Ch., 2003. An overview of spatial sampling procedures and experimental design of spatial studies for ecosystem comparisons. Agriculture, Ecosystems and Environment 94 (1), 31–47. Taylor, J., Sannier, C., Delincé, J., Gallego, F.J., 1997. Regional crop inventories in Europe assisted by remote sensing: 1988–1993. In: Synthesis Report. EUR 17319 EN. Office for Publications of the EC, Luxembourg, 71 pp.