International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
Contents lists available at SciVerse ScienceDirect
International Journal of Applied Earth Observation and Geoinformation journal homepage: www.elsevier.com/locate/jag
Object-based sub-pixel mapping of buildings incorporating the prior shape information from remotely sensed imagery Feng Ling a,∗ , Xiaodong Li b , Fei Xiao b , Shiming Fang c , Yun Du b a
State Key Laboratory of Geodesy and Earth’s Dynamics, Institute of Geodesy and Geophysics, Chinese Academy of Sciences, Wuhan 430077, China Key Laboratory for Environment and Disaster Monitoring and Evaluation, Hubei Province, Institute of Geodesy and Geophysics, Chinese Academy of Sciences, Wuhan 430077, China c Faculty of Earth Resources, China University of Geosciences, Wuhan 430074, China b
a r t i c l e
i n f o
Article history: Received 31 August 2010 Accepted 22 February 2012 Keywords: Sub-pixel mapping Super-resolution Object-based Buildings Spatial pattern Scale
a b s t r a c t Sub-pixel mapping (SPM) is a promising method to predict the spatial locations of land cover classes at the sub-pixel scale for remotely sensed imagery, using the fraction images generated by soft classification as input. At present, SPM treats all sub-pixels of different land cover classes in the same strategy by maximizing their spatial dependence. Although the maximal spatial dependence is a simple method to describe the spatial pattern of land cover classes and has been proved to be an effective principle for SPM, it does not reflect real-world situations. Given that spatial patterns are land cover class- or object-specific, each land cover class or object should be designated its own specific spatial pattern description when SPM is applied. In this paper, a novel object-based sub-pixel mapping (OBSPM) method was proposed to map buildings at the sub-pixel scale. On the basis of the prior information of the building shape (i.e., the building boundaries are parallel or perpendicular to the main orientation), a novel anisotropic spatial dependence model is adopted in the SPM procedure. The proposed OBSPM model includes three main steps: building segmentation, building feature extraction, and anisotropic SPM of buildings. The proposed model is evaluated with a simulated synthetic image and an actual AVIRIS image. The results show that OBSPM obtains more accurate building maps than do conventional SPM models, and the accuracy of fraction images and the spatial resolutions of remotely sensed images are two crucial factors that influence the OBSPM results. Furthermore, extending the OBSPM model to more land cover classes to incorporate more specific prior information is a promising method in enhancing the applicability of SPM to practical situations. © 2012 Elsevier B.V. All rights reserved.
1. Introduction When remotely sensed imagery at coarse spatial resolution is used to extract land cover information, mixed pixels inevitably occur. Obtaining land cover information at the sub-pixel scale is therefore an important issue for the remote sensing community (Atkinson et al., 1997; Fisher, 1997; Cracknell, 1998; Foody, 1998; Bonnett and Campbell, 2002). Although soft classification can be used to estimate the area proportion of each land cover class within mixed pixels, the spatial distribution of land cover classes in each mixed pixel is not provided. Sub-pixel mapping (SPM), also referred to as super-resolution mapping, is a technique used to predict the spatial distribution of land cover within mixed pixels of remotely sensed images by transforming the fraction images derived from soft classification into a finer scale hard classification map (Atkinson, 1997, 2009; Foody, 2002; Mertens et al., 2003).
∗ Corresponding author. E-mail addresses:
[email protected],
[email protected] (F. Ling). 0303-2434/$ – see front matter © 2012 Elsevier B.V. All rights reserved. doi:10.1016/j.jag.2012.02.008
In general, SPM can be viewed as the post processing of soft classification to provide more land cover information at the sub-pixel scale; it has been successfully used to improve the spatial resolution of remotely sensed images in many applications, such as ground control point refinement (Foody, 2002), shoreline mapping (Foody et al., 2005; Muslim et al., 2007) and land cover change mapping at the sub-pixel scale (Ling et al., 2011). SPM is implemented with the assumption of spatial dependence, a principle that pertains to the tendency of spatially proximate observations of a given property to be more similar than are distant observations (Atkinson, 1997). It is a simple but effective method for describing the spatial distribution patterns of land cover. SPM is typically formulated as an optimization model, and numerous algorithms have been proposed. These include Hopfield neural networks (Tatem et al., 2001, 2002), genetic algorithms (Mertens et al., 2003), linear optimization (Verhoeye and De Wulf, 2002), the Markov random field model (Kasetkasem et al., 2005; Tolpekin and Stein, 2009), pixel swapping algorithm (Atkinson, 2005; Makido et al., 2007), and the sub-pixel/pixel attraction model (Mertens et al., 2006; Ge et al., 2009).
284
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
Various methods for improving the effectiveness of SPM have also been put forward. One approach is to incorporate additional site-specific land cover information into SPM. Many kinds of datasets, such as the digital elevation model (DEM) (Ling et al., 2008), panchromatic band images (Nguyen et al., 2006), vector boundaries (Aplin and Atkinson, 2001), LIDAR data (Nguyen et al., 2005), and multiple sub-pixel shifted images (Ling et al., 2010), have been used to generate this information. Another approach is to use specific spatial information of different land cover classes. Thornton et al. (2007) proposed a linear pixel swapping algorithm, in which an anisotropic model is used to describe linear features in the SPM model. However, a serious defect of conventional SPM models is that the specific spatial patterns of different land cover classes are disregarded, and the same spatial pattern description is applied to all land cover classes. These methods can be viewed as sub-pixelbased models because they treat all sub-pixels in terms of the same spatial pattern description. This technique does not reflect realworld situations. Given that spatial patterns are land cover classor object-specific, each land cover class or object should be designated its own specific spatial pattern description when SPM is applied. Object-based image analysis (OBIA) is another widely used and effective classification method for remotely sensed images (AlKhudhairy et al., 2005; Yu et al., 2006; Blaschke, 2010; Corcoran et al., 2010). Unlike the pixel-based classification method, OBIA partitions the entire image into different objects by image segmentation. All pixels belonging to the same object have the same features that are used in the subsequent classification procedure. The OBIA method exhibited better accuracy than did the pixelbased method in many cases (Blaschke, 2010). On the basis of OBIA, sub-pixel-based SPM can be extended to object-based SPM by incorporating specific object information into the SPM procedure; this way, a more accurate result can be expected (Atkinson, 2009). In this paper, we propose a novel object-based sub-pixel mapping (OBSPM) model and use it to produce sub-pixel scale building maps. The rest of the paper is organized as follows. In the next section, we briefly review SPM principles and describe the algorithms used in the study. Section 3 presents the experiments conducted to validate the accuracy of the proposed model. Section 4 discusses some major issues influencing the proposed model, and Section 5 concludes the paper.
where ESpatial denotes the SPM goal of matching the spatial pattern of the final sub-pixel land cover map to a prior model; EArea is the SPM constraint, in which the original class proportion per-pixel is maintained in the final sub-pixel land cover map; and acts as a tradeoff parameter that balances the influence of ESpatial and EArea . ESpatial describes the spatial pattern of different land cover classes at the sub-pixel scale, and the manner by which this variable is defined is crucial in the implementation of SPM. Many different prior models have been proposed for SPM, and these can be classified to H- and L-resolution models in accordance with the relationship between the spatial resolutions of pixels in remotely sensed images and land cover patches (Atkinson, 2009). The Lresolution model features cases wherein pixels are much larger than the objects of interest; in this model, the spatial pattern of land cover classes is represented by spatial statistics models, such as the indicator variogram, two-point histogram, and multi-point histogram (Tatem et al., 2002; Boucher et al., 2008; Boucher, 2009). In the H-resolution model, the pixels are smaller than the objects of interest. Here, the goal of SPM is to maximize the spatial dependence between neighboring sub-pixels. The current study focuses on H-resolution SPM because it is more suitable for the construction of sub-pixel building maps (Atkinson, 2009). In H-resolution SPM, the maximal spatial dependence principle is used as the prior model (Verhoeye and De Wulf, 2002; Mertens et al., 2003, 2006; Thornton et al., 2006; Atkinson, 2009; Ling et al., 2010). A spatial dependence (SD) for each coarse-resolution pixel is calculated as:
xic × SDic
(2)
i=1 c=1
xic =
1 0
if sub-pixel i is assigned to land cover class c otherwise
(3)
where SDic is the spatial dependence for sub-pixel i when it is assigned to land cover class c. For sub-pixel pi , SDic is computed as a distance-weighted function of the sub-pixel neighbors of pi . In conventional SPM models, the isotropic exponential distance decay window is used for every sub-pixel and is calculated as:
SDic =
N
pi ,pn xc (pn )
(4)
n=1
2. Methodology
pi ,pn =
2.1. Sub-pixel mapping SPM, a technique for estimating the spatial distribution of land cover classes at the sub-pixel scale, uses the fraction images estimated by soft classification as input. Suppose that the fraction images of C land cover classes are obtained from a remotely sensed image with coarse spatial resolution R. To generate a sub-pixel land cover map at a finer resolution r, the zoom factor is set to z(=R/r) and each coarse-resolution image pixel is divided into z2 sub-pixels. All the sub-pixels are considered to be pure pixels and each sub-pixel should be assigned to one certain land cover class. All possible subpixel assignments are evaluated and the assignment that matches the prior spatial pattern model is the resultant fine-resolution land cover map. In general, SPM can be established in a manner that minimizes an objective function E, which is defined as a goal and a constraint as: E = ESpatial + · EArea
C z 2
SD =
xc (pn ) =
−d(p , p ) i
w
n
(5)
1 if sub-pixel pn is assigned to land cover class c (6) 0 otherwise
where N is the number of neighbor sub-pixels, pi ,pn is the normalized distance-dependent weight determined by the distance d(pi , pn ) between the centers of sub-pixel pi and its neighbor subpixel pn , and ˝ is the normalization constant chosen in order that N = 1. w denotes the non-linear parameter of the distance n=1 pi, pn decay model. EArea is the area constraint of each coarse-resolution pixel aiming to minimize the least squared error of area percentages (AE). This constraint is mathematically expressed as follows:
AE = (1)
1 exp ˝
C c=1
SPM (˛FI ) c − ˛c
2
(7)
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
285
Fig. 1. The considered sub-pixels (grey areas) in an 11 × 11 window when the spatial dependence of the target sub-pixel (black areas) is calculated. (a) Isotropic spatial dependence; (b)–(d) anisotropic spatial dependencies for the three main orientations. For anisotropic spatial dependencies, two directions (parallel and perpendicular to the main orientation of the building) within the window are defined first; then, all the sub-pixels that intersect with the directional lines in the window are used to represent the anisotropic spatial pattern of the target building sub-pixel. SPM reprewhere ˛FI c is the inputted fraction image of class c; ˛c sents the area fraction of class c in the final sub-pixel map and is calculated as:
2
z x i=1 ic
˛SPM = c
(8)
z2
SPM aims to maximize the sum of SD and minimize the sum of AE in the whole image simultaneously. Then, the object function E can be expressed as: z C 2
E=−
R
xic × SDic + ·
i=1 c=1
C R
SPM (˛FI ) c − ˛c
2
(9)
c=1
To search for the solution of the SPM model, we employ the simulated annealing (SA) algorithm (Tolpekin and Stein, 2009). The annealing schedule is defined on the basis of a power law decay function, where temperature T at iteration n is modified according to: Tn = · Tn−1
(10)
The parameter ∈ (0, 1) controls the rate of temperature decrease. The SA algorithm is implemented through the following steps: Step 1: The sub-pixels of each class within each coarse-resolution pixel are randomly labeled according to their numbers, calculated using the inputted fraction images. Step 2: All the sub-pixel labels are updated using the Metropolis–Hastings sampler. The number of successful updates is then counted. Step 3: If the number of successful updates is lower than 0.1% of the total number of pixels after three consecutive iterations, the optimization is terminated; otherwise, Step 2 is repeated. 2.2. Prior model for buildings Maximal spatial dependence is widely used as a spatial pattern description for SPM when no other sub-pixel information is available. However, this approach inevitably presents limitations because the distribution mechanisms of many land cover classes are not governed by this principle. For example, buildings have mutually perpendicular boundary directions. If the main orientation of a building is extracted, most building boundaries can be determined to be either parallel or perpendicular to the main orientation. This prior information about buildings has been used for building extraction or building boundary regularization (Lee et al., 2003; Sampath and Shan, 2007; Sohn and Dowman, 2007). To incorporate information on the specific shape of a building, the isotropic spatial dependence of each sub-pixel in conventional SPM models is
revised by creating an anisotropic exponential distance decay window for every sub-pixel on the basis of the main orientation of the building. Given that the boundary sub-pixels can belong to lines that are either parallel or perpendicular to the main orientation, we consider both directions as possible solutions, and therefore calculate two spatial dependencies at both directions. A simple example shown in Fig. 1 illustrates both spatial dependencies. For each building sub-pixel, we apply a moving window with the center pixel as the target sub-pixel to calculate the spatial dependence. In this example, the size of the moving window is set to 11. When the conventional isotropic spatial dependence is calculated, all the grey sub-pixels in the moving window are included as shown in Fig. 1(a). Conversely, when the anisotropic spatial dependence is calculated, two directions (parallel and perpendicular to the main orientation of the building) within the moving window are defined. All the sub-pixels that intersect with the directional lines in the moving window are then used to represent the anisotropic spatial pattern of the target building sub-pixel. Three directions (0◦ , 22.5◦ , and 45◦ ) and their perpendicular directions (−90◦ , −67.5◦ , and −45◦ ) are shown in Fig. 1(b)–(d). Only the grey sub-pixels are considered when the anisotropic spatial dependence is calculated. 2.3. Proposed algorithm To generate the final sub-pixel building map, the application of the proposed OBSPM model is divided into three steps: building segmentation, building feature extraction, and anisotropic SPM of the buildings. The entire procedure is listed as follows: Step 1. Building segmentation The first step in the proposed OBSPM model is image segmentation using the inputted fraction images. The goal is to group the sub-pixels into two land cover classes, including buildings and background objects. To perform the segmentation procedure, the conventional SPM model (CSPM) with isotropic spatial dependence is first applied. Although the CSPM is unsuitable for building mapping at the sub-pixel scale, maximizing the spatial dependence enables the grouping of building sub-pixels into different patches. With the resultant sub-pixel building map generated using the CSPM, the buildings are segmented by a simple seeded region growing method, which endows connected sub-pixels with the same building label. When all the sub-pixels of the buildings are grouped into different separated buildings, the feature of each building can be extracted individually in the next step. Step 2. Building feature extraction Identifying the main orientation of each building is the primary task in this step. The Hough transform, which has been widely used for line extraction (Lee et al., 2003), is employed to extract the principal building orientation.
286
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
Fig. 2. Simulated artificial imagery and SPM results. (a) IKONOS 1 m panchromatic image (420 × 390 pixels); (b) land cover map extracted from the image by manual digitization; (c) simulated fraction image obtained by degrading building map in (b) with R = 10 m; (d) sub-pixel building map generated by CSPM; (e) extracted building boundaries from (d) by edge detection with a Sobel operator; (f) main orientations of the building estimated with Hough transform; (g) sub-pixel building map generated by OBSPM; (h) error map of CSPM; and (i) error map of OBSPM. In the error maps, black indicates building sub-pixels detected as background, while grey indicates background sub-pixels detected as buildings.
To carry out the Hough transform, a binary building boundary map is first generated by the Sobel edge detection algorithm for each building. The Hough transform is then performed on the boundary map. Identifying the orientation of each building is accomplished by determining the dominant distribution in the polar Hough parameter space. For each building, two orthotropic peaks in the parameter space are extracted and the corresponding direction is considered the building’s main orientation. Step 3. Anisotropic SPM of buildings For each building, SPM with the anisotropic spatial pattern is applied in image subsets of the inputted fraction images. The considered area includes all coarse-resolution pixels containing the target building. For this area, an anisotropic spatial dependence template is generated with the estimated main orientation of the building. The SA algorithm is performed to generate the sub-pixel map of the building. After the sub-pixel maps of all the buildings are generated individually, they are then combined to generate the final sub-pixel building map.
2.4. Accuracy assessment To quantitatively evaluate the quality of the obtained sub-pixel building map, we compare the result with the reference highresolution building map for each building. Two measures, namely, interpretation accuracy (IA) and object accuracy (OA), are calculated:
IA =
NSPM & REF × 100% NREF
(11)
OA =
NSPM & REF × 100% NSPM
(12)
where NSPM &REF is the number of pixels labeled as buildings both in the SPM result and in the reference map, NREF is the total number of pixels labeled as buildings in the map, and NSPM denotes the total number of pixels labeled as buildings in the SPM result.
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
The rectangularity index (RI) was utilized as quantitative descriptors of building shapes to test the performance of OBSPM. RI is calculated as: RI =
A
(13)
AMBR
where A is the area of a given building, and AMBR is the building’s minimum bounding rectangle (MBR) (Rosin, 2003). For the entire sub-pixel building map, the Kappa coefficient is used to evaluate the accuracy of OBSPM. Moreover, we compare the RI in the SPM result with that in the reference fine-resolution map using the mean value (MRI ) and the root mean square error (RMSERI ), which are calculated as: 1 i RISPM Nb Nb
MRI =
(14)
i=1
Nb 1 i i )2 RMSERI = (RISPM − RIREF Nb
(15)
i=1
i represents the RI where Nb denotes the number of buildings, RISPM i of the ith building in the SPM result, RIREF is the RI of the ith building in the reference fine-resolution map.
3. Experimental results In this section, we provide the results of two experiments designed to illustrate and validate the accuracy of the proposed OBSPM model. Two sets of remotely sensed data, a simulated artificial image, and an actual remotely sensed image are considered. An 11 × 11 window is used to calculate the spatial dependence of each sub-pixel. Non-linear parameter w is set to 5 and tradeoff parameter is set to 0.03. With the annealing schedule, we adopted T0 = 3.0 and = 0.9 (Tolpekin and Stein, 2009). 3.1. Simulated artificial image In this experiment, we applied the OBSPM to a simulated artificial image, which included buildings of different shapes. The artificial image was simulated with an actual panchromatic (PAN) IKONOS image of 420 × 390 pixels (at 1 m spatial resolution) located in Wuhan City, Hubei Province, China. All the buildings were mapped by manually digitizing them from the PAN image (Fig. 2(a)). Using the high-resolution building map in Fig. 2(b) as reference, three different coarse-resolution images with R = 5 m, 10 m, and 15 m were simulated. To avoid introducing extra errors caused by soft classification, the fraction images were obtained by directly degrading the high-resolution building map. The area proportions of the building in each coarse-resolution pixel were calculated in a window size according to the spatial resolution. The corresponding fraction images were considered the soft classification result and were then used as the input for OBSPM. The results for R = 10 m shown in Fig. 2(c)–(i) illustrate the entire procedure of OBSPM. The simulated building fraction image shown in Fig. 2(c) was used as the input for OBSPM. Initially, the CSPM was applied to generate a preliminary sub-pixel building map (Fig. 2(d)), and then the building patches were segmented. Subsequently, the binary building boundary map (Fig. 2(e)) was generated using the Sobel edge detection algorithm. Hough transform was individually applied on all the building boundaries to estimate the main orientations of the buildings. Fig. 2(f) shows the results. Finally, the SA algorithm was used to generate a sub-pixel building map for each building with anisotropic spatial dependence determined by the main orientation. All the sub-pixel building maps were combined to generate the final sub-pixel building map (Fig. 2(g)). Fig. 2(h)
287
Table 1 Accuracy statistics of CSPM and OBSPM for the simulated synthetic image. ID 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
MO 0◦ 0◦ 0◦ 0◦ 0◦ 16◦ 16◦ 16◦ 18◦ 51◦ 51◦ 51◦ 0◦ 0◦ 0◦ 16◦ 16◦ 0◦ 0◦ 0◦ 72◦ 75◦ 0◦ 0◦ 71◦ 74◦ 74◦
IA CSPM
OBSPM
OA CSPM
OBSPM
RI REF
CSPM
OBSPM
97.7 89.1 95.0 95.0 95.7 74.1 93.7 86.2 90.5 95.5 93.4 93.0 89.7 95.6 93.7 94.5 96.2 96.4 81.3 83.6 91.4 97.9 86.3 97.7 94.2 91.3 66.2
100 100 100 100 100 74.2 94.7 86.3 90.7 93.1 93.5 91.8 100 97.6 100 94.9 96.6 99.3 100 100 92.7 98.5 100 100 95.3 95.1 67.2
96.8 93.7 96.7 97.5 98.4 92.3 96.9 91.1 95.6 98.5 98.0 97.8 98.2 96.6 98.5 97.1 97.9 96.3 96.8 94.1 96.9 98.2 98.5 99.4 97.5 96.9 89.1
100 100 100 100 100 92.5 100 100 99.6 99.6 99.6 98.7 100 97.2 100 100 99.5 99.2 100 100 97.6 99.0 100 100 100 99.6 89.3
1 0.694 1 1 1 0.866 0.929 0.883 0.891 0.898 0.898 0.891 1 0.781 1 0.927 0.882 0.818 1 0.678 0.690 0.528 1 1 0.937 0.902 0.795
0.842 0.621 0.860 0.838 0.891 0.754 0.814 0.758 0.760 0.843 0.814 0.809 0.824 0.747 0.857 0.831 0.800 0.781 0.784 0.638 0.816 0.551 0.801 0.910 0.824 0.783 0.712
1 0.694 1 1 1 0.811 0.910 0.830 0.856 0.863 0.862 0.869 1 0.785 1 0.902 0.866 0.819 1 0.678 0.850 0.563 1 1 0.905 0.874 0.824
Note: ID means the building number shown in Fig. 2(b).
and (i) shows the error maps of OBSPM and CSPM; the maps were drawn by comparing the results of the models with those of the reference map shown in Fig. 2(b). The visual comparison with the reference high-resolution building map (Fig. 2(b)) shows that the final sub-pixel building map (Fig. 2(d)) produced by CSPM is inaccurate. The buildings produced by CSPM have more roundish boundaries than do the reference buildings, whose boundaries are mostly square. This result is attributed to the isotropical spatial dependence, which causes the land cover patches to become rounded. Compared with the building map generated by CSPM, that produced by OBSPM (Fig. 2(g)) is visually more accurate. Their shapes and situations are considerably similar to the reference map, and the square corners of the buildings are well mapped. The error maps in Fig. 2(h) and (i) show that all the buildings have errors along their boundaries for CSPM. For OBSPM, about half of the buildings in the left portion are precisely mapped, and the other buildings have fewer errors than do those in CSPM. The accuracy statistics in Table 1 show a considerable increase in accuracy for OBSPM. The main orientation (MO) of each building is an important factor that influences the mapping accuracy of OBSPM. At a 0◦ MO, the mapping accuracies, including the IA and OA of many buildings, reaches 100% except for the 14th and 18th buildings, which are incorrectly mapped because of their complex boundary shapes. Compared with the accuracy of the buildings with the main orientation of 0◦ , that of the other buildings is lower because these orientations cannot be precisely represented in the 11 × 11 window. However, a noticeable improvement can be obtained by OBSPM compared with CSPM. It is noticed that the values of RI for all the sub-pixel buildings produced by OBSPM are higher than those produced by CSPM, showing that buildings produced by OBSPM have greater probabilities of representing rectangle buildings. Moreover, except for the 14th, 18th, 21st, 22nd and 27th buildings, the values of RI for all buildings in the reference map are not less than those produced by CSPM and OBSPM. The exceptions are attributed to the irregular building shapes. For example, the 14th and 18th buildings include middle empty area, and the
288
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
Table 2 Accuracy statistics of CSPM and OBSPM for the entire actual AVIRIS image. Kappa
MRI
4. Discussions 4.1. Model parameters
RMSERI
CSPM
OBSPM
CSPM
OBSPM
CSPM
OBSPM
0.8292
0.8419
0.7818
0.8098
0.0971
0.0732
21st and 22nd buildings have long and narrow parts. In these cases, RI is not suitable to evaluate the rectangularity of building shapes any more (Rosin, 2003), and the results of CSPM and OBSPM are actually not as good as the reference by visual inspection. 3.2. Actual images To further assess the capability of OBSPM to address a natural phenomenon, an image acquired using an Airborne Visible Infrared Imaging Spectrometer (AVIRIS) was used in this experiment. The study area is an 80 × 80 pixel AVIRIS scene (at a 20 m spatial resolution) of Moffett Field, at the southern end of the San Francisco Bay acquired in 1992 (Fig. 3(a)). The reference high-resolution image was collected by manually digitization using QuickBird images acquired on August 2003 in Google Earth (Fig. 3(b)). Five buildings within two circles in Fig. 3(b) do not exist in the AVIRIS image; thus, they were excluded in the reference building map, as shown in Fig. 3(c). Before OBSPM was applied, we first generated the fraction image of the buildings from the original AVIRIS image by soft classification. Eight endmembers were adopted, including two kinds of building roofs and six kinds of backgrounds. Representative pixels of each endmember were manually selected from the image scene. These pixels were assumed to be pure pixels, and their average was used to represent the endmember signature. The constrained linear mixture model was used to extract the fraction images for all the endmembers. These fractions were recombined to generate final fraction images, which only include two endmembers, that is, the building roof and background. The difference of area proportion for the class of building roof between the fraction image and the reference map was evaluated by root mean square error (Tolpekin and Stein, 2009), and the unmixed result (Fig. 3(e)) yielded a root mean square error of 0.12. The error image is shown in Fig. 3(f). A very large fraction error is observed in the 1st building primarily because this building has a signature similar to that of the background. Another large fraction error exists in the 2nd building, whose surrounding square has a signature similar to that of the building roof. The square is therefore recognized as the building roof, leading to an overestimated building fraction. In this experiment, the zoom factor was set to z = 8 to simulate an image with a 2.5 m spatial resolution, similar to that implemented in the multi-spectral imagery of QuickBird. When the high-resolution building map in Fig. 3(c) was used as the reference, all the incomplete buildings near the border were discarded. The 1st building was also excluded because of the large fraction error. The final reference high-resolution building map is shown in Fig. 3(g). Fig. 3(h) and (i) show the resultant building maps produced by OBSPM and CSPM, respectively. The visual comparison of the results shows that OBSPM prevails over CSPM. For CSPM, the square corners of all the buildings become roundish and their locations are difficult to determine. By contrast, the sub-pixel building map generated by OBSPM is visually more accurate, and almost all the buildings have square shapes similar to those of the reference high-resolution building map. The Kappa coefficient and MRI of OBSPM are higher than that of CSPM, and RMSERI of OBSPM is lower than that of CSPM (Table 2), showing the effectiveness of the proposed OBSPM model.
The neighborhood window size in SPM plays an important role in producing the sub-pixel map. In the proposed OBSPM model, the isotropic spatial dependence is first applied to produce an intermediate sub-pixel building map for building segmentation and orientation extraction. With our experiments, choosing a 5 × 5 window size is competent (Makido and Shortridge, 2007) and using a larger window size is unnecessary. In contrast, when the anisotropic spatial dependence is applied to generate final subpixel building maps, a much larger window size is needed to represent the building orientation efficiently. A 9 × 9 window size has an angle resolution about 4◦ , meaning that lines with the angle difference less than 4◦ have the same spatial pattern when they are represented by a 9 × 9 grid. The angle resolution is about 3◦ for a 11 × 11 window and about 2◦ for a 13 × 13 window. With our experiments, the window size of 11 × 11 is a suitable choice. Although a larger window size can slightly improve the ability of orientation distinguish, it cannot further improve the performance of the model, but increase the complex of the model. The non-linear parameter w in the distance decay model is used to determine the influence of neighboring sub-pixels. For the isotropic spatial dependence, because only a small window size is used, the value of w does not influence much on the result. By contrast, w takes more significant effect on the final sub-pixel map when the anisotropic spatial dependence is applied. With a small value of w, the weights decrease sharply with the distance between the target sub-pixel and its neighbor sub-pixels, meaning that the anisotropic spatial dependence only takes effect within a small lag of distance. As a result, linear features are hard to be represented in the final sub-pixel map. Fig. 4(a) shows an example of the sub-pixel building map obtained by OBSPM with w = 1, and jagged edges of buildings are obvious. With our experiments, the value of w to be more than 5 is recommended to avoid this shortcoming. The balance parameter is a key parameter for the proposed model. If the value of is too small, the solution is unsmoothed and susceptible to noise in the inputted fraction images. On the other hand, if the value of is too large, the spatial term will have a dominating effect on the solution and make the final sub-pixel map oversmoothed. For the isotropic spatial dependence, a small value of will generate unsmoothed patches, or even separate the target building into many isolated buildings, and a large value of will generate over-smoothed patches, and the main orientation can hardly be generated correctly from the over-smoothed rounded buildings. When the anisotropic spatial dependence is applied, a small value of will generate unsmoothed buildings (Fig. 4(b)). By contrast, a large value of will shrink buildings or even eliminate some buildings (Fig. 4(c)). For the time being, however, there is no efficient automatic technique to select the value of . This value needs to be decided through trail experiments with prior information about the study area. The parameters in SA influence the effectiveness and convergence of the model. With the annealing schedule, the optimal value of parameters T0 and depends on the zoom factor, and larger values of the zoom factor requires larger values of both parameters. With our experiments, in the case of zoom factor of 10, fine results were obtained with T0 = 3.0 and = 0.9 suggested by Tolpekin and Stein (2009), and the convergences were found to be reached less than 100 iterations for all tests. 4.2. Image spatial resolution To analyze the effect of the spatial resolution of coarseresolution imagery on the accuracy of OBSPM, the final sub-pixel
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
289
Fig. 3. Actual AVIRIS image and SPM results. (a) Actual AVIRIS image with a spatial resolution of 20 m (80 × 80 pixels); (b) QuickBird imagery in Google Earth; (c) building map extracted from the image in (b) by manual digitization; (d) actual building fraction image; (e) building fraction image estimated by soft classification; (f) building fraction error image; (g) reference high-resolution building image generated from (c); (h) sub-pixel building map generated by CSPM; and (i) sub-pixel building map generated by OBSPM.
Fig. 4. SPM results of AVIRIS imagery with different model parameters. (a) Sub-pixel building maps generated by OBSPM with w = 1 and = 0.03; (b) sub-pixel building maps generated by OBSPM with w = 5 and = 0.01; (c) sub-pixel building maps generated by OBSPM with w = 5 and = 1.
290
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
Fig. 5. SPM results of simulated artificial imagery. (a) and (b) Sub-pixel building maps generated by CSPM and OBSPM with R = 5 m; (c) and (d) error maps of (a) and (b), respectively; (e) and (f) sub-pixel building maps generated by CSPM and OBSPM with R = 15 m; (g) and (h) error maps of (e) and (f), respectively. In the error maps, black indicates building sub-pixels detected as background, while grey indicates background sub-pixels detected as buildings.
building maps generated by CSPM and OBSPM with R = 5 m and R = 15 m (Fig. 5) for the synthetic image were also compared. The results show that the spatial resolution of the original coarseresolution imagery is an important factor that affects the feasibility of OBSPM in terms of two aspects. The first is the distinguishability of adjacent buildings. For R = 5 m and R = 10 m, all the buildings can be separated. For R = 15 m, the 3rd, 4th, and 5th buildings, the 6th and 7th buildings, the 8th and 16th buildings, as well as the 21st, 25th, and 26th buildings, are grouped into unitary objects in the final sub-pixel building map. This grouping was implemented because the mixed pixels between these buildings contain sub-pixels that also belong to different adjacent buildings, for the inter-building distances are relatively small. The inter-building distance between the 6th and 7th buildings is about 7.5 m, and the inter-building distances between 3rd and 4th buildings, 4th and 5th buildings, 8th and 16th buildings, 21st and 25th buildings, 25th and 26th buildings range from 10 m to 13 m, which are all less than 15 m. Thus, when OBSPM is applied, the spatial resolution of the original imagery should be finer than the distance between the adjacent buildings. The second aspect is the ability to preserve building details. The larger the spatial resolution of coarse-resolution imagery, the fewer the details preserved. For example, four square corners in the center of the 14th building can be well reconstructed when R = 5 m. However, when R = 10 m or R = 15 m, these square corners become rounded. Another example is the 27th building, whose depressed portion is well mapped when R = 5 m, but is destroyed when R = 10 m and R = 15 m.
Table 3 shows the results of the accuracy statistics for SPM with different coarse-resolution images. For the three coarseresolution images with different spatial resolutions, the Kappa coefficients and MRI of OBSPM are all higher than those of CSPM, the RMSERI values of OBSPM are smaller than those of CSPM (when R = 15 m, MRI and RMSERI are not calculated because many buildings are incorrectly combined), meaning that OBSPM is superior to CSPM. For both CSPM and OBSPM, the Kappa coefficient and MRI decreases at a lower spatial resolution R, and RMSERI is lower at a finer spatial resolution. This means that a finer spatial resolution can produce the sub-pixel building map with higher precision. 4.3. Error sources When applying the OBSPM model to real remotely sensed imagery, the fraction error caused by soft classification imposes a considerably negative effect on the final sub-pixel building map. Take Fig. 3 as an example, the 1st building was not mapped. In the 2nd building, many surrounding squares were mapped as the building roof, thereby resulting in a building much larger than the reference. Improving unmixing accuracy using more suitable soft classification models, such as multiple endmember spectral mixture analysis technique (Li et al., 2005; Powell et al., 2007) and artificial neural network (Weng and Hu, 2008; Somers et al., 2011), necessitates further study so that a more precise building map can be obtained at the sub-pixel scale. In addition, the heterogeneity of building roof materials present in many scenes may add additional complexity that influences the accuracy of fraction images,
Table 3 Accuracy statistics of CSPM and OBSPM for the entire simulated synthetic image. Kappa
R=5m R = 10 m R = 15 m
MRI
RMSERI
CSPM
OBSPM
CSPM
OBSPM
CSPM
OBSPM
0.9885 0.9447 0.8522
0.9965 0.9753 0.8618
0.8532 0.7875 –
0.8811 0.8800 –
0.0397 0.1185 –
0.0068 0.0389 –
Note: The MRI and RMSERI for R = 15 m are not calculated because many buildings are incorrectly merged together.
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
and additional requirements on the number of spectral bands are needed in the input satellite image. Another problem of applying the OBSPM model to real remotely sensed imagery is the main orientation of the building generated by Hough transform. This issue is influenced by both the fraction error and the relationship between the spatial resolution of imagery and the building area. In Fig. 3, for large-area buildings such as the 2nd building, although a large fraction error is observed, the main orientation can still be estimated correctly. For small buildings, the fraction error plays an important role in the main orientation extraction. For example, the main orientations of the 3rd and 4th buildings can be estimated correctly. However, for the 5th and 6th buildings, which have similar main orientations and areas as the 3rd and 4th buildings, their main orientations are incorrectly estimated because of larger fraction errors. Thus, when OBSPM is applied, the spatial resolution of coarse-resolution imagery should be less than the size of the building. The third issue of applying the OBSPM model to real remotely sensed imagery is the shape of the buildings. First, as explained in the experiment on simulated images, the minor features of the building are difficult to reconstruct when the mixed pixels have coarse resolution. Moreover, in the experiment of real remotely sensed imagery, some building shapes do not obey the proposed prior model, that is, the building boundaries are not parallel or perpendicular to the main orientation. For example, in Fig. 3, part of the 7th building has a round boundary, while the 8th and 9th buildings have irregular polygonal boundaries. In this case, the buildings cannot be mapped precisely even if the fraction is correct. 5. Conclusions In this paper, we propose a novel OBSPM model and apply it to construct sub-pixel scale building maps. Conventional SPM models treat all the sub-pixels of different classes with the same prior information, which is inconsistent with actual situations. To overcome this shortcoming, the proposed OBSPM model assigns different spatial pattern descriptions to sub-pixels in different objects. This enables the prior information of different objects to be incorporated into the SPM procedure, thereby improving the accuracy of the resultant land cover map. On the basis of the prior information of the building shape (i.e., the building boundaries are parallel or perpendicular to the main orientation), a novel anisotropic spatial dependence model is adopted in the SPM procedure. The proposed OBSPM model, which involves building segmentation, building feature extraction, and anisotropic SPM of buildings, is evaluated with a simulated synthetic image and an actual AVIRIS image. The results show that OBSPM obtains more accurate building maps than do conventional SPM models. Furthermore, the accuracy of fraction images and the spatial resolutions of remotely sensed images are two crucial factors that influence the OBSPM results. Although the proposed OBSPM model is used for sub-pixel building mapping, it can also be applied to other rectangular objects that have the same prior information of spatial pattern. Moreover, aside from the prior information of the buildings used in this study, many other land cover classes have their own features. Given that incorporating more specific prior information is a promising approach to improving the accuracy of SPM, extending the OBSPM model to more land cover classes maybe an effective method in enhancing the applicability of SPM to practical situations. Acknowledgements This work was funded by the National Natural Science Foundation of China (No. 40801186, and No. 40801045) and Wuhan Youth Chenguang Project (No. 200950431218).
291
References Al-Khudhairy, D.H.A., Caravaggi, I., Glada, S., 2005. Structural damage assessments from Ikonos data using change detection, object-oriented segmentation, and classification techniques. Photogrammetric Engineering and Remote Sensing 71, 825–837. Aplin, P., Atkinson, P.M., 2001. Sub-pixel land cover mapping for per-field classification. International Journal of Remote Sensing 22, 2853–2858. Atkinson, P.M., 1997. Mapping sub-pixel boundaries from remotely sensed images. In: Innovations in GIS 4. Taylor & Francis, London, UK, pp. 166–180. Atkinson, P.M., 2005. Sub-pixel target mapping from soft-classified remotely sensed imagery. Photogrammetric Engineering and Remote Sensing 71, 839–846. Atkinson, P.M., 2009. Issues of uncertainty in super-resolution mapping and their implications for the design of an inter-comparison study. International Journal of Remote Sensing 30, 5293–5308. Atkinson, P.M., Cutler, M.E.J., Lewis, H., 1997. Mapping sub-pixel proportional land cover with AVHRR imagery. International Journal of Remote Sensing 18, 917–935. Blaschke, T., 2010. Object based image analysis for remote sensing. ISPRS Journal of Photogrammetry and Remote Sensing 65, 2–16. Bonnett, R., Campbell, J.B., 2002. Introduction to Remote Sensing, 3rd ed. Taylor & Francis, New York. Boucher, A., 2009. Sub-pixel mapping of coarse satellite remote sensing images with stochastic simulations from training images. Mathematical Geosciences 41, 265–290. Boucher, A., Kyriakidis, P.C., Cronkite-Ratcliff, C., 2008. Geostatistical solutions for super-resolution land cover mapping. IEEE Transactions on Geoscience and Remote Sensing 46, 272–283. Corcoran, P., Winstanley, A., Mooney, P., 2010. Segmentation performance evaluation for object-based remotely sensed image analysis. International Journal of Remote Sensing 31, 617–645. Cracknell, A.P., 1998. Synergy in remote sensing – what’s in a pixel? International Journal of Remote Sensing 19, 2025–2047. Fisher, P., 1997. The pixel: a snare and a delusion. International Journal of Remote Sensing 18, 679–685. Foody, G.M., 1998. Sharpening fuzzy classification output to refine the representation of sub-pixel land cover distribution. International Journal of Remote Sensing 19, 2593–2599. Foody, G.M., 2002. The role of soft classification techniques in the refinement of estimates of ground control point location. Photogrammetric Engineering and Remote Sensing 68, 897–903. Foody, G.M., Muslim, A.M., Atkinson, P.M., 2005. Super-resolution mapping of the waterline from remotely sensed data. International Journal of Remote Sensing 26, 5381–5392. Ge, Y., Li, S.P., Lakhan, V.C., 2009. Development and testing of a subpixel mapping algorithm. IEEE Transactions on Geoscience and Remote Sensing 47, 2155– 2164. Kasetkasem, T., Arora, M.K., Varshney, P.K., 2005. Super-resolution land cover mapping using a Markov random field based approach. Remote Sensing of Environment 96, 302–314. Lee, D.S., Shan, J., Bethel, J.S., 2003. Class-guided building extraction from Ikonos imagery. Photogrammetric Engineering and Remote Sensing 69, 143–150. Li, L., Ustin, S.L., Lay, M., 2005. Application of multiple endmember spectral mixture analysis (MESMA) to AVIRIS imagery for coastal salt marsh mapping: a case study in China Camp, CA, USA. International Journal of Remote Sensing 26, 5193–5207. Ling, F., Li, W.B., Du, Y., Li, X.D., 2011. Land cover change mapping at the subpixel scale with different spatial-resolution remotely rensed imagery. IEEE Geoscience and Remote Sensing Letters 8, 182–186. Ling, F., Xiao, F., Du, Y., Xue, H.P., Ren, X.Y., 2008. Waterline mapping at the subpixel scale from remote sensing imagery with high-resolution digital elevation models. International Journal of Remote Sensing 29, 1809–1815. Ling, F., Xiao, F., Du, Y., Xue, H.P., Wu, S.J., 2010. Super-resolution land cover mapping with multiple sub-pixel shifted remotely sensed images. International Journal of Remote Sensing 31, 5023–5040. Makido, Y., Shortridge, A., 2007. Weighting function alternatives for a subpixel allocation model. Photogrammetric Engineering and Remote Sensing 73, 1233–1240. Makido, Y., Shortridge, A., Messina, J.P., 2007. Assessing alternatives for modeling the spatial distribution of multiple land-cover classes at sub-pixel scales. Photogrammetric Engineering and Remote Sensing 73, 935–943. Mertens, K.C., De Baets, B., Verbeke, L.P.C., De Wulf, R.R., 2006. A sub-pixel mapping algorithm based on sub-pixel/pixel spatial attraction models. International Journal of Remote Sensing 27, 3293–3310. Mertens, K.C., Verbeke, L.P.C., Ducheyne, E.I., De Wulf, R.R., 2003. Using genetic algorithms in sub-pixel mapping. International Journal of Remote Sensing 24, 4241–4247. Muslim, A.M., Foody, G.M., Atkinson, P.M., 2007. Shoreline mapping from coarsespatial resolution remote sensing imagery of Seberang Takir, Malaysia. Journal of Coastal Research 23, 1399–1408. Nguyen, M.Q., Atkinson, P.M., Lewis, H.G., 2005. Superresolution mapping using a Hopfield neural network with LIDAR data. IEEE Geoscience and Remote Sensing Letters 2, 366–370. Nguyen, M.Q., Atkinson, P.M., Lewis, H.G., 2006. Superresolution mapping using a Hopfield neural network with fused images. IEEE Transactions on Geoscience and Remote Sensing 44, 736–749.
292
F. Ling et al. / International Journal of Applied Earth Observation and Geoinformation 18 (2012) 283–292
Powell, R.L., Roberts, D.A., Dennison, P.E., Hess, L.L., 2007. Sub-pixel mapping of urban land cover using multiple endmember spectral mixture analysis: Manaus, Brazil. Remote Sensing of Environment 106, 253–267. Rosin, P.L., 2003. Measuring shape: ellipticity, rectangularity, and triangularity. Machine Vision and Applications 14, 172–184. Sampath, A., Shan, J., 2007. Building boundary tracing and regularization from airborne lidar point clouds. Photogrammetric Engineering and Remote Sensing 73, 805–812. Sohn, G., Dowman, I., 2007. Data fusion of high-resolution satellite imagery and LiDAR data for automatic building extraction. ISPRS Journal of Photogrammetry and Remote Sensing 62, 43–63. Somers, B., Asner, G.P., Tits, L., Coppin, P., 2011. Endmember variability in spectral mixture analysis: a review. Remote Sensing of Environment 115, 1603–1616. Tatem, A.J., Lewis, H.G., Atkinson, P.M., Nixon, M.S., 2001. Super-resolution target identification from remotely sensed images using a Hopfield neural network. IEEE Transactions on Geoscience and Remote Sensing 39, 781–796. Tatem, A.J., Lewis, H.G., Atkinson, P.M., Nixon, M.S., 2002. Super-resolution land cover pattern prediction using a Hopfield neural network. Remote Sensing of Environment 79, 1–14.
Thornton, M.W., Atkinson, P.M., Holland, D.A., 2006. Sub-pixel mapping of rural land cover objects from fine spatial resolution satellite sensor imagery using super-resolution swapping. International Journal of Remote Sensing 27, 473–491. Thornton, M.W., Atkinson, P.M., Holland, D.A., 2007. A linearised pixel-swapping method for mapping rural linear land cover features from fine spatial resolution remotely sensed imagery. Computers and Geosciences 33, 1261–1272. Tolpekin, V., Stein, A., 2009. Quantification of the effects of land-cover-class spectral separability on the accuracy of Markov-random-field-based superresolution mapping. IEEE Transactions on Geoscience and Remote Sensing 47, 3283–3297. Verhoeye, J., De Wulf, R., 2002. Land cover mapping at sub-pixel scales using linear optimization techniques. Remote Sensing of Environment 79, 96–104. Weng, Q., Hu, X., 2008. Medium spatial resolution satellite imagery for estimating and mapping urban impervious surfaces using LSMA and ANN. IEEE Transaction on Geosciences and Remote Sensing 46, 2397–2406. Yu, Q., Gong, P., Clinton, N., Biging, G., Kelly, M., Schirokauer, D., 2006. Objectbased detailed vegetation classification with airborne high spatial resolution remote sensing imagery. Photogrammetric Engineering and Remote Sensing 72, 799–811.