ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
Contents lists available at ScienceDirect
ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs
Semantic classification of urban buildings combining VHR image and GIS data: An improved random forest approach Shihong Du ⇑, Fangli Zhang, Xiuyuan Zhang Institute of Remote Sensing and GIS, Peking University, Beijing 100871, China
a r t i c l e
i n f o
Article history: Received 26 November 2014 Received in revised form 16 March 2015 Accepted 17 March 2015
Keywords: Very high resolution (VHR) images Urban buildings Semantic classification Random forest Object-based image analysis (OBIA)
a b s t r a c t While most existing studies have focused on extracting geometric information on buildings, only a few have concentrated on semantic information. The lack of semantic information cannot satisfy many demands on resolving environmental and social issues. This study presents an approach to semantically classify buildings into much finer categories than those of existing studies by learning random forest (RF) classifier from a large number of imbalanced samples with high-dimensional features. First, a two-level segmentation mechanism combining GIS and VHR image produces single image objects at a large scale and intra-object components at a small scale. Second, a semi-supervised method chooses a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Third, two important improvements in RF classifier are made: a voting-distribution ranked rule for reducing the influences of imbalanced samples on classification accuracy and a feature importance measurement for evaluating each feature’s contribution to the recognition of each category. Fourth, the semantic classification of urban buildings is practically conducted in Beijing city, and the results demonstrate that the proposed approach is effective and accurate. The seven categories used in the study are finer than those in existing work and more helpful to studying many environmental and social problems. Ó 2015 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
1. Introduction As main sites of urban activities and important components of cities, urban buildings are vital foundations of urban studies. Semantic classification of buildings intends to label buildings using a set of semantic categories cognized and conceptualized by people, such as low-story shantytowns, middle-story apartments, high-story apartments, administrative buildings, and commercial buildings. These categories strongly correlate with urban environment analyses (e.g. ecological and environmental evaluation), urban resource allocation (e.g. resource management, transportation planning, and disaster reduction) and urban social analyses (e.g. population estimation, and market research) (Wu et al., 2005). Existing work has focused on how to extract building contours or accurately distinguish buildings from non-buildings. However, geometric information alone cannot fulfill the demands on urban ecology, resources and social researches (Paul et al., 2001). Therefore, semantic classification of urban buildings is required.
⇑ Corresponding author. Tel.: +86 10 62750294; fax: +86 10 62751961.
Geometric analyses of buildings have been intended to extract geometric contours of buildings or distinguish buildings from other objects by using geometric or spectral features. In the middle-to-late 1980s, researchers started to extract urban buildings from aerial photos (Huertas and Nevatia, 1988). With the explosive increase in image data and continuous development of sensor techniques, techniques of extracting urban buildings have made great progresses. From the perspective of images used, buildings can be extracted from either low- and medium-resolution images or high-resolution images (Lin and Nevatia, 1998). Due to the limits of spatial resolution, only large areas of buildings or residential areas instead of individual buildings can be obtained from lowand medium-resolution images (Nevatia et al., 1997). On the other hand, VHR images can provide finer texture and more accurate locations of buildings. Thus, they are used more comprehensively to acquire buildings in high accuracy (Myint et al., 2011). From the perspective of extraction methods, existing work generally falls into edge-based geometric grouping or object-based classification. The former first extracts edges from images, and then uses geometric models of buildings as prior constraints to find edges belonging to the same buildings and group them into complete contours. These works have often used optical VHR data (Kim and Muller,
E-mail address:
[email protected] (S. Du). http://dx.doi.org/10.1016/j.isprsjprs.2015.03.011 0924-2716/Ó 2015 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
108
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
1999; Sirmacek and Unsalan, 2009; Ok, 2013) or a combination of optical and LiDAR data (Sohn and Dowman, 2007; Awrangjeb et al., 2010, 2013). Unlike edge-based methods, object-based methods first segment VHR images into image objects, and then distinguish image objects of buildings from that of non-buildings using image features (Myint et al., 2011). However, in VHR images, a lot of detailed information emerges, and the heterogeneity of buildings becomes much larger. Consequently, it is difficult to find appropriate segmentation scales and image features to classify complete buildings with different shapes, sizes, and structures. Semantic analyses of urban buildings have concentrated much on distinguishing different categories of buildings. These categories are cognized and conceptualized by people and described by natural languages. More importantly, they are strongly correlated to environmental and social variables and have special implications to these variables. There have been a few studies concentrating on recognizing the categories or neighborhoods of urban buildings. For identifying the categories of buildings, Lu et al. (2014) used spatial attributes calculated from LiDAR and other land-use features to classify buildings into three categories: single-family houses, multiple-family houses, and non-residential buildings. Belgiu et al. (2014) used airborne laser scanning data to group buildings into three categories: residential/small buildings, apartments/block buildings, and industrial/factory buildings. For the classification of neighborhoods, Graesser et al. (2012) defined urban neighborhoods as homogeneous zones and classified them as formal and informal areas, but they did not recognize subtypes, such as residential, commercial, and industrial structures. Other work in this field includes extracting unplanned settlements (Kuffer et al., 2014) and slums (Kohli et al., 2012) from VHR images. In terms of analyses above, most existing studies have focused on extracting geometric information on buildings while only a few have concentrated on semantic analysis. In addition, some important issues still remain to be resolved. First, existing work on semantic analyses has distinguished too few categories to satisfy the many demands in environmental or social sciences (Graesser et al., 2012; Kohli et al., 2012; Belgiu et al., 2014; Kuffer et al., 2014; Lu et al., 2014). Second, there have been no appropriate segmentation scales and algorithms to produce single image objects for diverse buildings. This lack of computational methods leads to low classification accuracies as image features strongly depend on segmentation scales. Third, a small number of manually chosen samples and features may be practical for classifying a few categories of buildings. To distinguish between more building categories greatly varying in size, shape, structure, and spectrum, however, a large number of samples, high-dimension and heterogeneous features are required. In this situation, the samples are often imbalanced, and the features are often auto-correlated and have distinct importance for distinguishing different categories. Unfortunately, there is still a lack of related work to reduce the influences of imbalanced samples on classification and to evaluate feature importance to classifying each category. Aimed to resolve the issues raised above, this study presents a two-level segmentation mechanism (i.e. a large-scale layer constrained by GIS data for producing single image objects and a small-scale layer providing intra-object component features) and a semi-supervised method to choose a large number of unbiased samples by considering the spatial proximity and intra-cluster similarity of buildings. Random forest (RF) classifier is used to semantically classify buildings, for it is capable of handling a large number of samples and high-dimension and heterogeneous features. Moreover, to improve classification accuracy and evaluate feature importance, two improvements in RF classifier are presented: a voting-distribution-ranked rule for reducing the influences of imbalanced samples and a feature importance
measurement for each category based on Gini descent and path tracing strategy. The first contribution of this study is the improvements of RF classifier in voting rule and feature importance evaluation. Although some researchers have used RF classifier to classify VHR images, the effective approaches to handling imbalanced samples and evaluating feature importance for each category are still unresolved. The improvements fill the gap from a methodological perspective. Another contribution is the semantic classification of urban buildings. The seven categories used in this study are finer than those used in existing studies and more appropriate for many environmental and social variables, such as population distribution (Wu et al., 2005; Lu et al., 2006) and small-scale heating networks (Geiß et al., 2011). Existing categories are ineffective at handling these variables. This situation will be worse in China because the inter-category differences in the capability of holding families are very large. Therefore, this study is motivated by both theoretical and practical demands. 2. Semantic category system of buildings This section will first discuss the cognition and representation of urban buildings in the real physical world, the geoinformatic world, and the cognition world, and then analyze the transformations of real-world urban buildings to object features and semantic categories. Finally, it will construct a semantic category system of buildings. 2.1. Category system of urban buildings Urban buildings made of various materials with assorted styles and appearances in the real physical world are the basis of cognizing semantic category by people and of sensing buildings by remote sensors (the middle section of Fig. 1). In the geoinformatic world (the right section of Fig. 1), buildings are abstracted into contours in GIS data and into image pixels or image objects in VHR images. Thus, they are described from the aspects of spectrums, shapes, and textures. In the cognition world, people cognize, understand, and communicate their ideas about buildings through appropriate semantic categories (the left section of Fig. 1). Therefore, building a semantic category system helps to transform the feature representations in the geoinformatic world to the concepts in the cognition world. The goal of semantic classification is to build relationships between the concepts of buildings in the cognition world and the features of buildings in the geoinformatic world. Therefore, the semantic category system can be built by discriminating the appearances and functions of urban buildings, including low-story (LS) shantytowns, medium-story (MS) apartments, high-rising (HR) apartments, administrative (AD) buildings, commercial (CM) buildings, industrial (ID) buildings, and auxiliary (AU) buildings (Table 1). 2.2. Inter-category variations of buildings Fig. 2 illustrates seven typical images for each category of buildings. It is clear that these categories greatly vary in the following aspects. First, the buildings in those categories have different sizes. Most buildings are single objects, while some (e.g. LS shantytowns) refer to the extents of spatially dense buildings and are much larger than other buildings. Even buildings in the same category may have different sizes. Second, spectral values differ significantly between the seven categories. Generally, LS shantytowns are represented as gray pixels, while CM and ID buildings often consist of colored pixels. Some categories of buildings (e.g. AD buildings)
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
Semantic Categories
RS&GIS Data
Appearance Styles blue & factory
AX Buildings
109
Distribution
grey & bungalow grey & block
ID Buildings
low
... LS Shantytowns
Objects Geometry
grey & rectangle grey & joint
MS Apartments
middle Structure
...
Sub-objects
columnar AD Buildings
high
conjoint
Texture
... HR Apartments villa
Pixels
townhouse CM Buildings
joint Spectrum
...
Fig. 1. Urban buildings in the three worlds.
Table 1 Semantic classification system for urban buildings. Category
Code
Definition
Low-story (LS) shantytowns
0
Medium-story (MS) apartments
1
High-rising (HR) apartments Administrative (AD) buildings
2
Bungalows densely distributed for daily life Slab-type apartment buildings like rectangles with larger lengths than widths, and normally lower than ten stories (Fig. 2b) Residential buildings which are squares and higher than 12 stories (Fig. 2c) Office buildings for government medical, educational and commercial administration Buildings for commodity tradition Buildings for industrial production and processing Annex buildings
3
Commercial (CM) buildings Industrial (ID) buildings
4 5
Auxiliary (AU) buildings
6
(a) LS shantytowns
(b) MS apartments
(e) CM buildings
may be composed of multi-colored pixels. Therefore, spectral features can distinguish urban buildings to some degree. Third, the structures of urban buildings vary greatly. For example, LS shantytowns are composed of small buildings distributed both densely and randomly in their extents (Fig. 2a), while MS apartments are often more homogeneous and composed of one component, and HR apartments and AD buildings often have multiple components, each with a distinct size, shape, and spectrum. Fourth, the seven categories have different shapes: LS apartments are often rectangular, HR apartments may be rectangular or circles with holes (Fig. 2c), and AD buildings are polygons with complex shapes. The variations of urban buildings in size, shape, structure, and spectrum pose many challenges to semantically classifying buildings. Although a few work has been done for automatically parameterizing multi-scale segmentation (Dra˘gutß et al., 2014), while it is impractical to find appropriate segmentation scales or
(c) HR apartments
(d) ID buildings
Fig. 2. Typical images of various buildings.
(d) AD buildings
(e) AU buildings
110
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
algorithms to extract image objects for one or multiple categories without guiding by prior information. Inappropriate segmentation often produces over- or under-segmented image objects instead of single objects for buildings, which leads to a lack of appropriate features for describing the properties of buildings as components and as a whole. Furthermore, a large number of samples and high-dimension image features are required because a few samples are insufficient to train a classifier. It needs to be addressed how to choose a large number of samples and guarantee that they are evenly distributed in the whole study area. Third, a powerful classifier is needed to handle a large number of samples with high-dimension features and evaluate what features are important to classify different categories of buildings.
spectrum, texture, geometry, and spatial distribution. The first three are used to choose samples and classify buildings while the last to evaluate clustering results and choose samples. (3) Sample selection with semi-supervised method. A large number of samples is required to semantic classification. The ISODATA algorithm is first used to cluster buildings according to spectrum, geometry, and texture features. Then, intra-cluster similarity is used to choose corresponding samples of semantic categories in a semi-supervised way. (4) Semantic classification of buildings using improved RF classifier. The three steps above can collect a larger number of samples as well as high-dimension and heterogeneous features for each sample. To train a classifier from these complex features and huge samples, a semantic classification approach using RF classifier is presented. Furthermore, RF is improved to evaluate the contribution of each feature to each category, which reduces the influences of imbalanced samples.
3. Methodology To solve the issues raised above (the complete segmentation of buildings, the choice of training samples, and the classification approach of semantic categories), this study presents an approach for semantic classification of urban buildings. Four steps are required (Fig. 3):
3.1. Image segmentation constrained by GIS data (1) VHR image segmentation constrained by GIS data. Since existing segmentation methods cannot produce single image object for each building, this study adopts a segmentation constrained by GIS data to obtain a single object for each building. The pixels inside each GIS contour are first merged into a single object, and then image features are computed. (2) Features extraction of buildings. Image features are the bridge connecting building objects to semantic categories. In this study, four types of features are used, including
GIS constrained segmentation
Building Edges
RS Imagery
Due to the complexity of roof structures, the influences of sidewalls, shadows and occlusions, and the great variations in image features of different buildings in VHR images, the accuracy of classification is further limited by the difficulty in obtaining single-image objects for diverse buildings relying solely on image information. Therefore, contours of GIS buildings are used as constraints to segment VHR images. The two-level segmentation proceeds in the following two steps (Fig. 4). For the first step at
Image Objects
Spectral Features
Texture Features
Geometric Features
Structure Features
ISODATA Clustering Spectral Feature Space
Geometry Feature Space
Building Clusters
mapping
Classification System
Training Samples
sampling
K-NN Analysis
Random Forest Bootstrap DT Training
Decision Trees Voting
RF Classifier
Vote rule adjustment
OOB Votes Distribution
OOB Assessment
Classifying Semantic Categories
Feature Importance
Fig. 3. Flowchart of semantic classification of urban buildings.
111
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
extracted (Table 2). Spectral and geometric features can distinguish buildings with different colors, sizes and shapes (Nevatia and Babu, 1980) in VHR images. Furthermore, some buildings cannot be distinguished by visual features: texture and distribution features are used instead to describe the distributions of pixels or sub-objects.
GIS contours Image pixels constrained
Image objects
3.2.1. Spectral features Spectral features are determined by construction materials and appearances of buildings, such as the average, brightness, maximal difference ratio, and standard deviation of pixel values. These features can be defined using spectral values of pixels at four bands, i.e., one definition often corresponds to four features. A total of 47 spectral features are presented.
multi-scale
Image sub-objects Fig. 4. Two-level image segmentation constrained by GIS data.
the first level, the pixels within each building contour are directly merged to create a single-image object. For the second step at the second level, each image object obtained in first step is further segmented into sub-objects using smaller segmentation scales (Baatz and Schape, 2000). The image features of image objects can help describe the shapes and spectrums of buildings from a global perspective, while those of sub-objects can disclose the internal structures from a local perspective.
3.2. Feature extraction of buildings Based on the two-level image segmentation above, spectral, texture, geometric and distribution features can be defined and
3.2.2. Texture features Image objects differ in both grays at different bands and the distributions of sub-objects and pixels. Accordingly, texture features include the structure features defined based on sub-objects and the textures of pixels derived from gray-level-co-occurrence matrix (GLCM) (Honeycutt and Plotnick, 2008). The texture features derived from GLCM depend on the directions (0°, 45°, 90° and 135°), and in total, 16 features are obtained from four bands in four directions (Table 2). The structure features refer to the standard deviations of the mean spectrum, areas and main directions of sub-objects. Altogether, there are 216 texture features for classification. Fig. 5 illustrates the structure features of three buildings: a, b and c. Building a has the lowest standard deviation of average spectrum and b has the largest in both average spectrum and standard deviation. If only average spectrums are considered, buildings a
Table 2 The defined features of urban buildings. Types
Names
Meanings
Ranges
Spectrum (47)
[Mean] [Brightness] [Max. diff.] [Std. dev] [Skewness] [Ratio] [HIS] [Mean border] [Contrast border] [Circular] [NDVI]
The average spectrum of pixels The weighted average spectrum of pixels on each band The ratio of the maximal difference of average spectrum of each band to the brightness The standard deviation of pixels’ grayscale of an image object The skewness of grayscale histogram of pixels The ratio of average spectrum to brightness The hue, saturation and brightness of RGB color The average spectrum on the inside and outside border The average differences between border pixels and their neighborhood The average spectrum and brightness of pixels within a ring around the center of a building Normalized difference vegetation index
[m, M] [m, M] [0, 1] [0, 1] [M3, M3] [0, 1] [0, 1] [m, M] [M, M] [m, M] [1, 1]
Texture (216)
[SO std. dev.] [SO mean diff.] [SO area] [SO density] [SO direction] [Homogeneity] [Contrast] [Dissimilarity] [Entropy] [2nd Mom.] [Correlation] [GLDV]
The The The The The The The The The The The The
standard deviation of spectrums of sub-objects average and standard deviation of spectrum differences between adjacent sub-objects standard deviation of the areas of sub-objects standard deviation of the average density of sub-objects average and standard deviation of main directions of all sub-objects homogeneity derived from GLCM contrast derived from GLCM heterogeneity parameters derived from GLCM information entropy derived from GLCM secondary moment derived from GLCM correlation derived from GLCM vector composed of diagonal elements of GLCM
[0, 1] [0, 1] [0, N] [0, 1] [0, 180] [0, 1] [0, 65,025] [0, 255] [0, 10,404] [0, 1] [0, 1] [0, 255]
Geometry (44)
[Area] [Border length] [Length/width] [Border index] [Compactness] [Ellipse fit] [Main direction] [Radius ellipse] [Rectangular fit] [Roundness] [Shape index] [Sub-objects]
The The The The The The The The The The The The
number of pixels within image objects number of pixels the inner and outer borders length–width ratio of the envelop rectangle ratio between the border lengths of a building and the smallest enclosing rectangle ratio of the product of length and width to the area of an image object goodness of a building fitting into an ellipse direction of the eigenvector belonging to the larger of the two eigenvalues radiuses of maximal inscribed ellipse and minimal circumscribed ellipse goodness of a building fitting into a rectangle difference between the radiuses of the maximal inscribed and minimum circumscribed ellipses ratio of perimeter of a building to four times the square root of its area number of sub-objects
[0, N] [0, 1] [0, 1] [1, 1] [0, 1] [0, 1] [0, 180] [0, 1] [0, 1] [0, 1] [1, 1] [1, 1]
112
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
1000
Brightness
The distributions of sub-object brightness values
standard deviation 800
a
mean 600
b 400
200
c Blue
Green
Red
Nir
Fig. 5. The structure features of sub-objects.
and b can be confused with each other as their spectrums are very similar. However, if standard deviations are considered, the three buildings are distinct for the differences between standard deviations are larger than those between average spectrums. 3.2.3. Geometric features Due to the insufficiency and confusion of spectrum, it is hard to accurately classify buildings relying solely on image features. At the first segmentation level, image objects as wholes can directly measure geometric shapes and arrangements of buildings. Besides the area and perimeter, how well image objects can fit into basic shapes of similar sizes (e.g. square, round and ellipse) is important in distinguishing different categories, such as shape index, rectangular fit, length of skeleton lines, roundness, and elliptic fit (Trimble, 2011). 3.2.4. Distribution features Distribution features can measure the spatial proximity and category similarity between buildings. Buildings neighboring in space and similar in shapes are more likely to belong to the same category. Two distribution features serve for this purpose: the proportion of buildings belonging to the same category in the K-nearest neighbors (abbreviated as the proportion in KNN), and the average distance between a building and its K-nearest neighbors with the same category (abbreviated as the average distance of KNN). The proportion in KNN (Eq. (1)) considers both spatial proximity and categorical similarity to measure the degree to which a building belongs to the category of its neighbors. The average distance of KNN (Eq. (2)) measures the distance of buildings satisfying the requirements of the descriptor, i.e., the proportion in KNN. Therefore, the two features contribute to choosing samples.
The proportion in KNN :
jKNNðoÞj 100% ð1Þ KP oi 2KNNðoÞ disðo; oi Þ Dis KðoÞ ¼ ð2Þ jKNNðoÞj
Prop KðoÞ ¼
The average distance of KNN :
where KNNðoÞ refers to the K-nearest neighbors with the same category of o, jKNNðoÞj the number of buildings in KNNðoÞ, and disðo; oi Þ the distance between buildings o and oi . 3.3. Sample selection with a semi-supervised method The quality of samples heavily determines the accuracy of semantic classification. Thus, it is of great importance to obtain a
large number of samples that are evenly distributed in the whole study area and in proportion to the ratio of each category of buildings to the total buildings. That is, the samples should reflect the real distribution of the features in each category. However, manual selection cannot guarantee sufficient and even distribution of samples in a large study area, not to mention that the samples obey the real distributions. The discrepancies between the samples and the real distributions lie in the following two aspects. First, the distributions of feature values of selected samples do not obey those of the real ones in image data. Second, the ratio of the samples in each category to the total samples does not match with the actual one in image data. These two discrepancies will affect the classification results. For the first discrepancy, this study presents a supervised approach to choose a large number of samples guided by the results of ISODATA algorithm (Memarsadeghi et al., 2007). This approach uses feature similarity to group image objects into clusters by measuring how feature values are distributed over different clusters (Fig. 6). Furthermore, objects with high confidences are chosen as samples. Formally, the confidence is defined as conf ðoÞ ¼ Prop KðoÞ=Dis KðoÞ. The more likely the neighbors of an object belong to the same cluster, the shorter the distance among the neighbors, the higher the confidence that the object will be chosen as a sample. Generally, one category often corresponds to multiple clusters; thus, the categories of samples are manually specified by combining multiple clusters into one category. Because clusters are obtained by considering feature similarity and feature distribution of buildings, the feature distributions of samples should be consistent with the actual ones. However, this approach can inevitably produce imbalanced samples. In the physical world, buildings of some categories greatly outnumber those of others, so are samples. Therefore, the imbalance of samples will be considered in the following sections. 3.4. Improved RF classifier for semantic classification The presented semi-supervised method can obtain a large number of samples (over thousands) and high-dimension (over hundreds) and heterogeneous features. To reasonably exploit samples and their features, the RF classifier is employed to randomly select samples and features to train decision trees and to integrate all trained decision trees to vote for the most popular category (Breiman, 2000). In this study, RF is improved to evaluate the contribution of each feature to classifying each category and to reduce the influences of imbalanced samples on classification results.
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
113
Fig. 6. Sample selection using ISODATA clustering.
3.4.1. The principles of RF classifier Based on bagging integration learning algorithm (Efron, 1979), RF classifier trains each decision tree independently (Pal, 2005) and the votes of all decision trees determine the final results. The training steps are as follows: (1) to choose a subset of samples pffiffiffiffiffi using Bootstrap sampling methods, (2) to choose randomly M features from M ones for each node, (3) to construct a CART decision tree with the chosen samples by using GINI coefficient (Eq. (3)) as information gain (Quinlan, 1986), and (4) to build N CART decision trees until a RF is built. K X Gini SLM ¼ 1 p2i
ð3Þ
i¼1
where SNM refers to the chosen subset consisting of L samples and M features, pi the frequency of samples of ith category, and K the number of categories. The classification process of decision trees is exactly same as that of the training process. For each building, each tree independently predicts a category; accordingly, the resulting category is the most popular category. 3.4.2. Learning from imbalanced samples: a voting-distribution ranked rule Traditional RF uses the simple majority voting rule to make decisions and tends to misclassify the minority categories. Therefore, imbalanced samples heavily affect the classification accuracy. Existing work on learning imbalanced data includes pseudo balanced random forest (BRF) and weighted random forest (WRF) (Chen et al., 2004). BRF copies a small number of samples for the minority categories and randomly chooses the same number of samples for the majority categories. WRF weighs the samples of different categories to reduce the differences in samples. However, BRF needs to identify adaptively the number of copied samples, causing it to change the original distribution of samples. Meanwhile, WRF necessitates the specification of weight for each category. However, how to specify weights is still unresolved. Therefore, a new voting rule – the voting-distribution ranked rule – is proposed to replace the simple majority voting rule. Supposing there are N decision trees and K categories, and each tree has one vote for a sample ðoÞ, then the votes of the N trees can be represented as a distribution VoteðoÞ ¼ ðn1 ; n2 ; . . . ; nK Þ with
PK
i¼1 ni ¼ N. The simple majority vote rule (Kontschieder et al., 2014) assigns a sample to the most popular category. The vote distributions of out-of-bag (OOB) samples are significant in discovering confused categories. For example, let n1 and n2 be the first and second maximum votes; if they are close, unclassified samples tend to be misclassified. Due to the influence of imbalanced samples, it is easier for majority categories than minority categories to obtain more votes. Once the votes of the majority categories are larger than those that they deserved, misclassification will occur. Fortunately, imbalanced samples do not hide the vote distribution over categories, even though they affect the number of votes for each category. Let pi ¼ ni =N ði ¼ 1; 2; . . . ; KÞ be the probability of the ith category, then the vote distribution of unclassified sample o is defined as probðoÞ ¼ ðp1 ; p2 ; . . . ; pK Þ. For each OOB sample, a vote distribution can be obtained; as a result, multiple vote distributions can be obtained for each category since there are many OOB samples for each category. The reliable distributions (ranked in top 5% in the probabilistic distributions) of each category are averaged into a representative one. Accordingly, for an unclassified sample, its vote distribution is compared with the representative distributions of the K categories. The sample is then assigned to the category with the largest similarity (Eq. (4)).
ClassðoÞ ¼ arg mini
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 X ~ ði; j ¼ 1; 2; . . . ; KÞ pj ~ pij j
ð4Þ
where ~ pi refers to the representative distribution of category i and ~ p the vote distribution of unclassified sample o. 3.4.3. Evaluating feature importance: a Gini coefficient descent approach To rank feature importance per category, a Gini coefficient descent approach for each category is proposed. Two existing methods have been used to rank feature importance: permutation importance and Gini importance (Calle and Urrea, 2011). The first method randomly permutes the values of feature vi in all samples, and then classifies OOB samples by trained RF. The average decrease in accuracy, caused by feature permutation, is regarded as the importance. The Gini importance computes information gain DGini of branch nodes in each tree caused by feature vi, and thus the total DGini in all trees is considered as the feature importance. Feature permutation measures the importance for each feature
114
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
independently, while Gini importance evaluates the importance of each feature using the information on all features in all decision trees. Therefore, a path tracing approach is presented to evaluate feature importance per category based on Gini coefficient descent. To estimate the feature vi’s importance to category cj, the leaf nodes labeled as cj in all trees are first chosen, and the paths from these leaves to the branch nodes applying vi to split samples can be found. Let DGiniv i ðcj Þ be the information gain of one path from a branch node with vi to a leaf labeled as cj. Then, the information gain of all paths is defined as the vi’s importance to category cj (Eq. (5)). Similarly, feature importance vi per category can be obtained.
VIG ðv i ; cj Þ ¼
X
DGiniv i ðcj Þ
ð5Þ
4. Case studies To verify the feasibility of the presented approach, a series of experiments are conducted, including sample selection, semantic classification of traditional RF, semantic classification of improved RF for handling imbalanced samples, and evaluation of feature importance.
4.1. Study area and used data Since urban buildings vary notably in size, shape, structure and spectrum, the experiments will focus on them to test the presented method. The study area is located at Haidian and Xicheng districts in Beijing city (Fig. 7), which belongs to the city expansion area and is full of high-tech companies, culture and education industry, commercial and service industry, and research institutions. In addition, due to rapid urbanization in recent years, the area has a large number of informal settlements and developing zones. Therefore, it is very significant to analyze and address the semantic categories of buildings in this area.
For semantic classification of buildings, both GIS and Quickbird data are used: (1) Quickbird image Data. Panchromatic band with resolution of 0.61 m and four multispectral bands with resolution of 2.44 m are fused to produce a four band data with resolution of 0.61 m. The spectral, geometric, and texture features are extracted from Quickbird image. (2) GIS data of buildings. To obtain a single-image object for each building, the contours in GIS data were used. The study area covers 38.7 km2 and contains 8831 buildings. The largest contour, about 96,462 m2, represents a LS shantytown, while the smallest contour is only 70 m2 and refers to an AU building. Before semantic classification and sample selection, two-level segmentation was conducted and 307 image features were computed for each image object. In total, 8831 image objects and 15,258 sub-objects were obtained. The largest object has 118 sub-objects while the smallest object has only one sub-object. 4.2. Selected samples with the semi-supervised method The presented semi-supervised method (Section 3.3) was conducted to choose 2747 unbiased samples from 8831 buildings in the study area (Fig. 8). As shown in Fig. 8, the selected samples of each category are distributed sparsely and evenly in the whole study area, instead of being clustered together. It is clear that without the assistance of our presented semi-supervised method, it will be a big burden for users to choose so many samples from huge amount of candidates such that the chosen ones are distributed in the whole study. 4.3. Results of parameter sensitive analysis Two types of parameters are used in this study for defining image features and RF classifier. For the first kind of parameters,
Fig. 7. Study area and Quickbird data.
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
115
Fig. 8. The selected samples and unclassified buildings.
multiple features are defined using different parameters, and thus 307 features are obtained for classification. Therefore, only the parameter sensitive analysis of RF classifier is provided. However, the overall accuracy of RF classifier is mainly determined by the number of decision trees as the performance of RF is insensitive to the other parameters (Cutler et al., 2007). For each decision tree, about two-thirds of the samples are chosen for training, and the remaining ones are used as OOB samples to evaluate the accuracy. Table 3 reports the overall accuracies of RF classifiers with different numbers of decision trees. It is clear that accuracies increase with the increase in the number of decision trees, and tend to become stable when the number of trees is larger than 200. Accordingly, the RF classifier with 200 decision trees is used to classify buildings. 4.4. Classification results with the simple majority vote rule A RF classifier with 200 decision trees is trained using the chosen 2747 samples, and the 6084 unclassified buildings are Table 3 The overall accuracies of RF classifiers with different numbers of trees.
The simple majority voting rule The votingdistribution ranked rule
50
100
150
200
250
300
0.7117
0.7059
0.7011
0.7175
0.7135
0.7084
0.6727
0.7353
0.7561
0.7750
0.7521
0.7685
classified using the trained RF. For each OOB sample, each decision tree will have one vote for classification. The final category is the one with the most votes, and the confusion matrix of accuracy assessment is created based on the predicted and existing categories of OOB samples (Table 4). The overall accuracy is 71.50%, and overall kappa coefficient 0.59. From Table 4, the overall accuracy is greatly reduced by the misclassification of 303 high-story apartments and 35 commercial buildings. The possible reasons are: (1) the samples are unevenly distributed in different categories, and (2) classification difficulty varies much over different categories. Among the 2747 samples, MS apartments greatly outnumber buildings of other categories, accounting for 39.06% of total samples, while CM buildings have the smallest number of samples, taking up only 1.27%. The imbalanced samples are in proportion to the real distributions of buildings in the physical world, but they can greatly affect the classification accuracy. For example, when choosing a feature to partition samples at a branch node, if there are only a few samples at that node, the partition will stop. Therefore, if a category has too few samples for it to be distinguished from other categories, it will tend to be overlooked. Furthermore, the feature differences between categories also affect the classification results. According to the cognition to building semantics and object features of each category, it is relatively easy to classify LS shantytowns and AU buildings due to their special area features, as well as MS and HR apartments due to their special geometric shapes. However, it is relatively difficult to distinguish CM and ID buildings due to the confused and diverse features, leading to a low classification accuracy of the two categories.
116
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
Table 4 Confusion matrix and overall accuracy of the original RF. Category code
Reference 0
1
2
3
4
5
6
Total
0 1 2 3 4 5 6
159 1 0 8 9 13 0
22 1050 255 99 4 20 180
0 0 0 0 0 0 0
18 21 4 322 21 52 8
0 0 0 0 0 0 0
1 0 0 1 1 12 0
0 1 4 0 0 0 421
200 1073 303 430 35 97 609
Total
190
1670
0
446
0
15
426
2747
Producer’s accuracy (%)
User’s accuracy (%)
83.68 62.87 – 72.20 – 80.00 98.83
79.50 97.86 0.00 74.88 0.00 12.37 69.13
Overall accuracy = 71.50%. Overall kappa statistics = 0.59.
Fig. 9. The distributions of voting proportions of OOB samples. The colored vertical segments represent the correct results, while the white ones the wrong results. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
4.5. Classification results with the voting-distribution ranked rule
The minority categories will probably be misclassified. Due to the small number of samples, MS apartments – as are CM and AD buildings – are often confused with HR apartments. This confusion makes it difficult to correctly classify these buildings by the simple majority voting rule. The voting-distribution ranked rule presented in Section 3.4.2 is used to reclassify unclassified buildings, and the results are reported in Table 5. The overall accuracy increases to 79.54%, and the kappa coefficient remarkably to 0.72, demonstrating that our new vote rule can improve the classification accuracy: the misclassified MS and HR apartments are greatly reduced. Fig. 10 illustrates the improved classification results using the voting-distribution ranked rule. The spatial distributions of different categories are consistent with those of the actual situations. LS
The imbalanced samples result from the real distribution of buildings in each category, and they will lead to a certain bias when identifying the categories using the simple majority voting rule. Fig. 9 illustrates the distribution of voting results for the 2747 samples, where the x-axis denotes all OOB samples and the y-axis the probability distribution of the voting results. Each vertical line represents the votes for the categories assigned to a sample, with the height of each colored segment being the ratio of the votes for one category to all categories. Take the region in the circle as an example: the piecewise vertical segments represent the vote distribution of 1670 MS apartments, and only 1050 are correctly classified (Table 4) while the other 620 are misclassified.
Table 5 Confusion matrix and overall accuracy of the improved voting rule. Category code
Reference 0
1
2
3
4
5
6
Total
0 1 2 3 4 5 6
158 1 0 5 7 6 0
21 1041 134 86 0 15 140
0 7 162 3 0 3 13
17 23 3 327 3 44 7
2 0 0 3 22 3 0
2 0 0 6 3 26 0
0 1 4 0 0 0 449
200 1073 303 430 35 97 609
Total
177
1437
188
424
30
37
454
2747
Overall accuracy = 79.54%. Overall kappa statistics = 0.72.
Producer’s accuracy (%)
User’s accuracy (%)
89.27 72.44 86.17 77.12 73.33 70.27 98.90
79.00 97.02 53.47 76.05 62.86 26.80 73.73
117
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
Fig. 10. The classification results with the improved voting rule.
shantytowns are mainly located in the periphery of the city, HR apartments in the core area, and MS apartments are distributed uniformly and locally clustered. LS shantytowns, MS apartments, HR apartments and AU buildings are classified with high accuracies, while AD, CM and ID buildings are in relatively low accuracies, which are mainly caused by the complexity of the buildings themselves.
complex roofs and structures, showing unique characteristics of texture, and thus can be identified by texture features. Composed of diverse sub-objects, AD buildings vary greatly in materials and styles; thus, they can be classified by combining geometric and texture features. In other words, geometric and texture features are more important to classification than spectral ones. 5. Discussion
4.6. Evaluation of feature importance 5.1. The effectiveness of the presented approach The image features used in this study are highly correlated, high-dimensional and heterogeneous; therefore, it is necessary to evaluate the contributions of these features to classification. Based on the trained RF, the feature importance analysis method (Gini descent) in Section 3.4.3 was used to evaluate the importance of the 307 features. As shown in Table 6, texture and geometric features have great contributions to most categories, while spectral ones have poor performances due to their confusions. LS shantytowns are much larger than other buildings, and thus geometric features (e.g. area and perimeter) are significant. In addition, AU buildings are usually relatively small and easily distinguished by geometric features. Geometric features also play important roles in classifying MS apartments and ID and CM buildings because of the buildings’ unique shapes and areas; however, HR apartments often have
Since existing studies mainly have focused on completely different aspects and used different data and methods (Kohli et al., 2012; Table 6 The feature importance scores to seven categories. Semantic category
LS shantytowns MS apartments HR apartments AD buildings CM buildings ID buildings AU buildings
Feature contribution rank (%) Spectrum
Geometry
Texture
15.36 12.30 25.94 10.71 3.39 15.27 15.80
31.80 45.14 15.37 40.27 55.00 38.28 63.07
52.84 42.56 58.69 49.02 41.62 46.45 21.13
118
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119
Graesser et al., 2012; Belgiu et al., 2014; Kuffer et al., 2014; Lu et al., 2014), it is hard to compare this study with them. Accordingly, the differences between them are summarized as follows. First, the seven categories adopted in this study are much more refined than those of existing studies, which focused on three categories (Lu et al., 2014; Belgiu et al., 2014) or special zones about residents, such as formal/informal areas, unplanned settlements and slums (Graesser et al., 2012; Kohli et al., 2012; Kuffer et al., 2014). It is clear that the semantic categories in existing studies are too broad, as they cannot effectively recognize residential buildings, industrial and commercial buildings, and their subtypes. Therefore, our work is better than existing studies for environmental and social studies. Second, different strategies for computing image features are used. The two-level segmentation strategy can obtain single-image objects for diverse categories of buildings regardless of their sizes, structures and shapes. Thus, it can obtain object features (especially geometric ones) at the first segmentation level to exactly characterize buildings as wholes and the features of sub-objects at the second level to distinguish buildings with complex structures from those with simple ones. As a result, 307 features were used in our study, which is much greater than the number of features (less than 50) used in existing studies (Lu et al., 2014; Belgiu et al., 2014). Our strategy outperforms multi-scale image segmentation, which cannot produce single objects and appropriate features for classifications. Therefore, geometric features can help to improve the recognition of some categories of buildings. For example, geometric features are important to LS shantytowns and AU buildings, and shapes and areas to MS apartments, and ID and CM buildings. However, these features are useless in existing studies. Therefore, although LiDAR data was not used in this study, the overall accuracy is still higher than that in existing studies in urban areas. Third, a large number of samples and a larger study area were used in this study: 2717 samples were chosen to train RF classifier, 6084 buildings were classified, and the study area covers 38.7 km2, which is large enough to illustrate the effectiveness of the presented approach. In conclusion, considering that our semantic categories are finer and the inter-category variations are larger than existing studies (Lu et al., 2014; Belgiu et al., 2014), the overall accuracy 79.54% is still acceptable, and the presented approach is effective. 5.2. The scalability of the presented approach The improved RF classifier in this study uses the spectral, texture, shape and distribution features to classify buildings in urban areas, and the image features are derived from QB imagery and GIS data. If the categories of buildings are distinguishable by the four types of image features, RF classifier will work effectively. Therefore, when QB imagery (or other VHR imagery with similar resolution, such as IKONOS and GeoEYE) and GIS data (OpenStreetmap) are available in other places to define those image features, the presented approach needs no any modifications. However, if the used data are different, some modifications are required. For example, if LiDAR data or WorldView-2 data are available, more image features can be incorporated; while if only RGB fusion image is available, fewer image features can be defined. Therefore, if different VHR image is used, only the image features need to be modified. 5.3. The limitations of the presented approach The presented approach also has some limitations. First, GIS data, as prior restrictions for image segmentation, help to greatly reduce the errors caused by image segmentation. However,
locational discrepancies exist between GIS data and VHR images, which may affect the classification accuracy to a certain extent. Therefore, accurate geo-registration is required to reduce the discrepancies before the two data used. Second, LiDAR data have been proved to be useful in classifying buildings (Sohn and Dowman, 2007; Awrangjeb et al., 2010, 2013), but the data were unavailable and not adopted in this study. Since LiDAR data can provide height information on buildings, they can help characterize the structure of complex buildings and define new image features related to heights of buildings, leading to a high classification accuracy. Third, buildings in the same category and neighboring in space often tend to have similar shapes. Thus, shape classification or clustering techniques (Belongie et al., 2002) can be incorporated to improve classification accuracy. Fourth, a large number of unbiased samples are required for semantic classification of buildings. In this study, a semi-supervised approach combining spatial proximity and intra-clustering was presented to improve the efficiency of choosing samples while the labels of samples were still identified manually by users. Accordingly, the efficiency of choosing unbiased samples can be further improved by automatically identifying the labels of samples (Jirka et al., 2014). 6. Conclusion and future work This study presents a complete semantic category system, feature extraction, and improved classification approach for semantic classification of urban buildings. Four scientific tasks are resolved. Initially, GIS data were used to constrain the image segmentation for producing a single-image object for each building. Then, at the second level, each image object is further split into sub-objects to measure the internal heterogeneity of buildings. Next, ISODATA algorithm was used to group image objects into clusters by using extracted features, and a large number of unbiased samples were chosen by considering spatial proximity and intra-cluster similarity. The chosen samples reflect the real distributions of buildings in the physical world. Subsequently, the voting-distribution ranked rule was presented to improve RF classifier by reducing classification error caused by imbalanced samples. Finally, a path-tracing approach was presented to evaluate feature importance to classify buildings. The classification results of the improved and original RF were compared, and the accuracy increased from 71.50% to 79.54%, demonstrating the effectiveness of the improved approaches. Moreover, the results are highly in accordance with the recognition of humans. Furthermore, they can also be used to other types of VHR images. Nevertheless, there are still some limitations in this study. Although GIS data was introduced as prior restrictions for image segmentation to reduce errors caused by image segmentation, the location discrepancies between GIS and image data and the influences of shadows and occlusions on feature extraction can affect the classification accuracy to a certain extent. The finer categories of buildings obtained in this study are helpful in estimating urban population and heating consuming, which needs to be proved quantitatively. Therefore, these issues need to be addressed in future. Acknowledgements The work presented in this paper was supported by the National Natural Science Foundation of China (No. 41471315). References Awrangjeb, M., Ravanbakhsh, M., Fraser, C.S., 2010. Automatic detection of residential buildings using LIDAR data and multispectral imagery. ISPRS J. Photogramm. Remote Sens. 65, 457–467.
S. Du et al. / ISPRS Journal of Photogrammetry and Remote Sensing 105 (2015) 107–119 Awrangjeb, M., Zhang, C., Fraser, C.S., 2013. Automatic extraction of building roofs using LIDAR data and multispectral imagery. ISPRS J. Photogramm. Remote Sens. 83, 1–18. Baatz, M., Schape, A., 2000. Multiresolution segmentation: An optimization approach for high quality multi-scale image segmentation. J. Photogr. Sci. Remote Sens. 58 (3–4), 12–23. Belgiu, M., Tomljenovic, I., Lampoltshammer, et al., 2014. Ontology-based classification of building types detected from airborne laser scanning data. Remote Sens. 6, 1347–1366. Belongie, S., Malik, J., Puzicha, J., 2002. Shape matching and object recognition using shape contexts. IEEE Trans. Pattern Anal. Mach. Intell. 24 (4), 509–522. Breiman, L., 2000. Randomizing outputs to increase prediction accuracy. Mach. Learn. 40 (3), 229–242. Cutler, D.R., Edwards Jr., T.C., Beard, K.H., et al., 2007. Random forests for classification in ecology. Ecology 88 (11), 2783–2792. Calle, M., Urrea, V., 2011. Letter to the editor: stability of random forest importance measures. Brief Bioinform. 12, 86–89. Chen, C., Liaw, A., Breiman, L. 2004. Using random forest to learn imbalanced data. Technical Report of Department of Statistics, UC, Berkeley. Dra˘gutß, L., Csillika, O., Eisankb, C., Tiedeb, D., 2014. Automated parameterisation for multi-scale image segmentation on multiple layers. ISPRS J. Photogramm. Remote Sens. 88, 119–127. Efron, B., 1979. Bootstrap methods: another look at the jackknife. Ann. Stat. 7 (1), 1– 26. Graesser, J., Cheriyadat, A., Vatsavai, R.R., Chandola, V., Long, J., Bright, E., 2012. Image based characterization of formal and informal neighborhoods in an urban landscape. IEEE J. Select. Top. Appl. Earth Observat. Remote Sens. 5 (4), 1164– 1176. Geiß, C., Taubenböck, H., Wurm, M., Esch, T., Nast, M., Schillings, C., Blaschke, T., 2011. Remote sensing-based characterization of settlement structures for assessing local potential of district heat. Remote Sens. 3 (7), 1447–1471. Honeycutt, C.E., Plotnick, R., 2008. Image analysis techniques and gray-level cooccurrence matrices (GLCM) for calculating bioturbation indices and characterizing biogenic sedimentary structures. Comput. Geosci. 34 (11), 1461–1472. Huertas, A., Nevatia, R., 1988. Detecting buildings in aerial images. Comput. Vision, Graph., Image Process. 41 (2), 131–152. Jirka, V., Feder, M., Pavlovicova, J., Oravec, M., 2014. Face recognition system with automatic training samples selection using self-organizing map. In: The 56th International Symposium on ELMAR, pp. 23–26. Kim, T., Muller, J.-P., 1999. Development of a graph-based approach for building detection. Image Vis. Comput. 17 (1), 3–14. Kohli, D., Sliuzas, R., Kerle, N., Stein, A., 2012. An ontology of slums for image-based classification. Comput. Environ. Urban Syst. 36, 154–163.
119
Kontschieder, P., Bulo, S.R., Pelillo, M., Bischof, H., 2014. Structured labels in random forests for semantic labelling and object detection. IEEE Trans. Pattern Anal. Mach. Intell. 36 (10), 2104–2116. Kuffer, M., Barros, J., Sliuzas, R.V., 2014. The development of a morphological unplanned settlement index using very-high-resolution imagery. Comput. Environ. Urban Syst. 48, 138–152. Lin, C., Nevatia, R., 1998. Building detection and description from a single intensity image. Comput. Vis. Image Und. 72 (2), 101–121. Lu, D., Weng, Q., Li, G., 2006. Residential population estimation using a remote sensing derived impervious surface approach. Int. J. Remote Sens. 27, 3553– 3570. Lu, Z., Im, J., Rhee, J., Hodgson, M., 2014. Building type classification using spatial and landscape attributes derived from LIDAR remote sensing data. Landscape Urban Plan. 130, 134–148. Memarsadeghi, N., Mount, D.M., Netanyahu, N.S., Moigne, J.Le., 2007. A fast implementation of the ISODATA clustering algorithm. Int. J. Comput. Geom. Ap. 17, 71–103. Myint, S.W. et al., 2011. Per-pixel vs. object-based classification of urban land cover extraction using high spatial resolution imagery. Remote Sens. Environ. 115 (5), 1145–1161. Nevatia, R., Babu, K.R., 1980. Linear feature extraction and description. Comput. Graph. Image Process. 13 (3), 257–269. Nevatia, R., Lin, C., Huertas, A., 1997. A system for building detection from aerial images. In: Automatic Extraction of Man-Made Objects From Aerial and Space Images (II). Birkhäuser Basel, pp. 77–86. Ok, A.O., 2013. Automated detection of buildings from single VHR multispectral images using shadow information and graph cuts. ISPRS J. Photogramm. Remote Sens. 86, 21–40. Pal, M., 2005. Random forest classifier for remote sensing classification. Int. J. Remote Sens. 26 (1), 217–222. Paul, S. et al., 2001. Census from heaven: an estimate of the global human population using night-time satellite imagery. Int. J. Remote Sens. 22 (16), 3061–3076. Quinlan, J.R., 1986. Induction of decision trees. Mach. Learn. 1 (1), 81–106. Sirmacek, B., Unsalan, C., 2009. Urban-area and building detection using SIFT keypoints and graph theory. IEEE Trans. Geosci. Remote Sens. 47, 1156–1167. Sohn, G., Dowman, I., 2007. Data fusion of high-resolution satellite imagery and LiDAR data for automatic building extraction. ISPRS J. Photogramm. Remote Sens. 62, 43–63. Trimble Germany, 2011. eCognition Developer 8.7 Reference Book. Trimble Germany, Munich, Germany, pp. 262–272. Wu, S., Qiu, X., Wang, L., 2005. Population estimation methods in GIS and remote sensing: a review. GI Sci. Remote Sens. 42 (1), 80–96.