Infrared Physics and Technology 92 (2018) 203–218
Contents lists available at ScienceDirect
Infrared Physics & Technology journal homepage: www.elsevier.com/locate/infrared
Regular article
A top-down strategy for buildings extraction from complex urban scenes using airborne LiDAR point clouds
T
⁎
Ronggang Huanga, , Bisheng Yangb, Fuxun Liangb, Wenxia Daib, Jianping Lib, Mao Tianb, Wenxue Xuc a
State Key Laboratory of Geodesy and Earth’s Dynamics, Institute of Geodesy and Geophysics, Chinese Academy of Sciences, Wuhan, China State Key Laboratory of Information Engineering in Surveying, Mapping and Remote Sensing, Wuhan University, Wuhan, China c The First Institute of Oceanography, SOA, Qingdao, China b
A R T I C LE I N FO
A B S T R A C T
Keywords: Airborne LiDARpoint clouds Top-down strategy Building extraction The object entity
In the field of airborne LiDAR point clouds processing, the extraction of buildings has been an active research area for many years. However, it is still difficult to distinguish buildings from vegetation and other objects by using the point entity or the segment entity in various complex urban scenes. Therefore, this paper proposes a simple and novel method using a top-down strategy based on the object entity, which takes an object and its surrounding points as a unit for analysis. Firstly, ground points are separated from non-ground points, and nonground points are segmented for the detection of smooth regions. Secondly, the top-level processing recognizes the building regions from smooth regions based on their geometric and penetrating features. Finally, the downlevel processing is employed to remove non-building points around buildings from each building region using topological, geometric and penetrating features. The ISPRS benchmark dataset is selected to perform the experiment, the result is compared with the state-of-the-art methods, and parameter sensitivity analysis is performed to evaluate the robustness of the proposed method. The evaluation result shows that the proposed method achieves good performance with respect to area-based and object-based quality, and it is robust when parameters are within reasonable ranges. Furthermore, the proposed method is utilized to process one largescale dataset of Toronto. The result shows that the proposed method achieves a completeness of 96.2%, a correctness of 96.8% and a quality of 93.2% at the object level.
1. Introduction Airborne LiDAR has become a mature technology to capture 3D building information for the generation of city models [1]. However, the extraction of building points is an important task. Although many relevant articles have been published in the last decade, it is still a challenge in various complex urban scenes [2,3]. For example, it is difficult to separate buildings from vegetation and preserve the small rooftop furniture. According to the types of the data sources that are used, the published methods could be divided into three categories: DSM (Digital Surface Model)-based methods, point clouds-based methods and methods based on image fusing DSMs or point clouds [2]. However, DSM-based methods suffer from interpolation error and information loss [4–9]. Moreover, although methods that fuse images with range data could be helpful to distinguish buildings from vegetation [10–15], low quality will be caused by several problems, such as the computation
⁎
Corresponding author. E-mail address:
[email protected] (R. Huang).
https://doi.org/10.1016/j.infrared.2018.05.021 Received 24 January 2018; Received in revised form 23 May 2018; Accepted 23 May 2018
Available online 25 May 2018 1350-4495/ © 2018 Elsevier B.V. All rights reserved.
of the NDVI in a shadow and the perspective distortion of an image in a zone with high-rise buildings. Therefore, many researchers directly extract buildings from point clouds [16], which could better exploit the potential of geometric features and multi-returns. Currently, point clouds-based methods could be categorized as the point-based method and the segment-based method [17]. The pointbased method is implemented by using features based on the point entity (e.g., the height-related features, the multi-return-related features and the waveform-related features), and the class label is assigned for each point [18–20]. However, the point features could only describe the characteristics of a point and its neighbours, and they are easily disturbed by various noises and observation errors. Therefore, it is easy to cause a misclassification. To address this problem, many researchers began to study the segment-based methods. The segment-based methods firstly group points with similar point features into a segment, and the features are calculated from all points of each segment for distinguishing buildings from vegetation and other objects [16,21–23].
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
Fig. 1. Framework of the proposed method.
Fig. 2. The process of building points extraction. (a) Raw point clouds. (b) The filtered result. (c) Non-ground segments. (d) Detected smooth regions. (e) Recognized building regions. (f) The result of extracting building points.
204
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
(a) Area 1
(b) Area 2
(c) Area 3
Fig. 3. Raw point clouds of Areas 1–3.
(a) Area 4
(5) Area 5 Fig. 4. Raw point clouds of Areas 4 and 5.
object but not the entire object. It is difficult to accurately express the prior knowledge, and the separation of buildings from vegetation or other objects may be defeated. For example, vegetation segments with large areas and low penetrating ratios may be mistaken as buildings, and many small rooftop furniture may be erroneously removed due to their small geometric sizes. Therefore, we propose a novel method to extract building points from airborne LiDAR point clouds in a robust manner using a top-down strategy based on the object entity, which takes the points of an object and its surrounding points as a unit. Firstly, ground points are separated from non-ground points. Secondly, the smooth regions are detected from the non-ground points, and building regions are extracted from the smooth regions by using geometric and penetrating features based
The segment features could be the statistics of all point features within a segment, and they can also be some new features, like the segment’s size and shape. Generally, the segment features could better describe the characteristics and their contextual information. Some prior knowledge need to be formulated for extracting buildings in the segment-based methods, including the following: (i) the building roof has a good flatness while the surface of vegetation is rough; (ii) there are almost no penetrated points below roofs, and there may be many penetrated points in vegetation regions; and (iii) the building has a large size, and it is generally higher than the ground with a predefined value that is usually the average human’s height [24]. However, the reported segment-based methods only take each segment as an individual unit, and the features based on a single segment merely represent a part of an
205
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
2. Buildings extraction using a top-down strategy The proposed method is performed in a hierarchical process, as shown in Fig. 1. The point clouds are firstly classified into ground and non-ground points by the method of Yang et al. [25], and a reference DEM (Digital Elevation Model) is generated. Then, building points are extracted from non-ground points within a detection scheme. The scheme consists of two steps: a top-level processing is used for recognizing building regions, and a down-level processing is used for separating building points from non-building points around the building. 2.1. Top-level processing: recognition of building regions To improve the performance of distinguishing buildings from vegetation and other objects, the proposed method firstly detects building regions using the object entity. More specifically, the process includes two steps. The first step segments point clouds into planes and merges adjacent planes to obtain the smooth regions. The second step aims to detect building regions from smooth regions via several geometric and penetrating features.
Fig. 5. Raw point clouds of the large-scale data.
Table 1 Parameter settings of two geometric metrics. Parameters 2
ta /m tw /m
Areas 1–3
Area 4
Area 5
Description
5.0 2.0
50.0 5.0
50.0 10.0
Geometric size of a building
2.1.1. Detecting smooth regions (SRs ) Generally, the roof of the building is a smooth surface, and it could be expressed by an ensemble of many planar faces. Therefore, the proposed method utilizes a planar segmentation method [26]for nonground points NGP , and a set of planar segments will result and be denoted as RPS . Because small segments are more likely to be nonbuildings (e.g., vegetation and outliers), small segments are removed from the segmentation result. The remaining segments are denoted as PS , and the points of the removed small segments are denoted as P . Based on the segments PS , a topological relationship is analysed for each two segments. If there are some boundary points of the two segments within a set distance (e.g., twice that of the point spacing), these two segments are determined to be adjacent, and they are merged. In this way, the segments PS could be grouped into many clusters. However, a cluster may be only a part of an object. To guarantee the completeness of an object, each cluster is buffered with a distance. The distance is very important. If the distance is too small, many points could not be added to the corresponding object. If the distance is too large, a false result may occur in the recognition of the building regions. Therefore, an adaptive distance is employed for each cluster. The buffer distance is iteratively increased with a fixed value (e.g., twice that of the point spacing), and it will be terminated when one of two conditions isn’t satisfied. These two conditions are the following: the distance
on the object entity in the top-level processing. Finally, building points are refined for each building region in the down-level processing. The main contributions are the following:
• improving the performance of distinguishing buildings from vege•
tation by analysing their differences based on the object entity in the complex urban scenes, and better preserving building details (e.g., small roof furniture) by considering surface characteristics and the spatial relationship between details and the corresponding building.
The remainder of this paper is organized as follows. The proposed method is elaborated in Section 2. In Section 3, the experimental studies are undertaken to evaluate the proposed method. Finally, conclusions are drawn in Section 4.
C D B
Area 1
Area 2
Area 3
Area 4
Fig. 6. Elevation results by ISPRS.
206
Area 5
A
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
• The area and width of a smooth region (R , R )
Table 2 Evaluation results by ISPRS in five test areas. The average results of the two cities are highlighted. City
Dataset
Area-based/%
Object-based/%
a
A building should have a large size for individual living. Two thresholds (ta and tw ) could be set to remove small objects, such as cars and some vegetation. In general, these two thresholds are tuned according to urban scenes. The thresholds may be small for a residual area, but they should be specified as large values at commercial zones. These two values are calculated by converting points to a binary raster (br (SRi ) ) with a resolution of 1.0 m. A pixel with the grey value 1 is valid, and the grey value 0 is denoted as a void pixel. The area is the total number of valid pixels determined by Eq. (1). The width is obtained by an iterative morphological erosion by Eq. (3) until all pixels are void, and Rw could be defined as twice the iterative number.
RMS /m
CP
CR
Q
CP
CR
Q
Vaihingen
Area 1 Area 2 Area 3 Avg.
91.8 87.3 90.2 89.8
98.6 99.0 98.1 98.6
90.6 86.5 88.7 88.6
91.9 85.7 85.7 87.8
100.0 100.0 98.0 99.3
91.9 85.7 84.2 87.3
0.9 0.7 0.6 0.73
Toronto
Area 4 Area 5 Avg.
94.7 96.9 95.8
95.5 93.7 94.7
90.6 91.0 90.8
98.3 84.2 91.3
96.6 94.1 95.4
94.9 80.0 87.5
0.8 0.7 0.75
is smaller than a threshold, such as 3 m, and the area of the buffer zone is smaller than that of the corresponding cluster. The buffering result of a cluster is called a smooth region, which is regarded as an object entity, as shown in Fig. 2d. 2.1.2. Extracting building regions (BRs ) In these smooth regions SRs , some regions may be non-buildings, such as large vegetation and cars. To remove these non-building regions (NBRs ), the proposed method assumes that the building is a smooth surface with a large size, and a laser beam hardly penetrates the roof. Therefore, the proposed method employs geometric features and the penetrating capacity based on the object entity (i.e., a smooth region). These features are described as follows for a smooth region SRi , including segments and individual points P _SRi .
(1)
E (1) (br ) = br ⊖I
(2)
E (n) (br ) = E (n) ⊖E (n − 1) ⊖…⊖E (1) (br )
(3)
• The height of a smooth region (R ) h
A building should be higher than a certain value for people going in and out. The proposed method assumes that some boundary points of
(b)
Removed the rooftop vegetation points. Rooftop furniture
(c)
Ra = PNum (br (SRi ))
where PNum () is a counter of the valid pixels, ⊖ is the symbol of the morphological erosion, E (1) (br ) is the result of one morphological erosion, E (n) (br ) is the result after an iterative morphological erosion, and the iterative number is n .
Small annex structures
(a)
w
(d)
207
Fig. 7. The details of the evaluation results in several cases. (a) A false positive at location A of Fig. 6. (b) An example of preserving the annex structure at location B of Fig. 6. (c) Examples of preserving rooftop furniture and removing rooftop vegetation points at location C of Fig. 6. (d) A false negative at location D of Fig. 6 because of missing data.
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
and they are sorted from the highest to the lowest. The function of Qua1() aims to get the first quartile of Hb .
the segmented points should be higher than the DEM with a threshold within a smooth region. The elevation difference (th ) is specified by the average human’s height, such as 1.5 m. However, a building may be attached to a slope, and only a part of the boundary points may satisfy the condition. Therefore, Rh is calculated by Eq. (4).
Rh = Qua1(Hb)
• The area ratio of segments (R ) s
The value is the area of the smooth region SRi divided by the total area of segments . The area values are also calculated by counting the valid pixels after converting SRi and to two binary raster images
(4)
where Hb is a set of heights between boundary points and the DEM,
The proposed method
(a)
The proposed method
(b) Fig. 8. The performance comparison of the proposed method and the state-of-the-art methods within the Vaihingen datasets (Areas 1–3). (a) The comparison of the object-based quality. (b) The comparison of the area-based quality. (c) The comparison of the RMS.
208
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
The proposed method
(c) Fig. 8. (continued)
describes both the object’s surface flatness and also the penetrating capability. Generally, the value of a building region is high, and it is low in a vegetation region. Therefore, a threshold tsp can be employed. If the value is less than the threshold, SRi is labelled as a non-building.
(br (SRi ) and br (PS _SRi ) ), and the ratio is derived by Eq. (5). The value reflects the surface flatness of a smooth region, which is relatively high in a building region, but relatively low in a vegetation region. Therefore, a threshold ts is defined. If the value is less than the threshold, SRi is labelled as a non-building.
Rs =
PNum (br (PSSRi )) PNum (br (SRi ))
Rsp =
(5)
• The area ratio of extracted ground points (R ) The value is obtained by dividing the area of the ground points (G ) by that of the smooth region SRi . Generally, the value describes the penetrating capability, and it is relatively low in a building region. In contrast, the laser beam could penetrate the vegetation and arrives at the ground, thus resulting in a relatively high ratio in the vegetation region. Therefore, a threshold tg can be employed. If the value is larger than the threshold, SRi is labelled as a non-building. In theory, the threshold should be near to zero. However, there are a few special situations in urban scenes, where the laser beam could arrive at the ground below roofs. Additionally, there are some vegetation around the building. Consequently, the threshold should be larger than zero, but not too large.
PNum (br (G )) PNum (br (SRi ))
⎧ SRi ∈ NBRs if Ra < ta ||Rw < tw ||Rh < th ||Rs < ts ||Rg > tg ||Rsp < tsp ⎨ ⎩ SRi ∈ BRs otherwise (8)
2.2. Down-level processing: extraction of building points Although building regions have been exactly extracted, there may be some non-building points (e.g., vegetation and vehicles) around a building. Generally, it could be assumed that these objects consist of small segments and individual points and that they are near the border of the building. Therefore, the proposed method detects small segments using an area threshold (tsa ) and individual points as the candidates of the non-building objects. Then, the detected points are taken as unclassified and grouped into different clusters by a region growing method with a two-dimensional distance (i.e., twice that of the point spacing). For each cluster, it is classified by the following rules.
(6)
where br (G ) is a binary image converted from ground points within the smooth region SRi .
• The ratio of segmented points (R
(7)
where Num () is a counter of the number of points. Based on the above six features at the level of the entire smooth region, non-building smooth regions are removed by Eq. (8), and the remaining smooth regions are classified as building regions. Fig. 2e is the result of recognizing building regions.
g
Rg =
Num (PSSRi ) Num (SRi )
sp ).
The value is calculated by dividing the number of all points in SRi by the number of points in , as shown in Eq. (7). The value
209
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
• Topological relationship rule
cluster is inside the building, the segments of the cluster are directly classified to be building. If a cluster near the border of a building, it will be further processed by the subsequent geometric and penetrating rules.
Firstly, the boundaries of a cluster and the corresponding building region are extracted by the α-shape algorithm [27]. Then, we will determine whether a cluster is near the border of a building. If the boundary of a cluster is near the boundary of the building region within twice the point spacing, the cluster is regarded as near the border of a building. Otherwise, the cluster is regarded as inside the building. If a
• Geometric and penetrating rules Several features (i.e., Rs , Rg and Rsp ) of the remaining clusters are calculated. Then, each cluster is labelled to be a building or non-
Fig. 9. The performance comparison of the proposed method and the state-of-the-art methods within the Toronto datasets (Areas 4–5). (a) The comparison of the object-based quality. (b) The comparison of the area-based quality. (c) The comparison of the RMS.
210
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
Fig. 9. (continued)
Fig. 10. The process of detecting building points from the large-scale dataset. (a) The filtering result. (b) The result of extracting building points.
residential regions of Vaihingen, Germany, and the last two areas (i.e., Areas 4 and 5) are within commercial zones of Toronto, Canada. The raw point clouds of these areas used in our experiments are shown in Figs. 3 and 4. In Areas 1–3, the mean point density is 6.7 points/m2, while the mean point density is 4.0 points/m2 in the region covered by only one strip. In Areas 4 and 5, the mean point density is 6.0 points/m2. In these areas, there are many challenges for extracting building points, including diverse roof types, complex building shapes, rooftop furniture, trees near/inside buildings and high-rise buildings. To further validate the effectiveness of the proposed method, another large-scale dataset is selected from a commercial zone in the city of Toronto, as shown in Fig. 5. The area of this large-scale area is 1.45 km2, and its mean point density is
building according to the corresponding formulas in Section 2.1.2. Fig. 2f is the final result of extracting the building points. 3. Experiment results and analysis To validate the performance of the proposed method, two datasets are selected to perform the experiments. Firstly, the ISPRS benchmark dataset1 is selected for quantitatively and qualitatively evaluating the proposed method [2]. These areas are from two types of urban scenes. The three test areas (i.e., Areas 1–3) are located in 1
http://www2.isprs.org/commissions/comm3/wg4/tests.html.
211
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
• The
6.0 points/m2. In addition, this large-scale area is covered by the high-rise and multi-story buildings with complex rooftop structures, and much vegetation is around the buildings. To quantitatively evaluate the result of the proposed method, several quantitative indicators are adopted [28]. For example, the completeness (CP ), correctness (CR ) and quality (Q ) are based on the area or object, and the root mean square (RMS) of errors is between extracted building boundaries and reference boundaries.
•
3.1. Tests using the ISPRS benchmark dataset Five benchmark areas were processed by the proposed method. The main parameters are listed in Table 1. In these five areas, ts , tg and tsp are uniformly set as 0.5, and tsa is 5.0 m2. However, two parameters (i.e., ta and tw ) were tuned according to the urban scenes. The results were submitted to ISPRS for evaluation. The evaluation results are shown in Fig. 6 and Table 2. As seen from Table 2, the average values of the area-based quality are 88.6% and 90.8% in the Vaihingen and Toronto datasets, respectively, and the average values of the object-based quality are 87.3% and 87.5%, respectively. It indicates that the proposed method has good performance for extracting buildings from two different types of urban scenes. The details of the analysis are described as follows:
•
average values of area-based correctness are 98.6% and 94.7% in the Vaihingen and Toronto datasets, and the average values of the object-based correctness are 99.3% and 95.4%, respectively. The high values show that the proposed method can robustly distinguish buildings from vegetation or other objects in two types of urban scenes by analysing their differences based on the object entity. There are only a few false positives at the object level, and these false positives are very similar with the building, as shown in Fig. 7a. The average values of area-based completeness are 89.8% and 95.8% in the Vaihingen and Toronto datasets, and the average values of the object-based completeness are 87.8% and 91.3%, respectively. It shows that the proposed method could extract the majority of buildings, as shown in the yellow areas of Fig. 6. However, because of the small building size, low point density and missing data, some building points may be lost, as shown in Fig. 7d and the blue pixels in Fig. 6. Moreover, the proposed method could well preserve the annex structures (e.g., door eaves) and rooftop furniture simultaneously, as shown in Fig. 7b and c. And, most of the vegetation points on the rooftop could also be removed by the down-level processing, as shown in Fig. 7c. Furthermore, the performance of the proposed method was
Fig. 11. The first special case for describing the details of detecting a building with annexes and vegetation on the rooftop. (a) Raw point clouds. (b) The corresponding 3D model with textures from Google Earth. (c) The result of building points detection. (d) The local view of preserving the annex structures (e.g., the stairway points) rendered by red dots. (e) The local view of removing tree points on the rooftop rendered by green dots.
212
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
Removed rooftop vegetation points
Two preserved outdoor stairways
(c)
(d)
(e) Fig. 11. (continued)
3.2. The test using a large-scale dataset
compared with other methods from the website of the ISPRS evaluation results2. Many other methods take part in the evaluation of the Vaihingen datasets, but only a few methods take part in the evaluation of the Toronto datasets. The compared results of Vaihingen and Toronto are collected from 35 and 9 methods, respectively. The comparison results are shown in Figs. 8 and 9. It shows that the proposed method is the best at the average value of the object-based quality within the Vaihingen datasets. However, ZJU and LJU2 are the best at the average value of the area-based quality, but their values are only larger than that of the proposed method by 1.1%. It is because that the orthophotos were provided by ISPRS and the buildings are not very tall in Vaihingen. Therefore, ZJU and LJU2 combining the DSM and the orthophoto could obtain the best results. Regarding the RMS, the proposed method is ranked fourth, and it is larger than the best value by 0.13 m. For the Toronto datasets, the proposed method is the best at the average values of the area-based quality, the object-based quality and the RMS. Moreover, there are very few methods based on images combining with DSMs or point clouds within the evaluation list of the Toronto datasets, and all of their results are not good with respect to the object-based and areabased qualities. It shows that these methods are not suitable for the regions with high-rise buildings in commercial zones. In other words, compared with these state-of-the-art methods, the proposed method has good performance in various complex urban scenes.
2
Firstly, the procedure of the proposed method was executed to extract the building points from the large-scale dataset. The process of building points detection is illustrated in Fig. 10. Fig. 10a is the result of extracting the ground points, and Fig. 10b is the result of building points detection. To better show the details of building points detection in a local view, we selected three special cases, as shown in Figs. 11–13.
• In the first case of Fig. 11, there are two outdoor stairways attached
•
•
http://www2.isprs.org/commissions/comm3/wg4/results.html.
213
to the wall, many trees near the building, and several trees located on the rooftop. According to the building points detection result of Fig. 11c and the two enlarged views of Fig. 11d and e, the proposed method can preserve small annexe structures (e.g., stairways and door eaves) and remove vegetation points on the rooftop. In the second case, the main building is composed of many small facets in different stories, as shown in Fig. 12a and b. Fig. 12c is the result of the proposed method. Fig. 12d and e are the enlarged views of the rectangle in Fig. 12c. The enlarged views show that the proposed method has good performance in the extraction of small facets. However, Fig. 12f is the result of a simple segment-based method. The simple segment-based method takes each facet/segment as an individual unit (i.e., the segment entity), and then it extracts building segments by using the geometric and penetrating features (i.e., area, width and ground points below the roof segment). Therefore, many small facets are erroneously removed by the simple segment-based method, as shown in Fig. 12f. In the third case, the building has several small facets, and there are
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
Fig. 12. The second special case for describing the details of detecting a building with small facets. (a) Raw point clouds. (b) The corresponding 3D model with textures from Google Earth. (c) The result of building points detection. (d) The local view of small roof facets of the corresponding 3D model from Google Earth. (e) The local view of preserving small facets in the building detection result by the proposed method. (f) The local view of the result of the segment-based method, where some small facets are erroneously removed.
4. Discussion
many extracted ground points below the roofs, as shown in Fig. 13. Because the ratio of the ground points and the entire building is not too high, the proposed method can successfully extract all building points. However, the ratio between ground points and a single segment is very high in many facets, and it causes several facets to be erroneously removed by the simple segment-based method, as shown in Fig. 13d. Besides, small facets are also erroneously removed by the area/width threshold in Fig. 13d.
Generally, parameter tuning is beneficial for achieving a perfect result. However, the parameter sensitivity is important for the feasibility and robustness of the proposed method. Here, we performed experiments using many parameter configurations and analyse their influences on the performance of the proposed method. In this paper, we choose 5 parameters in the key process of distinguishing a building region from vegetation and other objects based on the object entity. These five parameters include the area (ta ), width (tw ), area ratio of segments (ts ), area ratio of extracted ground points (tg ) and ratio of segmented points (tsp ). In addition, the benchmark dataset (Area 3) is taken as the case in this discussion. In the parameter configuration, ta ranges from 5 m to 30 m with an interval of 5 m. tw is defined by the root of the corresponding ta . ts and tsp range from 0.4 to 0.9 with an interval of 0.1. tg ranges from 0.3 to 0.6 with an interval of 0.1. Finally, there are 864 parameter configurations in total. The raw point clouds of Area 3 were processed by the proposed method based on each parameter configuration. The area-based quality was also calculated and is illustrated in Fig. 15. It shows that the quality ranges from 72.5% to 88.7%, and all values are grouped into three clusters. There are two large inflection points where the change of the quality is larger than 5.0%. For the two inflection points, their split lines are 141 and 281 in the axis of the parameter configuration index, respectively. At the first split line, the quality jumps from 73.0% to 79.7%. At the second split line, the quality increases from 80.2% to 87.4%. The large inflection is mainly caused by the misclassification of some building regions when an unsuitable parameter configuration is set, such as an overly large value for tsp (e.g., 0.9). When a large building region is misclassified, the area-based completeness of the building will be significantly decreased, and the area-based quality is also decreased suddenly. Statistically, the quality larger than 87.4% occupies 67.5% of the parameter configurations. It shows that the
In other words, the proposed method could better describe the inherent characteristics of a building and other objects (e.g., vegetation) by the object entity, and could robustly extract buildings. To further analyse and evaluate the detection result, the outlines of extracted buildings were superposed on an existing map, which was taken as the reference. The reference map was captured from the website3. Then, the extracted outlines were checked one by one to determine whether it is a true positive (TP) or false positive (FP), and to count how many buildings are lost in the detection result (FN). Due to the difference between epochs of acquisition of LiDAR point clouds and the reference map, the evaluation results of some buildings may be uncertain. Therefore, we manually rechecked the ambiguous buildings based on images at the same epoch with the acquisition of LiDAR point clouds. The black dots of Fig. 14 were caused by the inconsistent references in the map and images, and they were not considered in the quantitative evaluation. Fig. 14 and Table 3 are the final evaluation results. The completeness and correctness are larger than 96.0%, and the quality is 93.2%. It shows that the proposed method has good performance in buildings detection and that there are only a few FPs and FNs. The FPs mainly consist of a few large objects like buildings. The FNs are small buildings with few points and small areas.
3
http://map.toronto.ca/maps/map.jsp?app=TorontoMaps_v2.
214
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
Fig. 13. The third special case for describing the details of detecting a building with many ground points below the rooftop. (a) The raw point cloud. (b) The corresponding 3D model with textures from Google Earth. (c) The result of building point detection based on the proposed method. (d) The result of the segmentbased method in which many roof segments are erroneously removed.
robust when parameters are set with values within the reasonable ranges.
proposed method could obtain good results for the majority of parameter configurations. More specifically, we analyse the influence of each parameter on the quality, as shown in Fig. 16. Because the width threshold is changed with the increase of the area threshold, the influence of the width threshold is not analysed separately. As seen from Fig. 16b and c, the changes in ts and tg have almost no influence on the quality. With the increase in the area threshold, it does also not affect the quality until the area threshold is specified as 30 m2, as shown in Fig. 16a. When the area threshold is 30 m2, a building region is removed, and the quality is decreased to 87.6%. In comparison with other parameters, the influence on the quality is relatively large for the parameter tsp , as shown in Fig. 16d. When tsp is set between 0.4 and 0.7, the change in the quality is small, and the difference between the minimum and the maximum values is only 0.8%. However, a significant reduction of the quality is produced when the parameter is set with 0.8 and 0.9. It shows that Rsp is relatively low in some building regions. The main reasons may be the low quality of the segmentation of some building points, and there is some vegetation around the building. In other words, it demonstrates that the proposed method is relatively
5. Conclusion In this study, we propose a method to extract buildings solely based on airborne LiDAR point clouds using a top-down strategy. First, the ground and non-ground points are separated. Second, a toplevel processing is used to recognize building regions via surface characteristics and penetrating capacities, which are calculated based on the object entity replacing the point and segment entities. Finally, non-building points are removed from building regions by a down-level processing. To verify the validity and the robustness of the proposed method, five benchmark datasets of two types of urban scenes that were provided by ISPRS were selected for the evaluation and performance comparisons, and one large-scale dataset was also selected. The results demonstrate that the proposed method could robustly extract the buildings with details (e.g., door eaves and roof furniture) and has good performance in distinguishing buildings
215
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
Fig. 14. The evaluation result of building detection based on the 2D map.
Table 3 Performance evaluation of the proposed method. TP
FP
FN
CP/%
CR/%
Q/%
301
10
12
96.2
96.8
93.2
from vegetation or other objects in various urban scenes. Additionally, the parameter sensitivity analysis is performed. It shows that the proposed method could obtain a robust result when parameters are set with reasonable values. However, a few buildings may be erroneously removed when the point density is too low or the size is too small. In the future, the spatial reasoning will be utilized to solve these problems. For example, some prior knowledge of the roof shape and walls could be added to extract building points.
Fig. 15. The distribution graph of the quality of all parameter configurations, where the quality is sorted from the smallest to the largest.
216
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
Fig. 16. The relationship graphs between the four parameters and the corresponding qualities. When a parameter is analysed, the other parameters are set according to Table 1. (a) The influence of ta on the quality. (b) The influence of ts on the quality. (c) The influence of tg on the quality. (d) The influence of tsp on the quality.
6. Conflict of interest [10]
All authors declare no conflict of interest. Acknowledgments
[11]
This study was jointly supported by the NSFC project (No. 41531177), the China Postdoctoral Science Foundation (No. 2017M622553), the National Key Research and Development Program of China (No. 2016YFF0103501) and the Basic Scientific Fund for National Public Research Institutes of China (No. 2015P13).
[12] [13]
[14]
References
[15]
[1] J. Shan, C.K. Toth, Topographic Laser Ranging and Scanning: Principles and Processing, CRC Press, London, 2008. [2] F. Rottensteiner, G. Sohn, M. Gerke, J.D. Wegner, U. Breitkopf, J. Jung, Results of the ISPRS benchmark on urban object detection and 3D building reconstruction, ISPRS J. Photogramm. Remote Sens. 93 (2014) 256–271. [3] I. Tomljenovic, B. Höfle, D. Tiede, T. Blaschke, Building extraction from airborne laser scanning data: an analysis of the state of the art, Remote Sensing 7 (4) (2015) 3826–3862. [4] X. Meng, L. Wang, N. Currit, Morphology-based building detection from airborne LIDAR data, Photogramm. Eng. Remote Sens. 75 (4) (2009) 437–442. [5] D. Mongus, N. Lukač, B. Žalik, Ground and building extraction from LiDAR data based on differential morphological profiles and locally fitted surfaces, ISPRS J. Photogramm. Remote Sens. 93 (2014) 145–156. [6] Y. Chen, L. Cheng, M. Li, J. Wang, L. Tong, K. Yang, Multiscale grid method for detection and reconstruction of building roofs from airborne LiDAR data, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7 (10) (2014) 4081–4094. [7] C. Liu, B. Shi, X. Yang, N. Li, H. Wu, Automatic buildings extraction from LiDAR data in urban area by neural oscillator network of visual cortex, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 6 (4) (2013) 2008–2019. [8] Z. Chen, B. Gao, An object-based method for urban land cover classification using airborne lidar data, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 7 (10) (2014) 4243–4254. [9] S. Du, Y. Zhang, Z. Zou, S. Xu, H. Xue, S. Chen, Automatic building extraction from
[16] [17]
[18] [19] [20]
[21]
[22]
[23]
[24]
217
LiDAR data fusion of point and grid-based features, ISPRS J. Photogramm. Remote Sens. 130 (2017) 294–307. F. Rottensteiner, J. Trinder, S. Clode, K. Kubik, Using the Dempster-Shafer method for the fusion of LIDAR data and multi-spectral images for building detection, Inf. Fusion 6 (4) (2005) 283–300. G. Sohn, I. Dowman, Data fusion of high-resolution satellite imagery and LiDAR data for automatic building extraction, ISPRS J. Photogramm. Remote Sens. 62 (1) (2007) 43–63. T.T. Vu, F. Yamazaki, M. Matsuoka, Multi-scale solution for building extraction from LiDAR and image data, Int. J. Appl. Earth Obs. Geoinf. 11 (4) (2009) 281–289. R. Qin, W. Fang, A hierarchical building detection method for very high resolution remotely sensed images combined with DSM using graph cut optimization, Photogramm. Eng. Remote Sens. 80 (9) (2014) 873–883. A. Zarea, A. Mohammadzadeh, A novel building and tree detection method from LiDAR data and aerial images, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 9 (5) (2016) 1864–1875. Z. Zhao, Y. Duan, Y. Zhang, R. Cao, Extracting buildings from and regularizing boundaries in airborne lidar data using connected operators, Int. J. Remote Sensing 37 (4) (2016) 889–912. M. Awrangjeb, C.S. Fraser, Automatic segmentation of raw LIDAR data for extraction of building roofs, Remote Sensing 6 (5) (2014) 3716–3751. G. Vosselman, M. Coenen, F. Rottensteiner, Contextual segment-based classification of airborne laser scanner data, ISPRS J. Photogramm. Remote Sensing 128 (2017) 354–371. Y. Gu, Q. Wang, B. Xie, Multiple kernel sparse representation for airborne LiDAR data classification, IEEE Trans. Geosci. Remote Sens. 55 (2) (2017) 1085–1105. B. Guo, X. Huang, F. Zhang, G. Sohn, Classification of airborne laser scanning data using JointBoost, ISPRS J. Photogramm. Remote Sens. 100 (2015) 71–83. J. Niemeyer, F. Rottensteiner, U. Soergel, Conditional random fields for LIDAR point cloud classification in complex urban areas, ISPRS Ann. Photogramm. Remote Sens. Spatial Inf. Sci. I-3 (2012) 263–268. A. Jochem, B. Höfle, V. Wichmann, M. Rutzinger, A. Zipf, Area-wide roof plane segmentation in airborne LiDAR point clouds, Comput. Environ. Urban Syst. 36 (1) (2012) 54–64. R. Richter, M. Behrens, J. Döllner, Object class segmentation of massive 3D point clouds of urban areas using point cloud topology, Int. J. Remote Sens. 34 (23) (2013) 8408–8424. J. Sánchez-Lopera, J.L. Lerma, Classification of lidar bare-earth points, buildings, vegetation, and small objects based on region growing and angular classifier, Int. J. Remote Sens. 35 (19) (2014) 6955–6972. Q. Zhou, U. Neumann, Complete residential urban area reconstruction from dense
Infrared Physics and Technology 92 (2018) 203–218
R. Huang et al.
[27] H. Edelsbrunner, E. Mücke, Three-dimensional alpha shapes, ACM Trans. Graphics (TOG) 13 (1994) 43–72. [28] M. Rutzinger, F. Rottensteiner, N. Pfeifer, A comparison of evaluation techniques for building extraction from airborne laser scanning, IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2 (1) (2009) 11–20.
aerial LiDAR point clouds, Graph. Models 75 (2013) 118–125. [25] B. Yang, R. Huang, Z. Dong, Y. Zang, J. Li, Two-step adaptive extraction method for ground points and breaklines from lidar point clouds, ISPRS J. Photogramm. Remote Sens. 119 (2016) 373–389. [26] B. Yang, Z. Dong, A shape-based segmentation method for mobile laser scanning point clouds, ISPRS J. Photogramm. Remote Sens. 81 (2013) 19–30.
218