ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
Contents lists available at ScienceDirect
ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs
Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis Erzhuo Che ⇑, Michael J. Olsen Oregon State University, Corvallis, OR 97331, United States
a r t i c l e
i n f o
Article history: Received 6 September 2017 Received in revised form 16 December 2017 Accepted 25 January 2018 Available online xxxx Keywords: Terrestrial Laser Scanning Lidar Segmentation Edge detection Region growing Feature extraction
a b s t r a c t Point cloud segmentation groups points with similar attributes with respect to geometric, colormetric, radiometric, and/or other information to support Terrestrial Laser Scanning (TLS) data processing such as feature extraction, classification, modeling, analysis, and so forth. In this paper we propose a segmentation method consisting of two main steps. First, a novel feature extraction approach, NORmal VAriation ANAlysis (Norvana), eliminates some noise points and extracts edge points without requiring a general (and error prone) normal estimation at each point. Second, region growing groups the points on a smooth surface to obtain the segmentation result. For efficiency, both steps exploit the angular grid structure storing each TLS scan that is often neglected in many segmentation algorithms, which are primarily developed for unorganized point clouds. Additionally, unlike the existing methods exploiting the angular grid structure that can only be applied to a single scan, the proposed method is able to segment multiple registered scans simultaneously. The algorithm also takes advantage of parallel programming for efficiency. In the experiment, both qualitative and quantitative evaluations are performed through two datasets whilst the robustness and efficiency of the proposed method are analyzed and discussed. Ó 2018 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
1. Introduction Terrestrial Laser Scanning (TLS), an effective and efficient 3D data acquisition approach utilizing Light Detection and Ranging (lidar), has been widely used in a variety of applications such as topographic mapping, engineering surveying, forest management, industrial facilities, cultural heritage, geohazard analysis, and so forth. TLS datasets can contain many millions or even billions of discrete points; hence, it can be very difficult to process or analyse each single point individually both computationally and practically. The point cloud needs to be discretised into simpler features or shapes based on common attributes to support further processing and analysis in these applications. This process, known as segmentation, groups the points with similar attributes with respect to geometric, colormetric, radiometric, and/or other information. The grouped points can be then used for feature extraction, classification, modeling, analysis, and so on. Many segmentation approaches have been developed and tested on Airborne Laser Scanning (ALS) data. While some techniques can be applied or easily adapted to TLS data (Grilli et al.,
⇑ Corresponding author. E-mail address:
[email protected] (E. Che).
2017), TLS has notable differences from ALS and Mobile Laser Scanning (MLS) in characteristics such as view angles, spatial resolution (and variability), and applicability for an area of interest. An object is usually scanned by TLS from several surrounding scan positions while ALS scans an object from above. Although MLS acquires the data from the side of an object similar to TLS, MLS has less flexibility because it requires accessibility for a vehicle with the MLS platform. In addition, MLS and ALS are designed to cover a large area in a short period of time, while TLS usually focuses on a smaller area, enabling more details to be captured. The spatial resolution (point density) of TLS data also varies significantly across the scene by orders of magnitudes due to the fixed (static) set up and scan pattern. Thus, with respect to data size and geometric complexity, segmentation for TLS data presents different challenges compared with ALS and MLS data. Existing segmentation approaches specific to TLS can be categorized into point cloud-based and image-based approaches. In the following sections, these approaches will be summarized. 1.1. Point cloud-based segmentation Point cloud-based approaches segment the data primarily using 3D geometric characteristics. Most of these methods group points through either region growing or clustering technique. The
https://doi.org/10.1016/j.isprsjprs.2018.01.019 0924-2716/Ó 2018 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
2
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
primary difference between them is that the criteria for a region growing focuses more on the relationship between the points in a neighborhood rather than the attributes at each single point in a clustering procedure. 1.1.1. Region growing In general, for each segment, region growing will initiate from one or more seed points manually selected or meeting a specific criterion. Then, the growing process groups the points in a neighborhood iteratively through additional criteria to determine whether to continue growing or to break. Rabbani et al. (2006) present a region growing-based method for the segmentation of smooth surfaces. A threshold of maximum residuals of plane fitting is provided to automatically seed points. The growing process is then performed with a criterion comparing the normal vector between the current point and its neighbor. To cope with both planar and pole-like objects, Habib and Lin (2016) propose a regiongrowing, multi-class, simultaneous segmentation procedure which initiates from the optimal seed regions selected based on the residuals of fitting a planar or pole-like feature. Belton and Lichti (2006) discuss techniques to classify a point as a surface, boundary, or edge point and perform region growing to segment the point cloud. The classification can ensure that the boundary and edge points will not be selected as seed points. Similarly, Nurunnabi et al. (2012) implement a modified Principal Component Analysis (PCA) to perform a more robust normal estimation and feature extraction for the following segmentation. In addition to the geometry of the objects such as surface roughness and curvature, Dimitrov and Golparvar-Fard (2015) present a multi-scale feature detection approach considering point density, which changes dramatically in a TLS data due to the scan pattern. Another approach to overcome the challenge of variable point density is resampling. For example, Vo et al. (2015) generate an adaptive octree to resample the data into voxels such that a region growing-based coarse segmentation can be performed first. 1.1.2. Clustering Some segmentation methods based on clustering techniques group the points using one or more geometric attributes computed for each individual point. The attributes can be an n-dimensional feature vector, which can distinguish the points lying on different classes. For example, Vosselman et al. (2004) implement a 3D Hough transform to extract parameterized shapes such as planes, cylinders, and spheres. Similarly, Maalek et al. (2015) first extract planar features and linear features from the point cloud using PCA and then cluster the planar feature points from the plane parameters. Lari et al. (2011) utilize point density to classify the point cloud into planar and rough surfaces, where the normal of the best fitting plane at each point is used for computing the attributes. Kim et al. (2016) propose a segmentation of planar surfaces using the magnitude of normal position vector for a cylindrical neighbor, which uses two sets of best-fitting plane parameters against two origins as attributes. To segment and classify a more complex natural scene, Brodu and Lague (2012) present a multiscale dimensionality attribute for segmentation and classification where PCA is primarily used to describe the local point distribution in different scales. Some methods utilize pattern recognition or machine learning clustering approaches for a point cloud segmentation. Biosca and Lerma (2008) present a clustering method for segmentation where, for each point, a plane best fit to its neighbors is used for computing the feature vector including the height difference, normal direction, and projected distance against the origin. Next, Fuzzy C-Means (FCM) and Possibilistic C-Means (PCM) are utilized to group the points into segments. Yang and Dong (2013) classify the point cloud according to geometric features using support
vector machines (SVMs). The classification result is segmented by defining a set of rules and can be further refined by merging the segments based on topological connectivity. Weinmann et al. (2015) propose a framework for semantic point cloud interpretation consisting of optimal neighborhood selection, feature extraction, feature selection, and supervised classification. For the supervised classification, various machine learning methods are discussed and tested in the experiment. With a similar workflow, Hackel et al. (2016) propose a more efficient method of semantic classification and demonstrate the effectiveness with both TLS and MLS data. Some other methods resample the data into 3D voxels to organize the point cloud and simplify the computation. Aijazi et al. (2013) utilize the position, normal, color, and intensity information to assign the feature vector to each voxel, which is further used in clustering and classification. Li et al. (2017a) separate ground and non-ground voxels, cluster them based on local point density, and refine them through a merging and re-assignment process. Su et al. (2016) present a segmentation algorithm for industrial sites where an octree-based split is performed based on a graph theory analysis. The criteria of proximity, orientation, and curvature are used for a merging process. Xu et al. (2016) propose a hierarchical segmentation method that first divides the point cloud into patches and then merges over-segmented patches by setting a grouping criterion in different levels. Similar to the concept of voxels, Li et al. (2017b) utilize Normal Distribution Transform (NDT) cells to resample the data and segment the data based on RANSAC. 1.2. Image-based segmentation An image-based segmentation method often follows these three steps: (1) projecting or structuring the TLS data into a 2D image (e.g., exploiting the angular grid structure used in acquiring a TLS scan) including single or multiple bands; (2) performing an image segmentation; and (3) mapping the segments back to the 3D point cloud data. There are two major advantages associated with imagebased segmentation methods (Mahmoudabadi et al., 2016). First, processing the data in 2D is often more computationally efficient than within 3D space. Second, a substantial amount of available techniques for image processing (e.g., image segmentation and edge detection) can be potentially applied to the 2D image derived from a TLS data. Gorte (2007) presents a segmentation algorithm using a threeband image consisting of range (defined as the projected distance along the normal direction to the scan origin), horizontal angle, and vertical angle. Then, by setting the criterion based on the range image gradients, image segmentation is performed. The results of the experiment show that it works properly on vertical planes but fails on horizontal planes. Zhou et al. (2016) improve this approach by fine-tuning the computation of the plane parameters instead of using a coarse estimation. Barnea and Filin (2013) utilize the mean-shift algorithm to segment the image with three bands based on range, normal, and color, respectively, followed by a refinement of integrating the results in different bands. Mahmoudabadi et al. (2013) implement Simple Linear Iterative Clustering for segmenting the point cloud on a panoramic image. Support Vector Machine (SVM) is utilized for categorizing the segments to multiple classes. In addition to range, normal, and color information used in the aforementioned methods, Mahmoudabadi et al. (2016) associate more information including intensity and incidence angle in the segmentation process. High Dynamic Range (HDR) imaging is utilized to minimize color inconsistencies across multiple images due to variable lighting conditions. A panoramic image map with a series of bands is generated with all of the characteristics and then segmented to
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
identify and extract edges. All of these edges from the input metrics are integrated to obtain the final segmentation results. Weinmann and Jutzi (2015) further exploit this scan grid to derive various metrics for evaluating the quality of each point in a TLS scan, which can potentially improve the existing image-based methods. 1.3. Challenges in segmentation Although many existing point cloud segmentation methods are demonstrated to effectively segment TLS data, there are still significant limitations and challenges that warrant further research: (1) Many existing segmentation methods require normal estimation before analyzing and grouping the data. Nevertheless, the estimation of normals is highly dependent on the parameters selected for defining the neighbors (e.g., radius in spherical neighborhood or number of neighbor points in k-NN). Despite a number of approaches to adaptively define neighbors, normal estimation can still be unreliable at edges or rough surfaces given that the normal is undefined at an edge by definition. (2) Efficiency is critical for processing lidar data due to the immense volume of data. Some methods down-sample the data or use a small subset of data for testing, which contains significantly less points and detail than are present in the actual dataset. The principal challenge with downsampling the dataset is that the loss of detail can adversely affect the quality of segmentation, particularly when smaller, detailed objects are of interest. (3) Although some image-based approaches exploit the angular grid structure used in storing a TLS scan for efficient processing, all of these approaches are only capable to segment a single scan individually without considering information from overlapping scans. (4) For machine learning approaches, it can be difficult and time-consuming to collect sufficient training samples to segment a dataset. In addition, for each dataset, a highly specific training dataset may be needed because of the differences in the scan set-up (e.g., scan positions, scan resolution) and the nature of a scene (e.g., types of objects and the corresponding distinguishable attributes). (5) Some methods use the colormetric information coregistered to the point cloud data. Although most TLS systems have an integrated camera, the photographic images and the point cloud data are usually not collected simultaneously, resulting in inconsistencies as a result of the temporal difference (several minutes) in acquisition. Consequently, occlusion effects from moving objects in the scene may cause the segmentation to fail (Mahmoudabadi et al., 2017). Moreover, photographic images suffer from variances in lighting throughout the scene such as shadows, which are prevalent throughout an outdoor scene. (6) Several methods utilize intensity as an attribute for segmentation because it is collected with the point cloud simultaneously and ordinarily not affected by the lighting condition. Nevertheless, for high quality results, radiometric calibration (Kashani et al., 2015) is usually necessary to provide consistent intensity information as a result of degradation due to factors such as range and incidence angle. While normalizing or transforming intensity values to reflectance is as simple as applying an energy transmission model to each point, deriving the coefficients of the model through a radiometric calibration requires substantial effort through rigorous testing of scanning different materials on the site
3
under a wide range of geometric conditions (e.g., range and incidence angle). The calibration is also unique to a specific scanner. To overcome these challenges, this paper presents a fast segmentation method for TLS scans consisting of two steps. First, NORmal VAriation ANAlysis (Norvana), a novel feature extraction approach, eliminates some noise points and extract edge points without requiring a general (and highly error prone) normal estimation at each point. Second, region growing groups the points on a smooth surface to obtain the segmentation result. Both steps take advantage of the angular grid structure used in storing a single TLS scan, which is often neglected in many segmentation algorithms for unorganized point clouds. In our previous work (Che and Olsen, 2017a), we implement a preliminary workflow of Norvana for a single scan, and its effectiveness to segment both indoor and outdoor datasets is demonstrated through qualitatively evaluation. In this paper, a key contribution is that the proposed method associates the data from multiple scans while preserving this structure through a novel indexing scheme. This indexing technique ensures that the processing is fast and can be readily implemented with parallel programming for improved efficiency. Further, the proposed method is robust to different types of objects and parameter settings. It also efficiently detects noise such as mixed pixels and occlusion effects. To validate our approach, we test our method on two real TLS datasets from different scenes with complex features and present both qualitative and quantitative evaluations. In these evaluations, the effectiveness, efficiency, accuracy, applicability, and robustness of the proposed method are demonstrated and discussed. 2. Methodology The proposed method consists of two primary steps, Normal Variation Analysis (Norvana) and Region Growing (Fig. 1). For a TLS dataset including multiple scans, we first generate an index table for each scan based on the scan pattern such that the processing results from individual scans can be integrated together. Then, points lying on silhouette edges, intersection edges, and smooth surfaces are extracted, respectively, through Norvana. This process, as will be explained below, can also help remove some of the noise in the point cloud. Lastly, region growing utilized for grouping the points lying on smooth surfaces. 2.1. Indexing grid structure Some TLS data formats (e.g., ASTM E57 (Huber, 2011) and Leica PTX) structure each scan by grouping the point cloud using the local coordinates with respect to scan origin into a grid and storing the corresponding transformation parameters. The translation component of a transformation matrix represents the coordinates of the scan position (the scan origin) while the rotation component defines the orientation of the scanner (e.g., roll, pitch, yaw). Alternatively, this transformation information can be captured in a translation vector and a quaternion. Such a data structure provides not only a 2D index for each point (row and column), potentially making the processing efficient, but also the topology information such as connectivity between points (Che and Olsen, 2017b). To our knowledge, all of the existing methods exploiting this structure for TLS processing can only process each single scan individually. Further, discussion about merging the results from multiple scans is limited. To solve this problem, we present an innovative approach to associate multiple scans stored in grid structures to efficiently process each individual scan in the grid structure as well
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
4
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
Fig. 1. Workflow of the proposed segmentation process based on Norvana.
as consider the information from overlapping scans to minimize segmentation errors from occlusions. To illustrate this approach, we demonstrate its application to nearest neighbor searching in a given scan as an example. Note that nearest neighbor searching is also later utilized as a step in our proposed segmentation process. First, the data needs to be organized into a grid. Ideally, in a grid structure, the points in each row and column will have consistent vertical and horizontal angles, respectively, within a scanner’s coordinate system. This assumption has been unquestioned in most prior research utilizing the scan grid structure. However, after carefully evaluating a variety of TLS scans from different sensors, we have determined that each scanline is actually independent and stored in individual columns, ordered by time of acquisition. First, depending on the mechanics of a scanner and the scan pattern, the scanlines can be misordered during sorting because of some repeated scanlines which can occur during the scan (e.g., overlap at the start and end of a 360° scan). In addition, while each scanline has a constant number of records and consistent increment of vertical angle, there could be a slight offset in vertical angles between scanlines given slight differences in where the scanline starts and where it stops. In some cases, scanlines can overlap slightly when scans are completed at high resolutions that exceed the ability to resolve the horizontal angle at which the laser is fired. Scanlines can also overlap or be out of order as a result of instability due to slight oscillations of the scanner head as it rotates, especially for high resolution scans. To correct the index of each point in the grid structure from how the file is written by the manufacturers, we first sort the scanlines by the median horizontal angles of each scanline. Note that, for efficiency, we only sort the scanlines rather than the entire point cloud because each scanline can be considered as an independent dataset. Then, the angular increment between adjacent points in the same scanline can be used to estimate the vertical angular resolution while the horizontal angular resolution can be estimated by calculating the angular increment between adjacent scanlines. Once the scan pattern, including field of view and angular resolution is obtained, we adjust the vertical index of each scanline such that the variance of the vertical angles in each row is minimized. After correcting the indices of the original grid for each scan, we generate an index table matching the scan pattern for linking horizontal and vertical angles to the corresponding original indices. To search for the nearest neighbor in a scan for a given point with its global coordinates ½ X
Y
02 3 1 2 3 x X C 6 7 7 1 B6 4 y 5 ¼ R @4 Y 5 TA z Z
ð1Þ
A given point can be mapped to the index table by computing the horizontal angle w and vertical angle h using its local coordinates. Then, a tolerance distance T Dist NN can be given when searching for the nearest neighbor from the points in other scans that lie in the same cell as the given point (Fig. 2). T Dist NN is used mainly to avoid combining points obtained on different sides of features such as walls. 2.2. Norvana Estimating a normal vector at each point is a common procedure required in TLS data processing and analysis, especially in segmentation and classification. A normal vector and its
Z T , the local coordinates
T
½ x y z of this given point in this scan can be first computed by Eq. (1) using the translation vector T and the rotation matrix R, which represent the position and the orientation of the scanner, respectively:
Fig. 2. Schematic illustrating nearest neighbor searching using an index table of the angular grid structure: P is a given point scanned in Scan #1 while P nn is its nearest neighbor in Scan #2. dnn is the 3D Euclidean distance (in global coordinates) between P and P nn , which is compared against the threshold T Dist NN .
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
5
derivatives (e.g., gradient and curvature) can be further used to set the criteria or compute the attributes for grouping the points and distinguishing different classes. Fitting a regular shape (e.g., plane, cylinder) and statistical analysis (e.g., PCA) are two main approaches for normal estimation. Unfortunately, there are two main limitations with such approaches. (1) Ideally, the result of the neighbor searching at a point in 3D space (e.g., k-NN, local neighbors (spherical or cylindrical)) is even distributed, which is difficult to achieve in most cases because the point density in a TLS dataset changes dramatically with different geometries between various objects and the scan origin. (2) One or several geometrical or statistical models used in normal estimation can cause an overfitting problem, especially for the objects that cannot be mathematically described as a simple geometry and those features without a clear definition of normal (e.g., a point lying on an edge resulting from the intersection of multiple planes). Rather than perform an error-prone normal estimation at each point, we introduce an original segmentation approach, Norvana, to extract smooth surfaces by detecting edges as well as removing noise such as mixed pixels. For the Norvana, each point with its eight neighbors in the angular grid structure (Fig. 3a) are used due to three main advantages: (1) the eight adjacent neighbors in the angular grid structure can help preserve as much details as possible by considering the topology information between a point and its adjacent neighbors (Che and Olsen, 2017b); (2) similar to k-NN, it is adaptive to the point density because the number of neighbors is fixed; (3) the neighboring points matching to the scan pattern through the grid structure will surround the center point such that a more unbiased spatial distribution is obtained. With the neighbor points defined in the grid structure, for the edge detection in our approach, we extract two types of edges: silhouette edges and intersection edges, which will be discussed in the following sections. Parallel programming can be easily implemented to improve the computation performance significantly because each point can be analyzed independently in every step. 2.2.1. Silhouette edge detection Because of its low power, either visible or near-infrared laser pulses used in TLS do not normally penetrate most objects except for the transparent and translucent ones (Che and Olsen, 2017a), resulting in occlusions or data gaps (shadows) occurring behind an object observed in a single scan. The points lying on the edge between an object and the shadows are defined as the silhouette edges in this paper. There are two main situations resulting in silhouette edges, which behave differently in a TLS scan. Both are considered in our approach by checking all the points (laser pulses) with valid returns and skipping those without returns. The first situation often occurs at the boundary of a roof or a window in an outdoor scene when there is nothing captured by the laser behind the object. As a result, there will be one or more adjacent neighbor points (laser pulses) with no return, and they can be simply detected by checking the eight adjacent neighbors of a point to determine if it is a silhouette edge point (Fig. 3(a)). The second situation occurs when there is another object detected behind the front object, resulting in a significant change in range and a shadow produced between these two objects. Unfortunately, there are two principal limitations of detecting range variations to extract the silhouette edges. On one hand, the criteria able to cope with various objects in a scan can be complicated considering various factors such as the shape of an object, the scan geometry (range and angle), the scan pattern, and so forth. On the other hand, the occurrence of mixed pixels (Lichti et al., 2005) when the objects are relatively close together can make it even more difficult to model this situation because the mixed pixels are lying on the boundary of those shadows and form an artificial plane. To this end, in the proposed approach, the shadows
Fig. 3. Silhouette edge detection and mixed pixel removal: (a) Schematic illustrating the first situation when a silhouette edge (red point) occurs in a scan. Note that laser pulses with no returns are recorded in the angular grid structure for TLS data; (b) Silhouette edge detection based on the proxy incidence angle (modified from (Che and Olsen, 2017a)): point A lying on the yellow object is the current point under analysis while point B lying on the blue object is one of its adjacent neighbor points. dAB , qA and qB are the distances between A and B, and their 3D Euclidean ! distances from the scan origin O, respectively; n AB is one component of the normal of AB, which is coplanar with triangle OAB, and lastly the corresponding proxy incidence angle is a. Note that if the footprint of a laser beam (green line) covers both two objects and they are located close to one another, mixed pixels may occur between two objects (green dash); (c) A schematic (exaggerated) comparing between the proxy incidence angle and the actual incidence angle: Point A, B, and C are three points lying on a smooth surface where the actual incidence angle of point A is 0°. aAB and aAC are the proxy incidence angles at point A computed with its neighbor points B and C, respectively. As shown in the figure, the proxy incidence angles are larger than the actual incidence angle such that we tend to be more aggressive in detecting silhouette edges and mixed pixels. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
between two objects can be assumed as an imaginary surface with a large incidence angle where the boundary of the surface is the silhouette edge while the mixed pixels are between the silhouette
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
6
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
edges of these objects. By making these assumptions to model the shadows rather than the relationships between different objects, the criteria can be simplified. The silhouette edge points and the mixed pixels can be then distinguished by further checking if the point with a large proxy incidence angle is, in fact, lying on an object. Nevertheless, computing the incidence angle can still be unreliable because it requires a normal estimation on a surface which does not actually exist. To solve this problem, a proxy incidence angle is defined as the angle on the 2D plane defined by the current point, its adjacent neighbor, and the scan origin (Fig. 3(b)). The proxy incidence angle can be considered as an estimation of the actual incidence angle in a certain direction and can be computed using the cosine law (see the equation in Fig. 3(b)). By computing the proxy incidence angles at a point with all its eight neighbors, not only can this analysis be conducted in different directions but also allows more aggressive detection of mixed pixels and silhouette edges, resulting in more robust segmentation (Fig. 3(c)). A point is labeled as a silhouette edge candidate (which includes both the silhouette edges and mixed pixels) if any proxy incidence angle at this point exceeds the given threshold of the maximum incidence angle T a . Then, we further check the adjacent neighbors of each silhouette edge candidate. If any of its neighbors are not labeled as a silhouette candidate, the point is labeled as a silhouette edge point. Otherwise, it is labeled as a mixed pixel. Although using the proposed approach may mislabel distant points lying on an oblique surface such as ground points, those points would likely not be utilized directly since they suffer from low point density, poor ranging accuracy, and limited capability of capturing the geometry features. If that area is of interest, additional scans would capture that section. As a result, the threshold T a should be a function of the scan’s angular resolution as well as the maximum acceptable obliquity (i.e., incidence angle) for a scan of a surface. 2.2.2. Intersection edge detection An intersection edge is defined as the intersection of multiple smooth surfaces. Normal estimation is not as robust on an intersection edge as a smooth surface because over-fitting will most likely occur at the edge point. The principal reason for the difficulty in estimating normals is that there is no clear definition of normals at the edges. In our approach, after extracting the silhouette edges and mixed pixels, we detect the intersection edges in each scan without using the normal vector at each point such that the proposed method is independent from a general normal estimation. As a result, the normals on smooth surfaces can be further used in the following process and analysis while the intersection edges can be handled differently. There are three key steps for the proposed intersection edge detection at each point: (1) a triangular mesh is generated around this point with its eight neighbors resulting in eight shared edges between these triangles; (2) the normal of each triangle is then computed, and the normal gradient across each shared edge determines if this point is lying on an intersection edge or smooth surface; and (3) the result is further refined by mapping the edge points in one scan to the others. This step of generating the triangular mesh requires minimal computational effort because the points needed to formulate the triangles can be quickly found with the grid structure. In the first step, at each point that is not labeled as a silhouette edge point or mixed pixel, we search for a neighbor point in each of the eight directions in the angular grid structure within a given threshold of minimum distance T Dist TIN from this center point. Norvana, within these neighboring points, can be performed in a relatively consistent scale such that the over-segmented artifacts caused by the mismatch between the point density and target level of detail can be limited without needing to resample the data. The current point is labeled as an unclassified point if any of the neigh-
bor points is labeled as a silhouette edge or mixed pixel, which can leave a buffer zone for the silhouette edges, mixed pixels, and oblique surfaces for other potential processes. Once the neighboring points at the current point under analysis are defined, we generate a triangular mesh where the current point is a vertex shared by all the triangles (Fig. 4). The normal of each triangle can be calculated by the cross product between any two edges. Notice that because the normal computed this way can point either inside or outside from the surface, we only consider its orientation towards the scan origin to ensure that all normal vectors point outside the surface. Then, the normal gradient across each shared edge (the orange and purple solid edges in Fig. 4) is computed by its adjacent triangles and the maximum gradient is compared against a specified threshold of maximum normal gradient, T DNorm , to label the current point as an intersection edge point or a smooth surface point. To further refine the results after the analysis is performed to all the scans individually, all of the edge points in each scan are mapped to the other scans. For a point labeled as an intersection edge point, its nearest neighbor within the given threshold T Dist NN in another scan is searched and labeled as an intersection edge point. The proposed method is more aggressive in detecting edge points than preserving the smooth surface points because smooth surfaces are more likely to be captured than the edges in a TLS data. Additionally, region growing is not robust to the false positive errors in recognizing smooth surfaces. 2.3. Region growing Region growing is widely used in point cloud segmentation (Rabbani et al., 2006) due to three important advantages: (1) it is generally efficient because theoretically, each point is analyzed only once during the growing process; (2) the iterative process in
Fig. 4. Intersection edge detection based on normal variation analysis (modified from (Che and Olsen, 2017a)): Point C (red) is the center point being analyzed with its eight neighbors (blue) in the angular grid structure; Points T, C, and B lie on or nearly on the intersection of two planes, while the other points lie on only one of the planes. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
region growing enables it to cope with more complex shapes; and (3) the connectivity between the points in a neighborhood can been considered during the processing such that the criteria can be straight-forward with a limited number of parameters. However, there are two limitations of region growing for segmenting TLS data (Grilli et al., 2017). First, it can be inconsistent because different selections of seed points can lead to different segmentation results. In addition, region growing can be less robust to the points with an inaccurate normal estimation points (e.g., edges) and the noise that can artificially connect multiple objects (e.g., mixed pixels). Fortunately, Norvana helps overcome these challenges. After processing the data through Norvana, only the points lying on a smooth surface are preserved. Hence, the region growing within the smooth surfaces can be consistent and more robust because the outliers (edges and noise) have been eliminated. Moreover, the criteria for region growing (the normal gradient) has been embedded in Norvana during the intersection edge detection. Thus, in the proposed method, region growing is utilized solely for grouping the points in the area enclosed by the extracted edge points and the unclassified points. For consistency with previous procedures, we implement region growing within the entire dataset including multiple angular grid structures in our method (Fig. 5). As we cycle through the data, the process starts from the first point in a scan labeled as a smooth surface point and a segment ID is assigned to this point. Then, its eight adjacent neighbors and its nearest neighbors using the threshold T Dist NN in all the other scans are examined to ensure the process to group points from all the scans. If a neighbor point is labeled as a smooth surface point, it is assigned with the same segment ID and the same growing process will be performed with this point. Once there is no valid neighbor to grow to, a new segment ID will be assigned to the next smooth surface point that has not been visited. After the proposed region growing, all points lying on smooth surfaces are grouped into a number of segments with unique IDs assigned to each segment. Notice that the proposed region growing essentially groups the points enclosed by the edge points such that the order of seeking
7
neighbors in the same scan as well as different scans does not affect the segmentation result. Additionally, the nearest neighbor searching within T Dist NN associated with the silhouette edge detection results can effectively separate the occlusion (e.g., moving object) from the object of interest. 3. Experiments To evaluate the proposed segmentation framework qualitatively and quantitatively, we use two experimental datasets: one indoor and one outdoor. Both of the datasets are collected with a resolution of 0.01 m@25 m (equivalent to an angular resolution of 0.023°) by the Leica ScanStation P40 (Fig. 6). The first dataset (Indoor dataset, Fig. 6(a)) contains 1,205,600 points in a single scan to several objects of different geometric shapes at an approximate range of 5 m from the scanner. The indoor dataset is primarily used as an example for the discussion of the parameter selection and the accuracy assessment. In addition to the first experimental setup, another dataset (Outdoor dataset, Fig. 6(b)), acquired for an ornate building located on the campus of Oregon State University, tests the effectiveness and demonstrates the robustness and efficiency of the proposed method. Eight scans resulting in 391,788,948 points in total are collected within the scene, which contains a variety of objects as well as noise. 3.1. Indoor dataset 3.1.1. Parameter selection To detect silhouette edges and remove mixed pixels, a threshold T a is necessary. One limitation of using a single angular threshold T a is that points obtained across an oblique surface can be misclassified as mixed pixels. Theoretically, with the known full waveform, beam divergence, and other information, an oblique surface could be distinguished from mixed pixels by adding additional constraints considering range (Hartzell et al., 2015). However, the full waveform or the signal processing function is not usually accessible for most users of many terrestrial laser scanners. Fortunately, in the practice of collecting sufficient data for the area of
Fig. 5. Schematic illustrating the proposed region growing process in multiple scans with angular grid structures. The red point in Scan #1 is the current point being considered during the region growing. It is growing to its eight neighbors in the same scan as well as its nearest neighbors in the other scans (Scans #2 and #3) with the given threshold T Dist NN for the neighbor searching. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
8
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
Fig. 6. The TLS datasets for testing the proposed segmentation: (a) Indoor dataset (single scan); (b) Outdoor dataset (multiple scans).
interest, those surfaces oblique to one scan are usually captured by another scan position at a closer range and more direct view. In addition to considering the point density, a point lying on an oblique surface or at a distant range sometimes can be eliminated by a range filtering because of its lower accuracy. Ground points, in particular, usually suffer the most from the situation mentioned above. As a result, we use 85° as default setting of T a , which means
that if the scanner is set up 2 m high above a flat ground surface, the proposed Norvana can work effectively for the ground points within an approximate range of up to 23 m away. T Dist TIN and T DNorm are the parameters for constraining the shape of the triangles and differences in normals, respectively. However, for the scene including some non-planar objects such as pipes, the parameters can be more difficult to determine. To cope with such
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
situation, provided that the point density is higher than T Dist TIN , the largest curvature in tolerance of a smooth surface can be computed by combining T Dist TIN and T DNorm (Fig. 7). In other words, T Dist TIN can be specified based on the target point density or level of detail, and T DNorm can be then computed by the largest curvature. For example, in the indoor dataset, T Dist TIN is set to be 0.01 m considering the point density as well as target accuracy and level of detail in modeling while the diameter of the pipe, an object of interest with the largest curvature in the scene, is approximately 0.05 m. Based on the equation shown in Fig. 7, T DNorm can be then solved to be approximately 23°. Notice this theoretical parameter can be finetuned by balancing the aggressiveness in detecting edges and grouping smooth surfaces. Additionally, the surface roughness is another a factor to consider so that Norvana is performed in an appropriate scale. As a result, to be slightly conservative in extracting the thin pipe in this case, T Dist TIN and T DNorm are set to 0.01 m and 25°, respectively. 3.1.2. Accuracy assessment It is challenging to perform point-based accuracy assessment of a segmentation algorithm for TLS data because: (1) segmenting the data manually can be time consuming due to the high resolution;
Fig. 7. Schematic illustrating the selection of the parameters T Dist based on the point density and curvature of the surface.
TIN
and T DNorm
9
(2) processing the data point-by-point is very subjective especially at boundaries and noise points; and (3) it is difficult to evaluate the manual segmentation results quantitatively. To overcome these challenges, we propose a novel approach for accuracy assessment of the segmentation. For ground truth, we manually extract the objects in geometric shapes and subsequently fit geometric models via least squares. We then run our automatic segmentation, select each major segment, and fit a geometric model. In this way, we can evaluate the proposed segmentation method not only by comparing the models generated by manual and automatic segmentation results, but also the quality of model fitting. In the indoor dataset, 7 objects in different shapes and sizes (1 plane, 2 spheres, 2 cylinders, and 2 cones) are selected to be modeled for the accuracy assessment (Fig. 8). In this example, the silhouette edges (blue) are extracted properly while the mixed pixels (red) are labeled correctly. After the intersection edges (black) are detected, the smooth surface points are grouped and randomly colored. According to the summary of modeling quality (Table 1), there is no significant difference in errors between the manual approach and the proposed semi-automatic approach based on Norvana. However, the number of points for each model extracted from Norvana is consistently less than the manually extracted one. There are three principal reasons for this: (1) visually recognizing and manually selecting points lying on an object can be very subjective at the boundaries; (2) Norvana will create a buffer around the silhouette and intersection edge points and the width of this buffer is approximately equal to T Dist TIN ; (3) other than at the boundaries, there will be more edge points detected on a rough surface because Norvana is relatively aggressive in detecting edges when enclosing a region (e.g., Cylinder2). However, it is worth noting that this difference is marginal in most cases because the main discrepancies occur in areas with mixed pixels. To further evaluate the proposed segmentation method, we compare the modeling results in position, orientation, and shape (see Table 2 for descriptions). Even though the comparison results (Table 3) overall show that an accurate model can be generated by the segmentation through Norvana, the differences in position and orientation for Cone1, Cone2 and Cylinder1 are significantly larger than the other objects. One reason for this difference is that it can be difficult to determine the boundary between points representing the objects and mixed pixels with manually extracted points.
Fig. 8. Accuracy assessment by comparing the models generated through the semi-automatic modeling through Norvana and a manual extraction approach.
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
10
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
Table 1 Summary of modeling quality. Object
Fitting Error Statistics (m)
# of Points
Mean
Std. Dev.
Abs. Mean
Abs. Max.
Plane
Manual Norvana
0.0000 0.0000
0.0013 0.0013
0.0011 0.0010
0.0047 0.0044
132,983 126,435
Sphere1
Manual Norvana
0.0000 0.0000
0.0022 0.0022
0.0018 0.0018
0.0091 0.0091
119,872 117,257
Sphere2
Manual Norvana
0.0000 0.0000
0.0010 0.0010
0.0008 0.0007
0.0048 0.0060
40,109 39,200
Cylinder1
Manual Norvana
0.0000 0.0000
0.0002 0.0004
0.0002 0.0003
0.0018 0.0014
4,661 3,595
Cylinder2
Manual Norvana
0.0000 0.0000
0.0004 0.0004
0.0003 0.0003
0.0017 0.0017
11,890 5,668
Cone1
Manual Norvana
0.0000 0.0000
0.0006 0.0005
0.0004 0.0004
0.0028 0.0025
34,298 30,881
Cone2
Manual Norvana
0.0000 0.0000
0.0006 0.0005
0.0005 0.0004
0.0026 0.0062
15,428 13,329
Table 2 Definitions of the attributes used for the model comparison. Object
Position
Orientation
Shape
Plane Sphere Cylinder
Centroid coordinates Center coordinates Centroid coordinates on the axis Vertex coordinates at the top
Normal vector – Axis vector
– Diameter Diameter
Axis vector
Diameter at the bottom
Cone
Hence, the manual segmentation likely includes some mixed pixels which can make the model slightly tilted, where some modeling techniques (e.g., RANSAC) may be needed to improve the results. 3.2. Outdoor dataset 3.2.1. Qualitative evaluation An outdoor dataset can be more challenging compared with an indoor scene because: (1) various types of objects are often present in the scene; (2) a wide range of point density exists on these objects; and (3) more occlusions occur within the scene. To further test the effectiveness and efficiency of the proposed segmentation, we run the Norvana segmentation on an outdoor dataset consisting of 8 registered scans. Considering the point density and the target
level of detail for segmentation, T a , T Dist TIN and T DNorm are set to be 85°, 0.03 m and 25°, respectively while T Dist NN is set to 0.01 m based on the reported accuracy of the registration. The overall result of the segmentation (Fig. 9) can be evaluated qualitatively, and several close-up views for different types of objects are selected for further discussion (Figs. 10–15). The buildings in an urban scene can be one of the most important objects where the building facades are usually well captured by TLS due to its view angle and high resolution. The proposed segmentation method performs effectively on both planar and curved building façades (Fig. 10). Some relatively small building elements (e.g., window sills and frames) are segmented appropriately. The edge points also can be used as input for some applications such as architectural analysis and building documentation where the data size can be reduced significantly (Liang et al., 2014) while still effectively communicating key structure. In addition, Norvana also performs well in segmenting stairs, which are composed by a series of horizontal and vertical planes (Fig. 11). Notice that for the data gap caused by the occlusion effect in one scan is properly filled in by another scan (e.g., the area with an inconsistent point density in B2) because the proposed region growing approach groups the points from multiples scans into the same segment. Moreover, there are a number of man-made and natural cylindrical objects including columns, pipes, and tree trunks that would be potentially desired to be segmented. Even though these cylindrical
Table 3 Comparison of modeling results. Object
Position (m)
Orientation
Shape (m)
Coordinates
Diff.
Vectors
Diff.
Diameters
Diff.
Plane
Manual Norvana
(1.9066, 4.2223, 0.6789) (1.9060, 4.2223, 0.6819)
0.0031
(0.0802, 0.9968, 0.0056) (0.0803, 0.9968, 0.0056)
0.0069°
– –
–
Sphere1
Manual Norvana
(2.8593, 3.5774, 0.7849) (2.8592, 3.5774, 0.7848)
0.0001
– –
–
0.7277 0.7275
0.0002
Sphere2
Manual Norvana
(2.4190, 4.0183, 0.9289) (2.4193, 4.0186, 0.9290)
0.0004
– –
–
0.4391 0.4399
0.0008
Cylinder1
Manual Norvana
(2.3291, 3.6007, 1.7933) (2.3290, 3.6004, 1.7948)
0.0015
(0.0125, 0.0006, 0.9999) (0.0004, 0.0007, 1.0000)
0.7468°
0.1170 0.1164
0.0006
Cylinder2
Manual Norvana
(2.5182, 4.1763, 1.7224) (2.5219, 4.1753, 1.7229)
0.0039
(0.8424, 0.5388, 0.0042) (0.8424, 0.5388, 0.0040)
0.0152°
0.0525 0.0546
0.0021
Cone1
Manual Norvana
(1.7895, 3.9237, 0.9671) (1.7902, 3.9255, 0.9568)
0.0105
(0.0099, 0.0137, 0.9999) (0.0111, 0.0166, 0.9998)
0.1776°
0.2638 0.2607
0.0031
Cone2
Manual Norvana
(2.7308, 3.3559, 1.2688) (2.7300, 3.3543, 1.2776)
0.0090
(0.0064, 0.0097, 0.9999) (0.0054, 0.0072, 1.0000)
0.1509°
0.1835 0.1835
0.0000
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
11
Fig. 9. Overview of the outdoor dataset (left) where the tags show the location of the close-up views shown in the following figures. The segmentation result (right) where the segments are randomly colored while the black points indicate the intersection edge points.
Fig. 10. Segmentation results for different shapes of the building façade.
objects cover a wide range of size and point density, the segmentation results appear sound for these human-made objects (Fig. 12). Tree leaves and branches are mostly detected as silhouette edges and noise points such that the upper part of a tree is either over-segmented or eliminated. Nevertheless, the lower trunk part is segmented reasonably well and would be suitable input for specific processing and analysis such as individual tree detection. While overall the proposed method separates the ground and non-ground points effectively, it fails to group the points lying on the grass in the middle of the scene where a limited number of points are detected as smooth surface while the other points are either detected as intersection edges or removed as silhouette edges or noise. To further explore this difficulty, we select an area with short plants (Fig. 13) from one scan. In close range, the points are too close to each other, resulting in ranging errors (Barnea and Filin, 2013) causing those points to be labeled as silhouette edges. Although the silhouette edges of the flowers and the leaves can be
detected correctly with the mixed pixels extracted, with an increasing range, the bare earth is less likely to be captured. As a result, the proposed Norvana segmentation cannot segment the rough ground surface covered by grass. To cope with this situation, existing ground filtering methods (e.g., Che and Olsen, 2017b) can be associated to extract ground points if desired. Unlike the grass ground, most of the points on the pavement and sidewalk are labeled as smooth surfaces such that they are clearly separated from the grass. However, while grouping points on the ground, some points are over-segmented while some others are under-segmented. As discussed in 3.1.1, the ground points far from the scanner are falsely identified as mixed pixels because the ground surface is becoming too oblique to pass the threshold T a . Thus, the over-segmentation caused by occlusions will more likely occur in the area with less overlap between scans with a valid view angle (Fig. 14). Other than the areas where an oversegmented problem happens, the pavement and sidewalk are under-segmented because they are connected by a ramp in
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
12
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
Fig. 11. Segmentation results of the stairs. Note the effective results despite the significant amount of occlusions present.
Fig. 12. The segmentation results for different cylindrical objects present within the scene.
between (Fig. 15). Depending on the specific application, there can be different ways to improve the over-segmented and undersegmented problems. For example, if the segmentation result is used to fit models, the segments can be merged or split based on the model fitting results. 3.2.2. Robustness test In this section, we test the robustness of the proposed segmentation in three aspects: parameter settings, scan resolution, and computational efficiency. First, to test the parameter robustness,
we select a portion of a large tree trunk which is over-segmented with the parameter settings used in the previous section because of the rough texture and complex shape. Next, we run the proposed segmentation to the same object using a series of parameter combinations (Fig. 16). The test shows that it is possible to finetune the parameters to obtain a reasonable segmentation result based on the desired level of detail. With an increasing T Dist TIN , the rough textures are more likely to be preserved in the segment rather than the intersection edges. T DNorm can be set accordingly to fit the complex shape of the objects. Overall, a wide range of
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
13
Fig. 13. Ground in close range and short plants where the silhouette edge points and mixed pixels are colored in blue and red respectively while the bottom right figure illustrates the smooth surfaces and intersection edges from the same point of view. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 14. Over-segmented pavement and sidewalk where the silhouette edge points and mixed pixels are colored in blue and red, respectively. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 15. Under-segmented pavement and side walk caused by a connecting ramp.
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
14
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
Fig. 16. Parameter robustness test for a complex tree trunk.
Fig. 17. Scan resolution robustness test.
parameter combinations can work for this object, which means the parameter selection can be fairly flexible and the results behave robust to the parameter settings.
In the previous qualitative evaluation, it has been discussed how the proposed segmentation performs in different point density. For further testing the robustness to scan resolution, we
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
15
Fig. 18. Computation time with different number of points processed and number of threads.
down-sample the original data (1/4, 1/9, 1/16, and 1/25) and run our segmentation with the same parameter settings and evaluate the segmentation on the building façade as an example (Fig. 17). With a decreasing scan resolution, the details on the building façade are less likely to be captured completely, especially for the scans further from the object. Even though some elements are grouped together because the intersection edge points cannot enclose the regions, the proposed method still provides a reasonable result of both segments and edges without fine-tuning the parameters with the variant point density. To test the efficiency robustness, we down-sample at various increments down to 1/100 and run Norvana segmentation. Because we take advantage of parallel programming in Norvana, the efficiency is also tested using different numbers of threads. The processor used in this work is an Intel(R) Xeon(R) CPU E5620 @ 2.40 GHz with 4 cores and supports 8 threads. Note that the computation time and performance exclude the data i/o operations. The correlation between the number of points and computation time is close to linear, indicating a strong scalability of the proposed algorithm with increased threads (Fig. 18); hence, Norvana segmentation is capable to process a larger dataset efficiently. Trying to avoid iterative and complex computations as well as limiting the number of times necessary to traverse through the data in Norvana are the primary reasons for achieving such a high performance even when processing datasets as large as hundreds of millions of points. The computation performance using 8 threads is over 1 million points per second. Processing would be even faster using a higher performance CPU or more threads because the usage of every thread is 100% during the Norvana stage of segmentation, indicating that the over-heading of data transferring within physical memory is scarcely affecting the computation performance. Further analysis and evaluation on parallel speed-up and efficiency could be achieved by running larger benchmark data with various processing architectures, which is outside the scope of this manuscript. 4. Conclusions This paper proposes a fast segmentation method for Terrestrial Laser Scanning (TLS) data consisting of two steps: (1) normal variation analysis (Norvana); (2) region growing. Norvana is a novel approach able to detect edges as well as filter noise such as that present from mixed pixels. Through testing two datasets in the experiment, we demonstrate and discuss the effectiveness and effi-
ciency of the proposed from the aspects of parameter selection, accuracy assessment, qualitative evaluation, and robustness. The proposed method can be used by many applications as a general segmentation framework, hence further refinement could be achieved by adding detailed criteria to remove, split, merge, or model the segments based on the unique requirements of a specific application, if desired. Norvana segmentation provides several important contributions: (1) Norvana functions as an efficient edge detection technique; (2) Norvana can remove the mixed pixels; (3) Norvana can cope with occlusion effects by processing multiple structured scans simultaneously; (4) normal estimation at each point is not required such that the segmentation can potentially be used as a preprocessing for improving a general normal estimation; and (5) due to exploiting the angular grid structure and taking advantage of parallel programming, a data including hundreds of millions of points can be processed with a efficiency of over 1 million points per second with the processing architecture (8 threads) used in the experiment. (This processing speed is on par with current acquisition rates of terrestrial laser scanners). The proposed algorithm is developed for structured TLS data but not directly applicable to unorganized point clouds obtained from Airborne Laser Scanning (ALS), Mobile Laser Scanning (MLS), and Structure from Motion (SfM). Therefore, in the future, we will continue our work in reconstructing or building the angular grid structure (e.g., placing a virtual scanner to capture the scene in an organized fashion) for these alternative sensors and utilizing the proposed segmentation method for further processing and analysis. Acknowledgements This material is based upon work supported by the National Science Foundation under Grant No. CMMI-1351487. The authors also appreciate the support of Leica Geosystems and David Evans and Associates who provided hardware and software used in this research, as well as Dr. Mike Bailey’s (OSU) assistance with some of the preliminary tests. We thank the anonymous reviewers who provided helpful feedback to the manuscript. References Aijazi, A.K., Checchin, P., Trassoudaine, L., 2013. Segmentation based classification of 3D urban point clouds: A super-voxel based approach with evaluation. Remote Sens. 5 (4), 1624–1650.
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019
16
E. Che, M.J. Olsen / ISPRS Journal of Photogrammetry and Remote Sensing xxx (2018) xxx–xxx
Barnea, S., Filin, S., 2013. Segmentation of terrestrial laser scanning data using geometry and image information. ISPRS J Photogramm Remote Sens 76, 33–48. Belton, D., Lichti, D.D., 2006. Classification and segmentation of terrestrial laser scanner point clouds using local variance information. Int. Arch. Photogramm. Remote Sens., Xxxvi 5, 44–49. Biosca, J.M., Lerma, J.L., 2008. Unsupervised robust planar segmentation of terrestrial laser scanner point clouds based on fuzzy clustering methods. ISPRS J Photogramm Remote Sens 63 (1), 84–98. Brodu, N., Lague, D., 2012. 3D terrestrial lidar data classification of complex natural scenes using a multi-scale dimensionality criterion: Applications in geomorphology. ISPRS J Photogramm Remote Sens 68, 121–134. Che, E., Olsen, M.J., 2017a. Fast Edge Detection and Segmentation of Terrestrial Laser Scans Through Normal Variation Analysis. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, pp. 51–57. Che, E., Olsen, M.J., 2017b. Fast ground filtering for TLS data via Scanline Density Analysis. ISPRS J Photogramm Remote Sens 129, 226–240. Dimitrov, A., Golparvar-Fard, M., 2015. Segmentation of building point cloud models including detailed architectural/structural features and MEP systems. Autom Constr 51, 32–45. Gorte, B., 2007, September. Planar feature extraction in terrestrial laser scans using gradient based range image segmentation. In ISPRS Workshop on Laser Scanning, pp. 173–177. Grilli, E., Menna, F., Remondino, F., 2017. A Review of Point Clouds Segmentation and Classification Algorithms. ISPRS-International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, pp. 339– 344. Habib, A., Lin, Y.J., 2016. Multi-Class Simultaneous Adaptive Segmentation and Quality Control of Point Cloud Data. Remote Sens. 8 (2), 104. Hackel, T., Wegner, J.D., Schindler, K., 2016. Fast semantic segmentation of 3D point clouds with strongly varying density. ISPRS Ann. Photogramm., Remote Sens. Spatial Inform. Sci., Prague, Czech Republic 3, 177–184. Hartzell, P.J., Glennie, C.L., Finnegan, D.C., 2015. Empirical waveform decomposition and radiometric calibration of a terrestrial full-waveform laser scanner. IEEE Trans Geosci Remote Sens 53 (1), 162–172. Huber, D., 2011, January. The ASTM E57 file format for 3D imaging data exchange. In: Three-dimensional imaging, interaction, and measurement. vol. 7864. Kashani, A.G., Olsen, M.J., Parrish, C.E., Wilson, N., 2015. A review of LiDAR radiometric processing: From ad hoc intensity correction to rigorous radiometric calibration. Sensors 15 (11), 28099–28128. Kim, C., Habib, A., Pyeon, M., Kwon, G.R., Jung, J., Heo, J., 2016. Segmentation of planar surfaces from laser scanning data using the magnitude of normal position vector for adaptive neighborhoods. Sensors 16 (2), 140. Lari, Z., Habib, A., Kwak, E., 2011, May. An adaptive approach for segmentation of 3D laser point cloud. In: ISPRS Workshop Laser Scanning. pp. 29–31. Li, L., Yang, F., Zhu, H., Li, D., Li, Y., Tang, L., 2017a. An improved RANSAC for 3D point cloud plane segmentation based on normal distribution transformation cells. Remote Sens. 9 (5), 433. Li, Y., Li, L., Li, D., Yang, F., Liu, Y., 2017b. A density-based clustering method for urban scene mobile laser scanning data segmentation. Remote Sens. 9 (4), 331.
Liang, Y., Zhan, Q., Che, E., Chen, M., 2014. Semiautomatic generation of three-view drawing of building using terrestrial laser scanning. In: IOP Conference Series: Earth and Environmental Science, vol. 17, No. 1, p. 012230. IOP Publishing. Lichti, D.D., Gordon, S.J., Tipdecho, T., 2005. Error models and propagation in directly georeferenced terrestrial laser scanner networks. J. Surv. Eng. 131 (4), 135–142. Maalek, R., Lichti, D.D., Ruwanpura, J., 2015. Robust classification and segmentation of planar and linear features for construction site progress monitoring and structural dimension compliance control. ISPRS Ann. Photogramm., Remote Sens. Spatial Inform. Sci., II-3/W5, pp. 129–136. Mahmoudabadi, H., Shoaf, T., Olsen, M.J., 2013, July. Superpixel clustering and planar fit segmentation of 3d lidar point clouds. In: Computing for Geospatial Research and Application (COM. Geo), 2013 Fourth International Conference on (pp. 1–7). IEEE. Mahmoudabadi, H., Olsen, M.J., Todorovic, S., 2016. Efficient terrestrial laser scan segmentation exploiting data structure. ISPRS J Photogramm Remote Sens 119, 135–150. Mahmoudabadi, H., Olsen, M.J., Todorovic, S., 2017. Detecting sudden moving objects in a series of digital images with different exposure times. Comput Vis Image Underst 158, 17–30. Nurunnabi, A., Belton, D. and West, G., 2012, December. Robust segmentation in laser scanning 3D point cloud data. In: 2012 International Conference on IEEE, Digital Image Computing Techniques and Applications (DICTA), pp. 1–8. Rabbani, T., Van Den Heuvel, F., Vosselmann, G., 2006. Segmentation of point clouds using smoothness constraint. Int. Arch. Photogramm., Remote Sens. Spatial Inform Sci. 36 (5), 248–253. Su, Y.T., Bethel, J., Hu, S., 2016. Octree-based segmentation for terrestrial LiDAR point cloud data in industrial applications. ISPRS J Photogramm Remote Sens 113, 59–74. Vo, A.V., Truong-Hong, L., Laefer, D.F., Bertolotto, M., 2015. Octree-based region growing for point cloud segmentation. ISPRS J Photogramm Remote Sens 104, 88–100. Vosselman, G., Gorte, B.G., Sithole, G., Rabbani, T., 2004. Recognising structure in laser scanner point clouds. Int. Arch. Photogramm., Remote Sens. Spatial Inform. Sci. 46 (8), 33–38. Weinmann, M., Jutzi, B., 2015. Geometric point quality assessment for the automated, markerless and robust registration of unordered TLS point clouds. ISPRS Ann. Photogramm., Remote Sens. Spatial Inform. Sci., II-3 W, 5, pp. 89– 96. Weinmann, M., Jutzi, B., Hinz, S., Mallet, C., 2015. Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers. ISPRS J Photogramm Remote Sens 105, 286–304. Xu, Y., S., Tuttas, Stilla, U., 2016. Segmentation of 3D outdoor scenes using hierarchical clustering structure and perceptual grouping laws. In: Pattern Recogniton in Remote Sensing (PRRS), 2016 9th IAPR Workshop on. IEEE, pp. 1– 6. Yang, B., Dong, Z., 2013. A shape-based segmentation method for mobile laser scanning point clouds. ISPRS J Photogramm Remote Sens 81, 19–30. Zhou, G., Cao, S., Zhou, J., 2016. Planar segmentation using range images from terrestrial laser scanning. IEEE Geosci Remote Sens Lett 13 (2), 257–261.
Please cite this article in press as: Che, E., Olsen, M.J. Multi-scan segmentation of terrestrial laser scanning data based on normal variation analysis. ISPRS J. Photogram. Remote Sensing (2018), https://doi.org/10.1016/j.isprsjprs.2018.01.019