Journal Pre-proof Generalization considerations and solutions for point cloud hillslope classifiers
Luke Weidner, Gabriel Walton, Ryan Kromer PII:
S0169-555X(20)30009-X
DOI:
https://doi.org/10.1016/j.geomorph.2020.107039
Reference:
GEOMOR 107039
To appear in:
Geomorphology
Received date:
6 December 2019
Revised date:
7 January 2020
Accepted date:
9 January 2020
Please cite this article as: L. Weidner, G. Walton and R. Kromer, Generalization considerations and solutions for point cloud hillslope classifiers, Geomorphology(2020), https://doi.org/10.1016/j.geomorph.2020.107039
This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
© 2020 Published by Elsevier.
Journal Pre-proof Generalization considerations and solutions for point cloud hillslope classifiers Weidner, Luke1*; Walton, Gabriel1; Kromer, Ryan1 1
Colorado School of Mines, 1500 Illinois Street, Golden, Colorado 48640, USA
*Corresponding author email:
[email protected]
of
Keywords: machine learning; point cloud; classification; domain adaptation
ro
Abstract
-p
Point cloud classifiers have the potential to rapidly perform landscape characterization for a variety of applications. The generalization (i.e., transferability to new sites) of such
re
classifiers could improve their accessibility and usefulness for both engineers and
lP
researchers alike, but guidelines for classifier generalization are lacking in the literature. This study develops and applies a Random Forest machine learning classifier for
na
Terrestrial Laser Scanning (TLS) point clouds, and generalizes the classifier to point
ur
clouds from several different locations. The classifier is trained to identify basic hillslope
Jo
topographic features, including vegetation, soil, talus, and bedrock using multi-scale geometric features of the point cloud. Four rock and soil slopes in western Colorado were scanned using TLS. Generalization experiments were performed testing point density, occlusion, and between-site domain variance factors, and all factors showed a significant influence on generalization accuracy. Several methods for improving classifier generalization accuracy were tested and compared, including combining training data from multiple sites, imposing probability thresholds, and a Domain Adaptation methodology known as Active Learning. It was found that incorporating data from multiple sites resulted in improved generalization accuracy, but in most cases the
Journal Pre-proof largest improvements in accuracy were associated with adding new training data from the target site. In this case, using Active Learning resulted in significant accuracy improvements with an over 90% reduction in the number of added training points. The results suggest that scanning characteristics are important factors in classifier generalization accuracy, but their effects can be mitigated by using the techniques described herein.
ro
of
1. Introduction
Machine learning and remote sensing have been increasingly used in engineering
-p
geology applications to gain new insights into dynamic earth surface processes.
re
(Jaboyedoff et al., 2012; Abellán et al., 2014; Eitel et al., 2016; Lary et al., 2016). LiDAR
lP
and photogrammetry are commonly used to create surface models for use in any number of subsequent engineering and geomorphologic analyses (Lan et al., 2010; Van
na
Den Eeckhaut et al., 2012; Westoby et al., 2012; Tarolli, 2014; Weidner et al., 2019a).
ur
Recent applications include automated rock slope monitoring and characterizing
Jo
soil/rock erosion dynamics (Abellán et al., 2014; Eltner and Baumgart, 2015; Kromer et al., 2017; Wagner et al., 2017; Kromer et al., 2019). In many cases, 2.5-D products (e.g., raster representations) are an unreasonable simplification of the site geometry, and the raw point cloud data is instead used directly for analysis (Lato and Vöge, 2012; Telling et al., 2017). An emerging issue is that interpreting large point cloud datasets to extract semantic features is often difficult and time consuming. Extensive filtering and manual editing are often used to distinguish between different objects of interest, such as vegetation, snow, soil and rock debris, alluvial deposits, talus fans, and bedrock outcrops (van Veen et al., 2017; Bonneau and Hutchinson, 2019).
Journal Pre-proof Tools for solving this problem have been developed in the past few years. Brodu and Lague (2012) developed a tool for 3D classification of point clouds by calculating local point statistics at multiple scales. Their open-source tool has an easy-to-use GUI in the open-source CloudCompare software (Girardeau-Montaut, 2018). Elsewhere, Dunham (2017) created a knowledge-driven decision tree classifier to identify rockfall source areas in outcrops. This method defines thresholds of slope angle and slope roughness
of
to classify points as intact and fractured rock, talus, and overhangs, but does not
ro
account for vegetation, which must be classified by other means.
-p
Machine Learning (ML) based methods, using algorithms like Random Forest
re
(Weinmann et al., 2015; Mayr et al., 2017; Becker et al., 2018) and deep neural
lP
networks (Qi et al., 2017), have been used to identify many semantic objects simultaneously with a single classifier. In an ML model, parameters are learned
na
automatically, and the only manual requirement is to create training data. These methods have only recently begun to be applied to engineering geology and
ur
geomorphology (Mayr et al., 2017). One of the most widespread uses of ML in
Jo
engineering geology is landslide susceptibility mapping (Reichenbach et al., 2018). In a different application area, Carter (2018) used Random Forest and slope morphology classification separately to identify rockfall source zones in airborne LiDAR and satellite orthoimagery. Mills and Fotopoulos (2015) used an ML classifier to identify and remove wire mesh from rock surface point clouds to improve surface reconstructions. Walton et al. (2016, 2019) used a similar approach to automatically characterize sedimentary beds in an open-pit mine face using mobile LiDAR data. Mayr et al. (2017) used an object-based classification method to identify slope morphology and extract landslide
Journal Pre-proof areas. Beretta et al. (2019) classified rock lithology and surface materials in an open pit mine using RGB features from UAV photogrammetry data. Bonneau and Hutchinson (2018) classified a cliff and talus slope based on grain size to track erosion and deposition patterns. While these studies highlight that ML has a multitude of important uses, the generalization of these methods is, for the most part, unexplored. Applying ML models assumes that the features of the target scene are from the same
of
distribution as the source, but this is not always an accurate assumption in reality (Pan
ro
and Yang, 2010). The distribution of the available training dataset may be at best an
-p
approximation of the distribution of the target application, and this is particularly relevant
re
to the geosciences, where datasets are typically much smaller in size and heterogeneity
lP
through space and time is the norm (Lary et al., 2016). In geoscience practice, limitations on generalization are well known, and Domain Adaptation, which attempts to
na
improve the transferability of ML methods, has become a major area of research (Tuia et al., 2016). The existing literature in engineering geology and geomorphology mostly
ur
deals with developing and applying an algorithm to one or a few test locations, and the
Jo
generalization of these algorithms is suggested as feasible but is rarely demonstrated. In validating a general point cloud classifier for urbanized environments, Becker et al. (2018) performed inter-dataset comparison, observing reduced classification accuracy in cases where the training and testing datasets contained differing object types. These types of validation analyses are an important step in ensuring that the developed ML methods are ultimately useful to the greater research community. For point cloud applications in engineering geology and geomorphology, there is a lack of guidance on the generalization potential of ML classifiers. For example, the accuracy
Journal Pre-proof of a pre-built classifier may differ from a site-specific one if the training examples are dissimilar to the target slope conditions, and if this is the case, the best approach may be to build a classifier from scratch using all new training data. Variables controlled by aspects of data collection, such as different point spacing or different scanning geometries, are related to the differences in accuracy between training and testing (Mills, 2015). Further, it remains unclear if combining data from multiple locations could
of
be used to improve classification accuracy.
ro
The current study investigates these effects using a general rock and soil slope
-p
morphology classifier on four spatially and morphologically diverse locations.
re
Experiments were performed by varying point spacing, scanning geometry, and various
lP
combinations of combined training datasets. We demonstrate several techniques that can be used to improve classification accuracy by modifying or augmenting a pre-
na
existing classifier. We employ a prototype Domain Adaptation methodology using Active
ur
Learning, whereby the addition of new training data can be expedited.
Jo
2. Data collection and site descriptions LiDAR data were collected at several sites in western Colorado, USA along Interstate 70 using a FARO Focus X330 laser scanner, the locations of which are shown in Figure 1. Scans were collected at two or more positions along the slope at each site to be merged in post-processing. Scans were processed in the CloudCompare software (Girardeau-Montau, 2018) by first aligning multiple scan locations using the Iterative Closest Point (ICP) algorithm. Aligned scans were then merged to create a single final point cloud with higher point density and reduced occlusion.
Journal Pre-proof Table 1 summarizes the characteristics of sites used for this study. Figures 2 and 3 show the labeled point clouds of these sites (see Section 3.1 for explanation of point labels). These sites were chosen because of their interest to the Colorado Department of Transportation as rockfall hazard zones, but they also happen to represent multiple rock types and slope geometries commonly seen in Colorado. Glenwood Canyon, Colorado, is a steep-walled river canyon cut by the Colorado River
of
east of Glenwood Springs, Colorado. The construction of the Interstate Highway 70 (I-
ro
70) corridor in the 1990s prompted increased study of the natural hazards present in the
-p
canyon (Mejía-Navarro et al., 1994). The bedrock at the monitoring site used for this
re
study consists of a jointed Paleoproterozoic granite complex found in the western end of
lP
the canyon (Kirkham et al., 2009). The sub-vertical slope is approximately 50 meters high and has one main catchment bench. The slope, which is directly adjacent to the
ur
2016 and 2018.
na
Glenwood Canyon pedestrian path, generated several large rockfalls (> 1 m3) between
Jo
The Indian Head, DeBeque Canyon, and Palisades sites are located within Cretaceous sandstones and shales along the valley walls adjacent to the Colorado River. As such, they exhibit an alternating ridge and soil slope morphology according to the variability in bed erosive potential, with ridge units more likely to source rockfalls and large blocks capable of reaching the roadway. At the DeBeque Canyon monitoring site, a large, deep-seated landslide is present in addition to the surrounding rock slopes. The DeBeque Canyon Landslide is a very slowmoving (Cruden and Varnes, 1996) complex landslide adjacent to and undercutting I70. The landslide has been a persistent hazard since roads were constructed in the
Journal Pre-proof area in the 1920s (White, 2005). Current mitigation efforts include in-situ
ur
na
lP
re
-p
ro
of
instrumentation and LiDAR displacement monitoring.
Colorado.
Jo
Figure 1. Overview map showing the site locations along Interstate 70 in western
na
lP
re
-p
ro
of
Journal Pre-proof
ur
Figure 2. DeBeque and Glenwood processed point clouds with manual material labels
Jo
shown as different colors. A subset of representative areas are labeled on the slope with colors indicated in the legend, and white/gray points are not labeled.
na
lP
re
-p
ro
of
Journal Pre-proof
ur
Figure 3. Palisade and Indian Head point clouds showing manual material labels. A
Jo
subset of representative areas are labeled on the slope with colors indicated in the legend, and white/gray points are not labeled.
Journal Pre-proof
Table 1. LiDAR point cloud locations and characteristics.
Site Name
Type
Point
No. Scan
Spacing
Positions
Location
Mesa cliff 0.02 m Indian Head (IH)
and talus
DeBeque, Colorado
2
of
0.04 m
ro
system Cut rock
Glenwood Springs,
Glenwood (GW)
Colorado
-p
slope
landslide
0.02 m
0.02 m -
Mesa slope
4 0.1 m
0.02 m -
Palisade, Colorado
2 0.06 m
3. Methods
Jo
ur
Palisade (PAL)
na
complex
3
DeBeque, Colorado
lP
DeBeque (DB)
re
Mesa
0.01 m -
3.1 Labeling material types Point cloud classification tasks typically use supervised ML methods, where manually labeled training and validation data is supplied to the algorithm. Ideally, by observing the training data, the ML algorithm can reasonably reproduce interpretations made by an experienced engineering geologist. Labeling points can be accomplished manually using open-source software, such as the CloudCompare project. The manual labels
Journal Pre-proof created by the authors are illustrated for each site in Figures 2 and 3. Manual labeling of the entire site would be prohibitively time consuming and potentially inaccurate, as the edges of the point cloud tend to exhibit much more extreme variations in occlusion and point spacing, making interpretation difficult. Instead, a subset of representative areas were labeled throughout each site. The rest of this section describes how manual class labels for vegetation, soil, talus, and bedrock, were interpreted from the point cloud.
of
Areas where the point cloud appears predominantly 3-dimensional and sparse,
ro
especially in small semi-spherical shapes, were considered vegetation. Photographs of
-p
the sites, and in some cases point coloration, were also used to manually distinguish
re
this class. On slopes with grasses and shrubbery, smaller vegetation objects tend to
lP
blend in with the rock and soil surfaces and it becomes impractical to distinguish vegetation from the rock or soil surface. In this case, the surface was classified based
na
on its roughness characteristics as either soil or talus.
ur
The distinction between soil and talus is not strictly defined. Many slopes in Colorado
Jo
contain particle sizes ranging from clay to boulders in irregular mixtures. Bonneau and Hutchinson (2018) recently demonstrated automatic discrimination between particles of different sizes on a talus slope using geometric features of a point cloud. They used a size threshold of 0.25 m, separating the surface into two classes depending on whether the particles were of larger or smaller diameter. We adopt a more general definition of different particle sizes to account for a wider variety of cases encountered in our datasets. Surfaces that appear to consist of more than 50% fine-grained material (sand size or smaller) are defined as belonging to the “soil” class, while less than 50% fine-
Journal Pre-proof grained material are defined as talus. Another roughly equivalent definition would be to define soil as being “matrix” supported, while talus is “clast” supported. Bedrock is defined as continuous regions of rock, and in practice this includes detached block failures, which may or may not be connected to bedrock at depth. In terms of slope, bedrock has broadly sub-vertical or overhanging surfaces but could locally have slopes at almost any angle due to bedding and fractures. The ability of a classifier to
of
identify a transition between detached block failures of varying sizes and bedrock
ro
depends in part on the scales chosen to calculate neighborhood point features (see
-p
Section 3.2.1 below). For example, if neighborhood statistics are calculated at a
re
maximum radius of 3m, but the detached block has a radius of 5m, the block is unlikely
na
3.2 Random Forest classifier
lP
to be correctly classified.
We used a Random Forest algorithm to perform our ML experiments. Besides being an
ur
algorithm among the highest performing in comparative tests (Fernandez-Delgado et
Jo
al., 2014), Random Forest is implemented in many programming languages, including MATLAB, Python, C++, and R. Some commercial point cloud software packages also employ Random Forest or a similar ensemble technique, such as Gradient Boosting, for point cloud classification (Becker et al., 2018). One key aspect of this method is that the output of the classifier is a series of probability values indicating the likelihood that a given point falls into each class. Probability outputs can be visually inspected, allowing the engineering geologist to understand the relative uncertainty in the outputs and establish confidence bounds.
Journal Pre-proof The “TreeBagger” MATLAB function (MATLAB Statistics and Machine Learning Toolbox) was used to implement the Random Forest. Random Forest learns feature relationships automatically, but the learning process is governed by several rules called hyperparameters that influence the learned relationships. We have demonstrated elsewhere, however, that for this application output accuracy is not very sensitive to hyperparameter selection (Weidner et al., 2019b). For this study, we used 100 trees and
of
a maximum tree depth of 1000 for moderate protection against overfitting, but otherwise
ro
all hyperparameters were given default values, as described in the MATLAB
-p
documentation.
re
The F score metric was used to evaluate the performance of each classifier by
lP
comparing the true label with the predicted label for each point (Weinmann et al., 2015; Mayr et al., 2017). F score is defined as the harmonic mean of precision and recall as
ur
na
follows in Equation 1:
(1)
Jo
A potential side effect of Random Forest‟s ability to fit an arbitrary non-linear decision boundary is that it can be prone to overfitting. We performed a preliminary comparison of Random Forest with a linear classifier, both in cases where training data for the testing site were and were not included. From this comparison, we found that Random Forest had on average 2% to 14% higher F score than the linear classifier in all cases (see Supplementary Figure 1), and there were no consistent differences in performance depending on whether or not training data from the testing site were included. This
Journal Pre-proof indicated that the use of a non-linear classification method is beneficial or neutral, not detrimental, with respect to generalization accuracy. 3.2.1 Point features A similar approach to those used in previous studies was implemented using multi-scale neighborhood statistics as features to train the algorithms (Brodu and Lague, 2012; Mills
of
and Fotopoulos, 2015; Weinmann et al., 2015; Mayr et al., 2017, Weidner et al., 2019b).
ro
Features are calculated at a series of different “scales” (i.e., search radii), a technique that has been demonstrated to result in higher accuracy classification in previous
-p
studies (Brodu and Lague, 2012). For our analysis, eight search radii (scales) between
re
3 m and 0.1 m were used for all features (3, 2, 1.5, 1, 0.75, 0.5, 0.25, and 0.1 m).
lP
Scales were chosen based on observation of the typical sizes of objects in the clouds and were intended to be large enough to capture the variation of most large structures
na
on rock slopes, such as fracture patterns and tree foliage.
ur
In order to be applicable to point clouds generated from any scanner, only intrinsic
Jo
geometric features were used in our classifiers, namely dimensionality features and slope statistics. Geometric dimensionality features are defined in terms of the eigenvalues of the covariance matrix of the neighborhood point X, Y, and Z coordinates. The three eigenvalues,
, where
, are reported as a proportion of the
sum (Equation 2).
(2)
Journal Pre-proof These allow the classifier to distinguish between point neighborhoods that are predominantly 1-dimensional, 2-dimensional, and 3-dimensional. In addition, these features are relatively simple and intuitive, while still being powerful discriminators. In order to also distinguish between surfaces of varying orientations and roughness, we add basic statistics of the slope angle, as derived from normal vector orientations calculated in CloudCompare (see Weidner et al., 2019b). We used the mean, standard
of
deviation, skew, and kurtosis, of the slope angles as features. The total number of
ro
features for each point, including both dimensionality and slope features at all scales, is
-p
56.
re
A commonly used approach to manage the computation requirements of this procedure
lP
is to only calculate features on a set of “core” points subsampled from the full point cloud (Brodu and Lague, 2012). The full point cloud is used for feature calculation at the
na
core points. In the output cloud, core point labels are propagated to their nearest
ur
neighbors such that every point in the full cloud has a label. This allows for a reasonably
Jo
high resolution classification while also significantly reducing the number of required calculations. Generally the number of core points used in our classifiers after subsampling varies from 100,000 to 400,000, corresponding to core point spacings of between around 0.04m and 0.2m (depending on the site considered). 3.3 Experiments performed 3.3.1 Point spacing experiment Previous studies have discussed the potential influence of point density on geometric features, and several schemes have been proposed to derive features robust to this
Journal Pre-proof effect, many of which include multi-scale features like those used in this study (Brodu and Lague, 2012; Lin et al., 2014; Wang et al., 2015). Classifiers trained on a site with high point density may still have reduced performance if applied to a site with much lower point density, however. With that in mind, an experiment was performed to measure the influence of point density on classification performance. The full Glenwood point cloud (the subject of detailed study by Weidner et al., 2019b) was spatially
of
subsampled at six different levels between point spacings of 0.01m and 0.16m, and
ro
point features were calculated for each case. Then, a classifier was trained and tested
-p
for each combination of point spacings.
re
The same core points were used for all classifiers. Therefore, it is critical to note that the
lP
training and testing points are not independent in this experiment, so output performances are not representative of actual generalization performance, only of the
na
relative effect of point spacing.
ur
3.3.2 Occlusion experiment
Jo
Shadow regions in point clouds, referred to as areas of occlusion, result from objects blocking the line of sight of the scanner. Self-occlusion of objects in a terrestriallycollected point cloud is typically unavoidable, and this can influence the geometric features calculated in these regions. Mills (2015) calculated multi-scale features for LiDAR scans of a rock outcrop taken from six different vantage points. He found that scan positions that were more oblique and self-occluded had higher variance in their calculated geometric features, and suggested that this results in less reliable classifier performance. Our goal is to quantify this with real-world estimates of the change in classification accuracy.
Journal Pre-proof We performed an occlusion test for the Glenwood site, which was scanned from three different vantage points along the approximately 300m-wide slope (Weidner et al., 2019b). Many shadow areas are created in the point clouds due to the shape of blocks and extent of deep channels incised into the rockmass (Figure 4). For each of the three individual scans, as well as for the merged full-slope point cloud, core points and geometric features were calculated. Then, the merged full-cloud features were used to
of
train a classifier, which was tested on the features from individual scans. Note that due
ro
to the relatively low prevalence of soil at this site, as well as for simplicity, soil and talus
-p
classes were merged into a single “talus” class for this experiment.
re
In a similar fashion to the point spacing experiment, the training and testing points used
lP
in this test are spatially intermixed with each other, and the performance metrics are therefore an overestimate of the actual generalization performance. To confirm that the
na
relative relationships observed are credible, we also performed the same test but
Jo
ur
trained the classifier using data from Indian Head, a separate site.
ur
na
lP
re
-p
ro
of
Journal Pre-proof
Jo
Figure 4. A region of the Glenwood site showing significant variation in occlusion of the target slope based on the position of the scanner. The plan drawing to the right illustrates the positions of the scan locations relative to the Region Of Interest (ROI), shown in the left panels. 3.3.3 Site generalization experiment The goal of the site generalization experiment was to evaluate the effect on classification accuracy when the ML algorithms are applied to multiple different rock slopes. This is intended to capture the influence of several factors on classification
Journal Pre-proof accuracy, including point spacing variability, occlusion, true differences in slope morphology between sites, and labeling errors. The general testing procedure followed is illustrated in Figure 5. Four sites were used for classifier training: Glenwood, DeBeque, Palisade, and Indian Head (see Figures 2 and 3). The labels for each site were manually divided into spatially disjoint training and validation regions. Points in the training region were used to train the classifier, and
of
points in the validation region were used to test the generalization accuracy. A classifier
ro
was trained for several possible combinations of sites. The final classifier contains
-p
training data from all 4 sites. For example, for testing the DeBeque site, classifiers were
re
trained individually on the Glenwood, Indian Head, Palisades, and DeBeque training
lP
sets. Then classifiers were trained in groups of two sites (not including DeBeque), a group of three sites (not including DeBeque), and all four sites together. Validation
na
statistics were then calculated using the DeBeque validation point set.
ur
The classes used in this analysis were vegetation, soil, talus, and bedrock. Each site,
Jo
however, has varying amounts of these four classes, and this is reflected in the training data. As multiple sites are added to the training data, it becomes possible for one site to dominate and reduce the influence of other sites in the classifier. We chose to balance classes between sites, such that the class proportions are the same for each site, and consequently each site has the same number of training examples. This was done by undersampling points from the training data for each class, where the number of points sampled is equal to the number of points for the site with the fewest instances. Expressed in equation form, this procedure would appear as follows in Equation 3:
Journal Pre-proof
{
(3)
3.4 Classifier modification techniques 3.4.1 Modifying probability thresholds
of
Each point in an output prediction is assigned a probability for each class, where the
ro
sum of all probabilities is equal to 1, and the final output label is assigned as the class
-p
with the highest probability. Alternatively, thresholds can be manually assigned to make
re
the classifier more or less sensitive to certain classes and reject points below the
lP
threshold. This allows a degree of control and flexibility, as the user can visually inspect and adjust the classification results. In addition to the option of manually adjusting the
na
probabilities for each class, we propose to display the maximum probability value for each point as a proxy for confidence. With this method, a point with similar probability
ur
values for multiple classes indicates disagreement or “confusion” in the classifier, and
Jo
these points will correspondingly have lower confidence. The minimum confidence value is 0.25 (all classes equally probable), and the maximum is 1 (one class “certain”). Classifier confusion can result when objects of different true classes have similar geometric characteristics, which could have some physical meaning. If the classifier is confused between soil and talus, for example, this could indicate that the surface roughness is somewhere between the pure soil and talus endmembers. In this study, we demonstrate manual thresholding of the output probabilities for all four sites. First, the 3-site classifier for each site (i.e., trained on all other sites but the target
Journal Pre-proof site) is applied to the 4th site validation set. Then, the output probabilities for each class are manually inspected in CloudCompare, and a threshold value is chosen for each class. These thresholds are then applied to the probabilities, such that a class can only be applied to a label if its probability is above the threshold. Any point whose probabilities fall below all the thresholds is left unlabeled in the final classification.
of
3.4.2 Active learning
ro
To further ensure high accuracy, a limited set of new training labels can be created using data from the target site. Creation of new labels can be time consuming, but the
-p
process can be expedited using concepts from the field of Active Learning in Domain
re
Adaptation, in which new labels are created in areas where they will have the highest
lP
impact on a modified classifier (Crawford et al., 2013; Tuia et al., 2016). For example, new labels can be focused on regions of the point cloud identified as low confidence in
na
a preliminary classification. With a small number of iterations, an improved classifier can
ur
be developed while essentially minimizing the need for human intervention (Maiora et
Jo
al., 2014). We hypothesize that labeling low-confidence points will have a larger marginal increase in accuracy than labeling points for which the classifier is already highly confident.
We demonstrate this technique for point cloud classification as follows. We use a premade classifier to identify which labels are most uncertain, and add labels for those points to the classifier. First, the 3-site classifier is applied to the 4th site training set. Then, our confidence metric is calculated, and points in the 4th site training set are divided into groups based on their confidence values. Four point groups were created, containing confidence values <0.5, 0.5-0.6, 0.6-0.8 and >0.8. The manual ground truth
Journal Pre-proof labels for each group were separately added to the 3-site training data, creating four new classifiers. Each new classifier was then tested on the validation point set for that
Jo
ur
na
lP
re
-p
ro
of
site.
Figure 5. General Random Forest training and validation workflow. Note that the “Balance classes between sites” step was only performed for the site generalization experiment, but all other steps are the same for all experiments. 4. Results
Journal Pre-proof 4.1 Point spacing effects Variations in output classifier performance (as measured by individual class and mean F scores) were observed based on corresponding input variations in point spacing and occlusion (Figures 6 and 7). For point spacing, the main observation was that using dissimilar training and testing point spacings resulted in a reduction in performance. Training and testing using the same point spacings resulted in the highest possible
of
performance of around 0.9, and for most cases there was a decreasing trend as
ro
spacing became more dissimilar between training and testing. A model trained on
-p
0.01m spacing and tested on 0.16m spacing had a mean F score of 0.84, while the
re
opposite case (training 0.16m, testing 0.01m) had a mean F score of 0.75. This
lP
suggests that attempting to generalize to lower point density (larger distances between points) than the source is less detrimental than generalizing to a higher point density
Jo
ur
na
than the source.
Journal Pre-proof Figure 6. Results of Random Forest classification of the Glenwood site using varying combinations of training and testing point spacing. Mean F scores are in each cell, with cell colors indicate relative differences in mean F score (green = higher, red = lower). 4.2 Occlusion effects The introduction of occluded areas into the point cloud also appears to reduce
of
classification accuracy (Figure 7). Individual scan positions at the Glenwood site
ro
consistently showed reduced F scores relative to a case where all positions were merged together. The effect was generally reproduced both when Glenwood itself was
-p
used for training, and in the generalization case when Indian Head was used for
re
training. In terms of scan positions, Position 1 had the largest reduction in F score of the
lP
three. In terms of material types, the talus class had the largest reduction, about -0.1 F score, followed by vegetation, with about -0.05. For bedrock, Position 3 actually had
na
higher accuracy than for the merged case. This is because individual positions may be
ur
more likely to over-represent smooth surfaces perpendicular to the scanner line of sight,
Jo
which are more easily classified (correctly) as bedrock. It follows that this bias toward simple, planar surfaces will result in a decrease in the classification performance for materials that inherently have more 3-dimensional characteristics and/or are oblique to the line of sight (such as talus and vegetation).
ro
of
Journal Pre-proof
Figure 7. Occlusion experiment results. F scores are shown for testing on Glenwood
-p
scans taken from three different positions (red, green, and blue bars), and for all three
re
positions merged into a single cloud (black bars). Two classifiers were trained: one
lP
using the merged Glenwood cloud (left) and one using the merged point cloud for a
na
different site (right).
4.3 Multi-site validation metrics
ur
Validation results for site generalization tests are summarized in Figure 8 and Table 2.
Jo
Each vertical cluster of points represents a classifier trained on a number of other sites and tested on the validation set indicated in the subfigure title. A primary observation is that performance varies significantly depending on the sites used in training. Variability in F scores is reduced, however, with an increase in the number of training sites, and performance appears to converge to a roughly “average” value for the 3-site and 4-site classifiers. F score increases by as much as 0.25 for some classes between the 3-and 4-site classifiers, where training data from the target site is added.
Journal Pre-proof Different colors in Figure 8 represent the individual F scores reported for each material class. Bedrock tends to be the most accurately distinguished class at all sites, with F scores typically above 0.9. The general trend of increasing or relatively unchanging accuracy with more trained sites is visible in the individual F scores, with a few exceptions. Vegetation is consistently misclassified at the Palisade site (F scores between 0.25 and 0.65), and soil is misclassified at the Glenwood Canyon site (F
of
scores between 0.3 and 0.6). These errors have reasonable explanations related to
ro
particular circumstances at these sites. At Glenwood Canyon, for example, there are
-p
several shallow-dipping, smooth discontinuity surfaces in bedrock which are
re
indistinguishable from soil based on geometry characteristics alone. This is why, compared to other sites, Glenwood Canyon metrics for soil are lower. Similarly for
lP
DeBeque, sufficiently large boulders (>6m diameter) are commonly labeled as bedrock
na
because the feature neighborhood radius is not large enough to capture multiple faces
bedrock.
ur
of the boulder, and so the geometric features are indistinguishable from those of
Jo
Pairwise comparisons of Random Forest classifiers trained on individual sites appear in Table 2. The diagonal entries in the left portion of the table indicate training and testing on the same site (using separate training and validation points). Off-diagonals indicate training and testing on different individual sites. Generally, the highest accuracy classifiers are those including training data from the target site (i.e., training and testing on the same site), and the 3-site classifiers, which did not include training data from the target site, were second highest in accuracy. Classifying a pair of sites generally results in similar accuracy regardless of which site was used for training and which was used
Journal Pre-proof for testing (e.g. PAL-IH: 0.69, IH-PAL: 0.74). This is not the case, however, with the Palisade-DeBeque pair, where training on DeBeque data results in a mean F score of 0.80 when testing on Palisade data, but training on Palisade data results in a mean F
Jo
ur
na
lP
re
-p
ro
of
score of 0.64 when testing on DeBeque data.
Figure 8. Results of incremental training and validation of the Random Forest classifier, with all F scores shown for individual classes.
Journal Pre-proof Table 2. Mean F score results for various combinations of training and testing of the RF classifier. Bolded values indicate the highest value for each row. Training Site
2-site DB
IH
PAL
GW
Label
3-site
4-site
0.79
0.87
(mean)
DB
0.87
0.78
0.64
0.67
IH
0.82
0.80
0.69
0.76
PAL
0.80
0.74
0.87
0.58
GW
0.61
0.70
0.61
0.66
0.76
0.85
0.87
0.78
ro
0.80
0.87
0.67
0.66
0.70
re
-p
0.81
lP
4.3.1 Qualitative observations
of
Test
na
Visual observations can be made to confirm the accuracy and usefulness of the classifier in interpreting and extracting slope morphology. The 4-site Random Forest
ur
algorithm was applied to the full point clouds for all sites for analysis. For DeBeque
Jo
(Figure 9), smooth surfaces, including bare soil and short grasses, tend to be classified as soil class as expected. Distinct shrubs of various sizes are correctly identified as vegetation. The DeBeque landslide is of particular interest because it consists of a large sliding sandstone block (width up to around 130 m) and a rubble zone consisting of many smaller rotating and translating blocks 5 to 20 m in diameter. These are all for the most part labeled as bedrock by the classifier. Common misclassifications are small shrubs being labeled as talus, labeling the underside and notches in rock blocks as
Journal Pre-proof vegetation, and the existence of a “halo” effect of talus labels around isolated shrubs (bottom center in Figure 9). Confidence metrics allow for the user to quickly interpret and correct some of the errors described above. For example, a comparison of our confidence metric, material labels, and actual topography for the Palisade site are presented in Figure 10. Confidence is high, close to 1, for some bedrock and soil surfaces, and confidence decreases in areas
of
where multiple classes are in close proximity. Boulders in this case have the lowest
ro
confidence, and this is reflected in the alternating of talus and bedrock labels (Figure
-p
10b). Small notches and crevices in bedrock and between boulders are commonly
re
mislabeled as vegetation, and this is particularly evident at Palisade (see also Figure 8).
lP
4.4 Comparison of classification approaches
na
The results presented in Section 4.3 generally represent three different approaches to classifier generalization: 1) obtain a pre-trained classifier from other sites, 2) train a new
ur
classifier on the target site (i.e., do not attempt to use a generalized classifier), 3)
Jo
combine a pre-trained classifier with training from the target site. In addition to these basic approaches, we tested two additional methodologies which are essentially slight modifications to these three main techniques. Accuracy results for all of these techniques are summarized in Table 3. The first additional method is to apply probability thresholds to the pre-trained classifier (Section 3.4.1). This is visually illustrated in Figure 11 for the Indian Head site. The slope shown is a combination of bedrock outcrops, soil, and talus. The outcrops are exposed on the right side of the image, while the left side is mostly a cover of soil and
Journal Pre-proof talus, and the transition between the two sides is not visually obvious. The default classification in Figure 11a is manually modified with user-defined thresholds in Figure 11b, such that a higher probability threshold is required for the labeling of talus. Areas that fall below the threshold values are shown as grey points. For a small number of unlabeled points, this can result in an increase in classification accuracy compared to the default classification (Table 3).
of
The second additional method is to add a small number of high-impact training points
ro
from the target site using Active Learning (see Section 3.4.2). Figure 12 confirms that
-p
low-confidence points have as much as five times the per-point value added as high-
re
confidence points. Adding these high-impact points (i.e., preliminary confidence < 0.5)
lP
to the training results in a significant increase in accuracy, close to or even exceeding the 4-site classifier accuracy (Table 3), with over a 90% reduction in the number of
Jo
ur
na
target site training points used.
ur
na
lP
re
-p
ro
of
Journal Pre-proof
Jo
Figure 9. DeBeque Landslide point cloud classification results using the 4-site classifier (top). Photograph of approximately the same area taken in August 2019. The main rubble zone of the landslide is visible in the upper left portion of the scene.
Jo
ur
na
lP
re
-p
ro
of
Journal Pre-proof
Figure 10. Palisade site classification results using the 4-site classifier. a) Confidence metric results. b) Full classification of the same area. White boxes highlight confusion between the talus and bedrock classes in large boulders. c) Photograph of approximately the same area taken August 2019.
Jo
ur
na
lP
re
-p
ro
of
Journal Pre-proof
Figure 11. Example of modifying probability thresholds (Section 3.4.1) for the Indian Head site. a) The classification output without applying thresholds. b) The same classification after manual thresholds are applied. c) Photograph of approximately the same area taken August 2019.
Journal Pre-proof
Table 3. Mean F score comparison of alternative approaches to improve classifier generalization. Columns are ordered in increasing manual effort required from left to
3-site
3-site MM*
AL**
DB
0.79
0.83
0.86
0.87
IH
0.85
0.86
0.87
of
right. Target only 4-site
PAL
0.80
0.83
0.84
GW
0.66
0.72
0.73
Relative effort***
0
1
2
0.87
0.87
0.87
0.66
0.70
5
5
re
-p
0.80
ro
0.87
lP
*Manually Modified thresholds, <5% points unlabeled **Active Learning, preliminary confidence < 0.5 added to training
Jo
ur
na
***Presumes availability of a pre-existing generalized classifier (3-site)
re
-p
ro
of
Journal Pre-proof
Figure 12. Evaluation of the Active Learning method for different sets of added training
lP
points. Preliminary point confidence is calculated using the 3-site classifier, and per
Jo
ur
na
point increase in mean F score is calculated on the 4th site validation set.
5. Discussion The results described above illustrate several challenges and considerations for transferring and generalizing point cloud hillslope classifiers. In addition, several solutions are proposed and compared for classifier generalization in practice. One might expect that sites with similar hillslope characteristics could be better suited for classifier generalization. The DeBeque, Indian Head, and Palisades sites, for
Journal Pre-proof example, exist within similar sedimentary packages and are being shaped by similar geologic processes. However, several other factors can result in lower-than-expected accuracy, even if geologic factors are held roughly constant. Differences in point density and scanning geometry (causing occlusion) reduce classification accuracy (Figures 6 and 7). Multi-scale features are often suggested as being resistant to point density effects (Brodu and Lague, 2012), but we nonetheless observe significant differences in
of
accuracy. This means that point density and occlusion should be considered important
ro
factors when obtaining training data for a generalized classifier (Lin et al., 2014; Wang
-p
et al., 2015; Weinmann et al., 2015).
re
Another factor of equal importance, but more difficult to quantify, is domain bias (Mills,
lP
2015). Domain bias is the result of inconsistent labeling, wherein the training labels manually created by the user do not accurately represent distinct hillslope materials in
na
reality. Previous work has illustrated that even among experts, there exists some variability in the manual interpretation of basic landscape features in point clouds
ur
(Weidner et al., 2019b), and in geologic data more generally (Bond et al., 2007). This
labels.
Jo
variability is reflected in the automatic interpretations of a classifier trained on imperfect
Becker et al. (2018) note the “importance of reliable and varied training data” in creating a point cloud classifier. In assessing domain bias, Mills (2015) showed that increasing the spatial variety of training points resulted in more consistent and accurate classifiers, an effect we have reproduced here on a larger scale (Figure 8 and Table 2). We find that classifiers built with a larger variety of point clouds from different localities tended to have more consistent performance, but not necessarily optimal performance, when
Journal Pre-proof tested on unseen data. Note also that this is independent of the total number of training points. Obtaining training data can be one of the most difficult steps in implementing an ML classifier, especially in a geoscience context. Our results show that using outsourced training data from the scientific community can be a feasible option, and several methods can be used to augment this data and improve its site-specific accuracy (Table
of
3). Pre-existing training data may be more likely to exist when the label domain is fairly
ro
general, such as when filtering vegetation. Note that if such data are to be made
-p
publicly available to the research and engineering communities, point spacing,
re
occlusion, and domain characteristics should be considered and presented.
lP
We have shown two main methods for improving generalization accuracy: imposing probability thresholds and adding some amount of training data from the target site.
na
These methods require varying amounts of additional manual time and intervention, and
ur
this is dependent to a large degree on the user interface. Steps such as file input and
Jo
output, visualization tools, and algorithm execution could be performed semiautomatically, but for this study, all processing was performed manually using MATLAB and CloudCompare. Regardless, we can still provide a first-order estimate of the relative effort required for classification methods (Table 3). The 3-site classifier represents the case where a pre-made dataset is obtained from an outside source and is applied without modification. This requires the minimum additional effort, but it also has the lowest accuracy. This pre-made classifier can be modified to improve accuracy by applying probability thresholds, a process which in this case took about 0.5 hours.
Journal Pre-proof Accuracy can be further improved by adding training data from the target site. Importantly, we have shown that this process can be expedited using Active Learning, resulting in comparably high accuracy to the 4-site classifier with an over 90% reduction in the number of additional training points used. This process is much more dependent on the ability of the user to interpret point cloud features, and could take on the order of 1-3 hours. Adding a large amount of training data from the target site is typically the
of
best option in terms of improving classification accuracy, but this also requires the most
ro
manual input. If deciding between a target-only classifier and a classifier combining
-p
target and external data, the latter is to be preferred, assuming the label domain is
re
substantially similar between target and source. We found in two cases that the targetonly classifier had lower performance than the 4-site classifier, indicating that even if
lP
target site data is available, there is still some potential benefit, and no observed
ur
6. Conclusions
na
detriment in terms of accuracy, to obtaining data from other sites.
Jo
Point cloud classifiers in engineering geology and geomorphology have the potential to rapidly perform basic landscape characterization for a variety of applications. The generalization of such classifiers, possibly making use of emerging open-data sharing initiatives, could drastically improve their accessibility and usefulness for both engineers and researchers alike. Guidelines for classifier generalization are lacking in the research literature, however. This is especially the case given the challenging conditions presented by point cloud data and geoscience problems. This study presents several challenges for geomaterial classifier generalization, and we also propose a number of potential solutions. Consideration of these should enable
Journal Pre-proof more effective data sharing within the research and engineering communities. We find that variations in several point cloud characteristics can result in major reductions in classifier performance when using commonly implemented geometric point cloud descriptors as features. We also find that training on multiple varied sites in addition to the target site improves generalization accuracy. Our main findings include the following: Differences in point density between training and testing result in a reduction in
of
ro
classifier accuracy; training on dense data and testing on sparse data is
Partial occlusion of objects in the scene results in variance in calculated features
re
-p
preferable to the inverse case.
lP
and biased representations of some classes (such as vertical bedrock surfaces) over others (such as talus).
Domain biases influence the generalization accuracy of classifiers. This can be
na
ur
caused by variance in labeling interpretations made by the user or by inherent differences in the geometric characteristics of objects between sites. All of the above issues can be mitigated by aggregating point clouds from
Jo
multiple sites and by supplementing a generalized classifier with training data from the target site.
Active Learning can be used to identify a small number of new training points that will have a high impact on generalization accuracy. Using a prototype Active Learning framework on our sites resulted in comparable accuracy improvement to that achieved using a full set of training data from the target site, but with over
Journal Pre-proof 90% fewer added training points. Therefore, more future experimentation with Active Learning methodologies is merited. This study highlights the need for effective data sharing between practitioners, including the communication of relevant metadata. While most data types can be hosted in online repositories, the size of point cloud datasets can be prohibitive in some cases. The reader is encouraged to contact the authors for access to the point clouds used for this
ro
of
study.
-p
Acknowledgements
We would like to acknowledge the Colorado Department of Transportation for funding
re
the data collection for this project. We would also like to thank all those who have
lP
assisted in data collection field trips for this work, including Michelle Mullane, Amber
Jo
ur
na
Hill, Caroline Lefeuvre, Heather Schovanec, and Brian Gray.
Journal Pre-proof
References Abellán, A., Oppikofer, T., Jaboyedoff, M., Rosser, N.J., Lim, M., Lato, M.J., 2014.
of
Terrestrial laser scanning of rock slope instabilities. Earth Surface Processes and
ro
Landforms 39, 80–97. https://doi.org/10.1002/esp.3493
-p
Becker, C., Rosinskaya, E., Häni, N., D‟Angelo, E., Strecha, C., 2018. Classification of
re
aerial photogrammetric 3D point clouds. Photogrammetric Engineering & Remote
lP
Sensing 84, 287–295. https://doi.org/10.14358/PERS.84.5.287 Beretta, F., Rodrigues, A.L., Peroni, R.L., Costa, J.F.C.L., 2019. Automated lithological
na
classification using UAV and machine learning on an open cast mine. Applied
ur
Earth Science 128, 79–88. https://doi.org/10.1080/25726838.2019.1578031
Jo
Bond, C.E., Shipton, Z.K., Jones, R.R., Butler, R.W.H., Gibbs, A.D., 2007. Knowledge transfer in a digital world: Field data acquisition, uncertainty, visualization, and data management. Geosphere 3, 568–576. https://doi.org/10.1130/GES00094.1 Bonneau, D.A., Hutchinson, D.J., 2019. The use of terrestrial laser scanning for the characterization of a cliff-talus system in the Thompson River Valley, British Columbia,
Canada.
Geomorphology
https://doi.org/10.1016/j.geomorph.2018.11.022
327,
598–609.
Journal Pre-proof Brodu, N., Lague, D., 2012. 3D terrestrial lidar data classification of complex natural scenes
using
a
multi-scale
dimensionality
criterion:
applications
in
geomorphology. ISPRS Journal of Photogrammetry and Remote Sensing 68, 121–134. https://doi.org/10.1016/j.isprsjprs.2012.01.006 Carter, R., 2018. Identifying rockfall hazards in the Fraser Canyon, British Columbia: a semi-automated approach to the classification and assessment of topographic
ro
University, Kingston, Ontario, Canada.
of
information from airborne LiDAR and orthoimagery. MSc Thesis, Queen‟s
-p
Crawford, M.M., Tuia, D., Yang, H.L., 2013. Active learning: any value for classification
re
of remotely sensed data? Proceedings of the IEEE 2013, vol. 101, 593–608.
lP
https://doi.org/10.1109/JPROC.2012.2231951
na
Cruden, D.M., Varnes, D.J., 1996. Landslides: Investigation and Mitigation. Chapter 3 Landslide types and processes. Transportation Research Board Special Report,
ur
n. 247, pp. 36-75. https://trid.trb.org/view/462501
Jo
Dunham, L., Wartman, J., Olsen, M.J., O‟Banion, M., Cunningham, K., 2017. Rockfall Activity Index (RAI): A lidar-derived, morphology-based method for hazard assessment.
Engineering
Geology
221,
184–192.
https://doi.org/10.1016/j.enggeo.2017.03.009 Eitel, J.U.H., Höfle, B., Vierling, L.A., Abellán, A., Asner, G.P., Deems, J.S., Glennie, C.L., Joerg, P.C., LeWinter, A.L., Magney, T.S., Mandlburger, G., Morton, D.C., Müller, J., Vierling, K.T., 2016. Beyond 3-D: The new spectrum of lidar
Journal Pre-proof applications for earth and ecological sciences. Remote Sensing of Environment 186, 372–392. https://doi.org/10.1016/j.rse.2016.08.018 Eltner, A., Baumgart, P., 2015. Accuracy constraints of terrestrial Lidar data for soil erosion measurement: Application to a Mediterranean field plot. Geomorphology 245, 243–254. https://doi.org/10.1016/j.geomorph.2015.06.008
of
Fernandez-Delgado, M., Cernadas, E., Barro, S., Amorim, D., 2014. Do we need
Machine
Learning
ro
hundreds of classifiers to solve real world classification problems? Journal of Research
3133–3181.
-p
http://jmlr.org/papers/v15/delgado14a.html
15,
re
Girardeau-Montaut, D. 2018. CloudCompare 3D point cloud and mesh processing
lP
software (version 2.10). [GPL software]. http://www.cloudcompare.org/
na
Jaboyedoff, M., Oppikofer, T., Abellán, A., Derron, M.-H., Loye, A., Metzger, R., Pedrazzini, A., 2012. Use of LIDAR in landslide investigations: a review. Natural
ur
Hazards 61, 5–28. https://doi.org/10.1007/s11069-010-9634-2
Jo
Kirkham, R.M., Streufert, R.K., Cappa, J.A., Shaw, C.A., Allen, J.L., Schroeder, T.J.I., 2009. Geologic Map of the Glenwood Springs Quadrangle, Garfield County, Colorado. MS-38. Scale 1:24,000, Colorado Geological Survey. Kromer, R.A., Abellan, A., Hutchinson, D.J., Lato, M., Chanut, M.-A., Dubois, L., Jaboyedoff, M., 2017. Automated terrestrial laser scanning with near real-time change detection - monitoring of the Séchilienne landslide. Earth Surface Dynamics Discussions 1–33. https://doi.org/10.5194/esurf-2017-6
Journal Pre-proof Kromer, R., Walton, G., Gray, B., Lato, M., Group, R., 2019. Development and optimization of an automated fixed-location time lapse photogrammetric rock slope
monitoring
system.
Remote
Sensing
11,
1890.
https://doi.org/10.3390/rs11161890 Lan, H., Martin, C.D., Zhou, C., Lim, C.H., 2010. Rockfall hazard analysis using LiDAR and
spatial
modeling.
Geomorphology
118,
213–223.
ro
of
https://doi.org/10.1016/j.geomorph.2010.01.002
Lary, D.J., Alavi, A.H., Gandomi, A.H., Walker, A.L., 2016. Machine learning in
Machine
Learning
in
Geosciences
re
of
-p
geosciences and remote sensing. Geoscience Frontiers, Special Issue: Progress 7,
3–10.
lP
https://doi.org/10.1016/j.gsf.2015.07.003
na
Lato, M.J., Vöge, M., 2012. Automated mapping of rock discontinuities in 3D lidar and photogrammetry models. International Journal of Rock Mechanics and Mining
ur
Sciences 54, 150–158. https://doi.org/10.1016/j.ijrmms.2012.06.003
Jo
Lin, C.-H., Chen, J.-Y., Su, P.-L., Chen, C.-H., 2014. Eigen-feature analysis of weighted covariance matrices for LiDAR point cloud classification. ISPRS Journal of Photogrammetry
and
Remote
Sensing
94,
70–79.
https://doi.org/10.1016/j.isprsjprs.2014.04.016 Maiora, J., Ayerdi, B., Graña, M., 2014. Random forest active learning for AAA thrombus
segmentation
in
computed
tomography
angiography
images.
Neurocomputing, Recent trends in Intelligent Data Analysis 126, 71–77. https://doi.org/10.1016/j.neucom.2013.01.051
Journal Pre-proof Mayr, A., Rutzinger, M., Bremer, M., Elberink, S.O., Stumpf, F., Geitner, C., 2017. Object-based classification of terrestrial laser scanning point clouds for landslide monitoring.
The
Photogrammetric
Record
32,
377–397.
https://doi.org/10.1111/phor.12215 Mejía-Navarro, M., Wohl, E.E., Oaks, S.D., 1994. Geological hazards, vulnerability, and risk
assessment
using
GIS:
model
for
Glenwood
Springs,
Colorado.
ro
of
Geomorphology 10, 331–354. https://doi.org/10.1016/0169-555X(94)90024-8 Mills, G., 2015. Numerical tools for interpreting rock surface roughness.PhD Thesis, University,
Kingston,
-p
Queen‟s
Ontario,
Canada.
re
https://qspace.library.queensu.ca/handle/1974/13097
lP
Mills, G., Fotopoulos, G., 2015. Rock surface classification in a mine drift using
na
multiscale geometric features. IEEE Geoscience and Remote Sensing Letters 12, 1322–1326. https://doi.org/10.1109/LGRS.2015.2398814
and
Jo
Knowledge
ur
Pan, S.J., Yang, Q., 2010. A survey on transfer learning. IEEE Transactions on Data
Engineering
22,
1345–1359.
https://doi.org/10.1109/TKDE.2009.191 Qi, C.R., Su, H., Mo, K., Guibas, L.J., 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. The IEEE Conference on Computer Vision and Pattern Recognition 652–660. http://arxiv.org/abs/1612.00593 Reichenbach, P., Rossi, M., Malamud, B.D., Mihir, M., Guzzetti, F., 2018. A review of statistically-based landslide susceptibility models. Earth-Science Reviews 180, 60–91. https://doi.org/10.1016/j.earscirev.2018.03.001
Journal Pre-proof Tarolli, P., 2014. High-resolution topography for understanding Earth surface processes: Opportunities
and
challenges.
Geomorphology
216,
295–312.
https://doi.org/10.1016/j.geomorph.2014.03.008 Telling, J., Lyda, A., Hartzell, P., Glennie, C., 2017. Review of Earth science research using
terrestrial
laser
scanning.
Earth-Science
Reviews
169,
35–68.
of
https://doi.org/10.1016/j.earscirev.2017.04.007
ro
Tuia, D., Persello, C., Bruzzone, L., 2016. Domain adaptation for the classification of remote sensing data: an overview of recent advances. IEEE Geoscience and Sensing
Magazine
-p
Remote
4,
41–57.
re
https://doi.org/10.1109/MGRS.2016.2548504
lP
Van Den Eeckhaut, M., Kerle, N., Poesen, J., Hervás, J., 2012. Object-oriented
Geomorphology
na
identification of forested landslides with derivatives of single pulse LiDAR data. 173–174,
30–42.
ur
https://doi.org/10.1016/j.geomorph.2012.05.024
Jo
van Veen, M., Hutchinson, D.J., Kromer, R., Lato, M., Edwards, T., 2017. Effects of sampling interval on the frequency - magnitude relationship of rockfalls detected from terrestrial laser scanning using semi-automated methods. Landslides 14, 1579–1592. https://doi.org/10.1007/s10346-017-0801-3 Wagner, W., Lague, D., Mohrig, D., Passalacqua, P., Shaw, J., Moffett, K., 2017. Elevation change and stability on a prograding delta. Geophysical Research Letters 44, 1786–1794. https://doi.org/10.1002/2016GL072070
Journal Pre-proof Walton, G., Fotopoulos, G., Radovanovic, R., 2019. Extraction and comparison of spatial statistics for geometric parameters of sedimentary layers from static and mobile
terrestrial
laser
scanning
data.
Environmental
and
Engineering
Geoscience 25, 155–168. https://doi.org/10.2113/EEG-2068 Walton, G., Mills, G., Fotopoulos, G., Radovanovic, R., Stancliffe, R.P.W., 2016. An approach for automated lithological classification of point clouds. Geosphere 12,
ro
of
1833–1841. https://doi.org/10.1130/GES01326.1
Wang, Z., Zhang, L., Fang, T., Mathiopoulos, P.T., Tong, X., Qu, H., Xiao, Z., Li, F.,
-p
Chen, D., 2015. A multiscale and hierarchical feature extraction method for
and
Remote
Sensing
53,
2409–2425.
lP
Geoscience
re
terrestrial laser scanning point cloud classification. IEEE Transactions on
na
https://doi.org/10.1109/TGRS.2014.2359951 Weidner, L., DePrekel, K., Oommen, T., Vitton, S., 2019a. Investigating large landslides
ur
along a river valley using combined physical, statistical, and hydrologic modeling.
Jo
Engineering Geology 259, 1-12. https://doi.org/10.1016/j.enggeo.2019.105169 Weidner, L., Walton, G., Kromer, R., 2019b. Classification methods for point clouds in rock slope monitoring: A novel machine learning approach and comparative analysis.
Engineering
Geology
263,
105326.
https://doi.org/10.1016/j.enggeo.2019.105326 Weinmann, M., Jutzi, B., Hinz, S., Mallet, C., 2015. Semantic point cloud interpretation based on optimal neighborhoods, relevant features and efficient classifiers.
Journal Pre-proof ISPRS Journal of Photogrammetry and Remote Sensing 105, 286–304. https://doi.org/10.1016/j.isprsjprs.2015.01.016 Westoby, M.J., Brasington, J., Glasser, N.F., Hambrey, M.J., Reynolds, J.M., 2012. „Structure-from-Motion‟ photogrammetry: A low-cost, effective tool for geoscience applications.
Geomorphology
179,
300–314.
of
https://doi.org/10.1016/j.geomorph.2012.08.021
ro
White, J.L., 2005. The Debeque Canyon Landslide at Interstate 70, Mesa County, WestCentral Colorado. 2005 Rocky Mountain Section, Geological Society of America,
Jo
ur
na
lP
re
-p
Field Trip Guidebook, pp. 1–8.
Journal Pre-proof Declaration of interests
☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Jo
ur
na
lP
re
-p
ro
of
☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests:
Journal Pre-proof Highlights:
Generalizing point cloud classifiers can be a time saving tool Point spacing, occlusions, and domain variances all affect classifier accuracy Combining training data from multiple sites results in more consistent performance
Jo
ur
na
lP
re
-p
ro
of
Adding training data from target with Active Learning minimizes manual intervention