ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
Contents lists available at ScienceDirect
ISPRS Journal of Photogrammetry and Remote Sensing journal homepage: www.elsevier.com/locate/isprsjprs
A greedy-based multiquadric method for LiDAR-derived ground data reduction Chuanfa Chen a,b,⇑, Changqing Yan c, Xuewei Cao b, Jinyun Guo a,b, Honglei Dai b a State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, 266590 Qingdao, China b College of Geomatics, Shandong University of Science and Technology, 266590 Qingdao, China c Department of Information Engineering, Shandong University of Science and Technology, 271019 Taian, China
a r t i c l e
i n f o
Article history: Received 17 October 2014 Received in revised form 20 January 2015 Accepted 21 January 2015
Keywords: LiDAR Generalization Accuracy DEM Interpolation Data reduction
a b s t r a c t A new greedy-based multiquadric method (MQ-G) has been developed to perform LiDAR-derived ground data reduction by selecting a certain amount of significant terrain points from the raw dataset to keep the accuracy of the constructed DEMs as high as possible, while maximally retaining terrain features. In the process of MQ-G, the significant terrain points were selected with an iterative process. First, the points with the maximum and minimum elevations were selected as the initial significant points. Next, a smoothing MQ was employed to perform an interpolation with the selected critical points. Then, the importance of all candidate points was assessed by interpolation error (i.e. the absolute difference between the interpolated and actual elevations). Lastly, the most significant point in the current iteration was selected and used for point selection in the next iteration. The process was repeated until the number of selected points reached a pre-set level or no point was found to have the interpolation error exceeding a user-specified accuracy tolerance. In order to avoid the huge computing cost, a new technique was presented to quickly solve the systems of MQ equations in the global interpolation process, and then the global MQ was replaced with the local one when a certain amount of critical points were selected. Four study sites with different morphologies (i.e. flat, undulating, hilly and mountainous terrains) were respectively employed to comparatively analyze the performances of MQ-G and the classical data selection methods including maximum z-tolerance (Max-Z) and the random method for reducing LiDAR-derived ground datasets. Results show that irrespective of the number of selected critical points and terrain characteristics, MQ-G is always more accurate than the other methods for DEM construction. Moreover, MQ-G has a better ability of preserving terrain feature lines, especially for the undulating and hilly terrains. Ó 2015 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
1. Introduction At present, the use of LiDAR-derived datasets for producing DEMs is becoming a standard practice in the spatial science community (Aguilar and Mills, 2008; Guo et al., 2010; Hodgson et al., 2005). Compared with photogrammetry, one of the competing advantages with LiDAR is the capability of canopy penetration with high density and high accuracy (Kraus and Pfeifer, 1998). However, the high density of LiDAR data leads to an enormous increase in data volume, which brings great challenges with respect to data ⇑ Corresponding author at: State Key Laboratory of Mining Disaster Prevention and Control Co-founded by Shandong Province and the Ministry of Science and Technology, Shandong University of Science and Technology, 266590 Qingdao, China. E-mail address:
[email protected] (C. Chen).
storage, processing, display and transmission (Liu and Zhang, 2011). Furthermore, processing all LiDAR datasets is uncertain to provide a significant improvement on DEM accuracy. Therefore, data reduction might be useful to produce a more manageable and operationally sized terrain dataset for DEM generation (Anderson et al., 2005, 2006; Guo et al., 2010). Data reduction starts with high density dataset and then repeatedly reduces the number of points for terrain modeling (Nordvik and Midtbø, 2007). Theoretically, an excellent reduction method should maximally reduce the number of sampling points while minimally losing the accuracy of produced DEMs. In practice, to what extent a set of sample points to be reduced generally depends on a user-specified accuracy tolerance (e.g. maximum interpolation error), which is determined by the elevation discrepancy between the raw dataset and the reduced one (Chen and
http://dx.doi.org/10.1016/j.isprsjprs.2015.01.012 0924-2716/Ó 2015 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS). Published by Elsevier B.V. All rights reserved.
111
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
Zhou, 2013; Lee, 1991). Generally, there are two forms of data reduction algorithm including conservative reduction and aggressive reduction (Chou et al., 1999). In the process of conservative reduction, only those redundant points are eliminated. This indicates that the remaining data can correctly recover the surface without losing any accuracy. However, the aggressive reduction eliminates not only the redundant points, but also some relatively important points, aiming to achieve a higher level of reduction ratio. As a result of the increasing availability of LiDAR data with high density for terrain modeling, the need of an efficient aggressive reduction method has improved. There exist various methods to generalize different structures of DEMs, such as grid, triangulated irregular network (TIN) and contour (Ai and Li, 2010; Bjørke and Nilsen, 2003; Heckbert and Garland, 1999). Based on the fundamental principle and suitability for LiDAR data reduction, the existing algorithms can be classified into three categories: the random method, the point subtractive method and the point additive method. The simplest method for reducing LiDAR data is the random reduction (Anderson et al., 2005; Guo et al., 2010; Liu and Zhang, 2011), where a portion of LiDAR points are randomly reduced based on a user-specified reduction ratio, and then the remaining points are used for terrain modeling, and vice versa. Although this method is simple and fast, it has low accuracy, because it does not regard the importance of each candidate point as well as areas of specific interest for an application (De Floriani et al., 2000). The point subtractive method (Lee, 1991) starts a triangulation of all LiDAR points and iteratively drops the points from the triangulation until a user-specified tolerance is reached. For each iteration, the error at each point is computed and the point with the lowest error is removed. The error at a point can be computed by temporarily deleting the point from the triangulation, doing a local Delaunay triangulation, and measuring the vertical distance from the point to its corresponding triangle (Schröder and Roßbach, 1994; Schroeder et al., 1992). Although this method performs very well, it is extremely time consuming (Lee, 1991). Thus, it is impractical to reduce LiDAR data with huge data volumes, especially under the common computation environment (Oryspayev et al., 2012). The point additive method (Chang, 2010; Heller, 1990), also termed maximum z-tolerance (Max-Z), first generates a candidate TIN using points selected from the LiDAR data to fully cover the perimeter of raw dataset, and then the point, which has the maximum variation to the surface defined by the TIN generated by the points selected in the previous iterations, is added to form a new TIN, until the threshold of desired error level is reached or the desired number of vertices is used. However, as this method never alters previous results, it always produces long and extremely thin sliver triangles that do not fit the surface well (Lee, 1991; Rippa, 1992). In principle, the data must be reduced in such a way that critical points are kept while less important points are removed. Hence, the key step is how to assess the importance of each data point. In almost all cases for the classical data reduction methods, the significance of each point is evaluated based on a simple linear interpolation with its neighbors. Namely, if the elevation of a point can be accurately estimated, then the point is defined as not ‘terrain significant’; consequently, it should not be involved in the final output. However, compared with the nonlinear interpolators such as kriging and radial basis function, linear interpolation is too simple to describe surfaces with complex morphology (Aguilar et al., 2005; Chen and Yue, 2010). In order to avoid the shortcoming of linear interpolation methods, an orthogonal least square (OLS)-based multiquadric (MQ) algorithm (MOLS) was proposed for DEM generalization by the retrieval of critical points from a grid-based DEM to construct a tri-
angulated irregular network surface (Chen and Li, 2013). Realworld examples proved the feasibility and efficiency of MOLS for DEM generalization. Nevertheless, MOLS is computationally expensive, as at each iteration, it always employs the Gram– Schmidt algorithm to compute orthogonal basis vectors of all sample points, aiming to select the grid point giving the maximal increment to the explained variance of the desired output. Furthermore, the classical MQ used in MOLS is an exact interpolation method disregarding noises inherent in grid points, which may lead to the over-fitting problem (Marquardt and Snee, 1975). Motivated by the aforementioned researches, in this paper, a greedy-based MQ method (MQ-G) is proposed to perform LiDARderived ground data reduction. MQ-G iteratively assesses the importance of all candidate points by calculating these interpolation errors. The point with the biggest interpolation error is considered as the most significant and will be grouped into the critical points. The process is repeated until the number of selected points reaches a pre-set level or no point is found to have the interpolation error exceeding a user-specified accuracy tolerance. Furthermore, in order to resist the effect of sample errors, a smoothing MQ is introduced to perform interpolations. In conclusion, this paper aims to (1) evaluate the feasibility of MQ-G to collect significant terrain points from LiDAR-derived ground dataset; (2) explore the accuracy of MQ-G for DEM construction with the selected terrain points; and (3) compare the performance of MQG with those of maximum z-tolerance (Max-Z) and the random method, both of which are widely used in LiDAR data reduction (Chen and Li, 2013; Chen and Zhou, 2013; Guo et al., 2010; Zhou and Chen, 2011). The remainder of this paper is organized as follows. The principle of MQ-G for LiDAR data reduction is given in Section 2. In Section 3, four study sites with different morphologies are respectively employed to assess the performances of MQ-G, MaxZ and the random method. Section 4 gives the discussions and conclusions. 2. Principle of MQ-G for LiDAR-derived ground data reduction 2.1. Smoothing MQ Multiquadric method (MQ) is an analytical method of representing irregular surfaces that involve the summation of equations of quadric surfaces located at significant topographic points (Hardy, 1971). At present, there have been many applications of this interpolation method in such diverse fields as photogrammetry, surveying and mapping, geodesy, geophysics, remote sensing, digital terrain modeling, hydrology, signal processing, and artificial intelligence (Franke et al., 1994; Hardy, 1990; Syed et al., 2003). As a rule, the value of variable z at a point(x, y) can be expressed as a sum of two components (Mitasova and Mitas, 1993):
zðx; yÞ ¼
k n X X bj pj ðx; yÞ þ bj qðr j Þ j¼1
ð1Þ
j¼1
where n is the number of sample points; rj is the distance from the interpolated point (x, y) to the jth sample point; q(r) is a radial basis function; {pj} is a set of k polynomials forming the basis of polynomials; bi and bj are the coefficients, obtained by the solution of a system of linear equations prescribing interpolation of sample points, and exactness for the polynomials:
8 k n X X > > > > bj pj ðxi ; yi Þ þ bj qðr j Þ ¼ zi ; > < j¼1
n > X > > > bj pi ðxj ; yj Þ ¼ 0; > : j¼1
i ¼ 1; 2; . . . ; n
j¼1
i ¼ 1; 2; . . . ; k
ð2Þ
112
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
For MQ, q(r) is expressed as,
pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi qðrÞ ¼ r2 þ c2
ð3Þ
where c is a shape parameter. Its optimal value depends on the number and the distribution of data points, on the data vector z and on the precision of the computation (Rippa, 1999). The matrix formulation of MQ based on Eq. (2) is expressed as,
Q
P
T
0
P
b b
¼
z
ð4Þ
0
0
1 0 1 p1 ðx1 ; y1 Þ pk ðx1 ; y1 Þ q11 q1n B .. C B C .. .. .. .. where Q ¼ @ . P¼@ A; . A; . . . qn1 . . . qnn p1 ðxn ; yn Þ pk ðxn ; yn Þ 0 1 0 1 b1 z1 B .. C B . C b ¼ @ . A; z ¼ @ .. A; zi is the value of the ith sample point; b is zn bn the coefficients of the polynomial. In order to perform interpolations with MQ (i.e. Eq. (1)), the coefficients b and b must be obtained beforehand by solving Eq. (4). Theoretically, the classical MQ is an exact interpolation method; that is, the interpolated surface must pass through each data point. However, in the process of data capture, sample points are always subject to measurement errors due to the inaccurate instrument calibration, improper operation and bad weather condition (Fisher, 1999; Fisher and Tate, 2006). Thus, over-fitting problem may occur in the exact MQ for interpolating noisy points. In statistics, ridge regression is an important tool to avoid the problems of overfitting and multicollinearity. In terms of ridge regression, the objective function of the smoothing MQ can be formulated as, T
minkðz Pb Q bÞ ðz Pb Q bÞ þ bT Q b b;b
Q þ I=k P P
T
0
b b
¼
z 0
2.2.1. Procedure of global MQ-G and speed improvement The detailed procedure of MQ-G with the global interpolation is as follows: (1) Choosing initial critical points. In principle, all LiDAR points can be considered as candidates for the critical points. Here, the data points with the maximum and the minimum elevations are first selected, as they are the most probable to be terrain feature points. For example, points located at hilltops and pits always have the extreme elevations. After this step, there are two groups of dataset, namely, critical terrain points and candidate points. (2) Performing an interpolation to all candidate points using the global smoothing MQ with the pre-selected critical points. Thus, we can obtain the interpolation errors (i.e. the difference between the sample and interpolated elevations) of all candidate points. (3) Selecting the critical point with the biggest interpolation error. Based on the aforementioned explanation, the point with the biggest interpolation error is always more important than others. (4) Repeating (2) and (3) until the number of selected points reaches a pre-defined level.
ð5Þ
where k is a smoothing parameter which controls the smoothness and goodness-of-fit of MQ. Letting the derivatives of Eq. (5) with respect to b and b equal to zero, we can obtain the following expression,
However, with the increase of the number of critical points, the computing speed of MQ-G would become lower, as its computation time is approximately proportional to the third power of the number of data points. Thus, when a certain amount of critical points were selected, the local MQ would replace the global one to perform interpolations. For the global MQ, the elevation of a candidate point was interpolated with all the pre-selected critical points; whereas for the local MQ, the elevation of a given point was interpolated only with its k nearest neighbors. So, no matter how many data points there are, MQ-G can be easily implemented.
ð6Þ
Comparing Eq. (4) with (6), it can be concluded that Eq. (4) is a special case of Eq. (6) with k being infinite. Eq. (6) indicates that the smoothing MQ is an approximate interpolation method. Thus, it is more reasonable to perform interpolations with the smoothing MQ than with the exact one when sample points are subject to measurement errors. There are two parameters to be pre-determined when the smoothing MQ is performed, i.e. the shape parameter c and the smoothing parameter k. In this paper, these optimal values are determined based on the k-fold cross-validation technique (e.g. k = 10) (Hofierka et al., 2002). 2.2. Greedy algorithm for selecting critical points One of the key steps for selecting critical points is to determine the importance of all candidate points. A point is considered to be significant when its elevation cannot be closely interpolated using elevations of other sample points. In the process of MQ-G, each point was interpolated using the smoothing MQ with the preselected critical points. The point with the biggest interpolation error was grouped into the critical points. The process was terminated when the number of selected points reached a pre-defined level or when no point was found to have the importance exceeding a user-specified tolerance.
Performing MQ-G is a time consuming task, as selecting one critical point requires solving Eq. (6) one time. Based on the classical matrix inversion algorithm (e.g. Gaussian elimination method), the computational cost for solving Eq. (6) is at least O(n3), where n is the number of pre-selected critical points. Thus, if N points were selected from the raw LiDAR data, the whole computational cost was O(23 + + (N 1)3). In order to quickly solve the systems of MQ equations, we here derived an efficient technique from matrix inversion lemma. The detailed information is as follows. In terms of matrix inversion lemma, Eq. (6) can be transformed into the form,
b b
¼
Q þ I=k P PT
1 z 0
0
¼
M MPUP T M
MPU
UP T M
U
! z 0
ð7Þ 1
where M ¼ ðQ þ I=kÞ1 ; U ¼ ðP T MPÞ . Eq. (7) indicates that the main cost for obtaining b and b is from computing M, especially when the matrix order is large. If a new critical point is selected, Q þ I=k (i.e. M1) in Eq. (7) should be updated. Its iterative formula is expressed as,
0 1 ðM k Þ 1 B M kþ1 ¼ @ nn qT ðnþ1Þðnþ1Þ 1n
1
q n1
1=k
C A
ð8Þ
11
or
0 M kþ1
ðnþ1Þðnþ1Þ
B ¼@
ðM k Þ
1
q
nn
n1
qT
1=k
1n
11
11 C A
ð9Þ
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
113
where n is the number of selected critical points after k iterations (if k = 0, then n = 2); q ¼ ðq01 q0n ÞT ; q0i represents the radial basis function of the distance between the newly selected point and the ith pre-selected critical point. In terms of matrix inversion lemma, M kþ1 can be expressed as,
M kþ1 ¼
M k þ M k qqT M k =a M k q=a qT M k =a
!
1=a
ð10Þ
where a is a scalar, expressed as a ¼ 1=k qT M k q. It can be found that if Mk was obtained beforehand, the cost for computing Mk+1 with Eq. (10) is only O(n2). Thus, the whole computing cost for selecting N points by solving the systems of MQ equations is O(23 + 22 + + (N 1)2). Thus, this skill is much faster than the original method for solving the systems of MQ equations. In this paper, 100 critical points were first selected from the raw LiDAR-derived ground dataset by the global MQ-G, and then the remaining critical points were picked by the local MQ-G.
(a) Interpolation to candidate points 2.2.2. Procedure of local MQ-G and speed improvement With the pre-selected critical points, MQ-G with a local interpolation was implemented with the following steps: (1) Forming DEM grids which cover the whole domain of the study site. The resolution of the DEM is h. (2) Performing an interpolation to all DEM grids using the local MQ with the k nearest neighboring critical points around each DEM grid. In this paper, we set k as 12. The kd-tree technique was used to search for the 12 nearest neighbors of a given grid, as this search can be done efficiently by using the tree properties to quickly eliminate large portions of the search space (Cormen et al., 2009). (3) Computing interpolation error of each candidate point (i.e. elevation difference between candidate point and the DEM grid where this candidate point located). (4) Grouping the point with the biggest interpolation error into the critical points. (5) Repeating (2) and (4) until the number of selected points reaches a pre-defined level or there are no points whose significance is bigger than a user-specified tolerance. From the above procedure we can find that selecting one critical point requires the computation of elevations of all DEM grids. However, if the k nearest neighbors of a given grid are not changed at the current iteration, then its interpolated value keeps unchanged at the next iteration. Thus, interpolation should be performed only on those grids whose k neighbors are changed. This skill can greatly improve the speed of MQ-G, as there are only a few grids whose neighbors are changed after an iteration. It is worth noting that when conducting interpolations with MQ, the values of all DEM grids rather than all candidate points should be interpolated. This is more reasonable for DEM construction, as the location difference (Fig. 1) between the sample points (e.g. A) and the corresponding central points (e.g. B) of DEM grids is avoided (Yue et al., 2007). 3. Experimental tests 3.1. Study sites and LiDAR data acquisition The study area is located in the region of the Heihe River basin (HRB) in the arid region of northwest China. HRB is sited within 37.7°–42.7°N, 97.1°–102.0°E, covering an area of approximately 1,432,000 km2 (Li et al., 2013). This basin is characterized by its
(b) Interpolation to DEM grids Fig. 1. Interpolation to (a) candidate points and (b) DEM grids with pre-selected critical points. ‘circle’ represents one of the candidates for critical points; ‘dot’ represents the central of DEM grid; and ‘triangle’ represents the pre-selected critical points.
distinct cold and arid landscapes: glaciers, frozen soil, alpine meadow, forest, irrigated crops, riparian ecosystem, and desert, which are distributed upstream to downstream. In order to cover varying morphologies, four study sites with different topographies (i.e. flat, undulating, hilly and mountainous terrains) were selected. The four sites were respectively dominated by flat terrain with some U-shaped gullies (Fig. 5), by undulating land with a prominent ridge line cross the area from the lower right to the upper middle (Fig. 6), by steep hilly slopes with a deep-cutting stream channel (Fig. 7) and by high mountains with a small catchment (Fig. 8). All of the study sites have an area of about 0.363 km2 (i.e. 602 m 602 m). Some terrain characteristics of the four sites were shown in Table 1, which includes relief features that are closely related to terrain variability, such as standard deviation of elevations. It is worth noting that the terrain roughness in Table 1 was computed with Eq. (13) in Section 3.2 of our paper. The raw data of all the study sites were captured using Leica ALS70 with the laser wavelength of 1064 nm in the year of 2012. During the data capture, for the flat site, the relative flying height
114
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
the non-ground LiDAR points. A human operator analyzed the candidate set of ‘‘ground’’ points to further improve the accuracy of classification. This is generally achieved by analyzing the relationship of each given point with its neighbors according to the 3D map of point cloud, or by overlaying the point cloud onto the imagery of the study site. Those points which are much different from its neighbors (termed as outliers) or located at unground objects (termed as non-ground points) should be deleted. The number of resulting ground points and the corresponding point post-spacing of the four sites were shown in Table 1. 3.2. Accuracy measures
Fig. 2. Flow chart of the scheme used to obtain the accuracy measures of the three data reduction methods for DEM construction.
was about 1500 m with the mean data density of 4 point/m2, and for the other three sites, the absolute flying height was about 4800 m with the mean data density of 1 point/m2. TerraSolid’s TerraScan software was employed to classify the ground points from
(a) Flat
(b) Undulating
In addition to MQ-G, the maximum z-tolerance (Max-Z) and the random method were also employed to select critical points from LiDAR-derived ground data. Theoretically, both MQ-G and Max-Z have two stopping rules, namely, the maximum interpolation error and the number of points to be selected. In our study, we employed the latter stopping rule to give a comparative analysis between the aforementioned three methods. We set the number of critical points as 100, 200, 400, 600, 800 and 1000, respectively. The flow chart of the scheme used to obtain the accuracy measures of the three methods was shown in Fig. 2. Here, accuracy measures including root mean square error (RMSE) and error range (i.e. difference between maximum error (MaxE) and minimum error (MinE)) were adopted. The RMSE is expressed as,
RMSE ¼
qX ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi n e2 =n i¼1 i
ð11Þ
(c) Hilly
(d) Mountainous
Fig. 3. Comparison of number of selected points versus RMSE between the three methods in the sites with (a) flat, (b) undulating, (c) hilly and (d) mountainous terrains.
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
(a) Flat
(c) Hilly
(b) Undulating
(d) Mountainous
115
Fig. 4. Comparison of number of selected points versus error range (i.e. MaxE–MinE) between the three methods in the sites with (a) flat, (b) undulating, (c) hilly and (d) mountainous terrains.
where e represents the difference of DEMs constructed with all ground points (original DEM) and the selected critical points (reduced DEM) by using a data selection method (e.g. MQ-G) as indicated in Fig. 2; n is the total number of DEM grids. In our study, the DEM resolution is 2 m, which is approximate to the mean postspacing of LiDAR points in the hilly terrain site. In order to assess the effect of all data reduction methods on preserving the morphological nature of the terrain surface, two DEM derivatives including mean slope ( S) and terrain roughness (K) were also used for comparison. They are respectively expressed as (Zhou and Chen, 2011),
S ¼
n X Si A i
,
i¼1
¼ K ¼ A=A
n X Ai
ð12Þ
i¼1 n X Ai sec Si i¼1
,
n X Ai
ð13Þ
i¼1
where S is the slope, measured in degrees; A is the projected area; A is the surface area, i is the ith grid cell; and n is the total number of the grid cells. In principle, the bigger the mean slope and surface roughness, the better the data reduction method for preserving morphological nature on the reduced DEMs. Thus, good reduction methods should keep the two measures as high as possible. In addition, we also quantitatively assessed the effectiveness of the three methods for keeping the important terrain features in terms of streamline matching rate (SMR) (Zhou and Chen, 2011). SMR is expressed as,
SMR ¼ ^L 100=L
ð14Þ
where ^L is the length of feature lines located into the corresponding feature line buffers; L is the total length of the original feature lines. Practically, the original feature lines such as ridges and valleys were first retrieved from the original DEM to generate ‘buffers’ with a threshold. Then the buffer zones were overlaid with the feature lines retrieved from the reduced DEMs. A high SMR means high performance accuracy. In this paper, the buffer threshold was 4 m, and hydrology module in ArcGIS 10.1 was used to extract feature lines. 3.3. Results Comparison of number of selected critical points versus accuracy measures of the three methods were shown in Figs. 3 and 4. It can be found that with the increase of number of selected points, the results of all methods become more and more accurate in terms of RMSE, except for Max-Z in the flat terrain (Fig. 3). However, in terms of error range, this trend is not obvious for Max-Z and the random method, which is proved by the serious oscillations of these trend lines (Fig. 4). This indicates that MQ-G is more stable than Max-Z and the random method for selecting significant terrain points. Furthermore, irrespective of accuracy measures, MQ-G always produces better results than the other methods, especially for the undulating and hilly terrains. The spatial distribution maps of 100 significant points selected by the three methods in the four study sites were shown in Figs. 5–8. It can be found that for MQ-G, more points are distributed in complex areas than in plain areas. Moreover, terrain features, such as gullies and valleys in the flat and hilly terrains (Figs. 5a and 7a), and ridges in the other sites (Figs. 6a and 8a) are accurately captured. However, the points selected by Max-Z
116
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
(a) MQ-G
(b) Max-Z
(c) Random method Fig. 5. 100 significant points selected by (a) MQ-G, (b) Max-Z, and (c) the random method in the flat terrain. ‘red dot’ represents the selected critical points. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
are prone to locate at the boundary of the study sites, especially for the mountainous terrain (Fig. 8b). This is due to the fact that TIN with a linear interpolation theoretically has a lower interpolation accuracy at the side of triangle than within the triangle (Li, 1993). Moreover, points selected by Max-Z suffer from uneven distributions, especially for the hilly and mountainous terrains (Figs. 7b and 8b). This is expected as TIN with a linear interpolation is a local method and has a poor interpolation accuracy in the complex topography. In comparison, points selected by the random method are randomly distributed without considering any terrain features (Figs. 5c, 6c, 7c and 8c). However, they are slightly more evenly distributed than those of Max-Z, which leads to its higher accuracy than Max-Z for DEM construction in almost all study sites (Figs. 3 and 4). In a word, MQ-G is more accurate than the classical methods for capturing feature points. Comparison of DEM derivatives versus number of selected points between the three methods was shown in Table 2. We can
find that as the number of selected points increases, both the mean slope and terrain roughness become bigger regardless of the terrains of the study sites. Irrespective of the number of selected points, MQ-G is always more accurate than the other methods in almost all study sites, except for Max-Z in the mountainous terrain. In comparison, the random method consistently produces the worst results, which proves its poor ability of capturing terrain features. Fig. 9 provided the SMRs of the three methods in the four study sites. It can be seen that with the increase of number of selected points, the SMR of MQ-G gradually becomes bigger in almost all study sites, except for the undulating terrain. In contrast, the trend lines of the other methods have obvious oscillations, especially for the random method. This further indicates the poor performance of the random method for capturing terrain features. Comparatively, MQ-G is more accurate than Max-Z and the random method for all cases in terms of SMR. In a word, MQ-G has a good ability of
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
117
(b) Max-Z
(a) MQ-G
(c) Random method Fig. 6. 100 significant points selected by (a) MQ-G, (b) Max-Z, and (c) the random method in the undulating terrain. ‘red dot’ represents the selected critical points. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
faithfully selecting significant points, while maximally retaining terrain features. 4. Discussions and conclusions 4.1. Discussions Although MQ-G obtains good performances for picking significant terrain points from the LiDAR-derived ground points, it is helpless to determine the number of data points required for a specified DEM accuracy. In practice, this is generally achieved by obtaining the relationship between DEM accuracy and all accuracy influencing factors by means of some empirical and theoretical models. For &&&&example, Guo et al. (2010) and
Aguilar et al. (2005) gave empirical investigations on the effect of topographical variability and sample density on the accuracy of DEMs with respect to different interpolation methods. Kraus et al. (2006) presented an empirical model based on moving least squares to deduce the local and relative accuracy measures for every interpolated grid point in DTMs, such as point density, terrain curvature and the number and alignment of the neighboring original points. Aguilar et al. (2010) developed an empirical-theoretical model for establishing relationships between LiDAR sampling density, terrain slope, IDW interpolation method and DEM accuracy. Theoretically, under the guide of the aforementioned models, MQ-G can faithfully select those data points to meet the requirement of DEM construction with a specified accuracy.
118
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
(a) MQ-G
(b) Max-Z
(c) Random method Fig. 7. 100 significant points selected by (a) MQ-G, (b) Max-Z, and (c) the random method in the hilly terrain. ‘red dot’ represents the selected critical points. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
It should be noted that our method is originally developed to reduce LiDAR-derived ground data in the context of DEM construction, and its good performance is partly owed to the high interpolation accuracy of the smoothing MQ, especially for the study sites with complex terrains. However, it poses great challenges to extend this method to deal with high dense Digital Surface Models (DSMs) of the raw LiDAR data because of the inherent complexities of the DSMs including the frequent discontinuities, the variety of textures and shapes, dense forests and the rate of surface change. Furthermore, the transferability of the MQ method for modeling the highly discontinuous surface requires further investigation. 4.2. Conclusion In this article, a greedy-based multiquadric method (MQ-G) has been developed to reduce LiDAR data by selecting critical points
with an iterative process. Namely, the point with the biggest interpolation error was grouped into the critical points after each iteration. In order to avoid the huge computing cost problem, an efficient technique was developed to solve the systems of the MQ equations in the global interpolation context, and the local MQ interpolation replaced the global one when a certain amount of critical points were selected. Four study sites with different morphologies were respectively employed to comparatively analyze the performances of MQ-G and two classical data selection methods including Max-Z and the random method. Results demonstrate that irrespective of the number of selected critical points and terrain characteristics, MQ-G is always more accurate than the other methods. Furthermore, MQ-G has a better ability in preserving terrain features in terms of SMR. Theoretically, geomorphological feature lines and points contain more important information that others for terrain analysis. Thus, they should be preserved during DEM generalization and
119
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
(a) MQ-G
(b) Max-Z
(c) random method Fig. 8. 100 significant points selected by (a) MQ-G, (b) Max-Z, and (c) the random method in the mountainous terrain. ‘red dot’ represents the selected critical points. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)
Table 1 Terrain characteristics of the four study sites. Terrain statistics
Flat
Undulating
Hilly
Mountainous
Minimum elevation (m) Maximum elevation (m) Mean elevation (m) Standard deviation of elevations (m) Mean slope (°) Terrain roughness Number of ground points Point post-spacing (m)
1404.2 1411.2 1408.0 1.15 1.86 1.001 587,809 0.8
3072.4 3266.1 3164.8 46.66 23.36 1.101 241,422 1.3
910.1 1210.7 1018.3 69.99 38.28 1.337 100,056 1.9
3479.6 3997.5 3711.3 126.35 40.96 1.358 188,629 1.4
data reduction. Liu and Zhang (2011) indicated that terrain feature lines played an important role in improving the accuracy of DEM construction in data reduction. Zhou and Chen (2011) demonstrated that when taking the feature lines and points into account, we can maintain RMSE at an acceptable level, while reducing the
elevation data points by over 99% in the context of DEM generalization. Thus, our further focus was on the integration of feature lines into the process of MQ-G to improve the data reduction ratio, while maximally maintaining the terrain features in the reduced DEMs.
120
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
Table 2 Comparison of DEM derivatives versus number of selected points between MQ-G, Max-Z and the random method in the four study sites. Number of selected points Flat
MQ-G Max-Z Random
Undulating
MQ-G Max-Z Random
Hilly
MQ-G Max-Z Random
Mountainous
MQ-G Max-Z Random
100
200
400
600
800
1000
S (°) K S (°) K S (°) K
1.17 1.0003 0.73 1.0001 0.56 1.0001
1.41 1.0005 0.93 1.0002 0.66 1.0001
1.56 1.0006 1.09 1.0003 0.81 1.0002
1.79 1.0008 1.16 1.0003 0.88 1.0002
1.79 1.0008 1.18 1.0004 0.97 1.0002
1.80 1.0008 1.06 1.0003 1.05 1.0003
S (°) K S (°) K S (°) K
22.02 1.0896 21.83 1.0852 20.50 1.0747
22.60 1.0947 22.08 1.0885 20.69 1.076
22.94 1.0979 22.39 1.0919 21.83 1.0857
23.06 1.0987 22.49 1.0932 22.05 1.0881
23.08 1.0993 22.58 1.0941 21.90 1.0871
23.10 1.0993 22.58 1.0941 22.22 1.0902
S (°) K S (°) K S (°) K
36.29 1.283 35.23 1.266 33.51 1.228
37.08 1.298 35.48 1.270 35.09 1.257
37.28 1.304 36.46 1.283 35.58 1.265
37.53 1.308 37.04 1.294 36.45 1.281
37.67 1.311 37.32 1.301 36.69 1.286
37.74 1.312 37.28 1.301 36.79 1.287
S (°) K S (°) K S (°) K
39.10 1.323 39.60 1.331 38.05 1.286
39.81 1.338 39.80 1.342 38.15 1.289
40.17 1.347 40.30 1.354 39.07 1.308
40.25 1.350 40.51 1.360 39.25 1.314
40.30 1.351 40.72 1.365 39.43 1.318
40.41 1.353 40.81 1.367 39.82 1.327
(a) Flat
(c) Hilly
(b) Undulating
(d) Mountainous
Fig. 9. SMRs of MQ-G, Max-Z and the random methods in the study sites with (a) flat, (b) undulating, (c) hilly and (d) mountainous terrains.
Acknowledgments This work is supported by National Natural Science Foundation of China (Grant Nos. 41101433, 41371367), by Young and Middle-
Aged Scientists Research Awards Fund of Shandong Province (Grant No. BS2012HZ010), by Qingdao Science and Technology Program of Basic Research Project (Grant No. 13-1-4-239-jch), by the Key Laboratory of Marine Surveying and Mapping in
C. Chen et al. / ISPRS Journal of Photogrammetry and Remote Sensing 102 (2015) 110–121
Universities of Shandong (Shandong University of Science and Technology) (Grant No. 2013B03), by SDUST Research Fund, by Joint Innovative Center for Safe and Effective Mining Technology and Equipment of Coal Resources and by the Special Project Fund of Taishan Scholars of Shandong Province. The dataset used in this paper was provided by Cold and Arid Regions Science Data Center at Lanzhou (http://westdc.westgis.ac.cn).
References Aguilar, F.J., Mills, J.P., 2008. Accuracy assessment of lidar-derived digital elevation models. Photogram. Rec. 23 (122), 148–169. Aguilar, F.J., Mills, J.P., Delgado, J., Aguilar, M.A., Negreiros, J.G., Pérez, J.L., 2010. Modelling vertical error in LiDAR-derived digital elevation models. ISPRS J. Photogramm. Rem. Sens. 65 (1), 103–110. Ai, T.H., Li, J.Z., 2010. A DEM generalization by minor valley branch detection and grid filling. ISPRS J. Photogramm. Rem. Sens. 65 (2), 198–207. Anderson, E.S., Thompson, J.A., Austin, R.E., 2005. LIDAR density and linear interpolator effects on elevation estimates. Int. J. Rem. Sens. 26 (18), 3889– 3900. Anderson, E.S., Thompson, J.A., Crouse, D.A., Austin, R.E., 2006. Horizontal resolution and data density effects on remotely sensed LIDAR-based DEM. Geoderma 132 (3–4), 406–415. Bjørke, J.T., Nilsen, S., 2003. Wavelets applied to simplification of digital terrain models. Int. J. Geogr. Inform. Sci. 17 (7), 601–621. Chang, K.-T., 2010. Introduction to Geographic Information Systems. McGraw-Hill Science, New York, 425pp. Chen, C.F., Li, Y.Y., 2013. An orthogonal least-square-based method for DEM generalization. Int. J. Geogr. Inform. Sci. 27 (1), 154–167. Chen, C.F., Yue, T.X., 2010. A method of DEM construction and related error analysis. Comput. Geosci. 36 (6), 717–725. Chen, Y.M., Zhou, Q.M., 2013. A scale-adaptive DEM for multi-scale terrain analysis. Int. J. Geogr. Inform. Sci. 27 (7), 1329–1348. Chou, Y.H., Liu, P.S., Dezzani, R.J., 1999. Terrain complexity and reduction of topographic data. J. Geogr. Syst. 1, 179–198. Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C., 2009. Introduction to Algorithms. MIT Press, Massachusetts, 1312pp. De Floriani, L., Magillo, P., Puppo, E., 2000. VARIANT: a system for terrain modeling at variable resolution. GeoInformatica 4 (3), 287–315. Fisher, P.F., 1999. Models of uncertainty in spatial data. Geogr. Inform. Syst. 1, 191– 205. Fisher, P.F., Tate, N.J., 2006. Causes and consequences of error in digital elevation models. Prog. Phys. Geogr. 30 (4), 467–489. Franke, R., Hagen, H., Nielson, G., 1994. Least squares surface approximation to scattered data using multiquadratic functions. Adv. Comput. Math. 2 (1), 81–99. Guo, Q., Li, W., Yu, H., Alvarez, O., 2010. Effects of topographic variability and lidar sampling density on several DEM interpolation methods. Photogramm. Eng. Rem. Sens. 76 (6), 701–712. Hardy, R.L., 1971. Multiquadric equations of topography and other irregular surfaces. J. Geophys. Res. 76 (8), 1905–1915. Hardy, R., 1990. Theory and applications of the multiquadric-biharmonic method. 20 years of discovery 1968–1988. Comput. Math. Appl. 19 (8), 163–208. Heckbert, P.S., Garland, M., 1999. Optimal triangulation and quadric-based surface simplification. Comput. Geom. 14 (1–3), 49–65.
121
Heller, M., 1990. Triangulation algorithms for adaptive terrain modelling. In: Proceedings 4th International Symposium on Spatial Data Handling, vol. 1, Zürich, pp. 163–174. Hodgson, M.E., Jensen, J., Raber, G., Tullis, J., Davis, B.A., Thompson, G., Schuckman, K., 2005. An evaluation of lidar-derived elevation and terrain slope in leaf-off conditions. Photogramm. Eng. Rem. Sens. 71 (7), 817–823. Hofierka, J., Parajka, J., Mitasova, H., Mitas, L., 2002. Multivariate interpolation of precipitation using regularized spline with tension. Trans. GIS 6 (2), 135– 150. Kraus, K., Pfeifer, N., 1998. Determination of terrain models in wooded areas with airborne laser scanner data. ISPRS J. Photogramm. Rem. Sens. 53 (4), 193– 203. Kraus, K., Karel, W., Briese, C., Mandlburger, G., 2006. Local accuracy measures for digital terrain models. Photogramm. Rec. 21 (116), 342–354. Lee, J., 1991. Comparison of existing methods for building triangular irregular network, models of terrain from grid digital elevation models. Int. J. Geogr. Inform. Sci. 5 (3), 267–285. Li, Z.L., 1993. Mathematical models of the accuracy of digital terrain model surfaces linearly constructed from square gridded data. Photogramm. Rec. 14 (82), 661– 674. Li, X., Cheng, G., Liu, S., Xiao, Q., Ma, M., Jin, R., Che, T., Liu, Q., Wang, W., Qi, Y., Wen, J., Li, H., Zhu, G., Guo, J., Ran, Y., Wang, S., Zhu, Z., Zhou, J., Hu, X., Xu, Z., 2013. Heihe watershed allied telemetry experimental research (HiWATER): scientific objectives and experimental design. Bull. Am. Meteorol. Soc. 94 (8), 1145– 1160. Liu, X., Zhang, Z., 2011. Effects of LiDAR data reduction and breaklines on the accuracy of digital elevation model. Surv. Rev. 43 (323), 614–628. Marquardt, D.W., Snee, R.D., 1975. Ridge regression in practice. Am. Stat. 29 (1), 3– 20. Mitasova, H., Mitas, L., 1993. Interpolation by regularized spline with tension: I. Theory and implementation. Math. Geol. 25 (6), 641–655. Nordvik, T., Midtbø, T., 2007. Application of data reduction methods in dynamic TIN models to topographic LIDAR data. In: Proceedings of the 11th Scandinavian Research Conference on Geographical Information Science, 5th–7th September, pp. 111–128. Aguilar, F.J., Aguera, F., Aguilar, M.A., Carvajal, F., 2005. Effects of terrain morphology, sampling density, and interpolation methods on grid DEM accuracy. Photogramm. Eng. Rem. Sens. 71 (7), 805–816. Oryspayev, D., Sugumaran, R., DeGroote, J., Gray, P., 2012. LiDAR data reduction using vertex decimation and processing with GPGPU and multicore CPU technology. Comput. Geosci. 43, 118–125. Rippa, S., 1992. Adaptive approximation by piecewise linear polynomials on triangulations of subsets of scattered data. SIAM J Sci. Stat. Comput. 13, 1123– 1141. Rippa, S., 1999. An algorithm for selecting a good value for the parameter c in radial basis function interpolation. Adv. Comput. Math. 11 (2), 193–210. Schröder, F., Roßbach, P., 1994. Managing the complexity of digital terrain models. Comput. Graph. 18 (6), 775–783. Schroeder, W.J., Zarge, J.A., Lorensen, W.E., 1992. Decimation of triangle meshes. Comput. Graph. 26 (2), 65–70. Syed, K.H., Goodrich, D.C., Myers, D.E., Sorooshian, S., 2003. Spatial characteristics of thunderstorm rainfall fields and their relation to runoff. J. Hydrol. 271 (1–4), 1– 21. Yue, T.X., Du, Z.P., Song, D.J., 2007. A new method of surface modelling and its application to DEM construction. Geomorphology 91 (1–2), 161–172. Zhou, Q.M., Chen, Y.M., 2011. Generalization of DEM for terrain analysis using a compound method. ISPRS J. Photogramm. Rem. Sens. 66 (1), 38–45.