Development of a pit filling algorithm for LiDAR canopy height models

Development of a pit filling algorithm for LiDAR canopy height models

ARTICLE IN PRESS Computers & Geosciences 35 (2009) 1940–1949 Contents lists available at ScienceDirect Computers & Geosciences journal homepage: www...

1MB Sizes 10 Downloads 41 Views

ARTICLE IN PRESS Computers & Geosciences 35 (2009) 1940–1949

Contents lists available at ScienceDirect

Computers & Geosciences journal homepage: www.elsevier.com/locate/cageo

Development of a pit filling algorithm for LiDAR canopy height models Joshua R. Ben-Arie a, Geoffrey J. Hay a,, Ryan P. Powers a, Guillermo Castilla a, Benoıˆt St-Onge b a b

Department of Geography, University of Calgary, 2500 University Dr. NW, Calgary, AB, Canada T2N 1N4 ` Montre ´al, Case postale 8888, succursale Centre-ville, Montre´al (Que´bec), Canada H3C 3P8 De´partement de Ge´ographie, Universite´ du Que´bec a

a r t i c l e in f o

a b s t r a c t

Article history: Received 7 June 2008 Received in revised form 25 January 2009 Accepted 20 February 2009

LiDAR canopy height models (CHMs) can exhibit unnatural looking holes or pits, i.e., pixels with a much lower digital number than their immediate neighbors. These artifacts may be caused by a combination of factors, from data acquisition to post-processing, that not only result in a noisy appearance to the CHM but may also limit semi-automated tree-crown delineation and lead to errors in biomass estimates. We present a highly effective semi-automated pit filling algorithm that interactively detects data pits based on a simple user-defined threshold, and then fills them with a value derived from their neighborhood. We briefly describe this algorithm and its graphical user interface, and show its result in a LiDAR CHM populated with data pits. This method can be rapidly applied to any CHM with minimal user interaction. Visualization confirms that our method effectively and quickly removes data pits. Crown Copyright & 2009 Published by Elsevier Ltd. All rights reserved.

Keywords: LiDAR Data pits Noise removal Canopy height model Interactive data language (IDL)

1. Introduction Light detection and ranging (LiDAR) is an active remote sensing technology that emits pulses of near infra-red light and records the backscatter, resulting in a three-dimensional (3D) point cloud. Once collected, this raw point cloud is typically filtered and classified into first and last returns. When captured over a forest environment, the first returns correspond to the energy echoed from the uppermost vegetation layer of a canopy. Once classified, these points are interpolated to a digital surface model (DSM) representing surface elevation above sea level. The last returns correspond to the last detectable signal when a pulse is intercepted by an opaque object, normally the ground (St-Onge et al., 2003). These classified points are interpolated into a digital terrain model (DTM), which represents bare terrain elevation above sea level. When the DTM is subtracted from the DSM, the result is a canopy height model (CHM), which represents absolute canopy height above the terrain surface.1 As a result, LiDAR technology allows for large-area acquisition of unprecedented 3D structural forest information. However, such data are not without their challenges. Data pits are typically visible in raster CHMs as (apparently) randomly distributed dark holes that are digitally represented by exceptionally lower digital height values than their neighbors. It is  Corresponding author.

E-mail address: [email protected] (G.J. Hay). The common term digital elevation model (DEM), as used later in this paper, represents any type of digital model composed of elevation or height values, thus DTMs, DSMs and CHMs are specific types of DEMs. 1

believed that these artifacts are caused by a combination of factors, from data acquisition to post-processing, though no specific cause has been defined in the literature. It is important to distinguish between data pits and canopy gaps. Gaps are natural openings in a forest canopy that range in size (Spies and Franklin, 1989). However, canopy gaps are not simply bare earth; rather they are often populated with shrubs and saplings at different stages of growth. In fact, canopy gaps are expected when studying forests and are integral to a healthy and dynamic ecosystem (Whitmore, 1989). It is easy to visually discriminate small canopy gaps from data pits. Canopy gaps are asymmetrical, are generally composed of many pixels (even small canopy gaps), and appear ‘natural’ in a forest landscape. Pits exhibit none of these characteristics. In Fig. 1, the large dark mass (image lower right) is a natural canopy gap, while the scattered small dark rectangles (more than one pixel) and squares (single pixels) represent data pits. In addition to lending a poor visual appearance to an image (Fig. 1), analyzing a pit-filled dataset may produce inaccurate biophysical and ecological measurements (elaborated below). The challenge is that not all pits are the same value, or differ the same amount in value relative to their neighbors, thus using a simple threshold to define pits does not work. If a group of contiguous pixels corresponding to a tree crown contains a pit consisting of a single pixel, that pixel value can still be higher than many non-pit pixels in other areas of the CHM. If one were to simply apply smoothing filters – common in remote sensing image processing software – to a CHM with data pits, pits could certainly be removed, or at least their ‘influence’ in the image visually reduced. However, based on current methods,

0098-3004/$ - see front matter Crown Copyright & 2009 Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.cageo.2009.02.003

ARTICLE IN PRESS J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

Canopy height (m)

0

15 m

51

Fig. 1. Subset of Campbell River canopy height model (CHM). Comparing a canopy gap approximately 10 m wide (lower right) to data pits (small dark rectangles and squares).

all pixels in the dataset would also be altered, not just pits. Not only would this result in a visual smoothing of the image – dependent on the size and type of smoothing filter selected – but it is also an inefficient use of LiDAR technology, as smoothing would also alter the vertical-accuracy (i.e., height) of the image. As Hyyppa¨ et al. (2000) reports, forest LiDAR data has an average vertical standard error of approximately 22 cm (varying with the technology used). Thus, when considering LiDAR’s numerous and growing applications in forestry and landscape ecology (Dubayah and Drake, 2000; Hyde et al., 2005; Lefsky et al., 2002; Lim et al., 2003; Reutebuch et al., 2005; Hilker et al., 2008; St-Onge et al., 2008), it is imperative to obtain accurate CHMs, especially for individual tree-crown delineation (Leckie et al., 2003) and tree height estimation (Popescu and Wynne, 2004). Additionally, the presence of pits will result in average canopy height underestimation, which could lead to measurement errors of aboveground forest biomass and carbon sequestration. The purpose of this study is to report on a pit filling method for LiDAR CHM’s and to evaluate its results by visually comparing them to the effect common smoothing filters have on the same datasets. As Wood and Fisher (1993) discuss, visual assessment of CHMs is important. Indeed, it is an effective and often underestimated method in assessing data quality. In the following sections, we provide a more thorough review of data pits within the literature and discuss their potential causes; describe the study site and data; provide details on the methods developed; discuss the results; and provide a conclusion to our work.

2. Background In this section, we briefly review pertinent literature related to LiDAR pits and discuss their possible causes. 2.1. Insight from the literature There is a distinct paucity in the literature regarding data pits in LiDAR digital elevation models (DEMs). Leckie et al. (2003)

1941

states that these artifacts arise from ‘‘ground hits within a tree crown’’. Their pre-processing attempt to remove pits involved overlaying a 25 cm grid on a 3D point cloud and building a DSM from the highest hit in each cell. However, some pits remained, causing ‘‘artifacts in the surface model’’. Hyyppa¨ et al. (2000) describe some pixels as ‘‘no data’’, and filled them ‘‘by using interpolation and a knowledge of near-by pixels.’’ The interpolation method is not given, neither is a definition of these problematic pixels. Kraus and Pfeifer (1998), in their influential paper on DTM production, encountered ‘‘big negative blunders’’, and it is relatively clear that data pits are their equivalent. These ‘‘blunders’’ are not described in any detail, but they did ‘tweak’ their technique to be more robust in the presence of this problem. Zhang and Chen (2003), in attempting to remove non-ground LiDAR data points for DTM production, mention that some points have ‘‘large negative elevation values drastically lower than those of their neighbors’’. They also call this phenomenon ‘‘negative blunders’’. To fix this problem, a morphological filter was suggested, but not implemented or tested. They also mentioned that the source of negative blunders is not definitively known. MacMillan et al. (2003) attempted to fill pits in a LiDAR DTM by applying mean filters of varying sizes. They considered this a sub-optimal approach and called for the development of a more flexible DEM editor. There has been other research conducted on closing pits or minimizing errors in LiDAR DTMs (Briese et al., 2002; Younan et al., 2002). However, these papers do not elaborate on the relative number of pits in the study area or the amount that pits drop in value relative to their neighbors, nor do they explicitly define data pits or their causes. Additionally, there is scarce mention in the literature specifically on pits occurring in LiDAR DEMs, let alone the appropriate filling of these pits. In fact, not a single paper was found devoted to the subject of data pits in LiDAR DEMs. The issue of data pits in LiDAR CHMs is far from resolved.

2.2. Causes of data pits While not specifically defined, it is probable that the cause(s) of data pits are intricately linked to LiDAR technology and the preprocessing of raw point-clouds into meaningful raster DEMs. Thus, the raison d’eˆtre of data pits is complex and not fully resolved. The process from laser scanning to raster model production is a multi-step procedure uniting three different technologies: (i) laser ranging system, (ii) inertial navigation system (INS) and (iii) differential global positioning system (DGPS). Because of the possible compounding effect of multiple factors, it is difficult to measure the effect individual factors may have on pit formation. Range error, platform attitude variation, position accuracy and time misregistration can all contribute to errors in a LiDAR dataset. Attitude errors arise from increasing the flying height and scan angle (Baltsavias, 1999). Time misregistration between the laser instrument, INS and DGPS account for inaccurate 3D positioning (Baltsavias, 1999; Latypov, 2005). Position accuracy, mainly concerning the quality of DGPS processing, accounts for most of the intrinsic technological error associated with LiDAR use (Baltsavias, 1999; St-Onge et al., 2003). However, it is not clear whether system errors produce data pits. Leckie et al. (2003) postulate that data pits may form by combining different flight line datasets. LiDAR spatial resolution may be affected or unevenly distributed by variation in aircraft attitude, scan pattern, structure of the canopy and terrain and deflection of lost returns. Some areas can yield zero to more than ten pulse returns per square meter, resulting in a failure to record many treetops (Gaveau and Hill, 2003, St-Onge et al, 2003).

ARTICLE IN PRESS 1942

J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

Fig. 2. Study site: 3D CHM view of Campbell River study site, Vancouver Island, Canada. Dark areas represent ground level (including cut areas, roads and a river) and bright areas represent vegetation.

Table 1 LiDAR parameters.

(1) Laplacian filter applied to the original CHM (with pits)

Parameter

Performance

Sensor Laser scan frequency Laser impulse frequency Laser power Maximum scan angle Type of scanning mirror Laser beam divergence Measurement density Datum Projection Flight altitude above ground Flight speed

Mark II 25 Hz 40,000 Hz o4 W o201 Oscillating o0.5 milliradians 0.5–0.8 hits per sq. metre NAD 83 UTM 10 N 900 m 25–30 ms 1

-1 -1 -1 -1+8 -1

Laplacian Operator

-1 -1 -1 (2) Threshold Laplacian of CHM

(3) Generate binary mask of Laplacian (4) Median filter applied to

Related to this is that the minimum detectable object depends on reflectivity, not size (Baltsavias, 1999). Regarding the methods used in creating DEMs from LiDAR point data, there is no standard procedure, and companies typically use proprietary algorithms for this purpose. Not only are vegetation returns more complex to interpolate than ground returns as St-Onge et al. (2003) discuss, but the precise preprocessing method is often glossed over, as seen in Mei and Durrieu (2004). Published accounts of LiDAR pre-processing focus almost exclusively on DTM production, with very little on DSM or CHM production. For an example of this, see Briese et al. (2002), Sithole and Vosselman (2004), Lee and Younan (2003) and Kraus and Pfeifer (1998), although this last paper does express an interest in honing the modeling of vegetation laser returns. Furthermore, pre-processing techniques depend on the nature of the topography, sampling density and the type of scanner used (Petzold et al., 1999). For these reasons, causes of data pits in LiDAR CHMs due to pre-processing are difficult to assess. All we know is that they visibly exist. A comparison of different classification and interpolation techniques for DSM production along with standard

Original CHM (5) Extract pit mask from median dataset

(6) Fill pits in original CHM with filled pit mask values Fig. 3. Pit filling algorithm.

recommendations in this area is highly desirable, and a good start is found in Hyyppa¨ et al (2004), but is beyond the scope of this paper. Though we do note that Kraus and Pfeifer (1998) developed a recognized method of pre-processing that takes into account a skewed distribution of laser points on treetops, compared with those on the ground. In determining the weight value for their iterative linear prediction, it was found that one (out of three) of

ARTICLE IN PRESS J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

their methods was more robust in dealing with ‘‘negative blunders’’. When multiple echoes are returned in forested areas, information loss occurs when interpolating to a DSM, such as points with different z values but similar x, y values (Axelsson, 1999). Because of this loss of information, Vosselmann and Maas (2001) recommends that only dense pulse returns should be interpolated. However, if datasets less than a certain pulse return density were interpolated, data pits may result. Gaveau and Hill (2003) found that linearly interpolating a LiDAR point cloud to create a DSM resulted in data smoothing; and that smoothing increased if a larger pixel size was selected. This ‘inappropriate’ selection of a smaller pixel size may result in data pits. However, it is important to consider that modeling an optimal spatial resolution (i.e., pixel size) for a forested area is not trivial (Marceau et al., 1994; Marceau and Hay, 1999). Using inverse distance weighting (IDW) to derive interpolated models from point-clouds may cause data pits, since IDW does not smooth datasets as well as other interpolation methods such as Kriging (Clark et al., 2004); however its results may be more accurate, as it models an area within a given distance of known

1943

points. We note that IDW was applied to the Campbell River dataset used in this study, and many data pits are evident herein.

3. Data and study area To evaluate the pit filling algorithm described in this study, a CHM from Vancouver Island, Canada is used. LiDAR data acquisition took place on June 08, 2004 over a forested area 10 km S.W. of Campbell River, on the east coast of Vancouver Island, Canada (Fig. 2). At this location, the terrain ranges from 120 to 470 m above sea level with vegetation consisting of 80% Douglas-fir (Pseudotsuga menziesii), 17% Western red cedar (Thuja plicata), 3% hemlock (Tsuga heterophylla) and small stands of red alder (Alnus rubra) in wetter areas. The understory is comprised of salal (Gaultheria shallon), oregon grape (Berberis nervosa), vanillaleaf deer foot (Achlys triphylla) and various ferns and mosses (Morgenstern et al., 2004). LiDAR data collection was conducted by Terra Remote Sensing (Sidney, British-Columbia, Canada). A small footprint LiDAR instrument mounted on a Bell 206 Jet Ranger helicopter achieved

Fig. 4. GUI basic view: pit filling graphical user interface basic view applied to a 400  400 pixel Campbell River CHM subset at 5% threshold.

ARTICLE IN PRESS 1944

J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

hit densities of 0.7 hits/m2 and a footprint (spot) size of 0.19 m (based on pulse frequency, lowest sustainable flight speed and altitude). In processing the (raw) LiDAR point cloud, separation of vegetation and terrain was carried out with TerraScan version 004.006 (Terrasolid, Helsinki, Finland) using an iterative algorithm that combines filtering and thresholding methods (Kraus and Pfeifer, 1998; Axelsson,1999). This resulted in a classification of ground (last returns) and non-ground (first returns).

More specifically, the DTM and DSM were produced by interpolation of the appropriate classified return. That is, if more than one return fell within a given pixel, only the lowest elevation value was retained in the case of the DTM and the highest value in the case of the DSM. Pixels without LiDAR returns received a value obtained using an inverse distance weighted interpolation by means of the four closest returns distributed in four quadrants (one per quadrant) around the empty pixel. The CHM was

Gap

Original

Subscene

Crown

Pits

250 m

3 x 3 Median

Pit filled

3 x 3 Mean

3 x 3 Gaussian

0

51

15 m Canopy height (m) Fig. 5. Pit filling comparison: Campbell River CHM subscene—an example comparing pit filling (using a 3  3 median filter) to standard 3  3 smoothing filters.

ARTICLE IN PRESS J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

obtained by subtracting the DTM values from the DSM. A 3000  3000 pixel subset of the original CHM is used in this study. It has a spatial (pixel) resolution of 0.5 m, making the study area within the scene 1500 m  1500 m (2.25 km2). See Table 1 for additional LiDAR acquisition metadata.

4. Methods The pit filling algorithm and its related graphical user interface (GUI) are written in interactive data language (IDL version 6.2 Win32  86, a product of ITT Visual Information Solutions). The following summary of the pit filling algorithm corresponds to the flowchart in Fig. 3. This is followed by a more detailed account of key components in Sections 4.1–4.3: (1) A Laplacian (edge) operator is applied to the original CHM. (2) A user-defined threshold is applied to the cumulative histogram of the resulting Laplacian image. (3) A binary mask is generated (from the Laplacian image) where data pits equal one, and non-data pit pixels equal zero. (4) A (3  3) median filter is then applied to the original CHM, resulting in a pit-filled CHM. (5) Pixels marked as data pits (one’s) in the binary mask are used to extract the corresponding digital number (DN) from the ‘median’ filled CHM. (6) These new median values are then used to replace (i.e., fill) the pit values in the original CHM. 4.1. Edge detection In step 1, pits are identified and extracted from the original CHM to create a binary ‘pit’ mask. However, a user cannot simply threshold the original dataset to create a pit mask based on DN’s

1945

alone, since the canopy height values in the Campbell River CHM range from 0 to 51 m, and data pits range from approximately 0–28 m. Thus, selecting specific DN’s (or a range) from the original dataset may exclude many data pits (omission error) and include many non-data pits (commission error). To overcome this, a Laplacian edge detector was applied to the original dataset, as pits can be recognized as abrupt changes (i.e., edges) in DN’s. Pits were then defined by thresholding the cumulative histogram of the resulting edge image, where data pits reveal a narrower and exclusive range of DNs. A Laplacian operator is an optimal edge detector for this application, since it is not directionally biased. In addition, physiological research suggests that humans visually define edges similar to responses from Laplacian operators (Jensen, 2005), which is helpful as visual acuity plays an important role in the threshold selection (elaborated in the following section). The particular Laplacian filter type used (shown in Fig. 3) was chosen, because it defined pits as large negative values, thus producing a stark visual and quantitative contrast with non-data pits. 4.2. Threshold selection Pit threshold selection was defined after much heuristic evaluation and is based on defining a percentage of the cumulative histogram of the Laplacian image. This is equivalent to specifying an expected percent of pits in the original dataset; which is preferable to using a fixed Laplacian value as a threshold, since it allows for explicit control of the percent of CHM pixels that have their DN altered by the pit filling procedure. This threshold can be selected interactively using a pit filling threshold bar GUI (Fig. 4), which allows the user to observe (and change) in real-time the effect of varying thresholds in a subset of the CHM. Visual evaluation of the output from different threshold values indicates that there is a relatively narrow range of values that

Fig. 6. Pit filling before and after. (A) Pre- and (B) post-pit filling x-axis profiles of Campbell River CHM and their related images (x-axis locations in each histogram represent DN values defined from dashed line in corresponding CHM image).

ARTICLE IN PRESS 1946

J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

yield a visually satisfying result. Within this range, the threshold will include the most problematic pits, i.e., those with large decreases in value relative to their neighbors. If the majority of pits are visually captured in the threshold, omission errors (i.e., pits not included within the threshold range) are not a great concern as (i) they will barely (if at all) be visually noticeable and (ii) they represent a very small percentage of total image pixels. Commission errors (i.e., non-pits defined as pits within the threshold range), are also not a great concern, as their numbers will be small and their values will be replaced with the median value of their (non-pit) neighbors. Thus, they will tend to fall within the vertical resolution accuracy of the sensor anyway. 4.3. Pit filling Once a suitable threshold has been selected, a binary ‘pit’ mask is generated where data pits equal one and all other pixels (defined as non-data pits) equal zero (Fig. 4 shows a subset of the Campbell River CHM and its associated binary ‘pit’ mask based on a 5% threshold). Pixels marked as pits subsequently have their DN

value replaced with the value obtained from the same location in a median-filtered CHM image. We have found that a single application of this program provides very satisfactory results. However, if pits remain, the program can simply be run again at an equivalent or smaller threshold.

4.4. Application of the pit filling algorithm in IDL In this study, the pit filling algorithm was applied to the Campbell River CHM using a 3  3 median filter. A threshold of 5% was heuristically determined to be suitable for this site. The hardware used was an Intel Pentium(R) 4, 2.8 GHz CPU, with 512 MB of RAM. In total, this program took 18.4 s to process the 36 MB, 3000  3000 pixel CHM. In addition to the pit-filled CHM, the program also reports (i) minimum Laplacian value (ii) maximum Laplacian value, (iii) threshold Laplacian value, (iv) number of data pits defined by threshold, (v) number of negative values in filled dataset (all negative values are set to zero) and (vi) processing time in seconds.

Fig. 7. GUI 3D view: pit filling graphical user interface 3D view applied to a 400  400 pixel Campbell River CHM subset at 5% threshold.

ARTICLE IN PRESS J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

1947

5. Results

5.2. x-Axis profile pre- and post-pit filling

5.1. Comparing the pit filling algorithm with smoothing filters

Another technique to visually assess the efficacy of the proposed pit filling algorithm is illustrated in Fig. 6, which compares a subset of the Campbell River CHM pre- and post-pit filling with its associated x-axis histogram. By observing the original profile, data pits are easily visible—the most obvious is pointed out with an arrow. In contrast, this data pit is not found in the pit-filled profile. One can also observe that aside from data pits, the x-axis profile structure remains unchanged between the two. In fact, even the structure of canopy gaps, as shown with a circle, remains essentially unchanged.

Fig. 5 visually compares the efficacy of the pit filling algorithm to three different 3  3 smoothing filters applied to a small subscene of the Campbell River CHM. As illustrated, pits are effectively removed by the 3  3 median filter, but as a whole the image appears blurred (i.e., over-smoothed) compared to the original. The 3  3 mean filter does remove pits, but in-turn introduces (larger) smoothed square artifacts along with an increased smoothing to the scene—even more than that generated by the median filter. The 3  3 Gaussian image appears visually identical to the original; consequently, data pits are still problematic. In examining the canopy gaps and crowns highlighted in Fig. 5, it appears that their shape is best maintained in the pit-filled image. Not only does the pit filling algorithm effectively remove pits, but it also preserves distinct edges such as canopy gaps and canopy crowns.

6. Discussion Apart from the physical problems associated with pits, including poor visual appearance in LiDAR height models, and their potential to lead to the underestimation of related

Fig. 8. GUI expert view: pit filling graphical user interface expert view applied to a 400  400 pixel Campbell River CHM subset at 5% threshold and four times zoom.

ARTICLE IN PRESS 1948

J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

biophysical measures, the conceptual problem with pits is that they are inconsistent with Tobler (1970) First law of Geography, a.k.a ‘‘everything is related to everything else, but near things are more related than distant things’’. Pits violate this principle, since they are visually and quantitatively incongruous with their surroundings and do not accurately represent canopy height at their location, or at immediate neighbors. In an effort to mitigate these problems, pits need to be located and filled. Based on our results, this pit filling algorithm is visually superior to the tested smoothing filters for filling data pits while preserving the edges, shape and structure of canopy gaps and canopy crowns. Only the 3  3 median filter is comparable in removing pits; however, the resulting median image does not preserve the height integrity or structural complexity of the original CHM. The 3  3 mean and Gaussian filters are wholly inferior to the pit filling algorithm, in terms of both removing pits and maintaining distinct canopy edges. Gaussian filters are ineffective as data pits remain. The 5% threshold evaluated for this dataset appears optimal, but it may not be so for other CHMs. To assist in the visualization and threshold selection of the pit filling algorithm for other scenes, the GUI can be used to evaluate pit filling at various spatial extents and thresholds (ranging from 1% to 30% of the cumulative Laplacian histogram). In this example, the GUI uses values extracted from a median 3  3 filter to fill data pits (we note that the filter size can be user defined). There are three GUI interfaces in the presented pit filling program: (i) basic, (ii) 3D and (iii) expert, each of which provide a statistical summary of the pit-filled dataset such as minimum, maximum and mean values as well as the absolute number of pixels filled. For the basic and expert options, a lookup table palette is available with user-defined coloring schemes to aid in threshold selection—as different colors will represent different heights in the canopy, providing insight into their origin (i.e., gap or pit). The basic interface (Fig. 4) displays the original image with pits, the pit-filled image and the binary pit mask (at the same spatial extent and window size), the latter two dynamically updated at user-defined thresholds. The 3D interface (Fig. 7) displays the original 2D image with data pits plus 3D renderings of the original image and the pitfilled image. View direction, texture interpolation and elevation can also be defined. The 3D image can also be rendered in realtime at various thresholds. The expert interface (Fig. 8) displays the original 2D image with pits at the largest spatial extent, while two smaller, corresponding windows display a zoomed window of the pit-filled and binary pit mask images. All three interfaces provide an option to save the pit-filled output in GeoTiff format, along with a text file of corresponding statistics.

7. Conclusion LiDAR canopy height models often exhibit unnatural looking holes or pits. While the exact cause of these pits is unknown, they most likely represent a combination of factors ranging from data acquisition to post-processing. The problem with these pits is that not only are they inconsistent with Tobler’s first law of geography—where near things are more related than distant things—but they result in a visually noisy appearance to the CHM. They can also lead to an underestimation of biophysical measurements, and skew ecological analyses based on these height models. In this paper, we present a robust semi-automated pit filling algorithm (developed in IDL) and compare its results to those generated from median, mean and Gaussian smoothing filters typically found in remote sensing software. We briefly describe our algorithm, its graphical user interface, and the results

from applying it to a 3000  3000 pixel LiDAR CHM populated with data pits. Visualization confirms that our method effectively and quickly removes data pits. Future work will include the incorporation of a built in tiling function so that input size will not be an issue. The IDL binary can be freely downloaded from: http:// www.ucalgary.ca/f3gisci/software

Acknowledgments This research has been generously supported in grants to Dr. Hay from the University of Calgary, the Alberta Ingenuity Fund, and the Natural Sciences and Engineering Research Council (NSERC). We also acknowledge an NSERC-BIOCAP research grant awarded to Dr. Benoıˆt St-Onge from which the original Campbell River data were provided and the problem defined. The opinions expressed here are those of the authors, and do not necessarily reflect the views of their funding agencies. We also thank the anonymous reviewers for their constructive feedback. References Axelsson, P., 1999. Processing of laser scanner data—algorithms and applications. ISPRS Journal of Photogrammetry and Remote Sensing 54, 138–147. Baltsavias, E.P., 1999. Airborne laser scanning: basic relations and formulas. ISPRS Journal of Photogrammetry and Remote Sensing 54, 199–214. Briese, C., Pfeifer, N., Dorninger, P., 2002. Applications of the robust interpolation for DTM determination. In: Proceedings International Archives of Photogrammetry and Remote Sensing 34. ISPRS symposium, Graz, Austria, pp. A-55–61. Clark, M.L., Clark, D.B., Roberts, D.A., 2004. Small-footprint lidar estimation of subcanopy elevation and tree height in a tropical rain forest landscape. Remote Sensing of Environment 91, 68–89. Dubayah, R.O., Drake, J.B., 2000. Lidar remote sensing for forestry applications. Journal of Forestry 98, 44–46. Gaveau, D.L.A., Hill, R.A., 2003. Quantifying canopy height underestimation by laser pulse penetration in small-footprint airborne laser scanning data. Canadian Journal of Remote Sensing 29, 650–657. Hilker, T., Wulder, M.A., Coops, N.C., 2008. Methods to extract and update forest inventory info using airborne LiDAR and high spatial resolution satellite imagery. Canadian Journal of Remote Sensing 34, 5–12. Hyde, P., Dubayah, R., Peterson, B., Blair, J.B., Hofton, M., Hunsaker, C., Knox, R., Walker, W., 2005. Mapping forest structure for wildlife habitat analysis using waveform lidar: validation of montane ecosystems. Remote Sensing of Environment 96, 427–437. Hyyppa¨, J., Hyyppa¨, H., Litkey, P., Yu, X., Haggre´n, H., Ro¨nnholm, P., Pyysalo, U., Pitka¨nen, J., Maltamo, M., 2004. Algorithms and methods of airborne laser scanning for forest measurements. In: Proceedings International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 36. Freiburg, Germany, Part 8/W2, Session 4, pp. 82–89. /http://www.isprs.org/ commission8/workshop_laser_forest/S. Hyyppa¨, J., Pyysalo, U., Hyyppa¨, H., Samberg, A., 2000. Elevation accuracy of laser scanning-derived digital terrain and target models in forest environment. In: EARSEL eProceedingsLIDAR Workshop, Dresden, FRG, No. 1, pp. 139–147. Jensen, J.R., 2005. Introductory Digital Image Processing: A Remote Sensing Perspective, third ed. Pearson Prentice Hall, Upper Saddle River, NJ, 526p. Kraus, K., Pfeifer, N., 1998. Determination of terrain models in wooded areas with airborne laser scanner data. ISPRS Journal of Photogrammetry and Remote Sensing 53, 193–203. Latypov, D., 2005. Effects of laser beam alignment tolerance on lidar accuracy. ISPRS Journal of Photogrammetry and Remote Sensing 59, 361–368. Leckie, D., Gougeon, F., Hill, D., Quinn, R., Armstrong, L., Shreenan, R., 2003. Combined high-density lidar and multispectral imagery for individual treecrown analysis. Canadian Journal for Remote Sensing 29, 633–649. Lee, H.S., Younan, N.H., 2003. DTM extraction of lidar returns via adaptive processing. IEEE Transactions on Geoscience and Remote Sensing 41, 2063–2069. Lefsky, M.A., Cohen, W.B., Parker, G.G., Harding, D.J., 2002. Lidar remote sensing for ecosystem studies. BioScience 52, 19–30. Lim, K., Treitz, P., Wulder, M., St-Onge, B., Flood, M., 2003. Lidar remote sensing of forest structure. Progress in Physical Geography 27, 88–106. MacMillan, R.A., Martin, T.C., Earle, T.J., McNabb, D.H., 2003. Automated analysis and classification of landforms using high-resolution digital elevation data: applications and issues. Canadian Journal of Remote Sensing 29, 592–606. Marceau, D.J., Hay, G.J., 1999. Remote sensing contributions to the scale issue. Canadian Journal of Remote Sensing 25, 357–366. Marceau, D.J., Gratton, D.J., Fournier, R.A., Fortin, J-P., 1994. Remote sensing and the measurement of geographical entities in a forested environment. 2. The optimal spatial resolution. Remote Sensing of the Environment 49, 105–117.

ARTICLE IN PRESS J.R. Ben-Arie et al. / Computers & Geosciences 35 (2009) 1940–1949

Mei, C., Durrieu, S., 2004. Tree-crown delineation from digital elevation models and high resolution imagery. In: Proceedings of the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences 36, working group VIII/2, Laser-Scanners for Forest and Landscape Assessment. Freiburg, Germany. /http://www.isprs.org/commission8/workshop_laser_forest/S. Morgenstern, K., Black, T.A., Humphreys, E.R., Griffis, T.J., Drewitt, G.B., Cai, T., Nesic, Z., Spittlehouse, D.L., Livingston, N.J., 2004. Sensitivity and uncertainty of the ˜ o/La carbon balance of a Pacific Northwest Douglas-fir forest during an El Nin ˜ a cycle. Agricultural and Forest Meteorology 123, 201–219. Nin Petzold, B., Reiss, P., Sto¨ssel, W., 1999. Laser scanning—surveying and mapping agencies are using a new technique for the derivation of digital terrain models. ISPRS Journal of Photogrammetry and Remote Sensing 54, 95–104. Popescu, S.C., Wynne, R.H., 2004. Seeing the trees in the forest: using lidar and multispectral data fusion with local filtering and variable window size for estimating tree height. Photogrammetric Engineering and Remote Sensing 70, 589–604. Reutebuch, S.E., Anderson, H., McGaughey, R.J., 2005. Light detection and ranging (LIDAR): an emerging tool for multiple resource inventory. Journal of Forestry 103, 286–292. Sithole, G., Vosselman, G., 2004. Experimental comparison of filter algorithms for bare-earth extraction from airborne laser scanning point clouds. ISPRS Journal of Photogrammetry and Remote Sensing 59, 85–101. Spies, T.A., Franklin, J.F., 1989. Gap characteristics and vegetation response in coniferous forests of the Pacific Northwest. Ecology 70, 543–545.

1949

St-Onge, B., Hu, Y., Vega, C., 2008. Mapping the height and above-ground biomass of a mixed forest using lidar and stereo Ikonos images. International Journal of Remote Sensing 29 (5), 1277–1294. St-Onge, B., Treitz, P., Wulder, M.A., 2003. Tree and canopy height estimation with scanning lidar. In: Wulder, M.A., Franklin, S.E. (Eds.), Remote Sensing of Forest Environments: Concepts and Case Studies. Kluwer Academic Publishers, Boston, pp. 489–509. Tobler, W.R., 1970. A computer movie simulating urban growth in the Detroit region. Economic Geography 46, 234–240. Vosselmann, G., Maas, H., 2001. Adjustment and filtering of raw laser altimetry data. In: Proceedings of the OEEPE (The European Organisation for Experimental Photogrammetric Research) Workshop on Airborne Laser scanning and Interferometric SAR for Detailed Digital Terrain Models, Stockholm, Sweden. Whitmore, T.C., 1989. Canopy gaps and the two major groups of forest trees. Ecology 70, 536–538. Wood, J.D., Fisher, P.F., 1993. Assessing interpolation accuracy in elevation models. IEEE Computer Graphics and Applications 13, 48–56. Younan, N.H., Lee, H.S., King, R.L., 2002. DTM error minimization via adaptive smoothing. In: Proceedings IEEE Geoscience and Remote Sensing Symposium, vol. 6, Toronto, Canada, pp. 3611–3613. doi:10.1109/IGARSS.2002. 1027266. Zhang, K., Chen, S., 2003. A progressive morphological filter for removing nonground measurements from airborne lidar data. IEEE Transactions on Geoscience and Remote Sensing 41, 872–882.