Model-assisted estimation of growing stock volume using different combinations of LiDAR and Landsat data as auxiliary information

Model-assisted estimation of growing stock volume using different combinations of LiDAR and Landsat data as auxiliary information

Remote Sensing of Environment 158 (2015) 431–440 Contents lists available at ScienceDirect Remote Sensing of Environment journal homepage: www.elsev...

930KB Sizes 0 Downloads 24 Views

Remote Sensing of Environment 158 (2015) 431–440

Contents lists available at ScienceDirect

Remote Sensing of Environment journal homepage: www.elsevier.com/locate/rse

Model-assisted estimation of growing stock volume using different combinations of LiDAR and Landsat data as auxiliary information Svetlana Saarela a,⁎, Anton Grafström b, Göran Ståhl b, Annika Kangas a, Markus Holopainen a, Sakari Tuominen c, Karin Nordkvist b, Juha Hyyppä d a

Department of Forest Sciences, University of Helsinki, PO Box 27, FI-00014 Helsinki, Finland Department of Forest Resource Management, Swedish University of Agricultural Sciences, SLU Skogsmarksgränd, SE-901 83 Umeå, Sweden Finnish Forest Research Institute, PO Box 18, FI-01301 Vantaa, Finland d Finnish Geodetic Institute, PO Box 15, FI-02431 Masala, Finland b c

a r t i c l e

i n f o

Article history: Received 13 November 2013 Received in revised form 15 November 2014 Accepted 17 November 2014 Available online 11 December 2014 Keywords: Airborne LiDAR Landsat Model-assisted estimator Probability sampling REDD monitoring

a b s t r a c t Airborne Light Detection and Ranging (LiDAR) and Landsat data were evaluated as auxiliary information with the intent to increase the precision of growing stock volume estimates in field-based forest inventories. The aim of the study was to efficiently utilize both wall-to-wall Landsat data and a sample of LiDAR data using model-assisted estimation. Variables derived from the Landsat 7 ETM+ satellite image were spectral values of blue, green, red, near infra-red (IR), and two shortwave IR (SWIR) bands. From the LiDAR data twenty-six height and density based metrics were extracted. Field plots were measured according to a design similar to the 10th Finnish National Forest Inventory, although with an increased number of plots per area unit. The study was performed in a 30000 ha area of Kuortane, Western Finland. Three regression models based on different combinations of auxiliary data were developed, analysed, and applied in the model-assisted estimators. Our results show that adding auxiliary Landsat and LiDAR data improves estimates of growing stock volume. Very precise results were obtained for the case where wall-to-wall Landsat data, LiDAR strip samples, and field plots were combined; for simple random sampling of LiDAR strips the relative standard errors (RSE) were in the range of 1–4%, depending on the size of the sample. With only LiDAR and field data the RSEs ranged from 4% to 25%. We also showed that probability-proportional-to-size sampling of LiDAR strips (utilizing predicted volume from Landsat data as the size variable) led to more precise results than simple random sampling. © 2014 Elsevier Inc. All rights reserved.

1. Introduction Forest resources are required for an increasing number of purposes globally, including wood- and fibre-based raw materials, maintenance of biodiversity, and mitigation of climate change (Mery et al., 2005). As a consequence, the demands for information from forests are steadily increasing (Kangas & Maltamo, 2006; UNECE and FAO, 2011). National forest inventories (NFIs) have been established for a long time in many countries (e.g. Tomppo et al., 2010). Normally, they are based on statistical samples of field plots (McRoberts et al., 2009, 2010; Woodall et al., 2009) as a means for ensuring trustworthy information, i.e. information derived from estimators that are unbiased and have high precision. Field-based forest inventories have many advantages. However, they become expensive when large sample size is required to reach the needed levels of precision. Furthermore, sparse road networks or ⁎ Corresponding author at: Latokartanonkaari 7, 00014 University of Helsinki, Finland. Tel.: +358 442177795; fax: +358 919158100. E-mail address: svetlana.saarela@helsinki.fi (S. Saarela).

http://dx.doi.org/10.1016/j.rse.2014.11.020 0034-4257/© 2014 Elsevier Inc. All rights reserved.

other conditions in a country may prevent easy access to the plots. Also, NFI information from field plots alone often leads to imprecise estimates for small regions within a country. This has stimulated the development of solutions where field plots and remotely sensed data are combined in order to provide the required information (Holmström et al., 2001; Maltamo et al., 2007; Næsset, 2004). Lately, the REDD+ mechanism (reducing emissions from deforestation and forest degradation; Brockhaus (2009)), which has been developed under the United Nations' Framework Convention on Climate Change, has led to an even stronger focus on forest information and NFIs, and on how to utilize remote sensing within NFIs, especially in countries with poor infrastructure conditions. Several approaches based on remote sensing have been developed and demonstrated , (e.g. Gobakken et al., 2012; Næsset et al., 2006; Nelson et al., 2009). However, inventories that make use of auxiliary information from remote sensing are not only relevant for developing countries and REDD + (e.g. Asner, 2009; Saatchi et al., 2011), but also for remote areas in developed countries, such as Siberia and Alaska (e.g. Andersen et al., 2009; Ene et al., 2012; Nelson et al., 2009). Further, in countries with well-established field-based NFIs, sample-based combinations of

432

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

field and remotely sensed data may offer new possibilities to make inventories cost-efficient. The problems involved in reaching good inventory solutions include variable and sometimes limited information in remotely sensed data, the need to combine remote sensing with field information in order to obtain reliable results, the lack of adequate field samples, the need to apply advanced statistical methods, and the challenge to make the solutions straightforward enough so that they can be easily operated in practice. Regarding the first issue, it is well known that remotely sensed data only contain auxiliary information about a limited number of all the variables that are typically addressed in NFIs. For example, land-use classification can seldom be based on image data alone (Anderson, 1976; Lo & Choi, 2004; Shalaby & Tateishi, 2007). However, for some important variables – for example tree biomass and volume – several remote sensing techniques such as LiDAR and RADAR have great potential (e.g. Wu, 1987; Hyyppä & Hallikainen, 1996; Boudreau et al., 2008; Asner et al., 2011; Næsset, 2011; Bollandsås et al., 2013). Thus, the focus on biomass and carbon in many emerging inventories makes remotesensing-based solutions potentially very useful. LiDAR data are known to provide auxiliary data that are highly correlated with growing stock volume, biomass and aboveground carbon in forests (Hyyppä & Inkinen, 1999; Hyyppä et al., 2008; Næsset, 1997; Næsset, 2009, 2011; Næsset & Gobakken, 2008; Nelson et al., 1988; Stephens et al., 2012). In many applications, LiDAR data have been acquired wall-to-wall over the target forest areas and standlevel estimates have been derived either based on the area method (Næsset, 2011) or based on the identification of individual trees (Hyyppä et al., 2001). For applications over large areas, such as countries, the acquisition of LiDAR data is prohibitively expensive; however, the data acquisition can be carried out as part of a sampling scheme to improve the precision of estimates. For example, Nelson et al. (2004) used a profiling LiDAR to estimate the forest resources of Delaware and Andersen et al. (2009) used data from an airborne laser scanner to estimate forest resources within a region of Alaska. The statistical inference may be either model-based (e.g. McRoberts, 2010; Ståhl et al., 2011) or design-based (e.g. Gregoire et al., 2011). In the design-based approach, model-assisted estimators (Gregoire et al., 2011; Næsset et al., 2013) are typically used based on data from a probability sample; the deviations between the model predictions and the field reference data are calculated and used to correct the model predictions. Main advantages of this approach are that all the attractive properties of design-based inference can be utilized, while at the same time, the LiDAR data and the model can improve the precision of estimates substantially. Drawbacks include that a strict probability sample from the entire population must be acquired and that mismatches in geopositioning between field plots and remotely sensed data may have a negative effect on the accuracy of estimators. Several studies have been conducted where LiDAR samples and field samples have been combined utilizing two-phase sampling and modelassisted estimation (e.g. Gregoire et al., 2011; Næsset et al., 2011); multi-spectral satellite data have sometimes been used to stratify the target population in these studies. Results have highlighted the usefulness of LiDAR as auxiliary data, although in some cases the gain over traditional field sampling has been modest (e.g. Gobakken et al., 2012). However, so far relatively few studies have been performed where several sources of remotely sensed predictors have been utilized in connection with model-assisted estimators. Examples include Andersen et al. (2012), who applied LiDAR and satellite data in a post-stratification approach utilizing multilevel sampling, and Strunk et al. (2014), who utilized several sources of remotely sensed data and compared linear regression and k nearest neighbours (kNN) methods to link field reference data, LiDAR and Landsat data. In developing this type of inventory the choice of sampling strategy is important. It is well known (e.g Särndal et al., 1992) that auxiliary information can be utilized both for improving the design, i.e. how the

sample is selected, and for improving the estimators, i.e. how the target quantities are computed once data have been acquired. Most studies so far have utilized fairly straightforward sampling designs, such as simple random or systematic sampling of LiDAR strips that traverse the entire study area (e.g. Andersen et al., 2009; Gobakken et al., 2012). Auxiliary data have been used to improve the estimators. However, as shown by Grafström et al. (2014), the potential gain from choosing an appropriate sampling design may be substantial and needs to be further evaluated. The objective of this study was to explore how three sources of information that would typically be available in connection with LiDAR sample surveys could be combined utilizing model-assisted estimation. Our data were wall-to-wall Landsat data, strip samples of LiDAR data, and field plots. We compare different cases of auxiliary data usage, show how estimates as well as uncertainty estimates can be derived, evaluate the effect of different sampling designs incorporating LiDAR, and discuss the advantages and disadvantages of the proposed sampling strategies. A novel feature of the study was that a strategy based on probability-proportional-to-size sampling was evaluated for the selection of LiDAR strips. The study was performed in the Kuortane area in the boreal forests of western Finland. 2. Materials 2.1. Study area The Kuortane study area is located in western Finland in the southern Ostrobothnia region (see Fig. 1). It is mainly covered by middle-aged Scots pine boreal forest in the Suomenselkä watershed area. Norway spruce and deciduous trees, mainly birches, usually occur as mixtures. The landscape is composed of forests on mineral soils, peatlands drained for forestry, open mires, and agricultural fields at lower elevations. The terrain depressions are covered by lakes. Non-forest areas – about one third of the total area – were masked out using digital map data (Tomppo et al., 2008). The area was tessellated into 16 × 16 m grid cells for which Landsat and LiDAR data were acquired. Our population was 818 017 grid cells corresponding to 20 942 ha of forest. 2.2. Data The material comprised three datasets: field data (Section 2.2.1), airborne LiDAR data (Section 2.2.2), and Landsat 7 ETM + data (Section 2.2.3). 2.2.1. Field data The field data were sampled using a modification of the design of the Finnish NFI (Tomppo et al., 2008), where the sampling density was significantly increased and the number of measured variables was somewhat reduced. The NFI is a sample-based inventory system that covers all land-use classes and ownership categories in Finland. The aim of the NFI is to produce reliable information on forest resources at national and regional level. The NFI is based on statistical sampling. The sampling design is systematic cluster sampling. Here, the plots were clustered into rectangular clusters of 18 plots, with 200 m distance between plots within clusters. The distance between clusters was 3 500 m. Each circular field plot has a radius of 9 m, so that the size of a plot area corresponds to the grid cell size of the tessellated area. Trees with a diameter at breast height (DBH) larger than 5 cm were measured as tally trees. DBH, tree storey class and tree species were recorded for each tally tree. Height was measured for one sample tree of each species and storey class per plot. The height measurements were used to calibrate the height estimates of the tree species-specific height models of Veltheim (1987). Volume models (Laasasenaho, 1982) were used to estimate tree volumes. The tree-level volumes were then transformed to volumes per hectare for each plot. The field plot locations were

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

433

Fig. 1. The location of the Kuortane study area (upper left), its coverage by LiDAR strips and field data (lower right), and details about the clusters of field plots (upper right).

determined with about 1 m accuracy using a Trimble ProXH GPS receiver. The field work was conducted during the summer 2006. In total 702 plots were inventoried within the study area, 441 of them were located on forest land. Table 1 provides an overview of reference data. 2.2.2. LiDAR data The LiDAR data were collected in July 2006 with an Optech 3100 laser scanning system operated at an altitude of 2000 m above terrain

Table 1 Overview of field data. Variable

DBH (cm) Height (m) Age (years) Volume (m3/ha)

Scots pine

Norway spruce

Deciduous

All

Mean

SD

Mean

SD

Mean

SD

Mean

SD

11.4 9.0 32 60.6

9.1 7.2 32 70.3

4.4 3.8 13 13.7

7.3 6.2 24 45.0

4.2 4.4 15 8.0

6.2 6.2 23 23.2

– – – 82.3

– – – 91.5

using a half scan angle of 15° and a side overlap of about 20%; the outgoing pulse density was 0.64 m−2. This resulted in a swath width of 1 070 m. The divergence of the laser beam (1 064 nm) was 0.3 mrad, which produced a footprint of 60 cm at ground level. LiDAR data were collected from within 21 laser swaths, of which 2 were used for calibration purposes. The Optech 3100 laser scanning system produces four types of returns (labelled “only”, “first”, “intermediate”, and “last”). A digital terrain (bare-earth) surface model was created using the Orientation and Processing of Airborne Laser Scanning data (OPALS) software (Kraus & Pfeifer, 2001), based on the return categories “only” and “last”. For the extraction of LiDAR height and density based metrics, we utilized the categories “only” and “first” together with the digital surface model in the FUSION software (McGaughey, 2012). Twenty-six LiDAR metrics were extracted from the laser point cloud data using the FUSION software. This was done for each 16 × 16 m grid cell and for each circular area corresponding to the field plots. The metrics were selected based on findings in previous studies (Evans et al., 2009; Hopkinson & Chasmer, 2009; Næsset, 2002). They were

434

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

maximum height, percentiles of the LiDAR height distribution from the 5th to the 95th percentile by 5% classes, and also the 99th percentile, mean and modal height, standard deviation of height, median absolute deviation and average absolute deviation of height, skewness and kurtosis of height, canopy relief ratio (which is a difference between mean and minimum canopy heights divided by a difference between maximum and minimum canopy heights), and percentage of first returns above 2 m (a crown cover estimate) (Table A2 in Appendix A). 2.2.3. Landsat data The Landsat 7 Enhanced Thematic Mapper Plus (ETM+) data were acquired in June 2006 (WRS path and row 190/16). The orthorectified (L1T) imagery data were downloaded from the U.S. Geological Survey (accessed in 2011) . Landsat 7 ETM + has 8 bands of different spatial resolution. For our study purposes we used bands 1 to 5 and 7 corresponding to blue, green, red, near infra-red (IR), and two shortwave IR (SWIR) bands (Table A2 in Appendix A) with a spatial resolution of 30 m. The image was geo-referenced to the ETRS35-FIN metric coordinate system, and the pixel size was re-sampled to 16 × 16 m using the nearest neighbour re-sampling method in ArcGIS 10 (ESRI, 2011). Spectral values were extracted for each 16 × 16 m grid cell and for each circular field plot. The digital numbers of the spectral values were used in the regression modelling (see Sections 3.2 and Appendix A). 3. Methods In this section we first present the statistical estimation through model-assisted estimators, and secondly the regression prediction models that we utilized within this framework. 3.1. Statistical estimation The methodological approach of this study was to apply modelassisted estimation (Särndal et al., 1992) to estimate the volume of growing stock within Kuortane, using different combinations of auxiliary data. Landsat data were always assumed to be available wall-towall, while LiDAR data were available in 1 km wide strips. In the study, LiDAR strip samples of different size were chosen, and the field sample of plots was obtained by selecting all the plots within the sampled strips. Fig. 1 provides an overview of Kuortane together with the location of LiDAR strips and field plots. We evaluated the following cases: Case A: Estimation based on the field plots only. To compare with the strip sampling designs, we only utilized field plots from within sampled strips. Case B: Two-phase model-assisted estimation with data from LiDAR strips as the first phase and field plot data as the second phase. Only field plots within sampled LiDAR strips were utilized in the estimation. Case C: Model-assisted estimation with wall-to-wall Landsat data and field plots. In order to make straightforward comparisons with the other alternatives, the field plots were selected only from within strips corresponding to the LiDAR samples. Case D: Two-phase model-assisted estimation with wall-to-wall Landsat data, a LiDAR strip sample as the first phase of sampling, and field plot data as the second phase. Case E: Model-assisted estimation with wall-to-wall Landsat and LiDAR data, and field plots. Like for the other cases only the field plots within sample strips were utilized in the estimation. In all cases, we treated the field plots as if they were selected according to a simple random sampling scheme, although in reality they were clustered. The distances between plots in the clusters were, however, chosen so that the intra-cluster correlation would lower (Tomppo, 2006). For all the cases two different principles for selecting sample strips were evaluated. Our baseline approach was to make the selection through simple random sampling without replacement (Särndal et al.,

1992). The more advanced – and according to our hypothesis the more efficient – alternative was to select the strips through a probabilityproportional-to-size procedure without replacement (πps), where model-predicted total growing stock volumes for the strips based on Landsat data were used as the “size” variable. The πps method is described in Grafström et al. (2012) and briefly in Appendix B. Within the first phase, only auxiliary data were acquired whereas field reference data were obtained from the second phase field plots. Since LiDAR data within the study area in reality were also acquired wall-to-wall we divided the area into 32 strips of equal width (1 000 m). In the evaluations, LiDAR strip samples of different sizes were chosen while the Landsat data were always assumed to be wall-to-wall; the number of field plots was set to the number obtained from within the sampled strips. Regression estimation was applied in this study. As shown by Särndal et al. (1992) and Massey et al. (2014), estimators and variance estimators that are (almost) unbiased can be derived when linear models are applied for the predictions based on auxiliary data. However, when non-linear models are applied (like in our study) no generally applicable approach to deriving unbiased variance estimators appears to be available (Massey et al., 2014; Opsomer et al., 2007). Thus, there is a need to resort to approximate formulas for the variance estimation. As described by Massey et al. (2014), one straightforward approximation approach is to apply formulas for difference estimators, i.e. estimators where the model predictions are assumed to be based on an external model, although the model has been estimated from the sample data. Massey et al. (2014, p.4) points out that in this case the point estimator is still approximately unbiased but the variance estimator will probably slightly underestimate the true variance. In the study we followed this approach. In the following we first provide some further background information about the estimators we applied. Then we present the estimators and the variance estimators for the cases A to E, following the notation used by Särndal et al. (1992). The population of grid cells is denoted by U. We wish to estimate the total Y = ∑Uyk, where yk is the true value of growing stock volume for unit k. We use the indices k and l to refer to units (grid cells). The first phase sample is a sample of n out of N = 32 LiDAR strips, denoted as Sa. Thus Sa contains all the grid cells in the n selected strips. The probability that unit k is included in Sa is denoted as πak and the probability that units k and l are both included in Sa is denoted as πakl. Usually πakl is called the second order inclusion probability (of k and l). The second order inclusion probabilities are needed for the variance and the variance estimators. The second phase sample S of field plots corresponds to sampled grid cells within the selected strips. Thus, the second phase sample is seen as a subset of the strip sample (S ⊂ Sa). The conditional probability that a unit k will be included in S, given that k ∈ Sa is denoted as π kjSa . Similarly, the conditional second order inclusion probability is denoted as π kljSa . We provide more details on our first case A, because the subsequent estimators and their variances can be derived in a similar manner and they are also given in Särndal et al. (1992). For case A, a Horvitz–Thompson estimator of Y = ∑Uyk based on the full strip sample Sa is Y^ ¼

X y k : π ak S

ð1Þ

a

However, the yks normally are not known for all units in Sa since S is a subsample of Sa. Thus we use the estimator Y^ A ¼

X S

Xy yk k ¼ πak πkjSa πk S

ð2Þ

which is an unbiased estimator of Y, see e.g. Särndal et al. (1992, Eq. 9.3.5, p. 348). We have E(ŶA) = E(E(ŶA|Sa)) = E(Ŷ) = Y. Sometimes ŶA is referred to as the π-star or double-expansion estimator.

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

  XX Δ y −y ^1k yl −y ^1l XX ΔkljSa yk −y ^k yl −y ^l akl k ^ Y^ ¼ V þ D   : ð10Þ S π S π π π π π ak al kljSa kl k l

For the variance we use the law of total variance         V Y^ A ¼ V E Y^ A jSa þ E V Y^ A jSa :

ð3Þ

An unbiased (provided all πakl and all πkljSa are strictly positive) variance estimator is given by   XX Δ y y XX ΔkljS y y akl k l k l a ^ Y^ ¼ þ V A  ; S π π π S π ak al kljSa π k π l kl

X y ^k ^k X yk −y þ πak π k S S

ð4Þ

ð5Þ

a

  XX Δ y y XX ΔkljS y −y ^k yl −y ^l akl k l k a ^ Y^ ¼ V þ : B S π π π S π πk π l ak al kljSa kl

ð6Þ

In addition to the notation already introduced, ŷk (ŷl) is the model predicted value for unit k (l), utilizing the LiDAR-based model (see Section 3.2 and Appendix A). For case C, the population total estimator and the corresponding variance estimator are (Särndal et al., 1992, Eq. 9.6.12, p. 358): Y^ C ¼

X

^1k þ y

U

X y −y ^1k k  π k S

ð7Þ

  XX Δ y −y ^1k yl −y ^1l XX ΔkljSa yk −y ^1k yl −y ^1l akl k ^ Y^ ¼ þ : ð8Þ V C   S π S π π π π π ak al kljSa kl k l In addition to the previously introduced notation, ŷ 1k (ŷ1l ) are model predicted values utilizing Landsat data (see Section 3.2 and Appendix A). For case D, the population total estimator and the corresponding variance estimator are (Särndal et al., 1992, Eq. 9.6.8 and Eq. 9.6.10, p. 357):

Y^ D ¼

X U

^1k þ y

Xy ^k −y ^1k X yk −y ^k þ  π π ak k S S

Finally, for case E, the population total estimator and the corresponding variance estimator are (Särndal et al., 1992, Eq. 9.6.12, p. 358): Y^ E ¼

where Δakl = πakl − πakπal, ΔkljSa ¼ πkljSa −π kjSa πljSa and πkl ¼ πakl πkljSa , see Särndal et al. (1992, Eq. 9.3.7, p. 348). The two components in Eq. (4) estimate the two components in Eq. (3). The Δs are known as the covariances of the inclusion indicators. For case B, the population total estimator and the corresponding variance estimator are (Särndal et al., 1992, Eq. 9.6.13 and Eq. 9.6.16, p. 358): Y^ B ¼

435

ð9Þ

a

X U

^k þ y

X y −y ^k k  π k S

ð11Þ

  XX Δ y −y ^k yl −y ^l XX ΔkljSa yk −y ^k yl −y ^l akl k ^ Y^ ¼ þ : V E S π S π πak πal πk πl kljSa kl

ð12Þ

Note that for case E, ŷk (and ŷl) are predictors based on a model that makes use of both Landsat and LiDAR auxiliary data (see Section 3.2 and Appendix A). In Table 2 we show how the first and second order inclusion probabilities and the covariance of the inclusion indicators can be expressed for the two design approaches evaluated in the study, i.e. (i) simple random sampling without replacement of both strips and plots (denoted SI, SI) and (ii) probability proportional to size selection of strips without replacement, based on Landsat model predictions of strip total growing stock values, followed by simple random sampling of plots (denoted πps,SI). For the πps selection we utilized the Landsat regression model, although in a practical approach this model would not be available before field data are collected. Thus, in practice older field data or field data from some neighbouring region would be applied for preparing a model that would be used for the sample selection. In addition to the previous notation, in Table 2 and elsewhere U (k) is the set of grid cells within the strip where unit k is located, mi is the number of field plots in the ith strip, and Mi is the total number of grid cells in the ith strip. As the true population values were not available from the Kuortane area, our evaluation of the performance of the different sampling strategies was based on the estimated variances of the estimators of total growing stock volume. In order to overcome some of the drawbacks with this, we applied Monte-Carlo simulation for the strip samples, and averaged the estimated variances over the outcomes from the simulations. This was possible since both LiDAR and field data were available for the entire study area. However, the field plots within a given sample strip were always the same and thus our Monte-Carlo approach captures the variability due to random sampling of strips but not the variability due to random sampling of plots. Eqs. (1) to (12) are valid for the cases when the y-values are grid cell totals. For practical reasons we made the calculations based on perhectare values and we report the results as per-hectare values as well. Thus, all estimates following Eqs. (2), (5), (7), (9) and (11) – with y-values given as per hectare values – were divided with the total number of grid cells in our target population. Similarly, the variance estimators

Table 2 First and second order probabilities of inclusion and covariances of inclusion indicators for the two design approaches SI,SI and πps,SI. Term

SI,SI

πps,SI

πak

n N

^1 j n∑ j∈U ðkÞ y

π kjSa

∑Sa mi ∑S Mi ( a πak if in same strip; n n−1 otherwise: 8 N N−1 if in same strip; < π kjSa ∑ m ∑Sa mi −1 otherwise: : Sa i ∑Sa Mi ∑Sa M i −1 πak π kjSa πakl π kljSa πkljSa −π kjSa π ljSa πakl − πakπal

πakl π kljSa π⁎k πkl⁎ ΔkljSa Δakl

∑i∈U y^1i ∑Sa mi ∑Sa Mi  π ak if in same strip; see Appendix B otherwise: 8 < π kjSa ∑ m ∑Sa mi −1 : Sa i ∑Sa M i ∑Sa Mi −1 π ak π kjSa π akl πkljSa π kljSa −π kjSa π ljSa πakl − πakπal

if in same strip; otherwise:

436

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

(4), (6), (8), (10) and (12) were divided with the square of the total number of grid cells. Lastly, although grid cells and field plots were of equal size the field plots are circular and their centre points seldom coincide with grid cell centre points. Thus, when differences between field plot values and model predictions were needed in the estimators, the predicted values were derived for a circular plot co-located with the field plot.

where Y is a vector of growing stock volume, X is a matrix of predictors, β is a vector of fixed effects and ε is a vector of error terms. The model form has been found appropriate in previous studies (e.g. Næsset, 2011). The main model evaluation criterion used was the root mean square error of the residual errors.

4. Results 3.2. Regression models To perform the model-assisted estimation, we developed three different regression models. All of them predict growing stock volume (m3/ha) at the level of 16 × 16 m grid cells (or 9 m radius circular plots). The first model utilized Landsat data only, the second LiDAR data only, while the third utilized a combination of Landsat and LiDAR data (called Combined model in the following), see Appendix A. For the LiDAR-based model a large number of LiDAR features were available. Some of them were highly correlated with each other, which may cause problems in regression analysis. A special selection algorithm, described in Appendix A) , was used for selecting an appropriate subset of LiDAR predictor variables for the regression model. For the Landsat-based model spectral values of bands 1 to 5, and 7, were used. All the regression models had the form: pffiffiffiffi T Y ¼ X βþε

ð13Þ

The estimated per-hectare growing stock volumes for Kuortane were found to be in the range 78.2 m3/ha to 106.2 m3/ha for the different cases and sample sizes. For case E and the entire field sample the estimate was 96.1 m3/ha (this value was used in the denominator in calculating relative standard errors). As expected, the standard errors (the square root of the estimated variances) of the growing stock volumes were found to decrease with increasing LiDAR sample size (Fig. 2 and Table 3). Further, the introduction of additional auxiliary data clearly improved the precision of the estimators, i.e. adding Landsat wall-to-wall data improved the precision of the LiDAR strip-based model-assisted estimators (compare case B to case D in Fig. 2 and Table 3). The designs with probability-proportional-to-size sampling in the first phase always resulted in a lower relative standard error than their counterparts based on simple random sampling. As expected, case E – the combination of all available auxiliary data – was the most precise strategy.

Fig. 2. Relative standard errors for the different sampling strategies: A — design-based estimation based on field plots only; B — two-phase model-assisted estimation with data from LiDAR strips as the first phase and field plot data as the second phase; C — two-phase model-assisted estimation with wall-to-wall Landsat data and field plots; D — two-phase model-assisted estimation with wall-to-wall Landsat data, a LiDAR strip sample as the first phase of sampling, and field plot data as the second phase; E — model-assisted estimation with wall-to-wall Landsat and LiDAR data, and field plots (1 — SI,SI, and 2 — πps,SI).

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

437

Table 3 Relative standard errors for the different sampling strategies: A — design-based estimation based on field plots only; B — two-phase model-assisted estimation with data from LiDAR strips as the first phase and field plot data as the second phase; C — two-phase modelassisted estimation with wall-to-wall Landsat data and field plots; D — two-phase modelassisted estimation with wall-to-wall Landsat data, a LiDAR strip sample as the first phase of sampling, and field plot data as the second phase; E — model-assisted estimation with wall-to-wall Landsat and LiDAR data, and field plots (%). Sample size, strips

Case

SI,SI

πps,SI

4

A B C D E A B C D E A B C D E A B C D E A B C D E

28.1 25.1 11.0 5.5 3.6 17.8 15.5 7.6 3.8 2.5 12.0 10.0 5.6 2.6 1.8 8.7 6.7 4.6 2.0 1.5 6.5 4.3 3.9 1.6 1.3

15.0 9.6 10.0 5.4 3.1 9.6 5.3 6.8 3.6 2.2 7.4 4.3 5.0 2.5 1.6 6.1 3.4 4.1 1.8 1.3 4.9 2.2 3.5 1.4 1.1

8

14

20

26

Studying the SI,SI design approach, the results show how the precision of estimates increases from Case A to Case E, i.e. when additional auxiliary information is added. However, the standard errors are fairly large. For a given case and sample size, the design approach πps,SI considerably increased the precision compared to SI,SI. However, the trend with increased precision from Case A to Case E is not as clear for πps,SI; LiDAR sample data performed better than wall-towall Landsat data in this case. Very high precision was attained for the πps,SI two-phase sampling strategy at moderate and large LiDAR strip sample sizes. Regarding the models used in the estimators, Table 4 presents statistical model fit measures after the response variables have been backtransformed. Predicted values versus field reference values are reported in Fig. 3. Details are presented in Appendix A.

5. Discussion The study showed that increased use of auxiliary information from remote sensing improved the precision of the model-assisted estimators. A clear trend of increased precision was obtained from using only field plots, through wall-to-wall Landsat data and LiDAR strip samples, to two-phase sampling and wall-to-wall data of both LiDAR and Landsat data. Further, it was found that πps sampling,

Table 4 Model fit measures after the response variables have been back-transformed. All available field data (441 plots) were used in the models. Measure

LiDAR

Landsat

Combined

Root Mean Square Error (RMSE), m3/ha Relative RMSE, % Coefficient of determination R2

24.5 29.7 0.93

70.2 85.1 0.40

22.6 27.3 0.94

Fig. 3. Field reference values versus predicted values of growing stock volume for the regression models. The line depicts a 1:1 relationship between the two variables. A — Landsat-based, B — LiDAR-based, C — Combined.

438

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

based on Landsat information, was superior to simple random sampling of LiDAR strips. The reduction in relative standard error (relative to simple random sampling) is much less for cases C, D, and E (see Table 3). The reason could be that πps sampling design is based on model predictions derived from Landsat, so when the Landsat data are incorporated in the estimator (as in C, D, and E with simple random sampling) there is less advantage to unequal probability design because the Landsat information is being used in the estimator. The results suggest that there is a choice for how to use Landsat data, in the estimator or in the design. As shown by Ene et al. (2012) caution must be exercised when the performance of different sampling strategies is compared based on estimated variances, as in our case. For example, by sampling simulation Ene et al. (2012) showed that the estimated variances from a LiDARbased model-assisted study with systematic sampling were substantially overestimated. To avoid this problem, in our study the LiDAR strips were selected through simple random sampling and through πps sampling. However, the field plots were located in systematically allocated clusters, although in the calculations we treated the plots as a simple random sample. In several previous studies (e.g. Ene et al., 2012; Næsset et al., 2013) it has been demonstrated that the combination of LiDAR and field samples can be very efficient for obtaining estimates with high precision of growing stock variables, such as total biomass or volume, in largearea forest surveys. Compared to these studies, where satellite spectral data typically were used only for purposes of stratification, we showed that adding wall-to-wall Landsat data to the model-assisted estimation considerably improved the precision of the estimators. Similar conclusions were reached by Andersen et al. (2012) and Strunk et al. (2014). Our study showed a clear gain in precision from using πps sampling compared to using simple random sampling. This is mainly due to the variable size of the LiDAR sample strips (e.g. Næsset et al., 2013), which is accounted for in the case of πps but not in the case of simple random sampling. However, πps sampling may not be efficient in multi-purpose surveys in which many parameters are estimated, some of which may not be related to the size variable used in the πps design. Further, the implication of using old field data or field data from a neighbouring region for establishing the “size” variables would probably be a slightly diminished reduction in standard error compared to what was reported in this study. We used total growing stock per strip predicted based on Landsat data as the “size” variable. However, since volume per area unit is relatively stable over the Kuortane area most likely it would not make a big difference to use forest area instead of predicted growing stock volume. An alternative to πps sampling to address the problem of unequally long strips would be to use a ratio estimator (Särndal et al., 1992), where the size of the strips is included as an additional auxiliary source of information. In any case it is evident that standard simple random sampling strategies normally perform rather poorly in cases where the length of the LiDAR sample strips varies. Our study focused on the variance of different model-assisted estimators under different sampling strategies. Although we made no formal cost-precision evaluation, we made the field sample sizes for the different alternatives comparable through utilizing field plots only from within sampled strips in all cases. Thus, as a crude assertion, the alternatives with equally many sample strips would cost roughly the same for all cases where LiDAR sampling was applied and roughly the same for all cases where LiDAR sampling was not part of the strategy (as Landsat data were assumed to be freely available). Comparing the three models developed for the model-assisted estimation, the LiDAR model and the combined model (utilizing a combination of LiDAR and Landsat data) were about equally good, with relative Root Mean Squared Error (RMSE) of 30% and 27%, respectively, and coefficient of determination (R2) of 0.93 and 0.94. The model based on Landsat data was rather poor, relative RMSE was 85% and R2 was 0.40 (Table 4). Using the Landsat model, some predicted values

deviated substantially from the field reference values; especially this was the case for some peatland plots, which obtained large predicted values of growing stock volume, whereas the field reference values were very low (Fig. 3). Somewhat surprisingly, the 75th, 80th and 90th LiDAR height percentiles were all selected into the LiDAR and Combined models (see Table A3 in Appendix A). Thus, one would expect multicollinearity to be a problem in our models. However, as shown by Fu (2005), when the purpose is to utilize the models for predictions only, multicollinearity is not a major problem. A problem in remote sensing based model-assisted estimation may be the poor spatial matching of field and remotely sensed data. In case there are deviations, errors occur both when the models are developed and when they are applied. In the application phase, geolocation mismatches are likely to increase the variance estimates. In our study, problems of this kind were small as the accuracy of the plot locations was high and since a procedure was applied where the remote sensing features were extracted to fit exactly the shape of the plots (circular). We conclude that it would be important to further explore what combinations of auxiliary data would underpin the most cost-efficient model-assisted strategies. However, in such studies it would be important to consider also the problem of land use classification (Tomppo et al., 2008), which is pertinent to most applications but typically neglected in research studies. While most research studies in this area, including ours, have assumed an adequate land use map to be available this is typically not the case in applications. Acknowledgement This study was financed by the Finnish Graduate School in Forest Sciences and the Finnish Academy project “Science and Technology Towards Precision Forestry” (PreciseFor). The authors wish to express their gratitude to Dr. Eva Lindberg for her expert help in developing regression models as well as Dr. Kenneth Olofsson and Jonas Bohlin for their help with airborne LiDAR data processing. Further, we wish to thank four anonymous reviewers for their comments and suggestions, which helped us to improve the manuscript. Appendix A. Regression models Altogether twenty-six LiDAR and six Landsat variables were available (see Tables A1 and A2) for model development. For detailed description of the LiDAR metrics, see McGaughey (2012). In order to obtain good model fits, we evaluated three different forms of variable transformations: log–log transformation (i.e. both responses and predictors are transformed logarithmically), semilog (i.e. when only responses are logarithmically transformed) and a model with square root transformation of the response variable. The square root transformation led to the smallest back-transformed RMSEs and, thus, was chosen for the models. Similar results are reported by Næsset (2011). In the case of Landsat data, we also noted that the use of square root transformed predictors resulted in lower back-

Table A1 Landsat variables available in the study. Landsat variable

Description

Band1 Band2 Band3 Band4 Band5 Band7

Band 1 — blue (wavelength 450–520 nm) Band 2 — green (wavelength 520–600 nm) Band 3 — red (wavelength 630–690 nm) Band 4 — near infrared (wavelength 770–900 nm) Band 5 — short-wave infrared (wavelength 1550–1750 nm) Band 7 — short-wave infrared (wavelength 2090–2350 nm)

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

Appendix B. Inclusion probabilities for the conditional Poisson design

Table A2 LiDAR variables available in the study. LiDAR variable

Description

max P01, P05, P10, P20, P25, P30, P40, P50, P60, P70, P75, P80, P90, P95, P99 min mean modal stdev medabdev avabdev skewness kurtosis CanopyReliefRatio

Maximum LiDAR height Percentiles of the LiDAR height distribution: 1th, 5th, 10th, 20th, 25th, 30th, 40th, 50th, 60th, 70th, 75th, 80th, 90th, 95th, 99th Minimum LiDAR height Mean LiDAR height Modal LiDAR height Standard deviation of LiDAR height Median absolute deviation of LiDAR height Average absolute deviation of LiDAR height Skewness of LiDAR height Kurtosis of LiDAR height mean−min Canopy relief ratio: max−min Percentage of LiDAR first returns above 2 m (a crown cover estimate)

above2m

439

transformed RMSEs. Thus, all regression models were based on a model of the following type: pffiffiffiffi T Y ¼X βþε

Let pi, i = 1,2,…,N, be the parameters for Poisson sampling, i.e. each unit i is included independently of the others with probability pi. Also let Ii, i = 1,2,…,N, be the inclusion indicators for Poisson sampling, i.e. the Iis are independent and Ii ~ Bin(1,pi). If only samples of size n are accepted, we get conditional Poisson sampling (Hájek, 1964) and the inclusion probabilities are given by  1 X  N  @ ¼ Pr Ii ¼ 1 I j ¼ nA:  j¼1 0

ðnÞ πi

ðB:1Þ

If the parameters pi = πi, i = 1,2,…,N, are used we only have πi(n) ≈ πi. The inclusion probabilities π(n) can be calculated recursively i by the following formula

ðnÞ πi

  ðn−1Þ pi =ð1−pi Þ 1−πi   ; ¼n ðn−1Þ ∑Nj¼1 p j = 1−p j 1−π j

ðB:2Þ

ðA:1Þ

Y = (Y1,..., Yz)T is the z-dimensional vector of observed values of growing stock volume for the z sampling units (m3/ha), β = (β0, β1,..., βp)T is the (p + 1)-dimensional vector of model parameters, XT is a matrix of dimension z by (p + 1) of different Landsat and LiDAR variables, and ε is the z-dimensional vector of error terms. The square root of the digital numbers of all six Landsat bands (Band1, Band2, Band3, Band4, Band5 and Band7) was used in the Landsat and Combined models. For the selection of LiDAR predictor variables we used an algorithm available in the R package FSelector (Romanski, 2013). This algorithm is based on the forward selection principle, i.e. it selects predictor variables into the model based on their correlation with the response variable after the response variable has been adjusted for the effect of the predictor variables already included (e.g. Chatterjee & Hadi, 2012). As a result of applying the algorithm the following LiDAR variables were selected: mean height (mean), height to the 75th (P75), 80th (P80) and 90th (P90) canopy cover percentiles, and percentage of first returns above 2 m height (above2m). Model parameter estimates, and their standard errors and t-values are presented in Table A3. No corrections for back-transformation bias were made, since such bias (Gregoire et al., 2008) would be corrected for by the model-assisted estimators.

= 0, i = 1,2,…,N. Formula (B.2) is due where the start is given by π(0) i to Chen et al. (1994). The second order inclusion probabilities can be derived from the following recursive relationship

ðnÞ

ðnÞ

πi j ¼ πi

ðn−1Þ

πj

ðn−1Þ

−πi j

ðB:3Þ

ðn−1Þ 1−πi

for i ≠ j (and πii(n) = πi(n)). We need to adjust the pis in order to get the inclusion probabilities πi, i = 1,2,…,N. The parameters can be adjusted by an iterative procedure by Aires (2000)   ðn Þ pi ðt þ 1Þ ¼ pi ðt Þ þ π i −πi ðt Þ ;

t ¼ 0; 1; 2; …

ðB:4Þ

where πi(n)(t) corresponds to the inclusion probabilities of Conditional Poisson (CP)-sampling when the parameters pi (t) are used. These πi(n)(t)s need to be calculated in each step, preferably by using (B.2). However, if pi(0) = πi, we usually only need to do a few number of iterations.

Table A3 Estimated model parameters. Variable

Landsat

Intercept ðBand1Þ =

2

ðBand2Þ =

2

ðBand3Þ =

2

ðBand4Þ =

2

ðBand5Þ =

2

1

1

1

1

1

ðBand7Þ = mean P75 P80 P90 above2m 1

2

LiDAR

Combined

Est.

SE

t value

Est.

SE

t value

Est.

SE

t value

6.92 −0.55

2.25 0.40

3.07 −1.39

0.13 –

0.02 –

7.29 –

0.05 −0.05

0.63 0.11

0.08 −0.43

0.95

0.34

2.83







0.20

0.10

2.13

−0.44

0.19

−2.32







0.08

0.05

1.48

−0.18

0.09

−1.87







−0.11

0.03

−3.93

−0.56

0.17

−3.39







−0.05

0.05

−1.0

0.09

0.21

0.44







−0.04

0.06

−0.73

– – – – –

– – – – –

– – – – –

0.06 −0.02 0.10 0.01 0.01

0.01 0.014 0.02 0.01 0.001

5.65 −1.39 4.86 0.74 7.54

0.06 −0.02 0.10 0.003 0.01

0.01 0.01 0.02 0.01 0.001

6.16 −1.83 5.23 0.37 7.26

440

S. Saarela et al. / Remote Sensing of Environment 158 (2015) 431–440

References Aires, N. (2000). Comparisons between conditional Poisson sampling and Pareto πps sampling designs. Journal of Statistical Planning and Inference, 88, 133–147. Andersen, H. -E., Barrett, T., Winterberger, K., Strunk, J., & Temesgen, H. (2009). Estimating forest biomass on the western lowlands of the Kenai Peninsula of Alaska using airborne LiDAR and field plot data in a model-assisted sampling design. Proceedings of the IUFRO Division 4 conference: Extending forest inventory and monitoring over space and time (pp. 19–22). Andersen, H. -E., Strunk, J., Temesgen, H., Atwood, D., & Winterberger, K. (2012). Using multilevel remote sensing and ground data to estimate forest biomass resources in remote regions: A case study in the boreal forests of interior Alaska. Canadian Journal of Remote Sensing, 37, 596–611. Anderson, J.R. (1976). A land use and land cover classification system for use with remote sensor data, vol. 964, US Government Printing Office. Asner, G.P. (2009). Tropical forest carbon assessment: Integrating satellite and airborne mapping approaches. Environmental Research Letters, 4, 034009. Asner, G.P., Hughes, R.F., Mascaro, J., Uowolo, A.L., Knapp, D.E., Jacobson, J., et al. (2011). High-resolution carbon mapping on the million-hectare Island of Hawaii. Frontiers in Ecology and the Environment, 9, 434–439. Bollandsås, O., Gregoire, T., Næsset, E., & Øyen, B. -H. (2013). Detection of biomass change in a Norwegian mountain forest area using small footprint airborne laser scanner data. Statistical Methods and Applications, 22, 113–129. Boudreau, J., Nelson, R.F., Margolis, H.A., Beaudoin, A., Guindon, L., & Kimes, D.S. (2008). Regional aboveground forest biomass using airborne and spaceborne LiDAR in Québec. Remote Sensing of Environment, 112, 3876–3890. Brockhaus, M. (2009). REDD+: National strategy and policy options. Cifor. Chatterjee, S., & Hadi, A.S. (2012). Regression analysis by example. John Wiley & Sons. Chen, X. -H., Dempster, A.P., & Liu, J.S. (1994). Weighted finite population sampling to maximize entropy. Biometrika, 81, 457–469. Ene, L.T., Næsset, E., Gobakken, T., Gregoire, T.G., Ståhl, G., & Nelson, R. (2012). Assessing the accuracy of regional LiDAR-based biomass estimation using a simulation approach. Remote Sensing of Environment, 123, 579–592. ESRI (2011). ArcGIS Desktop: Release 10. Redlands, CA: Environmental Systems Research Institute. Evans, J.S., Hudak, A.T., Faux, R., & Smith, A. (2009). Discrete return LiDAR in natural resources: Recommendations for project planning, data processing, and deliverables. Remote Sensing, 1, 776–794. Fu, W.J. (2005). A note on prediction error with collinearity. Communications in Statistics — Theory and Methods, 34, 619–623. U.S. Geological Survey (accessed in 2011). Landsat Missions. http://landsat.usgs.gov/ index.php. Accessed: 28 March 2011. Gobakken, T., Næsset, E., Nelson, R., Bollandsås, O.M., Gregoire, T.G., Ståhl, G., et al. (2012). Estimating biomass in Hedmark County, Norway using national forest inventory field plots and airborne laser scanning. Remote Sensing of Environment, 123, 443–456. Grafström, A., Qualité, L., Tillé, Y., Matei, A., et al. (2012). Size constrained unequal probability sampling with a non-integer sum of inclusion probabilities. Electronic Journal of Statistics, 6, 1477–1489. Grafström, A., Saarela, S., & Ene, L.T. (2014). Efficient sampling strategies for forest inventories by spreading the sample in auxiliary space. Canadian Journal of Forest Research, 44, 1156–1164. Gregoire, T.G., Lin, Q.F., Boudreau, J., & Nelson, R. (2008). Regression estimation following the square-root transformation of the response. Forest Science, 54, 597–606. Gregoire, T.G., Ståhl, G., Næsset, E., Gobakken, T., Nelson, R., & Holm, S. (2011). Modelassisted estimation of biomass in a LiDAR sample survey in Hedmark County, Norway. Canadian Journal of Forest Research, 41, 83–95. Hájek, J. (1964). Asymptotic theory of rejective sampling with varying probabilities from a finite population. The Annals of Mathematical Statistics, 1491–1523. Holmström, H., Nilsson, M., & Ståhl, G. (2001). Simultaneous estimations of forest parameters using aerial photograph interpreted data and the k nearest neighbour method. Scandinavian Journal of Forest Research, 16, 67–78. Hopkinson, C., & Chasmer, L. (2009). Testing LiDAR models of fractional cover across multiple forest ecozones. Remote Sensing of Environment, 113, 275–288. Hyyppä, J., & Hallikainen, M. (1996). Applicability of airborne profiling radar to forest inventory. Remote Sensing of Environment, 57, 39–57. Hyyppä, J., Hyyppä, H., Leckie, D., Gougeon, F., Yu, X., & Maltamo, M. (2008). Review of methods of small-footprint airborne laser scanning for extracting forest inventory data in boreal forests. International Journal of Remote Sensing, 29, 1339–1366. Hyyppä, J., & Inkinen, M. (1999). Detecting and estimating attributes for single trees using laser scanner. The Photogrammetric Journal of Finland, 16, 27–42. Hyyppä, J., Kelle, O., Lehikoinen, M., & Inkinen, M. (2001). A segmentation-based method to retrieve stem volume estimates from 3-D tree height models produced by laser scanners. IEEE Transactions on Geoscience and Remote Sensing, 39, 969–975. Kangas, A., & Maltamo, M. (2006). Forest inventory: Methodology and applications, vol. 10, Springer. Kraus, K., & Pfeifer, N. (2001). Advanced DTM generation from LIDAR data. International Archives Of Photogrammetry Remote Sensing And Spatial Information Sciences, 34, 23–30. Laasasenaho, J. (1982). Taper curve and volume functions for pine, spruce and birch [Pinus sylvestris, Picea abies, Betula pendula, Betula pubescens]. The Finnish Forest Research Institute. Lo, C., & Choi, J. (2004). A hybrid approach to urban land use/cover mapping using Landsat 7 Enhanced Thematic Mapper Plus (ETM+) images. International Journal of Remote Sensing, 25, 2687–2700. Maltamo, M., Packalén, P., Peuhkurinen, J., Suvanto, A., Pesonen, A., & Hyyppä, J. (2007). Experience and possibilities of ALS based forest inventory in Finland. ISPRS Workshop on Laser Scanning 2007 and SilviLaser 2007, Espoo, September 12–14, 2007, Finland.

Massey, A., Mandallaz, D., & Lanz, A. (2014). Integrating remote sensing and past inventory data under the new annual design of the Swiss National Forest Inventory using three-phase design-based regression estimation. Canadian Journal of Forest Research, 44, 1177–1186. McGaughey, R. (2012). FUSION/LDV: Software for LIDAR data analysis and visualization, version 3.01. http://forsys.cfr.washington.edu/fusion/fusionlatest.html (Accessed: 24 August 2012.) McRoberts, R.E. (2010). Probability-and model-based approaches to inference for proportion forest using satellite imagery as ancillary data. Remote Sensing of Environment, 114, 1017–1025. McRoberts, R.E., Ståhl, G., Vidal, C., Lawrence, M., Tomppo, E., Schadauer, K., et al. (2010). National forest inventories: Prospects for harmonised international reporting. National Forest Inventories (pp. 33–43). Springer. McRoberts, R.E., Tomppo, E., Schadauer, K., Vidal, C., Ståhl, G., Chirici, G., et al. (2009). Harmonizing national forest inventories. Journal of Forestry, 107, 179–187. Mery, G., Alfaro, R., Kanninen, M., & Lovobikov, M. (2005). Forests in the global balancechanging paradigms. Helsinki: International Union of Forest Research Organizations [IUFRO], 17. Næsset, E. (1997). Estimating timber volume of forest stands using airborne laser scanner data. Remote Sensing of Environment, 61, 246–253. Næsset, E. (2002). Predicting forest stand characteristics with airborne scanning laser using a practical two-stage procedure and field data. Remote Sensing of Environment, 80, 88–99. Næsset, E. (2004). Practical large-scale forest stand inventory using a small-footprint airborne scanning laser. Scandinavian Journal of Forest Research, 19, 164–179. Næsset, E. (2009). Effects of different sensors, flying altitudes, and pulse repetition frequencies on forest canopy metrics and biophysical stand properties derived from small-footprint airborne laser data. Remote Sensing of Environment, 113, 148–159. Næsset, E. (2011). Estimating above-ground biomass in young forests with airborne laser scanning. International Journal of Remote Sensing, 32, 473–501. Næsset, E., & Gobakken, T. (2008). Estimation of above- and below-ground biomass across regions of the boreal forest zone using airborne laser. Remote Sensing of Environment, 112, 3079–3090. Næsset, E., Gobakken, T., Bollandsås, O.M., Gregoire, T.G., Nelson, R., & Ståhl, G. (2013). Comparison of precision of biomass estimates in regional field sample surveys and airborne LiDAR-assisted surveys in Hedmark County, Norway. Remote Sensing of Environment, 130, 108–120. Næsset, E., Gobakken, T., & Nelson, R. (2006). Sampling and mapping forest volume and biomass using airborne LIDARs. Proceedings of the eight annual forest inventory and analysis symposium (pp. 297–301). Næsset, E., Gobakken, T., Solberg, S., Gregoire, T.G., Nelson, R., Ståhl, G., et al. (2011). Modelassisted regional forest biomass estimation using LiDAR and InSAR as auxiliary data: A case study from a boreal forest area. Remote Sensing of Environment, 115, 3599–3614. Nelson, R., Boudreau, J., Gregoire, T.G., Margolis, H., Næsset, E., Gobakken, T., et al. (2009). Estimating Quebec provincial forest resources using ICESat/GLAS. Canadian Journal of Forest Research, 39, 862–881. Nelson, R., Krabill, W., & Tonelli, J. (1988). Estimating forest biomass and volume using airborne laser data. Remote Sensing of Environment, 24, 247–267. Nelson, R., Short, A., & Valenti, M. (2004). Measuring biomass and carbon in Delaware using an airborne profiling LIDAR. Scandinavian Journal of Forest Research, 19, 500–511. Opsomer, J.D., Breidt, F.J., Moisen, G.G., & Kauermann, G. (2007). Model-assisted estimation of forest resources with generalized additive models. Journal of the American Statistical Association, 102, 400–409. Romanski, P. (2013). FSelector: Selecting attributes. R package version 0.19. Saatchi, S.S., Harris, N.L., Brown, S., Lefsky, M., Mitchard, E.T., Salas, W., et al. (2011). Benchmark map of forest carbon stocks in tropical regions across three continents. Proceedings of the National Academy of Sciences, 108, 9899–9904. Särndal, C. -E., Swensson, B., & Wretman, J.H. (1992). Model assisted survey sampling. Springer. Shalaby, A., & Tateishi, R. (2007). Remote sensing and GIS for mapping and monitoring land cover and land-use changes in the Northwestern coastal zone of Egypt. Applied Geography, 27, 28–41. Ståhl, G., Holm, S., Gregoire, T.G., Gobakken, T., Næsset, E., & Nelson, R. (2011). Modelbased inference for biomass estimation in a LiDAR sample survey in Hedmark County, Norway. Canadian Journal of Forest Research, 41, 96–107. Stephens, P.R., Kimberley, M.O., Beets, P.N., Paul, T.S., Searles, N., Bell, A., et al. (2012). Airborne scanning LiDAR in a double sampling forest carbon inventory. Remote Sensing of Environment, 117, 348–357. Strunk, J.L., Temesgen, H., Andersen, H. -E., & Packalén, P. (2014). Prediction of forest attributes with field plots, Landsat, and a sample of LiDAR strips. Photogrammetric Engineering & Remote Sensing, 80, 143–150. Tomppo, E. (2006). The Finnish national forest inventory. Forest Inventory (pp. 179–194). Springer. Tomppo, E., Gschwantner, T., Lawrence, M., McRoberts, R., Gabler, K., Schadauer, K., et al. (2010). National forest inventories. Pathways for common reporting. Springer. Tomppo, E., Haakana, M., Katila, M., & Peräsaari, J. (2008). Multi-source national forest inventory: Methods and applications, vol. 18, New York: Springer. UNECE and FAO (2011). State of Europe's forests 2011. Status and trends in sustainable forest management in Europe. Veltheim, T. (1987). Pituusmallit männylle, kuuselle ja koivulle. Master's thesis Helsinki, Finland: Faculty of Agriculture and Forestry, University of Helsinki. Woodall, C.W., Rondeux, J., Verkerk, P.J., & Ståhl, G. (2009). Estimating dead wood during national forest inventories: A review of inventory methodologies and suggestions for harmonization. Environmental Management, 44, 624–631. Wu, S. -T. (1987). Potential application of multipolarization SAR for pine-plantation biomass estimation. Geoscience and Remote Sensing, IEEE Transactions on, 3, 403–409.