Combining UAV and Sentinel-2 auxiliary data for forest growing stock volume estimation through hierarchical model-based inference

Combining UAV and Sentinel-2 auxiliary data for forest growing stock volume estimation through hierarchical model-based inference

Remote Sensing of Environment xxx (xxxx) xxx–xxx Contents lists available at ScienceDirect Remote Sensing of Environment journal homepage: www.elsev...

2MB Sizes 65 Downloads 69 Views

Remote Sensing of Environment xxx (xxxx) xxx–xxx

Contents lists available at ScienceDirect

Remote Sensing of Environment journal homepage: www.elsevier.com/locate/rse

Combining UAV and Sentinel-2 auxiliary data for forest growing stock volume estimation through hierarchical model-based inference Stefano Pulitia,⁎, Svetlana Saarelab, Terje Gobakkena, Göran Ståhlb, Erik Næsseta a b

Faculty of Environmental Sciences and Natural Resource Management, Norwegian University of Life Sciences, NMBU, P.O. Box 5003, NO-1432 Ås, Norway Department of Forest Resource Management, Swedish University of Agricultural Sciences, SLU, Skogsmarksgränd, SE-901 83 Umeå, Sweden

A R T I C L E I N F O

A B S T R A C T

Keywords: Unmanned aerial vehicles Sentinel-2 Growing stock volume Hierarchical model-based inference

Remotely sensed (RS) data are becoming increasingly important as sources of auxiliary information in forest resource assessments. Data from several satellites providing moderate image resolution are freely available (e.g. Sentinel-2). In addition, very-high-resolution three-dimensional data are available due to the advent of unmanned aerial vehicles (UAV). The increasing availability of auxiliary data offers new opportunities for largescale forest surveys using UAVs. A recently developed hierarchical model-based mode of inference makes it possible to use hierarchically nested auxiliary data in estimating population properties, such as total or mean biomass or volume, and their corresponding uncertainties in a statistically appropriate manner. In this study, hierarchical model-based inference was used to estimate growing stock volume (GSV; m3 ha− 1) and its variance using a small sample of field data, a larger sample of UAV data, and wall-to-wall Sentinel-2 data in a study area in SE Norway. The main objective of the study was to compare the performance, in terms of precision, of hierarchical model-based inference (denoted Case C) against two alternative cases. These were (1) model-based inference based on field data and wall-to-wall data, collected either with airborne laser scanning (Case A.1) or Sentinel-2 data (Case A.2), and (2) hybrid inference using a small sample of field data and a larger sample of UAV data (Case B). A second objective was to assess the possibility of reducing the UAV sampling intensity when adopting Case C rather than B, without decreasing the precision of the GSV estimates. The results, calculated as  % ), indicated that in case C the precision was of similar magnitude standard error as percentage of the mean (SE  % = 3.44%) as for Case A.1 (SE  % = 3.69%) and for Case B (SE  % = 3.58%). The standard error of Case A.2 (SE  % = 5.81%) as the rest of the cases. The results also indicated possibilities of was nearly twice as large (SE reducing the UAV sampling intensity without losing precision in cases where wall-to-wall Sentinel-2 data are available (Case C). The same precision for Case C with only five UAV samples was achieved as for Case B with 55 UAV samples. Thus, the study highlights the cost-efficiency of applications of UAV as in Case C and also provides first insights in the use of Sentinel-2 data for GSV estimation in boreal conditions.

1. Introduction 1.1. Background Unmanned aerial vehicles (UAVs) have been proposed as an innovative and accessible tool to support the acquisition of three dimensional (3D) remotely sensed (RS) data for forest management inventories (e.g, Dandois and Ellis, 2013; Lisein et al., 2013; Puliti et al., 2015) and for large-scale forest surveys (Puliti et al., 2017). UAV data have proven to provide detailed information on forest structures, yet their application is hampered by cost limitations when mapping large forest areas. In recent years, the availability of global coverage of RS auxiliary data has also increased rapidly, offering unique opportunities



for the production of large-scale forest resource estimates and maps. Nowadays, several satellite missions exist, providing freely available imagery (i.e., Landsat 8 and Sentinel-2). However, due to the moderate resolution of these satellite products (i.e., 10–30 m) and limited correlation of spectral variables with forest structural properties, the direct use of satellite imagery in forest inventories might not be compliant with the uncertainty prerequisites of current reporting schemes (e.g., REDD +; Reducing Emissions from Deforestation and Forest Degradation in Developing Countries; UNFCC, 2010). In parallel to the increased availability of auxiliary RS data, there have been major advances in developing statistical frameworks to ensure a rigorous uncertainty assessment when increasing the sources of auxiliaries in the estimation (Gregoire et al., 2016). A relevant example is the study by

Corresponding author. E-mail address: [email protected] (S. Puliti).

http://dx.doi.org/10.1016/j.rse.2017.10.007 Received 13 February 2017; Received in revised form 6 October 2017; Accepted 6 October 2017 0034-4257/ © 2017 Published by Elsevier Inc.

Please cite this article as: Puliti, S., Remote Sensing of Environment (2017), http://dx.doi.org/10.1016/j.rse.2017.10.007

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

like tree height or growing stock volume (Næsset et al., 2016; Saarela et al., 2016). Sentinel-2 data have been available since June 2015 and to the very best of our knowledge such data have to date been used in only one forestry related study on tree species classification (Immitzer et al., 2016) and there are no experiences so far on estimation of GSV. In this study, Sentinel-2 data were used as they are expected to represent an important data source for future forest monitoring schemes. It is therefore highly relevant to understand the level of uncertainty that can be expected when using Sentinel-2 to enhance the estimation.

Saarela et al. (2016), who developed estimators enabling the use of two levels of hierarchically nested RS auxiliary data. Applying this method, the combination of UAV data with satellite imagery might lead to an increase of the precision of estimators of key forest properties and also enable the production of maps. This approach may therefore benefit from the high resolution UAV data and from the large coverage of satellite imageries and, potentially, offer a cost-effective alternative to existing methods for large-area forest surveys. 1.2. UAVs in forest inventory

1.4. Combining multiple levels of RS auxiliary

The use of UAVs in forest inventories has gained attention in recent years because these RS data acquisition platforms allow a large number of users to capture very detailed information on the 3D structure of forest canopy. For example, the point clouds generated from photogrammetric processing of UAV imagery were proven to produce very high quality results even when compared to state-of-the-art RS data sources like airborne laser scanning (ALS) (Wallace et al., 2016; Puliti et al., 2017). To date, most applications of UAVs to derive 3D data in forest inventories have been based on optical imagery, but application of UAV-borne lasers is also a potentially intesting, but less explored, alternative. The use of UAVs to acquire full-coverage data is costly for largescale forest (i.e., area > 1000 ha) inventories and for this reason, Puliti et al. (2017) proposed a method using UAV data as part of a sampling strategy for large-scale forest inventory (i.e., area = 7330 ha). Their approach resulted in an increase in precision of field based estimates while not incurring in prohibitive costs of UAV wall-to-wall data acquisition. However, the method proposed by Puliti et al. (2017) can only be used to produce point estimates over a defined area of interest and not a continuous map of forest resources. The production of maps to support decision making and for local accounting of forest resources (McRoberts, 2010) is, however, a desired property of RS based forest inventories.

Traditionally, RS based forest inventories have relied upon models linking field plot information with RS data. The estimation in these cases has often been done in a model-based framework (Ståhl et al., 2016), where the precision of the estimators is based on the uncertainty of the model parameter estimates (Gregoire, 1998; McRoberts, 2006, 2010). With the increased availability of RS auxiliary data, there is an increased need for rigorous statistical frameworks that enable a clear reporting of the uncertainty of the estimators (Gregoire et al., 2016). Due to the large costs of acquiring expensive RS auxiliary data (i.e., ALS or UAV data) over large areas, statistical frameworks that enable the use of two-phase or two-stage sampling designs have recently been developed (Gregoire et al., 2011; Ståhl et al., 2011; Ringvall et al., 2016). Ståhl et al. (2011) proposed an inferential framework drawing partly on design-based inference and partly on model-based inference, which has later been referred to as hybrid inference, a notion introduced by Corona et al. (2014). With hybrid inference, a probabilistic sample of RS data is acquired (first phase) and then a second sample of field plots (second phase) is acquired where RS data are also available. The hybrid variance estimator includes two sources of uncertainty, namely: (1) the sampling error in the first phase and (2) the uncertainty due to model parameter estimates. In response to the wide availability of RS data from various sources, there have been increased efforts to combine multiple RS data types at different resolution for increasing the precision of estimators in largearea forest surveys. Some of the first examples of the use of multiple RS data in forest inventories used space-borne lidar samples in combination with samples of ALS and field plots (Boudreau et al., 2008; Nelson et al., 2009; Neigh et al., 2013; Margolis et al., 2015; Nelson et al., 2016). All these studies relied on a nested structure of the models linking biomass with ALS data and then ALS predicted above ground biomass or carbon with space-borne Lidar, but they admitted that they estimated the variance by accounting only for uncertainty associated with the second model. However, given the increase in complexity due to the nested structure of the models, it is crucial to account for every source of uncertainty in the estimation. A recent study by Saarela et al. (2016) aimed at filling this gap by developing hierarchical model-based estimators to estimate GSV and its uncertainty when using a combination of field, ALS, and Landsat data. The merit of the work by Saarela et al. (2016) lies in the development of variance estimators accounting for the uncertainties of all the models linking the different levels of information. In their work, Saarela et al. (2016) demonstrated that ignoring the uncertainty related to the field-to-RS model can lead to an underestimation of the variance by approximately 70%. Overall, the hierarchical model-based estimators also proved to be more precise than an hybrid estimator. In light of these results, the proposed inferential approach opens up many opportunities to develop new techniques for large-scale forest surveys using multiple RS data sources. When using UAVs as part of sampling strategies (e.g., Puliti et al., 2017) the estimators by Saarela et al. (2016) do offer opportunities for augmenting samples of UAV data with wall-to-wall satellite imagery (i.e. Sentinel-2). Such an approach could increase the precision of the estimates at no additional cost. Additionally, the availability of wall-towall coverage of RS auxiliary data enables the production of continuous

1.3. Sentinel-2 The need to further assess the contribution of regional/local forest ecosystems to the global carbon cycle has been one of the motivation behind the development of earth observation missions (e.g., Landsat, SPOT, Sentinel-2) devoted to providing users with free RS optical multispectral data acquired at regular intervals in time. The newly operational Sentinel-2 multispectral mission represents one of the most innovative and promising thus far, as it is the first to offer an unpreceded combination of high spectral, spatial, and temporal resolution (Drusch et al., 2012). Given the large focus of the Sentinel-2 mission on vegetation classification and forest monitoring, a key factor in designing the multispectral instrument (MSI) was the spectral resolution in the red-edge region of the spectrum. The result was the incorporation of three narrow bands (15–20 nm wide) centered at 705, 740, and 783 nm, in the red-edge region of the spectrum characterized by a high correlation with vegetation biophysical properties like chlorophyll content and leaf area index (Danson and Plummer, 1995; Delegido et al., 2011; Clevers and Gitelson, 2013). In addition to the spectral component, the increased spatial resolution of Sentinel-2 (10–20 m for the bands of interest for vegetation mapping) offers an increased level of detail compared to Landsat data (30 m resolution), which could contribute to improve the correlation between growing stock volume (GSV) and RS data. The frequent revisit time planned for the Sentinel-2 mission (3–5 days) also presents new opportunities to obtain cloud free imagery at regular intervals and to make use of dense time series of imagery. Previous experiences using Landsat data revealed that moderate resolution satellite optical data can be used for precise estimation and mapping of forest area (McRoberts, 2006). However, the drawback of multispectral satellite imagery is the difficulty of describing forest biophysical properties characterized by a strong vertical component 2

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

2.3. Remotely sensed data

GSV maps. This type of application would be relevant for the future inclusion of UAV samples in large-area forest resource assessment programs. Regardless of the small area coverage of a single UAV, the low cost and high availability of UAVs could provide valuable contributions to large-scale estimation of forest resources.

2.3.1. ALS data Wall-to-wall ALS data were collected in May–June 2015 using two Leica ALS70 sensors with a pulse density of 7.45 points m− 2. The normalized height point cloud was used to extract ALS variables (Næsset, 2004; Gobakken and Næsset, 2008) such as height percentiles (p10, p20, …, p90, p95, p100) and density variables (d0, …, d9) for the field plots and for the N grid cells of the target population U.

1.5. Aim This study proposes a novel application for large-area estimation of GSV using a combination of three sources of information, namely: 1) a wall-to-wall layer of Sentinel-2 data, 2) a sample of UAV data, and 3) a sample of field observations. This study builds upon the UAV sampling case proposed in the study by Puliti et al. (2017) and the estimation is carried out through the hierarchical model-based inferential framework proposed by Saarela et al. (2016). The precision of the proposed case is evaluated by comparing with existing cases considered as benchmark alternatives. These differed in the availability of RS data (i.e., wall-towall ALS, wall-to-wall Sentinel-2, and partial-coverage UAV) and in the estimators used (i.e., model-based and hybrid). The main objectives of this study were to:

2.3.2. UAV block data Partial-coverage UAV data (UAV blocks) were acquired during summer and fall 2015 as a probability sample selected from the population U. We denote the sample as Sa. The sample consisted of 55 systematically distributed blocks (1.1 km spacing) of average size of 41.06 ha and ranging from 29.4 to 58.8 ha. The variable size was due to varying wind speed and topography for the different UAV data acquisitions. To ensure UAV data coverage of the field plots the origin of the UAV block systematic grid was the same as for the field plot sampling grid. The 55 UAV blocks covered M = 29,598 grid cells, hence our sample Sa consists of M grid cells. Thus, the UAV samples covered 20.2% of the defined population. UAV imagery was acquired using a Canon IXUS/ELPH camera (SenseFly, 2014) mounted on senseFly eBee fixed wing UAV (senseFly, 2014). Photogrammetric processing of the images was carried out for each UAV block using Agisoft Photoscan (Agisoft, 2014). Five ground control points were measured in each UAV block and used in the camera alignment optimization. The absolute height values of the resulting dense point clouds were normalized using the digital terrain model obtained from the ALS data. The same height and density variables as extracted for the ALS data were computed for the UAV data for the m field plots as well as for the M grid cells. Further details on the acquisition and processing of the UAV data can be found in Puliti et al. (2017).

(1) Assess the precision of the estimates based on hierarchical modelbased inference and compare it with two benchmark cases; (2) Determine whether the inclusion of wall-to-wall Sentinel-2 data in a UAV sampling scheme offers any opportunities of reducing the UAV sampling intensity without reducing the precision of the estimates.

2. Materials 2.1. Study area The study area was located within Gran municipality (60°29′21″N, 10°16′52′ E) in southeastern Norway. It was defined to comprise forests with mean height > 7 m. The total area was 7330 ha. The study area was identified by classifying available ALS data according to height following the method described in Puliti et al. (2017). The main tree species in the area are Norway spruce (Picea abies (L.) Karst.), followed by Scots pine (Pinus sylvestris L.), and deciduous trees dominated by birch (Betula pubescens Ehrh.). The study area was tessellated into N = 146,619 squared grid cells with an area of 500 m2, which is equal to the field plot size (see Section 2.2). The N grid cells constitute our target population U.

2.3.3. Sentinel-2 imagery Wall-to-wall Sentinel-2 cloud-free imagery was acquired on 15 August 2015 at the same time as the field UAV data collection over the entire population U. Sentinel-2 imagery is captured using the multispectral imager (MSI), a push-broom imaging sensor that measures the Earth's top of atmosphere (TOA) reflected radiance in 13 spectral bands (443–2190 nm). The ground sampling distance (GSD) of the bands is 10 m for bands 2, 3, 4 (490–665 nm), and 8 (842 nm); 20 m for bands 5, 6, 7 (705–783 nm), 8A (865 nm), 11, and 12 (1610–2190 nm); and 60 m for bands 1 (443 nm), 9, and 10 (945–1375 nm). According to an evaluation of the geolocation uncertainty done in September 2016 (ESA, 2016a), the uncertainty was < 14.6 m at 95% confidence interval at the time of the acquisition of Sentinel-2 data used in this study. The Sentinel-2 data were delivered as level-1C 12-bit encoded TOA reflectance values. The orthorectification of level 1-C products was done by the European Space Agency using PlanetDEM 90 m resolution digital elevation model (Hoersch, 2015). Our study population was covered by two overlapping level 1-C tiles. These were processed to level 2-A bottom-of-atmosphere (BOA) 12-bit reflectance values according to the indications from the European Space Agency (ESA) using the freely available SNAP software (ESA, 2016d) and the associated Sen2Cor tool (ESA, 2016b). The conversion from level 1-C TOA to level 2-A BOA consisted in scene classification and atmospheric correction. A more detailed description of the steps involved in the processing of level 2-A products can be found in ESA (2016c). From the Sentinel-2 level 2-A product, band mean values (B1, …, B12) and standard deviation (B1SD, …, B12SD) were computed as independent variables using ArcGIS zonal statistics tool (ESRI, 2016). The variables were calculated for field plots and for all the grid cells within the population U. Fig. 1 provides an overview of the study area and data sources used

2.2. Field data Field data were collected during summer 2015 and consisted of m = 33 circular field plots of size 500 m2. The data were collected as part of a larger survey and originated from two subareas. Twenty-six field plots were located within the defined study area while the remaining seven were located in a contiguous area external to the study area. Both field datasets were collected according to systematic grids but with different plot spacing (i.e., 1.5 km for internal plots and 1.7 km for external plots). We denote the sample of field plots as S. The position of each plot center was recorded with sub-meter accuracy using a Topcon Legacy-E+ 40 channel dual-frequency receiver observing pseudorange and carrier phase of both global positioning system and global navigation satellite system. Ground reference GSV (m3 ha− 1) was computed using the method described in Puliti et al. (2017) based on allometric models devised by Braastad (1966), Brantseg (1967), Vestjordet (1967, 1968), and Fitje and Vestjordet (1977) using measured diameter at breast height and a sample of heights. The average GSV value for the field plots was 223.1 m3 ha− 1, ranging from 60.3 to 441.3 m3 ha− 1. Further details on the acquisition of the field data can be found in Puliti et al. (2017). 3

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

Fig. 1. Overview of the study area and sampling design adopted for field plots and UAV samples. Sentinel-2 and UAV data are also included to show the remotely sensed data used in the proposed hierarchical model-based inference.

Case A. Model-based inference.

in this study, with particular focus on the remotely sensed auxiliary data used in the hierarchical model-based inference.

Case A.1. Model-based inference with wall-to-wall ALS data. Case A.2. Model-based inference with wall-to-wall Sentinel-2 data.

3. Methods

Case B. Hybrid inference with a sample of UAV data. Case C. Hierarchical model-based inference with wall-to-wall Sentinel2 and a sample of UAV data.

In addition to the estimator based on hierarchical model-based inference (denoted Case C) we applied two benchmark cases (Cases A and B) for comparison. Thus, the following cases were evaluated: 4

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

Table 1 Summary of the selected variables for each model used in the study. Model

Predictor variablesa

GSV ~ ALS GSV ~ Sentinel-2 GSV ~ UAV UAV ~ Sentinel-2 GSV

p80, d1 B5, B6 p80, d2 B5, B6

B5, B6 : mean band values for band 5 and 6 of Sentinel-2 imagery. a p80, d1, and d2: height and density variables from ALS and UAV point clouds.

important to mention that MB relies on the assumption that the values that are linked to the elements in the population are realizations of random variables; hence, also the population parameters (i.e., total and mean) are random variables. In Case A.1, the GSV ~ALS model (yS = KSδS + gS) was used to link the column vector of m length of GSV values (yS) and r = 2 ALS predictor variables (KS), and gS is a column vector of random errors. The ordinary least squares (OLS) estimator of model parameters is:

δS = (KST KS )−1KST yS

(1)

The mean is estimated as: Fig. 2. Schematic illustration of the cases included in this study: A.1) model-based inference with field and wall-to-wall ALS data; A.2) model-based inference with field and wall-to-wall Sentinel-2 data; B) hybrid inference using field and UAV data; and C) hierarchical model-based inference using field, UAV, and Sentinel-2 data.

 E (μ)A .1 = ιUT KU δS

(2)

where ιU is a vector of length N where each element equals N × (r + 1) matrix of ALS variables. The variance of the estimator is obtained as:

Cases A and B represented the benchmarks, where Case A.1 was the possible best case (with ALS) and A.2 for the possibly worst case (with Sentinel-2) in terms of quality of the RS auxiliary data used in this study. Case B was included to assess the effect on the estimated variance when excluding the Sentinel-2 data. Fig. 2 illustrates the cases included in the study.

to 1

N,

KU is a

V [ E (μ)A .1] = ιUT KU Cov (δS ) KUT ιU

(3)

where Cov (δS ) in case of homoskedastic residuals is estimated by:

OLS (δS ) = Cov

 g ST  gS m−r−1

(KST KS )−1

(4)

gS is a m-length column vector of estimated residuals over the where  sample S. Furthermore, Saarela et al. (2016) described a modified version of the heteroskedasticity-consistent (HC) estimator from White (1980) to be used in presence of heteroskedasticity. This estimator was used to reduce possible overestimation of the actual variances of the model parameters when using an OLS estimator in presence of heteroskedasticity (White, 1980). The use of such an estimator is also advisable when other assumptions of OLS (i.e., normality, independence) are violated (White, 1980). The HC estimator of Cov ( δŜ ) was:

3.1. Model fitting Models that link the field observations to the RS data are needed to provide model-based and hybrid estimates that exploit the information in the RS data. Four different ordinary least squares (OLS) regression models were constructed linking: (1) GSV (yS) and ALS (KS) variables using information from the field data set (GSV ~ALS); (2) GSV (yS) and Sentinel-2 (ZS) variables using information from the field data (GSV ~Sentinel-2); (3) GSV (yS) and UAV (XS) variables using information from the field data (GSV~ UAV); ySa ) and Sentinel-2 (ZSa) (4) UAV-predicted GSV using the model in (3) (  UAV ~Sentinel-2). variables using Sa (GSV

m

T −1 2 T ⎤ HC (δS ) = (KST KS )−1 ⎡∑  Cov ⎢ gi k i k i⎥ (KS KS ) ⎣ i=1 ⎦

(5)

gi is the estimated residual and ki is a (r + 1) length row vector where  gi 2 term of ALS predictors for the ith observation from sample S. Each  m was multiplied by a correction factor in order to overcome an m−r−1 gi 2 being biased estimators of gi 2 (Davidson and issue of the residuals  MacKinnon, 1993). Case A.2 was almost identical to Case A.1, the only difference being that another source of auxiliary data was used (i.e., Sentinel-2 data instead of ALS). Given the GSV~ Sentinel-2 regression model (yS = ZSαS + wS) linking the GSV as the response (yS) and q = 2 of ZS Sentinel-2 predictor variables developed using the sample S, the estiE (μ)A.1]) E (μ)A.1 ) and the variance of the mean (V [ mators of mean ( were applied using the same estimators used in A.1 (see Eqs. (1)–(5)). The symbols K, δ, r, and g found in A.1 were substituted with Z, α, q, and w, respectively, in A.2. In Case B for the GSV ~UAV model (yS = XSβS + eS) linking GSV (yS) and p = 2 UAV predictor variables (XS), the estimator of the model parameters is:

Because the estimators developed by Saarela et al. (2016) currently only allow the use of linear models, no transformations of the variables were performed for any of the models. The independent variables used in these models were selected according to a branch-and-bound search for the best subset and model selection based on the Bayesian information criterion. Further details on the variable selection method are described in Puliti et al. (2017). Table 1 summarizes the variables selected for each of the abovementioned models. 3.2. Inference In the following section, the estimators of expected value of popuE (μ) and its variance V [ E (μ)] are presented for each case, lation mean  all of which relied in some way on model-based inference. Thus, it is 5

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

 = (X T XS )−1X T y β S S S S

The HC estimator from White (1980) is applied in case of heteroskedasticity and is obtained as:

(6)

The model was applied to predict GSV for each of the M grid cells in  ), where XSa represents the M × (p + 1) ySa = XSa β the sample Sa (  S matrix of UAV predictor variables. The estimator of the population mean is:

M

⎡ 2 T ⎤ T HC (α Sa) = (ZSa Cov ZSa)−1 ⎢∑ w i z i z i⎥ ⎣ i=1 ⎦ T T  ) X T ] ZSa (ZT ZSa)−1 HC (β + (ZSa ZSa)−1ZSa [XSa Cov Sa Sa

(7)

wi is the residual and zi is the (q + 1) length row vector of where  Sentinel-2 predictor variables for the ith observation in Sa. As for Cases M i 2 term was multiplied by a correction factor A and B, the w

where ιSaT is an M-length vector of 1/M values. The variance estimator for hybrid inference was obtained according to Ståhl et al. (2011) and Ståhl et al. (2014). In their estimator, both model uncertainty and sampling error due to the first-phase sample (Sa) contribute to the estimated variance. Differently from Saarela et al. (2016), we included M the finite population correction factor (FPC = 1 − N ) because the firstphase sample was large (i.e., the sampling fraction was 20.2%). Under the assumption that the UAV blocks were selected according to simple random sampling without replacement the variance estimator was:

1 T  ) X T ιSa V [ E (μ)B ] = FPC ⎛ ω2⎞ + ιSa XSa Cov (β Sa S ⎝M ⎠

M−q−1

(Davidson and MacKinnon, 1993). Further details on the development of Case C and on the derivation of the estimators in Eqs. (14) and (15) can be found in Saarela et al. (2016). 3.3. Reduction of sampling intensity for UAV samples One of the aims of this study was to assess whether or not the inclusion of wall-to-wall auxiliary data in a two-phase design could reduce the UAV sampling intensity without loss of precision. This was addressed by comparing the estimated variances when reducing the UAV sample sizes for Cases B and C. The number of UAV blocks was iteratively reduced by two samples at a time until a minimum number of three UAV sample blocks was reached. The total number of steps was 27 and for each step a random sample of UAV blocks was selected 6000 times. The reduction of UAV samples did not affect the sample S, meaning that the GSV~UAV model was the same – this represented the case of model-based/hybrid inference where an external model is available. The mean and variance were estimated for Cases B and C. Finally, the average value over all iterations was used to quantify the effect of the reduction of the number of UAV sample blocks on the difference in variance between Cases B and C. The relative efficiency (RE), calculated as the ratio between the estimated variance by B and that estimated by C, was used as a measure of the improvement in efficiency for a certain case. The contribution of the different components in the variance estimators (see Eq. (9) for case B and Eqs. (14) and (15) for Case C) was also reported in order to provide better understanding of the most influential sources of uncertainty.

(8)

ySa grid cell where ω2 is the sample-based variance for the vector of   ) is the covariance matrix of estimated model predictions and Cov (β S  . The OLS estimator for Cov (β  ) is: parameters β S S

) = OLS (β Cov S

T S e S e (XST XS ) m−p−1

(9)

 is the vector of residuals over the sample S. where eS = ys − XS β S Similar to case A.1 and A.2, the HC estimator of the covariance matrix of estimated model parameters was applied in case of heteroskedasticity and was:  ) = (X T XS )−1 ⎡∑m ei 2x T xi⎤ (X T XS )−1 HC (β Cov S i S ⎣ i=1 ⎦ S

(10)

where ei ̂ is the estimated residual and xi is a (p + 1) length row vector gi 2 term of ALS predictors for the ith observation in S. As for Case A, the  m was multiplied by a correction factor (Davidson and m−p−1 MacKinnon, 1993). In this study, the hierarchical model-based case proposed by Saarela et al. (2016) was adopted. According to this case, first the GSV~UAV ySa for all M in Sa. Furthermore, the model was applied to predict  UAV ~ Sentinel-2 model (  Sa ) linking the  GSV ySa = ZSa α ySa prediction Sa with q = 2 Sentinel-2 variables ZSa was fitted. The OLS estimator of α was obtained according to Saarela et al. (2016) as: T T Sa = (ZSa α ZSa)−1ZSa XSa (XST XS )−1XST yS

4. Results 4.1. GSV models

(11)

The different models that were fitted for the purpose of this study were characterized by a large range of adjusted coefficients of determination (Ra2; 0.30–0.83), revealing great differences in the ability to describe the variability in GSV when using different types of RS data. The ALS and UAV data appeared to have a greater explanatory power with Ra2 of 0.83 and 0.75, respectively, compared to Sentinel-2 (0.30 and 0.48), as one would expect. Table 2 summarizes the model performances in terms of Ra2, root mean square error (RMSE), and RMSE as percentage of the mean. The analysis of the residuals (see Fig. 3) revealed that all the models had an increasing variation in the residuals with increasing values of

The (q + 1) length column vector of estimated model parameters Sa was then used to compute the expected value of the superpopulation α E (μ)C ) as: mean for Case C (

 Sa E (μ)C = ιUT ZU α

(12)

where ZU is the N × (q + 1) matrix of Sentinel-2 variables for the entire population. Assuming simple random sampling of UAV and field plots, the E (μ)C is obtained as: variance estimator of 

 (α  [ Sa ) ZUT ιU V E (μ)C ] = ιUT ZU Cov

(13) Table 2 Summary of the model fit for models used for the purpose of this study. Ra2, RMSE, and RMSE as percentage of the mean (in brackets) are reported.

Under the OLS assumptions, the estimator of the covariance matrix Sa is: of α

OLS (α Sa) = Cov

(15)

S

T   E (μ)B = ιSa XSa β S

 T  w Sa Sa w T (ZSa ZSa)−1 M−q−1 T T  ) X T ] ZSa (ZT ZSa)−1 OLS (β + (ZSa ZSa)−1ZSa [XSa Cov Sa Sa S

(14)

UAV ~Sentinel-2 wSa is the M-length vector of residuals of the GSV where  model. 6

Model

Ra2

RMSE m3 ha− 1 (%)

GSV~ ALS GSV~ Sentinel-2 GSV~ UAV UAV ~ Sentinel-2 GSV

0.83 0.48 0.75 0.30

38.4 67.2 45.8 83.9

(17.2%) (30.1%) (20.5%) (37.47%)

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

Fig. 3. Plots of the residuals against estimated values (m3 ha− 1) for the different models constructed in this study.

Fig. 4. Normal QQ plots of the standardized residuals against estimated values for the different models constructed in this study.

results obtained using the HC estimators should be more realistic. Hence, in the rest of the study the focus will be on the HC estimators. E (μ) ) and of the The results for the estimation of expected GSV values ( E (μ)]) for all the studied cases and for the variance of the estimator (V [ HC variance estimators are reported in Table 3. The estimated mean values varied substantially between the different cases, with A.1 yielding the largest value (249.86 m3 ha− 1) and A.2 the lowest (209.32 m3 ha− 1). The estimated standard error  as percentage of the average of point es = V [ E (μ)] ) and the SE (SE  timates of all cases (SE% ) were used as measures of precision. The re and SE  % for Case C (7.76 m3 ha− 1; 3.44%), sults indicated smallest SE

predicted GSV, suggesting the presence of heteroskedasticity. In addition, as shown in Fig. 4, the assumption of normality seemed to some extent to be violated for most of the models. The bottom right plot in UAV ~ Sentinel-2 model negative predicFig. 3 shows that for the GSV tions occurred, however these were kept for estimation.

4.2. Inference According to the visual inspection of the residuals (see Section 4.1), it is reasonable to state that all the models to some extent violated the OLS assumptions of homoskedasticity and normality and therefore the 7

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

the sampling intensity for Sa as such reduction did not affect the sample S. Hence, the decrease in the proportion of the GSV ~UAV model error was due to the increase in the sampling error for Case B and in the UAV ~Sentinel-2 model error for Case C. For Case B, the sampling GSV error had a steady increase from 55 blocks to approximately 20, followed by an exponential increase from 20 to three block. A similar trend was found for the component representing the uncertainty of the UAV ~Sentinel-2 model in the variance estimator for Case C. GSV

Table 3 Summary of the results from the estimation of AGV under the different cases studied with HC variance estimators. Case

 E (μ)

V [ E (μ)]

 SE

% SE

A.1 A.2 B C

249.86 209.32 223.85 219.59

69.16 171.87 65.26 60.23

8.32 13.11 8.08 7.76

3.69 5.81 3.58 3.44

5. Discussion followed by Case B (8.08 m3 ha− 1; 3.58%), and A.1 (8.32 m3 ha− 1;  % was found for Case A.2 being almost two times 3.69%). The largest SE as large (5.81%) as for the other cases. Fig. 5 shows the prediction maps of GSV for each case. Here it is possible to see clearly that the use of Sentinel-2 data variables led to lower predicted values than ALS or UAV data, especially in the areas with large timber stocks. The example in Fig. 6 shows that the use of Sentinel-2 data led to severe under-prediction or even failed to predict existence of trees (positive GSV values) compared to ALS or UAV in the area characterized by steep slopes and by the largest illumination at the time of the acquisition of the Sentinel-2 data.

The current study introduced a novel approach to the use of UAV data in combination with moderate resolution satellite imagery for large-scale forest surveys. The proposed Case C used a combination of wall-to-wall Sentinel-2 imagery, a probability sample of UAV data, and a sample of field plots. A hierarchical model-based inferential framework (Saarela et al., 2016) was adopted as it enables a rigorous reporting of the uncertainty of the estimates when using multiple sources of RS auxiliary data. 5.1. Comparison of the different cases All the studied cases used models linking the different levels of information. All of the models linking GSV with RS auxiliary data showed Ra2 and RMSE values consistent with previous research. It is worthwhile mentioning that the present study is the first to provide insight on the modelling of GSV using Sentinel-2 data. The selected Sentinel-2 variables (see Section 3.1) were two bands (i.e., band 5 and 6; 20 m resolution) in the red-edge region of the spectrum, revealing the importance of these bands. It was an interesting finding that the bands representing the red-edge region of the spectrum with 20 m resolution were selected before the bands acquired with a finer spatial resolution (10 m). This illustrates the great utility of the spectral properties of these bands for GSV estimation. As expected, the models fitted using 3D RS auxiliary data (i.e., ALS and UAV data) showed better model fit (Ra2 = 0.75–0.83) and lower RMSE values (38.4–45.8 m3 ha− 1; 17.2%–20.5%) than the model based on Sentinel-2 imagery (Ra2 = 0.30–0.48). This large difference could possibly be attributed to the combined effect of poor ability of optical data to describe GSV and co-location errors between field plots and Sentinel-2 data. The results obtained using Sentinel-2 data were in line with previous research carried out using moderate resolution satellite imagery. Both Trotter et al. (1997) and Dube and Mutanga (2015) used Landsat data to model timber volume and found Ra2 values ranging from 0.3 to 0.58. The study by Næsset et al. (2016) also showed similar performances in terms of Ra2 (0.53) when using 5 m resolution RapidEye data. The  GSVUAV ~Sentinel-2 model was characterized by a large number of

4.3. Varying UAV sample size The results of the comparison between Cases B and C when varying the UAV sample size are presented in this section. The method for simulating a reduction in the sample size of UAV blocks based on simple random sampling without replacement selection of the blocks was adopted to obtain the continuous curves shown in Fig. 8. The applied number of iterations to estimate the mean variance (n = 6000) was considered appropriate as the values converged even for the cases with the lowest number of UAV blocks (see Fig. 7). As shown in Fig. 8, the averages of estimated variances for Case C was consistently smaller than for Case B. The analysis of the effect of the varying sample size on the RE, revealed that when reducing the number of UAV blocks, there was an increasing trend of the efficiency for Case C over Case B (see Fig. 9), however, the magnitude of the increase was very modest, with the RE ranging from 1.08 to 1.1. The contribution of the different variance components in Cases B and C is illustrated in Fig. 10. The results indicate that the largest contribution to the variance for both Cases B and C was attributed to the model uncertainty from the GSV ~UAV model. The model error contribution in percent of the estimated variance ranged from 99.6% to 91.9% and from 99.6% to 92.9% for Cases B and C, respectively. The magnitude of this source of uncertainty was constant when decreasing

Fig. 5. Overview of GSV predictions for all the studied cases. The upper row shows the predictions at a larger scale while the lower row shows details at the scale of a single UAV block for all cases.

8

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

Fig. 6. Detail of one of the areas mostly affected by under-prediction of GSV due to the use of Sentinel-2 data. a) Shaded relief. Illumination according to sun azimuth (169.89) and zenith (47.09) angles at the time of acquisition of Sentinel-2 imagery (the brighter the pixels the larger the illumination); b) Slope raster. Brighter values represent larger slopes; c) GSV predictions for each of the cases.

Appropriate treatment of these values could be carried out through either truncating the predictions to a minimum of zero, or by adopting models limiting the range of predictions, for example by using logistic regression models as demonstrated by McRoberts et al. (2013). Nevertheless, our study was constrained by the use of linear models, hence no particular treatment of the negative predictions was adopted. Some of the main factors affecting the accuracy of the predictions and UAV ~ Sentinel-2 model were the errors in the UAV model fit for the GSV predictions, co-location errors, and possibly the violation of the assumption of independence due to the clustered structure of UAV data. The first issue can be somewhat reduced by increasing the number of measured field plots and by adopting non-linear models. The second factor, i.e., violation of independency assumption, represents a limitation of the estimators developed by Saarela et al. (2016), which accounts for heteroskedasticity but not for the clustered UAV samples. The development of estimators using generalized least squares regression instead of OLS, could be beneficial to reduce the uncertainty due to the violation of the independency assumption. The results revealed rather large differences between the point estimates obtained for the different cases. Generally, for the cases when Sentinel-2 data were used (i.e., Cases A.2 and C) the estimates were smallest. In particular, the difference was largest between case A.1 E (μ)A.1 = 249.86 m3 ha− 1) and Case A.2 ( E (μ)A.2 = 209.32 m3 ha− 1). ( The visual comparison of the wall-to-wall predictions in Fig. 5 revealed an overall tendency of Sentinel-2 based models to predict lower GSV values compared to ALS and UAV based models. Such an effect is typical when modelling GSV using optical spectral data and is often referred to as an effect of saturation. This relates to the lack of sensitivity of spectral reflectance values to the variation in GSV for forests characterized by

Fig. 7. The convergence of the estimated variance over 6000 iterations for the step including the minimum number of UAV blocks (i.e., three UAV blocks). The solid and dashed lines represent the mean over all of the iterations for Case B (top) and C (bottom), respectively.

observations to fit the model (M = 29,598). Regardless of the sample size, the poorest performance in terms of Ra2 (0.3) and relative RMSE UAV ~Sentinel-2 (37.47%) was observed for this model. For the GSV model, negative predictions were also observed (see Fig. 3). 9

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

Fig. 8. Averages of estimated variances for Cases B  [  [ E (μ)B ]) and C (V (V E (μ)C ]) over 6000 iterations for each UAV sample size.

 values (7.8 m3 ha− 1; 3.5%). The precision reported slightly smaller SE of Case C was larger than Case B but the improvement was only marginal (0.32 m3 ha− 1). The magnitude of this improvement is consistent with what Saarela et al. (2016) reported using a sample of ALS strips and Landsat data (0.06–0.30 m3 ha− 1). These results were encouraging as the precision of the estimates for Case C was found to be larger than for Case B and that these estimates were close to what is obtainable under the best-case scenario using wall-to-wall ALS data in a modelbased framework. However, the differences in precision between Case C and Cases A.1 and B were very small, making it hard to give a conclusive judgment on the improvement of C over B. It is important to mention that for Cases B and C, variance estimators assuming simple random sampling were adopted while in reality the sampling (i.e., field plots and UAV blocks) was systematic. When adopting systematic rather than random sampling the variance estimators can be biased, overestimating the true variance (see e.g. Ene et al., 2012). Thus, the evaluation of the uncertainty of Cases B and C may have been conservative as the true values might have been smaller than what we reported in this study. In spite of the limitations of GSV maps produced from optical multispectral data, the possibility to produce wall-to-wall maps is an attractive property of Case C. The use of digital surface models from satellite image stereo pairs or interferometric synthetic aperture radio detection and ranging (InSAR) instead of Sentinel-2 data could lead to more precise estimation and more valuable map products.

large GSV values. Saturation of biomass models has been widely observed when using moderate resolution optical data (Lu, 2006; Zhao et al., 2016) and clearly represents a major limitation when using such data for GSV estimation. In the current study, in addition to the general trend of underprediction of large GSV values (see Fig. 4), we observed severe underprediction of large GSV values compared to ALS and UAV predictions in certain areas (see Fig. 5). These areas were characterized by steep terrain and by the largest illumination at the time of the acquisition of the Sentinel-2 data within the study population. As described by Zhao et al. (2016) the topography has an effect on the saturation, and in this case resulted in extreme underprediction. A potential reduction of the uncertainty due to topographic features could be obtained by orthorectifying and recomputing the spectral values of the Sentinel-2 imagery accounting for the topography using more accurate digital terrain models than the 90-m resolution PlanetDEM 90 used in the Sentinel2 level 1-C processing pipeline. It is important to acknowledge such limitations in the use of optical multispectral data in order to detect these types of errors, especially because they are not easily detectable by visual assessment of the imagery. The first objective of this study (see Section 1.5) was to assess the precision of Case C in relation to the alternative cases. The assessment  ) indicated that the estimator for Case C was as of the precision (SE  % = 3.53%) as the one for A.1 (SE  % = 3.33%). Puliti et al. precise (SE (2017) using the same ALS data as in the current study and by adopting  % of 2.4%. model-based inference and a log-log model, reported an SE Similarly, in boreal forests McRoberts et al. (2013, 2014) reported an  % in the range of 4.3%–5.3% for ALS model-based inference. As exSE pected, the estimator for Case A.2 had the poorest performance in terms  % almost twice as large (6.26%) as for the rest of of precision, with SE the cases. The abovementioned limitations when modelling GSV using moderate resolution optical data (i.e., co-location errors and saturation)  % for Case A.2. represent some of the likely causes behind the large SE For Case B, there were no terms of comparison other than the study by Puliti et al. (2017), who, with the same dataset but with log-log models,

5.2. Comparison of Cases B and C when varying UAV sampling intensity The second objective of this study (see Section 1.5) was to determine whether the adoption of Case C rather than Case B allowed for a reduced sampling intensity of UAV auxiliary data. The results showed that when the UAV sample size was reduced from a total of 53 to a minimum of 3, there was an increase in variance for both cases. Such a trend was expected because of the increase in the sampling error for Fig. 9. Average relative efficiency, calcu [ E (μ)B ] and lated as the ratio between V  [ V E (μ)C ] over 6000 iterations.

10

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

Fig. 10. Contribution of different components to the estimated variance for Cases B and C when reducing the number of UAV blocks. For Case C the two parts represent the uncertainty UAV ~ Sentinel-2 model (part 1 cov Eq. (15)) and the uncertainty due to the GSV~ UAV model (part 2 cov Eq. (15)). due to the GSV

Case B, and because of the reduction of the sample size for Sa to fit the UAV ~ Sentinel-2 model. The results also revealed that the increase GSV in the estimated variance was, even if moderate, larger for Case B than Case C. This was demonstrated by the observed increase in RE (see Fig. 8). One interesting finding was that the variance estimator in Case C when using only 5 UAV blocks was as precise as B with 55 blocks. According to these results the cost figures reported by Puliti et al. (2017) for the acquisition and processing of the 55 UAV blocks (3.42 EUR ha− 1) can theoretically be reduced by approximately 90% (0.31 EUR ha− 1) when using only 5 UAV samples. The results also revealed that the model error related to the GSV~ UAV model was the largest component both for Case C and Case B, i.e., always > 90% of the total estimated variance (see Fig. 9). This finding illustrates the importance of reducing this source of error, for example by increasing the size of the sample S. Given these results, it is important in future applications of Case C to invest further effort in finding an optimal balance between costs for data acquisition (UAV and field data) and the precision of the estimates. The combination of an increase of the sampling intensity of field data with a reduction of the sampling intensity of UAV data could indeed lead to a reduction of the GSV ~UAV model uncertainty, while reducing the costs for performing extensive UAV campaigns. In addition to identifying optimal sampling intensities, it would be beneficial to find optimal ways of locating the samples (i.e., UAV data and field plots). More efficient sampling strategies achieved by spreading the samples in the auxiliary space (Hawbaker et al., 2009; Gobakken et al., 2013; Grafström et al., 2014) could be beneficial to ensure full coverage of the range of GSV values given the limited resources available for field data collection.

auxiliaries. The low number of observations also made the analysis of the residuals more difficult. As observed by Puliti et al. (2017), acquiring UAV data over small areas (i.e., 500 m2) corresponding to single field plots (even located outside of the study area) for the purpose of model fitting can be done effectively due to the rapidity and ease of conducting such operations. Therefore, complementing the combined field/UAV samples with field plots with UAV data collected exclusively within the borders of the plots (not a full UAV sample block) may be an effective way of extending the sample data for model construction. This approach to reduce model uncertainty may alter some of the conclusions pertaining to the UAV sub-sampling simulation that we performed since model uncertainty accounts for a major fraction of the overall estimated uncertainty. Furthermore, in this study, the use of Sentinel-2 was sub-optimal and improvements may be achieved with a more thorough use of the data. As previously mentioned, the use of more precise DEMs for orthorectification could lead to better geometric accuracy, i.e., reducing the co-location errors, and possibly reduce the negative effects of extreme topography on the saturation. It is worth mentioning that improvements in geometric accuracy have been carried out after spring 2016, improving the geolocation error from 14.6 m (at the time of this study), to 8 m (ESA, 2016a). Hence, part of the issues related to colocation errors are nowadays reduced compared to this study. Additionally, the use of dense time series of Sentienl-2 imagery could provide more valuable information than the use of data from only a single point in time as it can result in variables that are less affected by saturation. The use of higher resolution imagery could also result in an increase in the precision of the estimates because of the reduced colocation errors. Some of the issues reported in this study for Case C also relate to the estimator used. The work by Saarela et al. (2016) represents a milestone in the use of multiple sources of RS auxiliary data for estimation of forest resource parameters. However, a main limitation of the hierarchical model-based estimator is that it currently only allows for linear models. The relationship between GSV and RS data is not necessarily linear and the use of non-linear models or use of other parametric regression methods can be beneficial. Further, it would be useful to allow for models accounting for clustered samples and models restricting the predictions to avoid negative values. As mentioned by Saarela et al. (2016), ongoing research is aimed at accounting for some of these issues, one of these being the development

5.3. General considerations The current study serves as a useful illustration of the potentials of the survey application we denoted Case C. It also provided some first and encouraging results of the potential of the proposed case in operational surveys. Yet, general conclusions can hardly be drawn on the basis of the evidence provided by this study; there are too many restrictions, assumptions, and specific limitations that preclude general inference. First, a main limitation was the low number of field plots (m = 33), which directly affected the models linking the GSV and the RS 11

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

fileadmin/user_upload/documents/manuals/Extended_User_Manual_eBee_and_eBee_ Ag_v16.pdf (16.04.2016). Gobakken, T., Næsset, E., 2008. Assessing effects of laser point density, ground sampling intensity, and field sample plot size on biophysical stand properties derived from airborne laser scanner data. Can. J. For. Res. 38, 1095–1109. Gobakken, T., Korhonen, L., Næsset, E., 2013. Laser-assisted selection of field plots for an area-based forest inventory. Silva Fennica 47, 1–20. Grafström, A., Saarela, S., Ene, L.T., 2014. Efficient sampling strategies for Forest inventories by spreading the sample in auxiliary space. Can. J. For. Res. 44. Gregoire, T.G., 1998. Design-based and model-based inference in survey sampling: appreciating the difference. Can. J. For. Res. 28, 1429–1447. Gregoire, T.G., Ståhl, G., Næsset, E., Gobakken, T., Nelson, R., Holm, S., 2011. Modelassisted estimation of biomass in a Lidar sample survey in Hedmark County, Norway this article is one of a selection of papers from extending forest inventory and monitoring over space and time. Can. J. For. Res. 41, 83–95. Gregoire, T.G., Næsset, E., McRoberts, R.E., Ståhl, G., Andersen, H.-E., Gobakken, T., Ene, L., Nelson, R., 2016. Statistical rigor in Lidar-assisted estimation of aboveground forest biomass. Remote Sens. Environ. 173, 98–108. Hawbaker, T.J., Keuler, N.S., Lesak, A.A., Gobakken, T., Contrucci, K., Radeloff, V.C., 2009. Improved estimates of forest vegetation structure and biomass with a Lidaroptimized sampling design. J. Geophys. Res. Biogeosci. 114. Hoersch, B., 2015. Sentinel-2 Mission: Observation Scenario, Products and Mission Status. Available at: due.esrin.esa.int/mwbs2015/files/3_Hoersch.pdf (08.12.2016). Immitzer, M., Vuolo, F., Atzberger, C., 2016. First experience with Sentinel-2 data for crop and tree species classifications in Central Europe. Remote Sens. 8, 166. Lisein, J., Pierrot-Deseilligny, M., Bonnet, S., Lejeune, P., 2013. A photogrammetric workflow for the creation of a forest canopy height model from small unmanned aerial system imagery. Forests 4, 922–944. Lu, D., 2006. The potential and challenge of remote sensing-based biomass estimation. Int. J. Remote Sens. 27, 1297–1328. Margolis, H.A., Nelson, R.F., Montesano, P.M., Beaudoin, A., Sun, G., Andersen, H.-E., Wulder, M.A., 2015. Combining satellite Lidar, airborne Lidar, and ground plots to estimate the amount and distribution of aboveground biomass in the boreal forest of North America. Can. J. For. Res. 45, 838–855. McRoberts, R.E., 2006. A model-based approach to estimating forest area. Remote Sens. Environ. 103, 56–66. McRoberts, R.E., 2010. Probability- and model-based approaches to inference for proportion forest using satellite imagery as ancillary data. Remote Sens. Environ. 114, 1017–1025. McRoberts, R.E., Næsset, E., Gobakken, T., 2013. Inference for Lidar-assisted estimation of forest growing stock volume. Remote Sens. Environ. 128, 268–275. McRoberts, R.E., Næsset, E., Gobakken, T., 2014. Estimation for inaccessible and nonsampled forest areas using model-based inference and remotely sensed auxiliary information. Remote Sens. Environ. 154, 226–233. Næsset, E., 2004. Practical large-scale forest stand inventory using a small-footprint airborne scanning laser. Scand. J. For. Res. 19, 164–179. Næsset, E., Ørka, H.O., Solberg, S., Bollandsås, O.M., Hansen, E.H., Mauya, E., Zahabu, E., Malimbwi, R., Chamuya, N., Olsson, H., Gobakken, T., 2016. Mapping and estimating forest area and aboveground biomass in Miombo woodlands in Tanzania using data from airborne laser scanning, tandem-X, Rapideye, and global forest maps: a comparison of estimated precision. Remote Sens. Environ. 175, 282–300. Neigh, C.S.R., Nelson, R.F., Ranson, K.J., Margolis, H.A., Montesano, P.M., Sun, G., Kharuk, V., Næsset, E., Wulder, M.A., Andersen, H.-E., 2013. Taking stock of circumboreal forest carbon with ground measurements, airborne and spaceborne Lidar. Remote Sens. Environ. 137, 274–287. Nelson, R., Boudreau, J., Gregoire, T.G., Margolis, H., Næsset, E., Gobakken, T., Ståhl, G., 2009. Estimating Quebec provincial forest resources using Icesat/Glas. Can. J. For. Res. 39, 862–881. Nelson, R., Margolis, H., Montesano, P., Sun, G., Cook, B., Corp, L., Andersen, H.-E., deJong, B., Pellat, F.P., Fickel, T., Kauffman, J., Prisley, S., 2016. Lidar-based estimates of aboveground biomass in the continental US and Mexico using ground, airborne, and satellite observations. Remote Sens. Environ. 188, 127–140. Puliti, S., Ørka, H.O., Gobakken, T., Næsset, E., 2015. Inventory of small forest areas using an unmanned aerial system. Remote Sens. 7, 9632–9654. Puliti, S., Ene, L.T., Gobakken, T., Næsset, E., 2017. Use of partial-coverage Uav data in sampling for large scale forest inventories. Remote Sens. Environ. 194, 115–126. Ringvall, A.H., Ståhl, G., Ene, L.T., Næsset, E., Gobakken, T., Gregoire, T.G., 2016. A poststratified ratio estimator for model-assisted biomass estimation in sample-based airborne laser scanning surveys. Can. J. For. Res. 46, 1386–1395. Saarela, S., Holm, S., Grafström, A., Schnell, S., Næsset, E., Gregoire, T.G., Nelson, R.F., Ståhl, G., 2016. Hierarchical model-based inference for forest inventory using three sources of information. Ann. For. Sci. 73, 895–910. SenseFly, 2014. User Manual Ixus/Elph Camera. Available at: http://www. mencisoftware.com/documentUAV/camera%20manual/[ENG]_2014_user_manual_ ixuselph_v2.pdf (18.01.2016). Ståhl, G., Holm, S., Gregoire, T.G., Gobakken, T., Næsset, E., Nelson, R., 2011. Modelbased inference for biomass estimation in a Lidar sample survey in Hedmark County, Norway. Can. J. For. Res. 41, 96–107. Ståhl, G., Heikkinen, J., Petersson, H., Repola, J., Holm, S., 2014. Sample-based estimation of greenhouse gas emissions from forests - a new approach to account for both sampling and model errors. For. Sci. 60, 3–13. Ståhl, G., Saarela, S., Schnell, S., Holm, S., Breidenbach, J., Healey, S.P., Patterson, P.L., Magnussen, S., Næsset, E., McRoberts, R.E., Gregoire, T.G., 2016. Use of models in large-area forest surveys: comparing model-assisted, model-based and hybrid estimation. Forest Ecosyst. 3, 1–11. Trotter, C.M., Dymond, J.R., Goulding, C.J., 1997. Estimation of timber volume in a

of more general estimators to tackle clustered Sa samples. 6. Conclusion The main merit of this study is the illustration of a novel application for large-scale forest inventories using innovative RS data sources and estimators. The results were encouraging for further use of the case proposed in this study (Case C), nevertheless they should be confirmed by similar findings in other areas and for other sample sizes. The merit of this study also lies in the fact that it provided first insights into the opportunities and limitations related to the use of Sentinel-2 data for estimation of GSV in boreal forest conditions. It is concluded that Sentinel-2 is a promising data source for future large-scale forest surveys, however, it is important to acknowledge the intrinsic limitations that optical data have when modelling GSV or similar volumetric properties like biomass or carbon. Acknowledgements The field data were collected as part of the hyperBio project (project info: 244599 hyperBio - using new technology to reduce costs and improve the accuracy of forest inventory mapping TERRATEC AS) and funded by the Norwegian Research Council (project number 244599). The UAV-data collection was funded by the Norwegian Univers ity of Life Sciences (NMBU). References Agisoft, 2014. Agisoft Photoscan User Manual. Agisoft LLC, St. Petersburg, Russia Available at: http://www.agisoft.com/pdf/photoscan-pro_1_1_en.pdf (29.12.2016). Boudreau, J., Nelson, R.F., Margolis, H.A., Beaudoin, A., Guindon, L., Kimes, D.S., 2008. Regional aboveground forest biomass using airborne and spaceborne Lidar in Québec. Remote Sens. Environ. 112, 3876–3890. Braastad, H., 1966. Volume tables for birch. Meddr norske SkogforsVes 21, 23–78. Brantseg, A., 1967. Volume functions and tables for scots pine. South Norway. Meddr norske SkogforsVes 22, 689–739. Clevers, J.G.P.W., Gitelson, A.A., 2013. Remote estimation of crop and grass chlorophyll and nitrogen content using red-edge bands on Sentinel-2 and -3. Int. J. Appl. Earth Obs. Geoinf. 23, 344–351. Corona, P., Fattorini, L., Franceschi, S., Scrinzi, G., Torresan, C., 2014. Estimation of standing wood volume in forest compartments by exploiting airborne laser scanning information: model-based, design-based, and hybrid perspectives. Can. J. For. Res. 44, 1303–1311. Dandois, J.P., Ellis, E.C., 2013. High spatial resolution three-dimensional mapping of vegetation spectral dynamics using computer vision. Remote Sens. Environ. 136, 259–276. Danson, F.M., Plummer, S.E., 1995. Red-edge response to forest leaf area index. Int. J. Remote Sens. 16, 183–188. Davidson, R., MacKinnon, J.G., 1993. Estimation and Inference in Econometrics. Delegido, J., Verrelst, J., Alonso, L., Moreno, J., 2011. Evaluation of sentinel-2 red-edge bands for empirical estimation of green lai and chlorophyll content. Sensors 11, 7063. Drusch, M., Del Bello, U., Carlier, S., Colin, O., Fernandez, V., Gascon, F., Hoersch, B., Isola, C., Laberinti, P., Martimort, P., Meygret, A., Spoto, F., Sy, O., Marchese, F., Bargellini, P., 2012. Sentinel-2: Esa's optical high-resolution mission for Gmes operational services. Remote Sens. Environ. 120, 25–36. Dube, T., Mutanga, O., 2015. Evaluating the utility of the medium-spatial resolution Landsat 8 multispectral sensor in quantifying aboveground biomass in Umgeni catchment, South Africa. ISPRS J. Photogramm. Remote Sens. 101, 36–46. Ene, L.T., Næsset, E., Gobakken, T., Gregoire, T.G., Ståhl, G., Nelson, R., 2012. Assessing the accuracy of regional Lidar-based biomass estimation using a simulation approach. Remote Sens. Environ. 123, 579–592. ESA, 2016a. S2 Mpc Data Quality Report. Available at: https://sentinel.esa.int/ documents/247904/685211/Sentinel-2-Data-Quality-Report (28.09.2016). ESA, 2016b. Sentinel-2 Msi – Level-2a Prototype Processor Installation and User Manual. Available at: step.esa.int/thirdparties/sen2cor/2.2.1/S2PAD-VEGA-SUM-0001-2.2. pdf (29.09.2016). ESA, 2016c. Sentinel-2 Msi – Level 2a Products Algorithm Theoretical Basis Document. Available at: https://earth.esa.int/c/document_library/get_file?folderId=349490& name=DLFE-4518.pdf (29.09.2016). ESA, 2016d. Snap Sentinels Application Platform, 3.0. Available at: http://step.esa.int/ main/download/ (29.12.2016). ESRI, 2016. How Zonal Statistics Works. Available at: http://desktop.arcgis.com/en/ arcmap/10.3/tools/spatial-analyst-toolbox/h-how-zonal-statistics-works.htm (09.12.2016). Fitje, A., Vestjordet, E., 1977. Stand height curves and new tariff tables for Norway spruce. Meddr norske SkogforsVes 34, 23–62. senseFly, 2014. Ebee Extended User Manual. Available at: https://www.sensefly.com/

12

Remote Sensing of Environment xxx (xxxx) xxx–xxx

S. Puliti et al.

Wallace, L., Lucieer, A., Malenovský, Z., Turner, D., Vopěnka, P., 2016. Assessment of forest structure using two Uav techniques: a comparison of airborne laser scanning and structure from motion (Sfm) point clouds. Forests 7, 62. White, H., 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica: J. Econometric Soc. 817–838. Zhao, P., Lu, D., Wang, G., Wu, C., Huang, Y., Yu, S., 2016. Examining spectral reflectance saturation in Landsat imagery and corresponding solutions to improve Forest aboveground biomass estimation. Remote Sens. 8, 469.

coniferous plantation forest using Landsat TM. Int. J. Remote Sens. 18, 2209–2223. UNFCC, 2010. Decision 1/Cp.16. The Cancun AgreementsBonn, Germany. Available at: http://unfccc.int/resource/docs/2010/cop16/eng/07a01.pdf. Vestjordet, E., 1967. Functions and tables for volume of standing trees. Norway spruce. Meddr norske SkogforsVes 22, 539–574. Vestjordet, E., 1968. Volum Av Nyttbart Virke Hos Gran Og Furu Basert På Relativ Høyde Og Diameter I Brysthøyde Eller Ved 2,5 M Fra Stubbeavskjær (Merchantable Volume of Norway Spruce and Scots Pine Based on Relative Height and Diameter at Breast Height or 2.5 M above Stump Level). Meddr norske SkogforsVes 25, 411–459.

13