Scale effects in survey estimates of proportions and quantiles of per unit area attributes

Scale effects in survey estimates of proportions and quantiles of per unit area attributes

Forest Ecology and Management 364 (2016) 122–129 Contents lists available at ScienceDirect Forest Ecology and Management journal homepage: www.elsev...

546KB Sizes 1 Downloads 24 Views

Forest Ecology and Management 364 (2016) 122–129

Contents lists available at ScienceDirect

Forest Ecology and Management journal homepage: www.elsevier.com/locate/foreco

Scale effects in survey estimates of proportions and quantiles of per unit area attributes Steen Magnussen a,⇑, Daniel Mandallaz b, Adrian Lanz c, Christian Ginzler c, Erik Næsset d, Terje Gobakken d a

Natural Resources Canada, Canadian Forest Service, 506 West Burnside Road, Victoria, BC V8Z 1M5, Canada Chair of Land Use Engineering, ETH Zurich, CH 8092 Zurich, Switzerland c Swiss Federal Research Institute, WSL, Zürcherstrasse 111, 8903 Birmensdorf ZH, Switzerland d Norwegian University of Life Sciences, P.O. Box 5003, NO1432 Ås, Norway b

a r t i c l e

i n f o

Article history: Received 22 October 2015 Received in revised form 7 January 2016 Accepted 11 January 2016 Available online 20 January 2016 Keywords: Quantiles Area proportions Spatial support Spatial autocorrelation Scaled quantiles Scaled area proportions

a b s t r a c t Quantiles and proportions in a sampling distribution of a per unit area attribute (Y) depend on the spatial support (area) of employed survey plots. This is a nuisance for managers, and policy developers; in particular when the underlying data have been collected with different spatial supports. Users of these statistics may wish to calibrate their estimates to a common scale of spatial support. The easiest way to do this is through scaling to a common plot size. We demonstrate a statistical method for upscaling. The method is illustrated in the context of a design-based forest inventory of a target attribute Y with a census of a co-located vector of auxiliary variables (X) correlated with Y. Two case studies from Norway and Switzerland confirmed significant and practically important scale effects in quantiles and proportions of above ground live tree biomass (Mg ha1) and stem volume (m3 ha1). Upscaling requires an estimate of the spatial autocorrelation of Y given X at the scale of the original spatial support. We present an expedient method to this end. Our method affords estimation of scaled quantiles and proportions and assures consistency of sampling distribution across scales. Crown Copyright Ó 2016 Published by Elsevier B.V. All rights reserved.

1. Introduction Most per unit area forest survey results are summarized to means or totals (Schreuder et al., 1993, ch. 7; Köhl et al., 2006, ch. 2). Means and totals are generally easy to interpret, albeit with the caveat that the user should realize that the estimates have been scaled from the support area of a survey plot to a commonly accepted area unit (e.g. one hectare). For estimates of means and totals the scaling is unproblematic (Zeide, 1980; Kangas and Maltamo, 2006, ch. 4.1.1). With respect to estimates of quantiles and area proportions, the user should be aware of scale effects as these statistics depend on the sampling distribution of the attribute which in turn depends on the support area (size) of the survey plots (Smith, 1938; Correll and Cellier, 1987; Magnussen, 1989; Mandallaz and Ye, 1999; Mandallaz and Lanz, 2001; Gray, 2003; Duane et al., 2010; Hou et al., 2015). As plot size increases, the sampling distribution of per unit area attribute values becomes increasingly concentrated around the mean leading to changes quantiles and area proportions. ⇑ Corresponding author. E-mail address: [email protected] (S. Magnussen). http://dx.doi.org/10.1016/j.foreco.2016.01.013 0378-1127/Crown Copyright Ó 2016 Published by Elsevier B.V. All rights reserved.

Estimates of quantiles and area proportions are important for management and policy development purposes (Lagergren et al., 2006; Kurz et al., 2009; Yan et al., 2014). Questions like: ‘‘what is the proportion of forest land with per hectare volume (or biomass) above (below) a threshold value for economic exploitation?”; and ‘‘what is the upper (lower) limits of volume (biomass) in the 10% of the forest with the most (least) amount of per ha volume?”; can be answered from a sample-based estimate of the distribution function of per unit area volume (biomass). Yet a replication of the survey with a larger or smaller support area would generate different estimates. Hence users of statistics like quantiles and proportions of a per unit area survey attribute may rightly wish to remove these scale effects. Unfortunately, scale effects are an intrinsic property of quantiles and proportions of a per unit area attribute (Francisco and Fuller, 1991; Bellhouse and Stafford, 1999; Chen and Wu, 2002; del Mar Rueda et al., 2003). In forestry, the scale-size effect will depend on the specific forest, the species composition and structure, the attribute of interest, and the spatial autocorrelation in attribute values obtained from plots in close spatial proximity to each other (Smith, 1938; Correll and Cellier, 1987; Magnussen, 1989; Mandallaz and Ye, 1999; Mandallaz and Lanz, 2001).

S. Magnussen et al. / Forest Ecology and Management 364 (2016) 122–129

Scale effects in quantiles and proportions are compounded if the sampling distribution arises from data collected with different spatial support and also when the statistics with different spatial support are compared across time and space. To assist managers and policy developments using per unit area estimates of quantiles and proportions from surveys with different spatial support (plot sizes), this study proposes a method for upscaling a sampling distribution to a desired target of spatial support. Our method is therefore most relevant in the context of multi-regional or multi-national comparisons, compilations, and analyses of scale dependent per unit area resource statistics (Kindermann et al., 2008; Köhl et al., 2009, 2015; McRoberts et al., 2010; Martin and Margaret, 2011; Stinson et al., 2011; Pelletier et al., 2015).

2. Material and methods 2.1. The scaling method in brief In absence of a census of the attribute Y of interest, quantiles and proportions must be estimated from a design-based sampling distribution of Y or via a model of Y given the value of one or more auxiliary variables X correlated with Y. The sampling distribution will depend on two design parameters: the number of sample plots, and the size (area) of the sample plots. The effect of sample size goes to zero as sample size increase, but for small sample sizes one should expect an underestimation of the variance (David and Mishriky, 1968). We only address the issue of plot size effects. In the current context a user wants to upscale an estimate of the sampling distribution of a per unit area attribute observed in plots of size a, to a prediction of the anticipated sampling distribution with an m times larger spatial support. The upscaling would be almost trivial if the only effect of a scaling of the spatial support by a factor m > 1 would be a concomitant decrease in the variance by a factor m1. However, in a spatial population (e.g. a forest) one should anticipate a spatial autocorrelation that will modify the effect of a scaling on the sampling distribution (Kint et al., 2003; Finley et al., 2008; Fortin et al., 2012; Ver Hoef and Temesgen, 2013; Breidenbach et al., 2015). The proposed upscaling will therefore demand an estimate of the autocorrelation at the scale of the original plots. Without a census or near census of auxiliary variables, estimates of the spatial autocorrelation are most expediently obtained from observations of Y in spatially compact clusters composed of m subunits of size a (Cochran, 1977, p. 209). This scenario is not covered here. With a census or a near census of one or more auxiliary variables correlated with Y, the required estimate of autocorrelation can be obtained from spatial explicit predictions of Y given X. For computational reasons it is substantially easier to work with an integer valued scale factor (m). If a desired scale factor is not integer, we recommend a scaling to the two enclosing scale factors of plus and minus one of the rounded value of m, followed by an interpolation between corresponding distribution functions. A simplified outline of the proposed scaling method is outlined next. Necessary details for an implementation and computation of predictions of scaled quantiles and area proportions and their variances are given in Appendix A. Under the assumption of an integer valued target for the scaling factor (m > 1) and a census or near census of auxiliary variables (X) correlated with a target attribute (Y), our proposed method starts  ma among with an estimate of the average spatial autocorrelation q m spatially compact (clustered) units each with a spatial support area equal to a – which is the support area underlying the sampling distribution to be scaled. This average autocorrelation is estimated by generating a map with predictions of Y given X plus a random

123

error-term (at the resolution of the original spatial support) followed by a sampling of spatially compact clusters with m units of size a. An analysis of variance with clusters as sampling units ^  ma (Donner, 1986). The estimate of (replications) then furnishes q  ma allows a simulation of the desired sampling distribution F of q Y|X with plots of size m  a. A user who is only interested in obtained scaled estimates of quantiles or area proportions can apply a simple distribution matching by recognizing that the variance of the scaled distribu ma Þ where n is the sample size tion is r2yjx  ðmnÞ1 ð1 þ ðm  1Þq and

r2yjx is the variance of Y given X (Cochran and Cox, 1957,

p. 209; Baffetta et al., 2012). Those who wish to have estimates of the variance of a scaled quantile or a scaled area proportion with the assurance that the scaled distribution function is consistent with the design-based estimate of the distribution function of Y| X will have to consult the more elaborate and complex method outlined in Appendix A. The simulations of a scaled distribution function start with a ^  ma Þ values of Y|X drawn from a draw of m equal correlated ðq design-based and model-calibrated distribution function F^a ðyjxÞ (Wu and Sitter, 2001). This sampling is repeated a large number of times, ideally N ma ¼ bA  ðmaÞ1 c times where A is the area of the sampled population. Desired quantiles and proportions are then obtained from the ensuing distribution function F ma of the simulated size m  a plots. To ensure that the marginal distribution of the N ma  m individual simulations of Y|X is consistent with the model-calibrated estimate F^a ðyjxÞ, the random draws were constrained to meet this requirement through an application of a multivariate Gaussian Copula (Fischer, 2010; Lopez-Paz et al., 2013). Variance estimators for the predicted proportions and quantiles are in Appendix A. 2.2. Two application examples The first example uses data from a design-based forest inventory in the municipality of Våler in south-eastern Norway. The second example uses data from the third (2004–2006) design-based Swiss national forest inventory (SNFI). Only a brief summary of the data is provided. Additional details about the Våler data can be found in Næsset et al. (2012). Magnussen et al. (2014) detail the SNFI data. A census of auxiliary variables (X) correlated with the target variable (Y) are available in both cases. Model-assisted estimators of means (totals) are based on a working (linear) model of the relationship between Y and X. A scaling of a modelcalibrated distribution model (see Appendix A for details) based on the actual employed plot sizes to a predicted distribution function for plots m times larger is pursued in both examples. Scaled quantiles and area proportions are then obtained from the scaled distribution functions. 2.2.1. Våler Data consisting of estimates of above ground biomass (AGB, Mg ha1), derived from tree level observations and models, were collected in 2010 in 145 circular a = 200 m2 forested plots in the municipality of Våler in south-eastern Norway. The forested area is 852.6 ha. The Våler forest has four strata: recently regenerated stands (Str. 1); young forest (Str. 2); mature forest with spruce as the dominant species (Str. 3); and mature forest with pine as the dominant species (Str. 4). In this study our target population is strata 2, 3, and 4. The areas (A) of the three strata are: A2 = 120.9 ha; A3 = 140.4 ha; and A4 = 195.6 ha, respectively, with a total of A234 = 456.9 ha. A systematic stratified design with samples located on a 150 m  150 m grid in Str. 2 and 3, and a 150 m  450 m grid in Str. 4 was employed. Strata sample sizes

124

S. Magnussen et al. / Forest Ecology and Management 364 (2016) 122–129

(nj, j = 2, 3, 4) were n2 = 55, n3 = 58, and n4 = 32. Accordingly, the total sample size is n = 145. Two auxiliary variables – indicating the height (zCH) and the standard deviation (sCH) of canopy heights – are available for each of M200 = 22,858 square 200 m2 plots in the target population. The two auxiliary variables were derived from a 2010 airborne laser scanning (ALS) of the Våler forest, and both are correlated with the target variable Y = AGB (Magnussen et al., 2015). Strata specific linear models with y as the dependent variable and zCH and sCH as predictors were refitted for this study by ordinary least squares techniques. The estimated regression models are in Table 1. A model-calibrated estimate of the distribution function of AGB was obtained for each of the three strata as outlined in Wu and Sitter (2001) and detailed in Appendix A. An estimate of the average autocorrelation among m units of a = 200 m2 plots in a spatially compact cluster needed for the Copula based simulations of larger plots with an area of m  a was obtained by a stratum specific sampling of clusters with m predictions of Y|X plus a random residual error with a variance equal to the estimated residual variance (see Table 1) followed by an analysis of variance (Donner, 1986). Scaled distribution functions were computed for m = 2, 9, 25, and 49. Clusters of size 400 m2 (m = 2) were rectangular with the longest side in an East–West direction. Larger clusters were square. Details on estimating the required autocorrelations are deferred to Appendix A.

2.2.2. SNFI Field data from the third (2004–2006) Swiss national forest inventory (SNFI) were used. Each plot in the SNFI represents a 2 km2 area (Brändli, 2010; Lanz et al., 2010). Field data were only collected in plots where the plot center was located within a forest. Appropriate forest boundary corrections were applied throughout (Mandallaz, 2008, p. 59). Plot-level estimates of stem volume (VOL) in m3 ha1, obtained from nested circular plots of 200 m2 and 500 m2, is our attribute (Y) of interest. For the purpose of demonstration we consider a = 500 m2 as the nominal plot size and ignore the nested structure. A full census of the average canopy height (zCH) in square plots of 500 m2 was generated from a national digital surface model based on ALS point clouds with 2 m  2 m grid cells and a corresponding high precision digital elevation model provided by the Swiss Federal Office of Topography. The forests of Switzerland are post-stratified to five Production Regions (PR). With the same methods as used for the Våler data we predict, for three randomly selected PRs, the cumulative distribution functions of Y|X for spatially compact clusters of size 1000 m2 (m = 2), 2000 m2 (m = 4), 4500 m2 (m = 9), and 8000 m2 (m = 16). As for Våler, orientation for m = 2 clusters is East–West. Summary statistics of the selected PRs are in Table 2.

Table 1 Summary statistics for three strata in the Våler forest. The regression model for predicting Y = AGB Mg ha1 from two auxiliary variables zCH and sCH is: pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Y ¼ b0 þ b1 zCH þ b2 sCH þ e zCH þ 1; e  Nð0; 1Þ. Stratum

2

3

4

Forest area (ha) Sample size (a = 200 m2) Sample mean of AGB (Mg ha1) (standard error in % of mean) ^0 ÞÞ ^0 ðs:e:ðb b

121 55 164 (6.4) 9.91 (11.7) 2.64 (0.14) 6.08 (2.85) 34.3

140 58 113 (12.3) 0.74 (10.5) 2.60 (0.12) 3.43 (1.83) 41.0

196 32 93 (9.8) 2.47 (12.5) 1.99 (0.16) 1.94 (2.11) 22.6

0.78

0.80

0.73

^1 ðs:e:ðb ^1 ÞÞ b ^2 ðs:e:ðb ^2 ÞÞ b Mean residual standard deviation (Mg ha1) ^2 Þ Coefficient of determination ðR

Table 2 Summary statistics for the Swiss Production Regions (PR). Regression model for from canopy height (zCH, m): predicting Y = VOL m3 ha1 pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi Y ¼ bzCH þ e zCH þ 1; e  Nð0; 1Þ. Production region

1 Jura

2 Plateau

4 Alps

Forest area (km2) SFNI sample plots (a = 500 m2) Sample mean of Y (standard error in % of mean) ^ ^ bðs:e:ð bÞÞ

2052 940 365 (1.8)

2432 1120 393 (2.0)

3651 1499 337 (1.8)

22.5 (0.53) 177

21.5 (0.34) 202

25.2 (0.38) 176

0.66

0.78

0.75

Mean residual standard deviation (m3 ha1) ^2 Þ Coefficient of determination ðR

3. Results With a zero spatial autocorrelation among neighboring fixed area plots, a prediction of the effect of increasing the size (area) of a sample plot on the cumulative distribution function of Y|X would be a simple matter of a scaling determined by the ratio of plot sizes. However, in both the Våler and Swiss forests, the ANOVA-based estimates of the average intra-cluster correlation ^  ma confirmed the presence of a non-trivial spatial autocorrelation q in all clusters with m plots of size a. The average correlation among plots within a cluster declines, as expected, with an increasing number (m) of plots in a cluster. For Våler, we used the average ^  ma (ANOVABOOT and ANOVANN, of two methods for computing q ^  ma is in Tables 3 and 4. see Appendix A for details). A summary of q Enlarging the area of a sampling plot by a factor m > 1 had the anticipated effect of concentrating the cumulative distribution function of Y|X around its mean and a concomitant thinning of the probability mass in the tails. The changes are a function of both ^  ma . They are largest when m increases from 1 to 2 in comm and q ^ ma . With m increasing beyond 2, the changes  bination with a weak q in F~ma ðyjxÞ become less and less pronounced. Examples from Str. 3 in Våler, and PR 4 in Switzerland are in Fig. 1. Most striking is the effect of m on the decline in the predicted proportion of the area with a near zero value of AGB Mg ha1 in Str. 3, and a near zero value of VOL m3 ha1 in PR 4. Continuing with the examples in Fig. 1: selected estimates (m = 1) and predictions (m P 2), of the proportion of the total area in Str. 3 with less than AGB⁄ Mg ha1, and the area in PR 4 with less than VOL⁄ m3 ha1, are in Tables 5 and 6. For the lowest values of AGB⁄ and VOL⁄ there are large and significant decreases in the estimated proportions as cluster-size (m) increases. For intermediate values of AGB⁄ and VOL⁄ the impact of cluster size is, as expected, less pronounced. At the highest chosen level of AGB⁄ and VOL⁄, the tabled proportions of the total area with lower values vary by 6% in Str. 3

Table 3 ^ ma Þ of Y = AGB Mg ha1 in an m  200 m2 Expected average intra cluster correlation ðq spatially compact cluster in Våler forest strata (Str). See Appendix A for details on ANOVABOOT and ANVOANN. m 2

9

25

49

ANOVABOOT Str. 2 Str. 3 Str. 4

0.48 0.62 0.42

0.39 0.52 0.39

0.28 0.37 0.32

0.17 na 0.24

ANOVANN Str. 2 Str. 3 Str. 4

0.36 0.62 0.32

0.26 0.53 0.25

0.18 0.46 0.20

0.11 0.38 0.13

S. Magnussen et al. / Forest Ecology and Management 364 (2016) 122–129 Table 4 ^ ma Þ of Y = VOL m3 ha1 in an m  500 m2 Expected average intra cluster correlation ðq in spatially compact cluster in a Swiss Production Region (PR). See Appendix A for details on estimation method. m

PR = 1 PR = 2 PR = 4

2

4

9

16

0.53 0.51 0.64

0.50 0.50 0.60

0.44 0.46 0.60

0.42 0.39 0.47

and just 2% in PR 4. Standard errors of estimated and predicted proportions are also in Tables 5 and 6. In the case of Våler Str. 3, the relative errors decline from 6% (m = 1) and 23% (m = 49) to a level of 1–2% as the threshold of AGB⁄ is increased from 50 to 300 Mg ha1. For a fixed AGB⁄, the effect of m on the relative error was modest. Relative errors in estimated and predicted area proportions in PR 4 with VOL < VOL⁄ were highest for the lowest values of VOL⁄ and m = 1 or m = 16. For VOL⁄ > 300 m3 ha1 the influence of m was minor. Apart from a few combinations of a low value of VOL⁄ and large clusters, the estimated and predicted area proportions in the Swiss PRs can be considered as precise (errors less than 3%). For a fixed set of chosen area proportions p⁄ = 0.10, 0.25, 0.50, 0.75, and 0.90 the corresponding estimated (m = 1) and predicted (m P 2) quantiles in Våler Str. 3 and Swiss PR 4 are listed in Tables 7 and 8 with corresponding estimates of standard errors. The effect of m is clear and significant: for p⁄ < 0.5 the quantiles increase with m and at a faster rate as p⁄ decreases. The opposite trend is observed for p⁄ > 0.5. For p⁄ = 0.10 the relative precision of Våler quantiles was generally poor with errors at or above 20%. Errors of 12% or less were generally limited to p⁄ P 0.75. Swiss quantiles were, due to a much larger sample size estimated (predicted) with a relative error in the order of 2–4% for p⁄ > 0.10, and with relative errors between 4% (m P 4) and 9% (m = 1) for p⁄ = 0.10. 4. Discussion Survey statistics beyond estimates of a mean (total) are needed by managers, policy developers, and analysts. Quantiles and proportions are prime examples. Estimation of a design-based distribution function (F) of Y or Y|X goes a long way towards meeting

125

needs of this kind (Bellhouse and Stafford, 1999; Bellhouse et al., 2003). We chose the method by Wu and Sitter (2001) for the estimation of a (finite) model-calibrated distribution function. Recently proposed methods by Muñoz et al. (2014a) and Muñoz et al. (2014b) failed due to either numerical instability or to highly erratic estimates of variance. Model-dependent methods by Chambers and Clark (2012) were considered but appeared as less efficient than the method by Wu and Sitter (2001). An empirical likelihood method (Rueda and Muñoz, 2009) appeared to be at par with the method by Wu and Sitter (2001) in terms of efficiency, but computing time became a detractor. Our two examples demonstrated practically important scale effects in estimates of quantiles and proportions. In and by itself, this scale-dependency is not a problem per se. At least not as long as it is understood that the statistics are with reference to a specific population surveyed with a specific fixed area plot. Changing the plot size will change the estimated quantiles and proportions. Hence statistics of this kind cannot be compared directly nor can they be pooled if they arise from surveys with different spatial support. When faced with a task of comparing or analysis of quantiles and proportions – from a multitude of surveys with different spatial support – the need to find a common scale of the spatial support and rescaling of the sampling distributions arise. The scaling would be simple and straightforward in absence of a spatial autocorrelation (Legendre, 1993). In forestry one should, as demonstrated and mentioned earlier, anticipate a positive autocorrelation among plots separated by a relatively short distance (Mandallaz, 2000; Kint et al., 2003; Finley et al., 2008; Corona et al., 2014; Breidenbach et al., 2015). In absence of external information and estimates regarding the anticipated autocorrelation, our proposed method of estimation is expedient and pragmatic when predictions of Y|X can be mapped over a sufficiently large area to provide a robust estimate of the required autocorrelation. A highly fragmented forest may, due to edge-effects, compromise our method. Purely geostatistical approaches would face the same challenge (Aldworth and Cressie, 2003; Hofer and Papritz, 2010, 2011). A study in Våler with 400 m2 plots collocated with the 200 m2 plots used in this study and arranged as concentric circles (Gobakken and Næsset, 2008; Næsset et al., 2015) ^ þ eÞ.  2200 based on ðy corroborated for m = 2 our estimates of q

Fig. 1. Examples of estimated (m = 1) and predicted (m P 2) cumulative distribution functions F~ma ðyjxÞ in Våler (top) and Switzerland (below).

126

S. Magnussen et al. / Forest Ecology and Management 364 (2016) 122–129

Table 5 Estimated (m = 1) and predicted (m P 2) proportions of the area in Våler Str. 3 with AGB 6 AGB⁄ Mg ha1. Estimates of standard errors are in parentheses. AGB⁄ Mg ha1

50 100 150 200 300

Cluster size: m  200 m2 m=1

m=2

m=9

m = 25

m = 49

0.33 (0.02) 0.42 (0.04) 0.62 (0.04) 0.78 (0.04) 0.94 (0.02)

0.26 (0.02) 0.45 (0.04) 0.62 (0.05) 0.79 (0.04) 0.95 (0.02)

0.20 (0.02) 0.43 (0.04) 0.65 (0.05) 0.82 (0.04) 0.97 (0.02)

0.15 (0.02) 0.41 (0.04) 0.67 (0.05) 0.85 (0.04) 0.99 (0.01)

0.12 (0.03) 0.39 (0.05) 0.68 (0.06) 0.88 (0.05) 1.00 (0.01)

Appendix A A.1. A model calibrated distribution function

Table 6 Estimated (m = 1) and predicted (m P 2) proportions of the area in the Swiss PR 4 with VOL 6 VOL⁄ m3 ha1. Estimates of standard errors are in parentheses. VOL⁄ m3 ha1

100 300 500 700 900

0.10 0.25 0.50 0.75 0.90

m=1

m=2

m=4

m=9

m = 16

0.15 (0.008) 0.51 (0.010) 0.77 (0.009) 0.92 (0.006) 0.98 (0.004)

0.11 (0.002) 0.49 (0.003) 0.79 (0.002) 0.94 (0.001) 0.98 (0.001)

0.08 (0.002) 0.47 (0.004) 0.81 (0.003) 0.95 (0.002) 0.99 (0.001)

0.06 (0.002) 0.46 (0.006) 0.83 (0.005) 0.97 (0.003) 1.00 (0.001)

0.05 (0.004) 0.45 (0.007) 0.83 (0.006) 0.97 (0.002) 1.00 (0.001)

Cluster size: m  200 m2 m=1

m=2

m=9

m = 25

m = 49

1 (6) 8 (3) 121 (11) 180 (10) 261 (9)

3 (3) 48 (22) 115 (21) 186 (22) 257 (28)

26 (11) 61 (13) 115 (15) 176 (17) 235 (23)

39 (10) 71 (11) 116 (12) 169 (14) 219 (19)

46 (9) 77 (11) 118 (11) 164 (13) 208 (16)

Table 8 Estimated (m = 1) and predicted (m P 2) p⁄th quantiles of VOL m3 ha1 in Swiss PR 4. Estimates of standard errors are in parentheses. p⁄

0.10 0.25 0.50 0.75 0.90

Our proposed method for scaling quantiles and proportions involves a scaling of a sampling distribution of Y, or in our case, Y|X. We therefore begin with details for estimating this distribution. We largely follow notation in Wu and Sitter (2001). The plotlevel relationship between per unit area Y and X in the population of interest is formulated as

yi ¼ lðxi ; hÞ þ v ðxi Þei ;

Cluster size: m  500 m2

Table 7 Estimated (m = 1) and predicted (m P 2) p⁄th quantiles of AGB Mg ha1 in Våler Str. 3. Estimates of standard errors are in parentheses. p⁄

however, only arise when an application requires (i) an estimates of variance in scaled estimates of quantiles and proportions, and (ii) an assurance that a scaled distribution function is internally consistent with the original sample-based estimate of the distribution of Y|X. Dropping these requirements means that the scaling can be done expeditiously by a distribution matching (Baffetta et al., 2012).

Cluster size: m  500 m2 m=1

m=2

m=4

m=9

m = 16

64 (6) 157 (7) 295 (8) 485 (10) 669 (15)

92 (5) 178 (6) 307 (7) 466 (9) 631 (13)

115 197 314 455 599

131 209 317 445 577

140 217 320 442 566

(5) (5) (6) (8) (11)

(5) (5) (6) (7) (10)

(5) (5) (6) (7) (9)

 2200 : 0.52 Specifically we obtained the following estimates of q (Str. 2), 0.75 (Str. 3) and 0.55 (Str. 4). A retrospective analysis of the autocorrelation in SFNI volume between estimates from a 300 m2 (outer) and a 200 m2 (nested) inner plot also suggested that our proposed method provides realistic (reasonable) estimates of the autocorrelation. Our estimates of the intra-cluster correlations may, for similar forest types, serve as optional ‘default’ values when managers, policy makers, and analysts have no access to a direct estimation. Our method for scaling a distribution function may appear complex, perhaps even overly so. Much of the apparent complexity,

i ¼ 1; 2; . . . ; M a ðA ¼ M a  aÞ

ðA:1Þ

where lðxi ; hÞ is a model-based prediction of yi given xi , lðxi jhÞ is a known function, h is a vector of model parameters, v ðxi Þ is the square root of a known variance function (Davidian and Carroll, 1987), and ei is a residual error assumed normally distributed with a mean of zero a variance of 1, Ma is the number of fixed-area plots of size a in the population with an area A. In this study we assume a spatial autocorrelation in xi and thus lðxi ; hÞ but not in v ðxi Þei . For a relatively small sample (as in our context) taken from a large population, it is reasonable to assume independence of both lðxi ; hÞ, and v ðxi Þei within the sample (i = 1, . . ., n) (Breidenbach et al., 2015). To simplify notation, we will suppress the conditioning on the vector of estimated model parameters ^ h when it is clear from the context. Model calibration rests on the idea of estimating the unknown true cumulative distribution function G of the unobservable true residuals ei , i.e. Pðyi < y Þ ¼ Gðfy  lðxi jhÞgv ðxi Þ1 Þ, where y is a user specified value of Y. For an arbitrary residual value, say u⁄, we have the following unbiased estimate from a sample of size ^ n ¼ n1 P d^e 6u with ^ei ¼ ðfy  lðxi j^ hÞgv ðxi Þ1 Þ, and where n: G i i2s i dI is an indicator variable taking the value of 1 when the event I is true and 0 otherwise. ^ by, for example, least squares techFollowing an estimation of h niques, a model-calibrated estimator with design-based properties of the proportion (viz. probability integral transform) p for a given (chosen) y satisfying pa ðy jxÞ ¼ F a ðy jxÞ is (Wu and Sitter, 2001, Eq. 19) F^a ðy jxÞ ¼ M 1 a

X

p

(

1 i dyi 6y

þ M 1 a

i2s

) Ma X X  1 ^  ^ Gj ðy Þ  pi Gi ðy Þ B^y j¼1

ðA:2Þ

i2s

where pi is the sample inclusion probability of the ith plot in the sample (s), dI is an indicator variable taking the value of 1 when ^ i is a design-based estimate the event I is true and 0 otherwise, G ^ y is a design-based estimate of the slope in a of p (c.f. (A.3)), and B ^ i on dy 6y (c.f. (A.4)). linear regression of G i

^ n ðu Þ ¼ n1 ^ i ðy Þ ¼ G G

X

p1 i d^ei 6u

i2s

with ^ei ¼

ðyi  lðxi ; ^hÞÞ ; v ðxi Þ

u ¼

y  lðxi ; ^hÞ v ðxi Þ

ðA:3Þ

X ^ y ¼ B

^ ^   p1 i ðGi  Gn Þðdyi 6y  dÞ

i2s

X

^ ^ p1 i ðGi  Gn Þ

i2s

2

;

^ ¼ M1 G n a

Ma X ^j G j¼1

ðA:4Þ

127

S. Magnussen et al. / Forest Ecology and Management 364 (2016) 122–129

^ y depends on the chosen, but otherwise arbitrary, value of Note, B y . The choice may or may not coincide with an observed value in the sample. Under the SRS design considered here with

pi  n  N1 , and N ¼ a  A1 , the asymptotic design variance of

^a ðy Þ ¼ F^a ðy jxÞ is (Wu and Sitter, 2001, Eq. 11 after adaption to a p SRS design)

^rðF^a ðy jxÞÞ ¼ Va

1X ^ ÞÞ2 ^ i  ðdy 6y  G ðdyi 6y  G n i n i2s

provides the three-step process.

ðA:5Þ

Estimates of F^a ðy jxÞ and its variance in (A.5) were computed for L values of y uniformly distributed on the interval from 0 to the maximum of the observed and predicted values of Y. In the two application examples, L was 410, and 1200, respectively. Conversely, for a given (chosen) proportion, p ¼ k  M1 a ; k 2 f1; 2; . . . ; M a g the corresponding quantile estima^pa ¼ F^1 ðpjxÞ. Francisco and Fuller (1991) provide a tor of y is y jx

pj1 ¼ Uðzj1 Þ; . . . ; pjm ¼ Uðzjm Þ whereby U denotes a distribution function of a univariate standard Gaussian variable. In the third and last step, the proportions (in step two) are backtransformed to the desired predictions of the m values of Y|X in a cluster, i.e. ^1 ^kx;jk ¼ F^1 y a ðpjk jxÞ ¼ F a ðpjk jxÞ; k ¼ 1; . . . ; m; j ¼ 1; . . . ; M a . Eq. (A.6)

a

least squares solution for non-integer k. A.2. Scaling a model calibrated distribution function Here we detail how the model calibrated distribution function in A1 can be rescaled from a spatial support a m2 to a spatial support of m  a m2. The method allows for a Monte-Carlo based estimation of variances of scaled quantiles and proportions. If these are not needed then a simpler distribution matching can be pursued (Baffetta et al., 2012). ma To predict a cumulative distribution function for the mean y in a spatially compact cluster with m plots of size a, one needs an estimate of the anticipated spatial auto-correlation among the ^ m see (A.6)). For the distance (u) separatm plots in a cluster (i.e. R ing the centroids of two plots in a cluster, we shall assume an exponential correlation function: qðuÞ ¼ expðu=/Þ (Diggle and Ribeiro, 2007, p. 52) with parameter /. To estimate / requires an  ma among the m plots. estimate of the average auto-correlation q  ma is outlined in section A6. How we obtained an estimate of q We acknowledge that an exponential correlation function is unrealistic for large distances (Mandallaz, 2000). Fitted Matérn correlation functions (not shown) had ranges in the order of 400–900 m in the two case studies, well below any within-cluster distance encountered in our study. To note, probability density functions ^ m or with  m in all off-diagonal positions of R estimated with either q

^ ma Þ, were similar. derived distance dependent estimates expðu=/ No hypothesis of equality of the distributions was rejected at the 5% level of significance (test: Anderson–Darling, (Anderson and Darling, 1952)). In the rare case with an approximate normal distribution function the scaling to a larger spatial support would be a simple matter since all moments greater than two will be approximately zero (Wackernagel, 2003, p. 72). However, the estimated distribution function of Y|X may be non-normal and require a more involved procedure (Gurley and Kareem, 1998; Legendre and Legendre, 2012). Since we aspire to estimate variances of scaled quantiles and proportions with the assurance that the scaled distribution function is internally consistent with F^a we opted for a Monte-

Carlo simulation approach detailed next. In the first step of the Monte-Carlo procedure, we draw repeated sets of m correlated standard multivariate normal (MVN) random variables ðzj1 ; . . . ; zjm Þ; j ¼ 1; . . . ; M ma with a mean of 0m1 , a variance of 1.0, and a m  m correlation matrix Rm. The number of draws is the number of clusters in the population, i.e. Mma ¼ dA  m1  a1 e. In the second step, a univariate standard normal probability integral transform converts ðzj1 ; . . . ; zjm Þ to proportions

step2 :

^ mÞ ðzj1 ; . . . ; zjm Þ are distributed as a MVNð0; R ^j1 ; . . . ; p ^jm Þ ¼ Uðzj1 ; . . . ; zjm Þ; ðp

step3 :

^jx;j1 ; . . . ; y ^jx;jm Þ ¼ F^1 ^ ^ ðy a ðpj1 ; . . . ; pjm jxÞ;

step1 :

ðA:6Þ

Repeat 1—3 for j ¼ 1; . . . ; M ma The chain from step 1 to step 3 ensures that the marginal distribution of cluster means is internally consistent with F^a (Fischer, 2010; Kurowicka, 2010; Gräler and Pebesma, 2012). After the third step we computed the predicted cluster means j ^ ma ; j ¼ 1; . . . ; M a , and obtained the desired distribution function y by pairing a sorted (ascending) list of cluster means with associ1 ^ ^ ðrÞ ^ðrÞ ated proportions, i.e. p ma ¼ F ma ðy ma jxÞ ¼ r  M ma ; r ¼ 1; . . . ; M ma ðrÞ ^ma is the rth smallest cluster mean. The predicted distribu where y tion function is dependent on the random draws in step 1. To reduce this dependency, we generated BR = 100 replications of the ordered pairs. Our final approximation to F ma becomes

F~ma ðy jxÞ ¼ BR1

B X  ðrÞ ðrÞ  ma ; p ^ma Þb ; r ¼ 1; . . . ; Mma ½y  IPF ðy

ðA:7Þ

b¼1

where y is an arbitrary user specified value of a cluster mean, IPF is a linear interpolation function that for a given y and M ma pairs of   ðrÞ ðrÞ ma ^ma y ;p predicts the proportion F^bma ðy jxÞ. To control for potenb

tial extrapolation errors, all values of F^bma ðy jxÞ were restricted to the permissible interval of [0, 1].

A.3. Scaled proportions and their variance For a fixed value of Y, say y⁄, the area proportion with a value of Y less than y⁄, is estimated to F~ma ðy jxÞ, an approximation to the variance of this proportion was obtained for each value of F^b ðy jxÞ; b ¼ 1; . . . ; B as per (A.5) and averaged over the B replicama

tions. The computation included a scaling of ya;i and ^ea;i by the ^ ma ¼ ð1 þ ðm  1Þq ^  ma Þ of square root of the estimated ratio V R the expected variance among means of Y|X in clusters of m plots of size a to the variance among plots of size a (Cochran, 1977, Eq. 8.10). Accordingly, we propose the following approximation to the variance of F~ma ðy jxÞ 1

~rðF~ma ðy jxÞÞ ¼ BR1 Va

B   X ^r F^bma ðy jxÞ Va b¼1

þ ðB  1Þ1

B  2 X F^bma ðy jxÞ  F~ma ðy jxÞ

ðA:8Þ

b¼1

i.e. the expected variance plus the variance of the expectations (Fuller, 2009, Eq. 1.2.78). In an attempt to avoid excessive times for the simulations in the large Swiss populations, we limited Mma to 70,000. The impact of this curtailment is only material in the extreme tails of the predicted distributions where prediction errors are (anyhow) very large. The practical importance of the curtailment is negligible.

128

S. Magnussen et al. / Forest Ecology and Management 364 (2016) 122–129

A.4. Scaled quantiles and their variance Estimating the pth quantile of Y|X requires an inverse to a dis⁄ tribution function, i.e. F 1 ma ðpjxÞ. For a user specified value p we obtain a desired quantile via the inverse to the interpolation functions in (A.7), i.e.

 ~ma ðp Þ ¼ F~1 y ma ðp jxÞ

¼ BR1

B X  ðrÞ ðrÞ  ^ma ; y ma Þb ; r ¼ 1; . . . ; M ma ½p  IPF ðp

ðA:9Þ

b¼1

An approximation to the variance of pth quantile of y|x is found ~ma ðpÞ (Francisco and via a Bahadur representation of the error in y Fuller, 1991; Fuller, 2009, 1.3.98) and (Serfling, 1980, p. 80). Specifically,  ~r 1 ðy ~ma ðp ÞÞ ¼ p ð1  p Þ^f 2 ~ Va ma ðyma ðp ÞjxÞ

ðA:10Þ

 ~ where ^f 2 ma ðyma ðp ÞjxÞ is an estimate of the probability density func~ma ðp Þ. Our estimates of the pdfs were tion (pdf) of Y|X evaluated at y obtained from normalized derivatives of the interpolation functions (IPF) in (A.7). The variance estimator in (A.10) ignores the variance in F~ma . We therefore added an estimate of the variance in F~ma to get

~rðy~ma ðp ÞÞ ¼ V a ~r1 ðy ~ma ðp ÞÞ þ ðBR  1Þ Va

1

B X   2 ^ðrÞ ðrÞ ~ ðIPF½ðp ma ; yma Þb ; r ¼ 1; .. . ; M ma ½p   yma ðp ÞÞ b¼1

ðA:11Þ  ma A.5. Estimation of the intra-cluster correlation q In the current context with no sampling of Y outside the n sam ma , we pled field plots, and no available information about q  m from spatially explicit predictions of Y, resorted to estimate q hÞ þ v ðxÞe where e is a random draw from a standard nori.e. lðxj^  ma was estimated via a spatial sammal distribution. Specifically, q pling (without replacement) of spatially compact clusters with m plots of size a followed by a one-way analysis of variance (ANOVA) with clusters in the role of random blocks (Donner, 1986).  ma was straightforward to impleThis approach to estimate q ment with the large SFNI data samples. However, a Våler stratum constituted a more fragmented population with an irregular outline. Thus the approach was only feasible without modifications for clusters with m = 2  200 m2 plots. With m = 49, only 13% in Str. 2, 11% in Str. 3, and 3% in Str. 4 had a full complement of plots. With m = 25 and 49, no single cluster contained the target number of plots. Thus, for m > 2 we obtained, by two different methods, an ^ in clusters with at least d0:7me  ma by imputing lðxjhÞ estimate of q plots. In the first method (ANOVABOOT), we completed deficient clusters via a random (with-replacement) sampling from the avail^ In the second method able within-cluster values of lðxjhÞ. (ANOVANN), missing values of were imputed from the nearest available plots outside the cluster but within a maximum distance pffiffiffiffiffi of 2 m from its center. All incomplete clusters were discarded from the ANOVA. The BOOT method is expected to over-estimate q ma , while the NN method is expected under-estimate q ma . References Aldworth, J., Cressie, N., 2003. Prediction of nonlinear spatial functionals. J. Statist. Plan. Infer. 112, 3–41. Anderson, T.W., Darling, D.A., 1952. Asymptotic theory of certain ‘‘goodness of fit” criteria based on stochastic processes. Ann. Math. Stat. 23, 193–212. Baffetta, F., Corona, P., Fattorini, L., 2012. A matching procedure to improve k-NN estimation of forest attribute maps. For. Ecol. Manage. 272, 35–50. Bellhouse, D.R., Goia, C.M., Stafford, J.E., 2003. Graphical displays of complex survey data though kernel smoothing. In: Chambers, R.L., Skinner, C.J. (Eds.), Analysis of Survey Data. Wiley, Chichester, pp. 133–150.

Bellhouse, D.R., Stafford, J.E., 1999. Density estimation from complex surveys. Stat. Sin. 9, 407–424. Brändli, U.-B., 2010. Schweizerisches Landesforstinventar. Ergebnisse der dritten Erhebung 2004–2006. Bern, p. 312. Breidenbach, J., McRoberts, R.E., Astrup, R., 2015. Empirical coverage of modelbased variance estimators for remote sensing assisted estimation of stand-level timber volume. Remote Sens. Environ. Chambers, R.L., Clark, R.G., 2012. An Introduction to Model-based Survey Sampling with Applications. Oxford University Press, New York. Chen, J., Wu, C., 2002. Estimation of distribution function and quantiles using the model-calibrated pseudo empirical likelihood method. Stat. Sin. 12, 1223–1240. Cochran, W.G., 1977. Sampling Techniques. Wiley, New York. Cochran, W.G., Cox, G.M., 1957. Experimental Designs. Wiley. Corona, P., Fattorini, L., Franceschi, S., Chirici, G., Maselli, F., Secondi, L., 2014. Mapping by spatial predictors exploiting remotely sensed and ground data: a comparative design-based perspective. Remote Sens. Environ. 152, 29–37. Correll, R.L., Cellier, K.M., 1987. Effects of plot size, block size and buffer rows on the precision of forestry trials. Aust. For. Res. 17, 11–18. David, H.A., Mishriky, R.S., 1968. Order statistics for discrete populations and for grouped samples. J. Am. Stat. Assoc. 63, 1390–1398. Davidian, M., Carroll, R.J., 1987. Variance function estimation. J. Am. Stat. Assoc. 82, 1079–1091. del Mar Rueda, M., Arcos, A., Martínez, M.D., 2003. Difference estimators of quantiles in finite populations. Test 12, 481–496. Diggle, P.J., Ribeiro, P.J.J., 2007. Model-based Geostatistics. Springer, New York. Donner, A., 1986. A review of inference procedures for the intraclass correlation coefficient in the one-way random effects model. Int. Stat. Rev. 54, 67–82. Duane, M.V., Cohen, W.B., Campbell, J.L., Hudiberg, T., Turner, D.P., Weyermann, D.L., 2010. Implications of alternative field-sampling designs on Landsat-based mapping of stand age and carbon stocks in Oregon forests. For. Sci. 56, 405–416. Finley, A.O., Banerjee, S., Ek, A.R., McRoberts, R.E., 2008. Bayesian multivariate process modeling for prediction of forest attributes. J. Agric. Biol. Environ. Stat. 13, 60–83. Fischer, M., 2010. Multivariate copulae. In: Kurowicka, D., Joe, H. (Eds.), Dependence Modeling. World Scientific, Singapore, pp. 19–36. Fortin, M.-J., James, P.M.A., MacKenzie, A., Melles, S.J., Rayfield, B., 2012. Spatial statistics, spatial regression, and graph theory in ecology. Spat. Stat. 1, 100–109. Francisco, C.A., Fuller, W.A., 1991. Quantile estimation with a complex survey design, pp. 454–469. Fuller, W.A., 2009. Sampling Statistics. Wiley, New York. Gobakken, T., Næsset, E., 2008. Assessing effects of laser point density, ground sampling intensity, and field sample plot size on biophysical stand properties derived from airborne laser scanner data. Can. J. For. Res. 38, 1095–1109. Gräler, B., Pebesma, E., 2012. Modelling dependence in space and time with vine copulas. In: Ninth International Geostatistics Congress, Oslo, NO, pp. 11–15. Gray, A., 2003. Monitoring stand structure in mature coastal Douglas-fir forests: effect of plot size. For. Ecol. Manage. 175, 1–16. Gurley, K., Kareem, A., 1998. Meccanica 33, 309–317. Hofer, C., Papritz, A., 2010. Predicting threshold exceedance by local block means in soil pollution surveys. Math. Geosci. 42, 631–656. Hofer, C., Papritz, A., 2011. ConstrainedKriging: an R-package for customary, constrained and covariance-matching constrained point or block kriging. Comput. Geosci. 37, 1562–1569. Hou, Z., Xu, Q., Hartikainen, S., Antilla, P., Packalen, T., Maltamo, M., Tokola, T., 2015. Impact of plot size and spatial pattern of forest attributes on sampling efficacy. For. Sci. Kangas, A., Maltamo, M., 2006. Forest Inventory: Methodology and Applications. Springer. Kindermann, G.E., McCallum, I., Fritz, S., Obersteiner, M., 2008. A global forest growing stock, biomass and carbon map based on FAO statistics. Silv. Fenn. 42, 387. Kint, V., Van Meirvenne, M., Nachtergale, L., Geudens, G., Lust, N., 2003. Spatial methods for quantifying forest stand structure development: a comparison between nearest-neighbor indices and variogram analysis. For. Sci. 49, 36–49. Köhl, M., Baldauf, T., Plugge, D., Krug, J., 2009. Reduced emissions from deforestation and forest degradation (REDD): a climate change mitigation strategy on a critical track. Carbon Balance Manage. 4, 10. Köhl, M., Lasco, R., Cifuentes, M., Jonsson, Ö., Korhonen, K.T., Mundhenk, P., de Jesus Navar, J., Stinson, G., 2015. Changes in forest production, biomass and carbon: results from the 2015 UN FAO Global Forest Resource Assessment. For. Ecol. Manage. 352, 21–34. Köhl, M., Magnussen, S., Marchetti, M., 2006. Sampling Methods, Remote Sensing and GIS Multiresource Forest Inventory. Springer, Berlin. Kurowicka, D., 2010. Introduction: dependence modeling. In: Kurowicka, D., Joe, H. (Eds.), Dependence Modeling. World Scientific, pp. 1–17. Kurz, W.A., Dymond, C.C., White, T.M., Stinson, G., Shaw, C.H., Rampley, G.J., Smyth, C., Simpson, B.N., Neilson, E.T., Trofymow, J.A., Metsaranta, J., Apps, M.J., 2009. CBM-CFS3: a model of carbon-dynamics in forestry and land-use change implementing IPCC standards. Ecol. Model. 220, 480–504. Lagergren, F., Grelle, A., Lankreijer, H., Mölder, M., Lindroth, A., 2006. Current carbon balance of the forested area in Sweden and its sensitivity to global change as simulated by biome-BGC. Ecosystems 9, 894–908. Lanz, A., Brändli, U.B., Brassel, P., Ginzler, C., Thürig, E., 2010. Switzerland. In: Tomppo, E., Gschwantner, T., Lawrence, M., McRoberts, R.E. (Eds.), National Forest Inventories – Pathways for Common Reporting. Springer, Dordrecht, pp. 555–566.

S. Magnussen et al. / Forest Ecology and Management 364 (2016) 122–129 Legendre, P., 1993. Spatial autocorrelation: trouble or new paradigm? Ecology 74, 1659–1673. Legendre, P., Legendre, L.F., 2012. Numerical Ecology. Elsevier. Lopez-Paz, D., Hernández-Lobato, J.M., Ghahramani, Z., 2013. Gaussian process vine copulas for multivariate dependence. In: Proceedings of the 30th International Conference on Machine Learning, JMLR: W&CP, Atlanta, GA, pp. 1–9. Magnussen, S., 1989. Inter-plant interactions and their influence on within and among plot variances. Scand. J. For. Res. 4, 369–377. Magnussen, S., Mandallaz, D., Breidenbach, J., Lanz, A., Ginzler, C., 2014. National forest inventories in the service of small area estimation of stem volume. Can. J. For. Res. 44, 1079–1090. Magnussen, S., Næsset, E., Gobakken, T., 2015. LiDAR supported estimation of change in forest biomass with time invariant regression models Can. J. For. Res. 45, 1514–1523. Mandallaz, D., 2000. Estimation of the spatial covariance in Universal Kriging: application to forest inventory. Environ. Ecol. Stat. 7, 263–284. Mandallaz, D., 2008. Sampling Techniques for Forest Inventories. Chapman and Hall, Boca Raton, Florida. Mandallaz, D., Lanz, A., 2001. Forest inventory: further results for optimal sampling schemes based on the anticipated variance. Can. J. For. Res. 31, 1845–1853. Mandallaz, D., Ye, R., 1999. Forest Inventory with optimal two-phase, two-stage sampling schemes based on the anticipated variance. Can. J. For. Res. 29, 1691– 1708. Martin, H., Margaret, S., 2011. Monitoring, reporting and verification for national REDD+ programmes: two proposals. Environ. Res. Lett. 6, 014002. McRoberts, R.E., Tomppo, E.O., Næsset, E., 2010. Advances and emerging issues in national forest inventories. Scand. J. For. Res. 25, 368–381. Muñoz, J., Álvarez, E., Rueda, M., 2014a. Optimum design-based ratio estimators of the distribution function. J. Appl. Stat. 41, 1395–1407. Muñoz, J., Arcos, A., Álvarez, E., Rueda, M., 2014b. New ratio and difference estimators of the finite population distribution function. Math. Comput. Simul. 102, 51–61. Næsset, E., Bollandsås, O.M., Gobakken, T., Gregoire, T.G., Ståhl, G., 2012. Modelassisted estimation of change in forest biomass over an 11 year period in a

129

sample survey supported by airborne LiDAR: a case study with poststratification to provide ‘‘activity data”. Remote Sens. Environ. 128, 299–314. Næsset, E., Bollandsås, O.M., Gobakken, T., Solberg, S., McRoberts, R.E., 2015. The effects of field plot size on model-assisted estimation of aboveground biomass change using multitemporal interferometric SAR and airborne laser scanning data. Remote Sens. Environ. 168, 252–264. Pelletier, J., Busch, J., Potvin, C., 2015. Addressing uncertainty upstream or downstream of accounting for emissions reductions from deforestation and forest degradation. Clim. Change 130, 635–648. Rueda, M., Muñoz, J., 2009. New model-assisted estimators for the distribution function using the pseudo empirical likelihood method. Stat. Neerl. 63, 227– 244. Schreuder, H.T., Gregoire, T.G., Wood, G.B., 1993. Sampling Methods for Multiresource Forest Inventory. Wiley, New York. Serfling, R.J., 1980. Approximation Theorems of Mathematical Statistics. J. Wiley Sons, New York. Smith, H.F., 1938. An empirical law describing heterogeneity in the yields of agricultural crops. J. Agric. Sci. 28, 1–23. Stinson, G., Kurz, W.A., Smyth, C.E., Neilson, E.T., Dymond, C.C., Metsaranta, J.M., Boisvenue, C., Rampley, G.J., Li, Q., White, T.M., Blain, D., 2011. An inventorybased analysis of Canada’s managed forest carbon dynamics, 1990 to 2008. Glob. Change Biol. 17, 2227–2244. Ver Hoef, J.M., Temesgen, H., 2013. A comparison of the spatial linear model to nearest neighbor (k-NN) methods for forestry applications. PLoS ONE 8, e59129. Wackernagel, H., 2003. Multivariate Geostatistics. Springer Science & Business Media. Wu, C., Sitter, R.R., 2001. A model-calibration approach to using complete auxiliary information from survey data. J. Am. Stat. Assoc. 96, 185–193. Yan, E., Lin, H., Wang, G., Sun, H., 2014. Multi-scale simulation and accuracy assessment of forest carbon using Landsat and MODIS data. In: Earth Observation and Remote Sensing Applications (EORSA), 2014 3rd International Workshop on. IEEE, pp. 195–199. Zeide, B., 1980. Plot size optimization. For. Sci. 26, 251–257.