Artificial neural networks on integrated multispectral and SAR data for high-performance prediction of eucalyptus biomass

Artificial neural networks on integrated multispectral and SAR data for high-performance prediction of eucalyptus biomass

Computers and Electronics in Agriculture xxx (xxxx) xxxx Contents lists available at ScienceDirect Computers and Electronics in Agriculture journal ...

2MB Sizes 0 Downloads 43 Views

Computers and Electronics in Agriculture xxx (xxxx) xxxx

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture journal homepage: www.elsevier.com/locate/compag

Artificial neural networks on integrated multispectral and SAR data for highperformance prediction of eucalyptus biomass ⁎

Getulio Fonseca Dominguesa,e, , Vicente Paulo Soaresa, Helio Garcia Leitea, Antônio Santana Ferrazb, Carlos Antonio Alvares Soares Ribeiroa, Alexandre Simões Lorenzona, Gustavo Eduardo Marcattia, Thaisa Ribeiro Teixeiraa, Nero Lemos Martins de Castroa, Pedro Henrique Santos Motaa, Guilherme Silverio Aquino de Souzaa,f, Sady Júnior Martins da Costa de Menezesc, Alexandre Rosa dos Santosd, Cibele Hummel do Amarala a

Federal University of Viçosa/UFV, Department of Forest Engineering, Av. Peter Henry Rolfs, s/n, 36570-000 Viçosa, MG, Brazil Federal University of Viçosa/UFV, Department of Civil Engineering, Av. Peter Henry Rolfs, s/n, 36570-000 Viçosa, MG, Brazil c Federal Rural University of Rio de Janeiro/UFRRJ, Three Rivers Institute, Av. Prefeito Alberto Lavinas, 1847, 25802-100, Centro, Três Rios, RJ, Brazil d Federal University of Espírito Santo/UFES, Department of Rural Engineering, Alto Universitário, s/n, 29500-000 Alegre, ES, Brazil e National Institute of the Atlantic Forest/INMA, Av. José Ruschi, 29650-000 Santa Teresa, ES, Brazil f State University of Mato Grosso do Sul/UEMS, Km12 Aquidauana-Camisão Road, 79200-000 Aquidauana, MS, Brazil b

A R T I C LE I N FO

A B S T R A C T

Keywords: Forest biomass Synthetic aperture radar Multispectral images Machine learning

Biomass estimation plays an important role in forest management being applied in most carbon sequestration studies, assessment of forest succession, conservation of natural resources, quantification of nutrient cycling, energy planning where forest biomass is used as primary fuel for power generation and harvest planning and stock management in pulp industry. Using data from Advanced Visible and Near Infrared Radiometer type 2 (AVNIR-2) and Phased Array type L-band Synthetic Aperture Radar (PALSAR) sensors onboard Advanced Land Observing Satellite (ALOS), above-ground biomass (AGB) estimates were generated via artificial neural networks for a eucalyptus planting area in Minas Gerais State, Brazil. With 206 inventory plots, computed coefficient of determination between AGB estimates and observed values within validation sample was 0.95. Relative root mean square error was 2.87% with errors ranging from −8% to 4%. These results demonstrated artificial neural networks higher performance in modeling eucalyptus biomass based on Multispectral and SAR data over previous study, in which multiple linear regression method was applied in the same dataset, achieving R2 equal to 0.71.

1. Introduction Brazilian wood production system with eucalyptus forests is one of the most technological in the world, achieving high productivity (IBÁ, 2015; Stape et al., 2010) and representing an extraordinary potential in using biomass as renewable energy source (ANEEL, 2016). Nowadays, 5,560,000 ha of eucalyptus forests are used for wood production in Brazil. But, in the last few years, production costs increased substantially due to increasing personnel wages and stagnation of workforce productivity, leading forest companies to search for new process to optimize production management (IBÁ, 2015).

Forest inventory is the common method for biomass assessment. It is based on one or multiple vegetation measurements done in representative plots to determinate forest biomass through allometric equations. However, innovations push forward to reduce number of field plots, which demands extensive field works, as well to provide more details of the forest area, towards precision forestry (Holopainen et al., 2014). Remote sensing data is largely recognized for its potential application in ecosystem modeling and management (de Araujo Barbosa et al., 2015; Lopez and Frohn, 2018), mapping species and forest composition (Corbane et al., 2015; Vaglio Laurin et al., 2016) and to obtain



Corresponding author at: Federal University of Viçosa/UFV, Department of Forest Engineering, Av. Peter Henry Rolfs, s/n, 36570-000 Viçosa, MG, Brazil. E-mail addresses: getulio.fl[email protected], [email protected] (G.F. Domingues), [email protected] (V.P. Soares), [email protected] (H.G. Leite), [email protected] (A.S. Ferraz), [email protected] (C.A.A.S. Ribeiro), [email protected] (A.S. Lorenzon), [email protected] (N.L.M. de Castro), [email protected] (S.J.M.d.C. de Menezes), [email protected] (A.R. dos Santos), [email protected] (C.H. do Amaral). https://doi.org/10.1016/j.compag.2019.105089 Received 11 January 2019; Received in revised form 10 October 2019; Accepted 2 November 2019 0168-1699/ © 2019 Elsevier B.V. All rights reserved.

Please cite this article as: Getulio Fonseca Domingues, et al., Computers and Electronics in Agriculture, https://doi.org/10.1016/j.compag.2019.105089

Computers and Electronics in Agriculture xxx (xxxx) xxxx

G.F. Domingues, et al.

(AVNIR-2) and Phased Array type L-band Synthetic Aperture Radar (PALSAR) sensors, both onboard Advanced Land Observing Satellite (ALOS) (Table 1), consisted the primary data source for this study. Images from PALSAR sensor were obtained on May 2th, 2009 while AVNIR-2 images were collected on May 27th, 2009. AVNIR-2 images were corrected for atmospheric effects using dark object digital number subtraction technique (Chavez, 1975). Geometric and radiometric calibration were performed according to preprocessing level 1B2-G and georeferenced using a vector file of roads and corridors within study area. PALSAR images, with a 16-bit radiometric resolution, were converted to 8 bits, thereby producing four polarimetric images (LHH, LHV, LVV and LVH) with the same radiometric resolution of AVNIR-2 images. Next, backscatter coefficient (σ°) of each polarization was computed by Shimada et al. (2006) equation, as follow:

estimates of forests biophysical parameters (White et al., 2016; Galidaki et al., 2017). One important aspect of using remote sensing data is the ability to provide information on large areas with greater efficiency and lower cost than other techniques, which include extensive field campaigns. When used together with inventory data, remote sensing products can provide parameters estimates with proper accuracy for forest management (Englhart et al., 2011). Passive optical remote sensing, as multispectral images, was the first technology widely used to map forest biomass, due to their temporal scope and feasibility. Moreover, the possibility to use them in conjunction with other types of data, e.g. texture information derived from multispectral image itself and microwaves (Djiongo Kenfack et al., 2018), could help overcome some issues reported in biomass modelling, especially when canopy structure no longer strongly correlates with biomass amount or when there is presence of signal saturation (Koch, 2010; Song, 2013). Active microwave remote sensors such as Synthetic Aperture Radar (SAR) are particularly suitable for estimation of biophysical parameters regarding forest cover (height, diameter, volume and biomass), which is possible due to microwave signal interaction with vegetation in certain frequencies and polarizations (Gama et al., 2010). SAR can operate in different bands. Small wavelengths, such as those coverage by X and C bands, bring information about canopy surface through backscatter produced in leaves and small branches. Larger wavelengths, as L and P bands, have deeper canopy penetration, in which branches and trunks cause the backscatters (Dobson, 2000). Wavelengths mainly interact with plant components that have their same order of magnitude (Le Toan et al., 1992). Microwave and multispectral data together complement each other in forest parameter assessments because multispectral data is related to chemical and biophysical characteristic of canopy surface, while SAR data provide dielectric (related to moisture) and structural data from forest vertical layers. Hence, combination of electromagnetic radiation response from different forest attributes can lead to improvements in forest modelling (Bagan et al., 2012; Vaglio Laurin et al., 2013). Machine learning algorithms and remote sensing data have been efficient in estimating biophysical parameters (Verrelst et al., 2012). One of the most popular models in this domain is the artificial neural network (ANN), a technique that is highly utilized and effective for modeling forest parameters from multivariate data, including remote sensing ones (Del Frate and Solimini, 2004; Muukkonen and Heiskanen, 2005). ANN simulates the operation of a biological neuron network in an effort to model relationships among variables. For example, multilayer perceptron (MLP), a type of artificial neural network, is capable of approximating any measurable function (Hornik et al., 1989), that is, if MLP is allowed to vary freely, it can take the form of any continuous curve (Almeida, 2002). This makes MLP a powerful tool for modeling, especially when implicit relationships in data are unknown. The purpose of this study is (1) to apply artificial neural networks technique, i.e. multilayer perceptron, to estimate above-ground biomass (AGB) of eucalypt stands with Advanced Visible and Near Infrared Radiometer type 2 (AVNIR-2) and Phased Array type L-band Synthetic Aperture Radar (PALSAR) data, (2) evaluate the efficiency of ANN as well (3) multispectral and SAR data integration to satisfy that goal.

σ ° = 10. log10.[DN 2] + CF where CF (Conversion Factor) = −83.0 From February to September 2009, a forest inventory with 1924 permanent plots was conducted. For study development, 206 plots were selected from this dataset by Oliveira (2011) according to the follow requirements: (1) be within stands with more than five years-old; (2) be located in areas imaged by both sensors (AVNIR-2 and PALSAR); (3) be in areas without clouds coverage in AVNIR-2 images; (4) be outside shadow areas in PALSAR images. Regarding inventory, height and diameter at breast height measurements were carried out to model total biomass within each plot through biomass measurements of site representative preselected trees. Among 206 plots, mean observed biomass was 128.73 t/ha, ranging from 90.13 t/ha to 190.25 t/ha. In a geographic information software, we used squared masks of 341 m2, area equal to plots size, to extract digital values of different overlapping pixels (≅100 m2 each). Thus, mask value was the weighted average value of overlapping pixels within mask (Fig. 2). Artificial neural network Multilayer Perceptron (MLP) obtained by supervised training was used to estimate biomass. MLPs were trained to estimate stands biomass based on four spectral bands (B, G, R, and NIR) of AVNIR-2 sensor, four polarizations (LHH, LHV, LVV e LVH) of PALSAR sensor and observed biomass values of training plots. Plots were divided randomly into subset samples for training, testing and validation. Initially, 50% of plots were used for MLPs training, 25% for testing and remaining 25% used to validate biomass estimates. Initially, 5000 MLPs networks were trained for each sensor; subsequently, both sensors were trained together as a unit. The supervised training of the MLP used biomass observed values as dependent variable while variables from AVNIR-2, PALSAR and both, AVNIR-2 and PALSAR together, represented independent variables. During the 5000 MLPs training process, number of hidden neurons in each MLP was a randomly chosen between 1 and 50. Activations functions were tanh, logistic, exponential and identity. Average error of each combination of activation function and neurons number were computed in order to evaluate reliability of all architectures. We considered the number of 5000 MLPs high enough to analyze trends within results of each architecture and to find a good fitted MLP, without high computational costs. Among the 5000 MLP trained for each sensor, the best-fitted MLP was selected to the validation sample. Factors that influenced network selection based on validation sample were (a) Coefficient of determination between observed and estimated values by MLP, (b) number of neurons in the hidden layer, (c) error distribution and (d) value of root mean square error in percentage (RMSE%). Multilayer perceptrons were trained using Statistica 12 software (StatSoft, Inc., 2013) that includes usage of (a) Broyden-Fletcher-Goldfarb-Shanno algorithm, which minimizes error function, and (b) Backpropagation for artificial

2. Material and methods This study was conducted within eucalypt stands located in Belo Oriente, Santana do Paraíso, Ipaba and Caratinga municipalities, Minas Gerais state, southern Brazil (Fig. 1). Stands consist of Eucalyptus urophylla × Eucalyptus grandis hybrid clones, spaced 3 × 2 m, with age ranging from 4 to 8 years at the study time, with a mean height of 25.9 m. Plantation purpose is cellulose pulp production and has a total area of 837.12 km2. Images from Advanced Visible and Near Infrared Radiometer type 2 2

Computers and Electronics in Agriculture xxx (xxxx) xxxx

G.F. Domingues, et al.

Fig. 1. Inventory plots location within study area.

amount equally divided and used for testing and validation. Data division into training, test and validation samples help to avoid overfitting (Hawkins, 2004). Data separated for test was used in training itself, where they are part of an iterative process used to check the estimated error during training, which prevents overfitting by considering an early stopping. For example, MLP training is interrupted if computed test error increases for a few cycles. If this is the case, MLP returns to result obtained in the cycle before test error start to increase (Fig. 3). Thus, training and test samples were used during MLP models training and validation sample was used to compute model‘s accuracy.

Table 1 Specification of Multispectral and SAR data used in the study. AVNIR-2 Band

Wavelength Region (µm)

Resolution (m)

1 2 3 4

0.42–0.50 0.52–0.60 0.61–0.69 0.76–0.89

10 10 10 10

(blue - B) (green - G) (red - R) (near infrared - NIR)

PALSAR Band

Frequency (GHz)

Polarization

Resolution (m)

SAR-L

1.3

HH, HV, VV, VH

10

3. Results Training of 5000 MLPs using SAR and multispectral data over 50% of plots achieved an average of twelve MLPs for each combination of: (a) number of neurons (b) activation function in the hidden layer and (c) output activation function. Thus, it was possible to observe the

Fig. 2. Example of computed weighted average value by each pixel‘s overlapping area within mask (M), where DNM = ∑ (DNi ∗ Ai )/ ∑ Ai and i = the ith pixel.

neural network learning. Next, based on AVNIR-2 and PALSAR combined data, number of plots within training subset was reduced to 40% of the total, with remaining plots partitioned into two subsets of identical size (two of 30%), one for testing and the other for validation. This procedure continued until 10% of plots were used for training and remaining

Fig. 3. Training stop using a sample for test (adapted* StatSoft, Inc., 2013). 3

Computers and Electronics in Agriculture xxx (xxxx) xxxx

G.F. Domingues, et al.

each independent variable on multilayer perceptron training. Green wavelength along with LVH and LVV backscatter stand out as the most important variables (Table 3). 4. Discussion According to Tetko et al. (1995), while a sufficient number of neurons can provide good predictions, an excess of neurons can cause overfitting. It happens when ANN learns training data tendencies and noises, thereby reducing its generalizability. By analyzing Fig. 5, only combination of tanh function within hidden layer with identity function within output layer clearly showed an increase of errors occurrence within validation sample after thirteen neurons. Moreover, for different architectures it is possible to observe in Figs. 4 and 5 that validation error decreases and stabilizes when number of neurons within hidden layer decreases to approximately thirteen. Although overfitting cannot be observed in all MLP architectures tested, it is recommended that simple architecture MLP be selected. Simple MLP architectures possess a fewer number of neurons while achieving results comparable to complex architectures as it is possible to see in Figs. 4 and 5. Computed standard deviation showed large variability of error estimates. Since initial weights represented the primary difference between MLP’s training parameters with same architecture, the large variation suggests that this parameter had a significant influence on MLP training for this dataset. The same training, test and validation plots were used to generate two estimates of biomass: one estimate from AVNIR-2 data (multispectral) and a second estimate from PALSAR (SAR, L band) data. Although several authors highlight the potential of Synthetic Aperture Radar (SAR) for forest AGB estimations (Patenaude et al, 2005; Hyde et al., 2007), results obtained when using AVNIR-2 data reinforce the importance of passive optical data on forest biophysical parameters estimation when using MLP models. Use of SAR X and P bands from OrbiSAR-1 sensor data and multivariable regression techniques for biomass estimation of Eucalyptus saligna stands, Brazil (Gama et al., 2010), achieved similar results to those obtained by using PALSAR sensor L-band. On the other hand better results were found for biomass estimations of Eucalyptus saligna in Australia and Pinus pinaster in France with SAR L band data (Le Toan et al., 1992; Austin et al., 2003), when linear and nonlinear regression analysis were used. However, overall methods for eucalyptus AGB prediction still needed to be improved allowing properly estimates over large areas, towards precision forest management. In the present study, RMSE improved from 6.47 t/ha or 5.17% (AVNIR-2 data) to 3.59 t/ha or 2.87% when combining AVNIR-2 and PALSAR data. Even when restricting number of plots used for training, i.e. using only 10% of plots for training, results here were superior to those obtained by abovementioned authors. In this case, MLP showed a RMSE of 9.25 t/ha or 7.35% with R2 equal to 0.64. However, estimated errors were highly dispersed (-18 to 20%). Using 20% of plots for training, R2 increased to 0.66, but RMSE and error dispersion also increased (9.42% and from −32 to 28%, respectively). Although more

Fig. 4. Behavior of RMSE validation and training by numbers of neurons and Logistic function in the hidden layer and activation functions in the output layer to estimate biomass.

Fig. 5. Behavior of RMSE validation and training by numbers of neurons and Tanh function in the hidden layer and activation functions in the output layer to estimate biomass.

behavior and trends of each MLP architecture to assist in choosing the best MLP (Figs. 4 and 5). Model’s validation reveals better biomass estimations when using AVNIR-2 (multispectral) and PALSAR (SAR) data together, followed by AVNIR-2 and PALSAR alone (Table 2). Moreover, model’s accuracies when using data from both sensors were extremely high; errors from −8 to 4% within plots distributed over 837.12 km2 were achieved with MPL 8-3-1 and 50% of data used for training (Table 2 and Fig. 6). Global sensitivity analysis of MLP shows the importance level of

Table 2 Multilayer Perceptron (MLP) model’s accuracies for Eucalyptus stands biomass estimation, Minas Gerais State, Brazil, using AVNIR-2 and PALSAR data, MPL combinations, and different percentages of training data. Sensor

MLP

Training data (%)

RMSE (%)

R2

Errors range (%)

AVNIR-2 PALSAR AVNIR-2 AVNIR-2 AVNIR-2 AVNIR-2 AVNIR-2

4-5-1 4-4-1 8-6-1 8-2-1 8-2-1 8-3-1 8-3-1

50 50 10 20 30 40 50

5.17 8.93 7.35 9.42 5.63 3.04 2.87

0.86 0.52 0.64 0.66 0.79 0.94 0.95

−8 to 22 16 to 22 −18 to 20 −32 to 28 −12 to 22 −8 to 8 −8 to 4

& & & & &

PALSAR PALSAR PALSAR PALSAR PALSAR

4

Computers and Electronics in Agriculture xxx (xxxx) xxxx

G.F. Domingues, et al.

Fig. 6. Scatterplot between predicted and observed values of biomass and histogram of residues for validation sample by different combinations of AVNIR-2 and PALSAR data.

Table 3 Global sensitivity analysis of MLP to estimate Eucalyptus biomass in Brazil, using AVNIR-2 and PALSAR data and different percentages of training data. Independent variables in descending order of importance from left to right. Data source and Training data (%)

Independent variables in descending order of importance

AVNIR-2 – 50% PALSAR – 50% AVNIR-2 and PALSAR AVNIR-2 and PALSAR AVNIR-2 and PALSAR AVNIR-2 and PALSAR AVNIR-2 and PALSAR

G LVV LVH G G G B

– – – – –

10% 20% 30% 40% 50%

NIR LHH LVV B LVV LVH LVH

R LVH LHH NIR R LHH R

B LHV LHV R NIR LVV NIR

NIR LVV LVH R LVV

G LHV LHH NIR LHV

R LVH B B G

B LHH LHV LHV LHH

efficiency to estimate biomass using both SAR and multispectral data. The MLP showed a RMSE equal to 3.82 t/ha or 3.04% and 3.59 t/ha or 2.87% respectively, and a normal distribution of errors tested by Shapiro-Wilk test with associated p-value of 0.608 and 0.114 respectively (Table 2 and Fig. 6).

plots were used for this training, loss of prediction may be related to low representation of the randomly selected plots. When using 30% of plots for training, adjustment showed a considerable gain, with a RMSE equal to 5.63% or 7.08 t/ha. Results using 40% and 50% of plots for training prove MLP 5

Computers and Electronics in Agriculture xxx (xxxx) xxxx

G.F. Domingues, et al.

Acknowledgements

Results obtained in this study were far more superior to those achieved by regression analysis using the same dataset. Multilinear regression model generated a multiple R2 equal to 0.71 for biomass (Oliveira, 2011), while MLP showed R2 equal to 0.95 with 50% of data used for training, which was an outstanding improvement in accuracy and in demand of data. Diamantopoulou (2005) and Ozçelik et al., (2013) also compared performances of ANN and regression analysis for modelling forest parameters and concluded that ANN is more effective in solving various problems, e.g. addressing non-linear relationships, non-normal distributions, outliers and noise. For each MLP a global sensitivity analysis was performed, in which the model error when the variable value was set to its mean was divided by the model error when the variable was normally used. In this analysis, higher values indicate the most important variable. In descending order of importance, those variables are: AVNIR-2 green band (G); PALSAR L band vertical-horizontal polarization (LVH); PALSAR L band vertical-vertical polarization (LVV); AVNIR-2 near infrared band (NIR); PALSAR L band horizontal -horizontal polarization (LHH); AVNIR-2 blue band (B); AVNIR-2 red band (R); and PALSAR L band horizontal-vertical polarization (LHV) (Table 3). This conclusion was made based on each variable frequency within a certain position in the MLP‘s sensitivity analysis. First position received a score of eight while last a score of zero. Next, we calculated the rank of all variables based on its scores among all MLPs. AVNIR-2 green band importance may be explained by correlation of reflectance at this wavelength (around 500 nm) and the amount of photosynthetic material (Jensen, 2007). Photosynthetic activity is strong related with AGB produced in Eucalyptus plantation. Once NIR reflectance is related to leaf cell structural properties, e.g. epidermis, mesophyll layer, etc. (Jensen, 2007), the invariance between Eucalyptus clones trees may have caused this effect. The importance of LVH polarization backscatter is a response to variation of forest structure complexity (foliage, twigs, and small and large branches), which is also strongly correlated to AGB. In addition, the third variable in importance LVV backscatter is mainly affected by forest trunks, where the most AGB is stored in clonal eucalypt plantations. Based on that, it is possible to outcome the importance of integrating multispectral and SAR data to retrieve AGB in eucalyptus stands, whose different information patterns among variables were efficiently addressed by ANN. The other variables were similar to one another with regard to level of importance.

We thank F. S. Oliveira for providing the dataset used in this work. This research was supported by National Council for Scientific and Technological Development (CNPq) in Brazil. References Almeida, J.S., 2002. Predictive non-linear modeling of complex data by artificial neural networks. Curr. Opin. Biotechnol. 13, 72–76. https://doi.org/10.1016/S09581669(02)00288-4. Austin, J.M., Mackey, B.G., Van Niel, K.P., 2003. Estimating forest biomass using satellite radar: an exploratory study in a temperate Australian Eucalyptus forest. For. Ecol. Manage. 176, 575–583. https://doi.org/10.1016/S0378-1127(02)00314-6. Bagan, H., Kinoshita, T., Yamagata, Y., 2012. Combination of AVNIR-2, PALSAR, and polarimetric parameters for land cover classification. IEEE Trans. Geosci. Remote Sens. 50, 1318–1328. https://doi.org/10.1109/TGRS.2011.2164806. Corbane, C., Lang, S., Pipkins, K., Alleaume, S., Deshayes, M., García Millán, V.E., Strasser, T., Vanden Borre, J., Toon, S., Michael, F., 2015. Remote sensing for mapping natural habitats and their conservation status – New opportunities and challenges. Int. J. Appl. Earth Obs. Geoinf. 37, 7–16. https://doi.org/10.1016/j.jag.2014. 11.005. de Araujo Barbosa, C.C., Atkinson, P.M., Dearing, J.A., 2015. Remote sensing of ecosystem services: a systematic review. Ecol. Indic. 52, 430–443. https://doi.org/10. 1016/j.ecolind.2015.01.007. Del Frate, F., Solimini, D., 2004. On neural network algorithms for retrieving forest biomass from SAR data. IEEE Trans. Geosci. Remote Sens. 42, 24–34. https://doi.org/ 10.1109/TGRS.2003.817220. Diamantopoulou, M.J., 2005. Artificial neural networks as an alternative tool in pine bark volume estimation. Comput. Electron. Agric. 48, 235–244. https://doi.org/10.1016/ j.compag.2005.04.002. Djiongo Kenfack, C.B., Monga, O., Mpong, S.M., Ndoundam, R., 2018. Quaternion-based texture analysis of multiband satellite images: application to the estimation of aboveground biomass in the east region of Cameroon. Acta Biotheor. 66, 17–60. https://doi.org/10.1007/s10441-018-9317-z. Dobson, M.C., 2000. Forest information from synthetic aperture radar. J. For. 98, 41–43. https://doi.org/10.1093/jof/98.6.41. Englhart, S., Keuck, V., Siegert, F., 2011. Aboveground biomass retrieval in tropical forests—The potential of combined X- and L-band SAR data use. Remote Sens. Environ. 115, 1260–1271. https://doi.org/10.1016/j.rse.2011.01.008. Galidaki, G., Zianis, D., Gitas, I., Radoglou, K., Karathanassi, V., Tsakiri-Strati, M., Woodhouse, I., Mallinis, G., 2017. Vegetation biomass estimation with remote sensing: focus on forest and other wooded land over the Mediterranean ecosystem. Int. J. Remote Sens. 38, 1940–1966. https://doi.org/10.1080/01431161.2016.1266113. Gama, F.F., Dos Santos, J.R., Mura, J.C., 2010. Eucalyptus biomass and volume estimation using interferometric and polarimetric SAR data. Remote Sens. 2, 939–956. https://doi.org/10.3390/rs2040939. Hawkins, D.M., 2004. The problem of overfitting. J. Chem. Inf. Comput. Sci. 44, 1–12. https://doi.org/10.1021/ci0342472. Holopainen, M., Vastaranta, M., Hyyppä, J., 2014. Outlook for the next generation’s precision forestry in Finland. Forests 5, 1682–1694. https://doi.org/10.3390/ f5071682. Hornik, K., Stinchcombe, M., White, H., 1989. Multilayer feedforward networks are universal approximators. Neural Networks 2, 359–366. https://doi.org/10.1016/ 0893-6080(89)90020-8. Hyde, P., Nelson, R., Kimes, D., Levine, E., 2007. Exploring LiDAR–RaDAR synergy—predicting aboveground biomass in a southwestern ponderosa pine forest using LiDAR, SAR and InSAR. Remote Sens. Environ. 106, 28–38. https://doi.org/10.1016/ j.rse.2006.07.017. IBÁ, 2015. Indústria brasileira de Árvores. Anu. estatístico da IBA ano base 2014 (2015), 100. https://doi.org/10.1007/s13398-014-0173-7.2. Jensen, J.R., 2007. . Remote Sensing of the Environment: An Earth Resource Perspective, second ed. Prentice Hall, Upper Saddle River, New Jersey. Koch, B., 2010. Status and future of laser scanning, synthetic aperture radar and hyperspectral remote sensing data for forest biomass assessment. ISPRS J. Photogramm. Remote Sens. 65, 581–590. https://doi.org/10.1016/j.isprsjprs.2010.09.001. Le Toan, T., Beaudoin, A., Riom, J., Guyon, D., 1992. Relating forest biomass to SAR data. IEEE Trans. Geosci. Remote Sens. 30, 403–411. https://doi.org/10.1109/36.134089. Lopez, R.D., Frohn, R.C., 2018. Remote Sensing for Landscape Ecology. Monitoring, Modeling, and Assessment of Ecosysyems, second ed. CRC Press, Boca Raton, Florida. Muukkonen, P., Heiskanen, J., 2005. Estimating biomass for boreal forests using ASTER satellite data combined with standwise forest inventory data. Remote Sens. Environ. 99, 434–447. https://doi.org/10.1016/j.rse.2005.09.011. Oliveira, F.S. De, 2011. Uso de Imagens de satélite Alos para estimativa de parâmetros dendrométricos de plantios de eucalipto. PhD thesis. Federal University of Viçosa, Viçosa, Braz. Özçelik, R., Diamantopoulou, M.J., Crecente-Campo, F., Eler, U., 2013. Estimating Crimean juniper tree height using nonlinear regression and artificial neural network models. For. Ecol. Manage. 306, 52–60. https://doi.org/10.1016/j.foreco.2013.06. 009. Patenaude, G., Milne, R., Dawson, T.P., 2005. Synthesis of remote sensing approaches for forest carbon estimation: reporting to the Kyoto Protocol. Environ. Sci. Policy 8, 161–178. https://doi.org/10.1016/j.envsci.2004.12.010.

5. Conclusions Artificial neural networks (ANN) and integrated AVNIR-2 and PALSAR data have proven to be effective tools for estimating Eucalyptus aboveground biomass (AGB). Multilayer Perceptron (MLP) artificial neural networks were efficient for modeling remote sensing data to estimate AGB from Eucalyptus stands in southeastern Brazil. They are flexible and their results were far superior to results using regression analysis with this same dataset. In addition, 103 plots were sufficient to train artificial neural networks to estimate biomass with high accuracy over an area of 83,712 ha, reaching RMSE of 3.59 t/ha (2.87%) and normal distribution of errors with R2 equal to 0.97. MLP ability to estimate biomass was more effective when using AVNIR-2 multispectral data (R2 = 0.86, and RMSE% = 5.17) than PALSAR L-band data (R2 = 0.52, and RMSE% = 8.93) alone. However, the use of both AVNIR-2 and PALSAR data resulted in the most efficient model, as showed above (R2 = 0.95, and RMSE% = 2.87). The most important variables for Eucalyptus AGB estimation were green band and LVH polarization, proving how integration of multispectral and SAR data provided improvements for AGB estimation using ANN. Declaration of Competing Interest The authors declared that there is no conflict of interest. 6

Computers and Electronics in Agriculture xxx (xxxx) xxxx

G.F. Domingues, et al.

Vaglio Laurin, G., Liesenberg, V., Chen, Q., Guerriero, L., Del Frate, F., Bartolini, A., Coomes, D., Wilebore, B., Lindsell, J., Valentini, R., 2013. Optical and SAR sensor synergies for forest and land cover mapping in a tropical site in West Africa. Int. J. Appl. Earth Obs. Geoinf. 21, 7–16. https://doi.org/10.1016/j.jag.2012.08.002. Vaglio Laurin, G., Puletti, N., Hawthorne, W., Liesenberg, V., Corona, P., Papale, D., Chen, Q., Valentini, R., 2016. Discrimination of tropical forest types, dominant species, and mapping of functional guilds by hyperspectral and simulated multispectral Sentinel-2 data. Remote Sens. Environ. 176, 163–176. https://doi.org/10.1016/j.rse.2016.01. 017. Verrelst, J., Muñoz, J., Alonso, L., Delegido, J., Rivera, J.P., Camps-Valls, G., Moreno, J., 2012. Machine learning regression algorithms for biophysical parameter retrieval: opportunities for Sentinel-2 and -3. Remote Sens. Environ. 118, 127–139. https://doi. org/10.1016/j.rse.2011.11.002. White, J.C., Coops, N.C., Wulder, M.A., Vastaranta, M., Hilker, T., Tompalski, P., 2016. Remote sensing technologies for enhancing forest inventories: a review. Can. J. Remote Sens. 42, 619–641. https://doi.org/10.1080/07038992.2016.1207484.

Shimada, M., Itoh, N., Watanabe, M., Moriyama, T., Tadono, T., 2006. PALSAR initial calibration and validation results. In: Meynart, R., Neeck, S.P., Shimoda, H. (Eds.), p. 636103. https://doi.org/10.1117/12.689363. Song, C., 2013. Optical remote sensing of forest leaf area index and biomass. Prog. Phys. Geogr. Earth Environ. 37, 98–113. https://doi.org/10.1177/0309133312471367. Stape, J.L., Binkley, D., Ryan, M.G., Fonseca, S., Loos, R.A., Takahashi, E.N., Silva, C.R., Silva, S.R., Hakamada, R.E., Ferreira, J.M. de A., Lima, A.M.N., Gava, J.L., Leite, F.P., Andrade, H.B., Alves, J.M., Silva, G.G.C., Azevedo, M.R., 2010. The Brazil Eucalyptus Potential Productivity Project: Influence of water, nutrients and stand uniformity on wood production. For. Ecol. Manage. 259, 1684–1694. https://doi.org/10.1016/j. foreco.2010.01.012. StatSoft, Inc., 2013. STATISTICA (data analysis software system), version 12.0. www. statsoft.com. Tetko, I.V., Livingstone, D.J., Luik, A.I., 1995. Neural network studies. 1. Comparison of overfitting and overtraining. J. Chem. Inf. Model. 35, 826–833. https://doi.org/10. 1021/ci00027a006.

7