Modeling of eucalyptus productivity with artificial neural networks

Modeling of eucalyptus productivity with artificial neural networks

Industrial Crops & Products 146 (2020) 112149 Contents lists available at ScienceDirect Industrial Crops & Products journal homepage: www.elsevier.c...

2MB Sizes 0 Downloads 55 Views

Industrial Crops & Products 146 (2020) 112149

Contents lists available at ScienceDirect

Industrial Crops & Products journal homepage: www.elsevier.com/locate/indcrop

Modeling of eucalyptus productivity with artificial neural networks a,

b

c

Eliane Cristina Sampaio de Freitas *, Haroldo Nogueira de Paiva , Júlio César Lima Neves , Gustavo Eduardo Marcattid, Helio Garcia Leiteb

T

a

Departamento de Ciência Florestal, Universidade Federal Rural de Pernambuco, Manuel de Medeiros Street, 870, Dois Irmãos, Recife, Pernambuco, 52171-900, Brazil Departamento de Engenharia Florestal, Universidade Federal de Viçosa, Purdue Avenue, no number, University Campus, Reinaldo de Jesus Araújo Building, Viçosa, Minas Gerais, 36570-900, Brazil c Departamento de Solos, Universidade Federal de Viçosa, Peter Henry Rolfs Avenue, no number, University Campus, Viçosa, Minas Gerais, 36570-900, Brazil d Departamento de Ciências Agrárias, Universidade Federal de São João del-Rei, Sete Lagoas Campus, Highway MG 424 - Km 47, Sete Lagoas, Minas Gerais, 35701-970, Brazil b

ARTICLE INFO

ABSTRACT

Keywords: Meteorological factors Genotype Spacing Edaphic factors Fertilization Climatic factors

Although it is easy to list the several factors that influence forest productivity, it is almost impossible to isolate or measure all biotic and abiotic components that influence crop growth and development on a field-scale. Artificial neural networks (ANN) have been widely used to model forest productivity, since habitual modeling is hampered by the inclusion of categorical variables as well as by the large number of independent variables and its complex relationships with the dependent variable. This study aimed to obtain ANN to estimate eucalyptus productivity as a function of environmental variables, genotype and silvicultural practices, and infer about those more important in forest productivity. It was used data from continuous forest inventory, climate, soil analysis and fertilization carried out on 507 eucalyptus stands, composed of different genotypes and spacing. Multiple-layer Perceptron networks were trained to estimate the mean annual increment of eucalyptus stands at six years of age (MAI6), testing different combinations of input variables, number of neurons in the hidden layer, training algorithms, data percentage in training and validation subsets, and activation functions. In the validation, it was obtained ANN with correlation between the estimated and observed MAI6 higher than 85 % and root mean square error less than 15 %. Despite data complexity, ANN made it possible to estimate MAI6 with good precision and to include easily numerous variables, even categorical. Genotype, spacing, edaphic characteristics (clay, organic matter and Cation exchange capacity- CEC), climatic characteristics (rainfall, temperature and water deficit) and fertilization were the predictive variables that most influenced eucalyptus productivity at the end of rotation.

1. Introduction Among the cultivated arboreal species, Eucalyptus genus stands out, whose plantations have expanded to meet the demand for wood products, due its plasticity in relation to environmental conditions, rapid growth, diversity of uses, and wide knowledge in relation to its silviculture. According to the FAO, in 2000, forest plantations covered 187 million hectares in the world, of which almost 18 million were eucalyptus. In 2009, this area increased to over 20 million hectares (Git Forestry Consulting, 2009). Just in Brazil, nowadays, the area corresponding to plantations of this genus is almost 6 million hectares (IBÁ, 2019). Forest productivity is the result of interaction between edaphic,

climatic, physiographic and biotic factors (Campos, 1970), and influenced by forest management and silvicultural practices. Although it is relatively simple and easy to enumerate the factors influencing forest growth, the understanding and evaluation of the sum of interaction these factors and their effects on the site are complex (Braga et al., 1999). Billings (1952) points out that, from the ecological point of view, no environmental factor can be very significant when studied alone. However, it is practically impossible to isolate or measure, on a field scale, all the biotic and abiotic components that influence a crop development (Montezano et al., 2006). Several studies have shown the influence of genotype, spacing, edaphic and climatic variables, and fertilization on forest productivity, either in isolation or by the interaction of some of these factors, through

Corresponding author. E-mail addresses: [email protected] (E.C.S.d. Freitas), [email protected] (H.N.d. Paiva), [email protected] (J.C.L. Neves), [email protected] (G.E. Marcatti), [email protected] (H.G. Leite). ⁎

https://doi.org/10.1016/j.indcrop.2020.112149 Received 12 September 2018; Received in revised form 30 December 2019; Accepted 18 January 2020 0926-6690/ © 2020 Elsevier B.V. All rights reserved.

Industrial Crops & Products 146 (2020) 112149

E.C.S.d. Freitas, et al.

of growth and production or ecophysiological models (Berger et al., 2002; Stape et al., 2004a, b; Ortiz et al., 2006; Ferreira, 2009; Borges, 2012; Campoe et al., 2013; Marcatti et al., 2017). Growth and production models are a set of mathematical relationships that quantitatively describe a particular system, for example, forest stands (Berger et al., 2002). Growth simulation of one single tree or stand allows analyzing its behavior under certain conditions, providing important information for the forest manager to make decisions (Maestri et al., 2013). Among the ecophysiological models, 3-PG stands out, which makes it possible to predict the productive potential of the forest as a function of the environmental variables and the management practices (Almeida et al., 2004; Stape et al., 2004b; Guimarães et al., 2007; Ferreira, 2009; Borges, 2012). In spite of being generalists, these models are complex, with a large number of necessary parameters, which limits their practical application (Maestri et al., 2013). Another important consideration is that the potential productivity obtained by these models is generally not achieved due to possible nutritional and water limitations during cutting cycle (Whitehead and Beadle, 2004). Regarding the descriptive models, adding environmental variables allows higher variability explanation and the improvement in the generalization capacity of these models (Maestri et al., 2013). However, habitual modeling is hampered by the inclusion of categorical variables, as well as by the large number of predictor variables and their complex relationships, usually non-linear, with the predicted variable. Thus, Artificial Neural Networks (ANN) are currently an additional tool to model the relationship between forest productivity and biotic and abiotic factors that influence it (Lafetá, 2012; Özçelik et al., 2013; Binoti et al., 2013, 2015a; Alcântara, 2015; Medeiros, 2016; Campos et al., 2016; Silva Ribeiro et al., 2016; Martins et al., 2016; Vahedi, 2017). Artificial Neural Networks is a parallel computing system composed of several simple processing elements (artificial neurons), that are connected in a specific way to perform a given task (Bullinaria, 2014). Among ANN advantages, Haykin (2001) mentions the ability to learn and generalize, allowing complex problems resolution, and noise tolerance. Among the tasks solved by ANN, the function approximation is the modeling of the relationship between the variables of a system from a known set of its representative values (Silva et al., 2010). ANN are capable of modeling complex functions, such as non-linear with a large number of variables, including categorical variables. By using the ANN, relating productivity to the biotic and abiotic factors is possible, as well as identifying whether the adopted practices will guarantee the adequate forest stands development (Alcântara, 2015; Medeiros, 2016). Forest data monitoring and management, which vary with space and time, allow making accurate interventions in forests, aiming at maximum yield according to soil potential and environmental factors (Brandelero et al., 2007). Thus, this study aimed to configure, train and validate artificial neural networks to estimate the productivity of eucalyptus stands in Minas Gerais, Brazil, using as input variables the environmental factors (climatic and edaphic), genotype, spacing and fertilization, and infer about those that most influence forest productivity.

the function the network will model. However, studies on problems similar can indicate the size of the dataset necessary to train ANN. Another important point to consider is that the data sample should be representative of the problem to be studied, as the ANN learn only from the observations presented. Thus, in this study was evaluated sizes of the training dataset. Data refer to the mean annual increment in wood volume with bark estimated by continuous forest inventory (CFI), started at three years of age. Information on soil physical and chemical analysis, performed prior the third year of forest stands, and fertilization made during cutting cycle was added to the CFI database. As soil analyzes occurred in temporary plots, prior to CFI initiation, they do not coincide with the permanent plots of the inventory. Thus, the worked data considering the stand, using the averages of the plots of forest stands. These datasets were processed by the company and provided for the study. Meteorological data were obtained for the stands, considering the geographical location of their centroids, through a database developed by Xavier et al. (2015), who generated a grid for Brazil (pixel resolution 0.25° x 0.25° (Lat x Long)), with daily data from 1 January 1980 to 31 December 2013. With rainfall and potential evapotranspiration data, it was calculated the normal (1980–2013) and sequential climatological water balance (taking account the eight years prior to obtaining MAI6) using the Thornthwaite and Mather (1955) method, exemplified by Pereira et al. (2002), with a 300 mm AWC (available water capacity) as suggested by the same authors. It was considered the six years of rotation and the two years prior to planting for water deficit and rainfall variables according to Alcântara (2015), since the water availability does not depend exclusively on the climatic conditions of the evaluation year. For the other climatic variables, it was considered the means of the six years of rotation, since those years had small variation in their values. Also, it was obtained the averages from 1980 to 2013 for the climatic variables. Table 1 shows data summary used to train the artificial neural networks. The trained networks were Multiple Layer Perceptron (MLP) type, consisting of two layers of artificial neurons that process the data (intermediate layer and output layer) and one layer of artificial neurons that receiving data (input layer) and directing them to the middle layer. The software NeuroForest (Binoti et al., 2015b) was used to obtain artificial neural networks. Before training the ANN, Pearson correlation analysis was performed between climatic variables and between edaphic variables through correlation matrices. Thus, in addition to training networks including all variables, networks with no highly correlated variables were trained. This has resulted in numerous combinations of continuous input variables. The networks were trained by changing the input variables (networks with all variables; without the highly correlated variables, aiming to reduce the number of input variables, as well as the redundancy of information; using annual values, summation or mean of the 8, 7 or 6 years prior to obtaining MAI6 for the variables rainfall and water deficit); number of neurons in the hidden layer (the number of neurons in the input layer varied according to the number of independent variables considered in each configuration); training algorithms (Backpropagation, Resilient Propagation (Rprop +) and Quick Propagation); data percentage in the training subset and validation (90 to 50 %), with random selection; and the activation function (logistic, hyperbolic tangent, linear) in the intermediate (except linear function) and output layers. It was trained 250 networks for each configuration, totaling 85.750 networks. The chosen algorithms are the most used in function approximation problems in the forest area. The Backpropagation algorithm is one of the best known. However, it has some disadvantages, such as slow convergence. This has led to the development of other algorithms, Resilient and Quick Propagation. In order to obtain the best topology of ANN, numerous attempts are required to generate satisfactory results. As a starting point, a simpler

2. Material and methods It was obtained data from 507 eucalyptus stands, ranging from 3 to 79 ha, belonging to one forest company that have plantings in different regions of Minas Gerais state, Brazil (Fig. 1). This number was resulting from the consistency analysis of data received from company and, after compiling of different datasets, from those which had all desirable information to carry out the study. There is no methodology for optimal data size determination for predictive modeling by ANN, in other words, this problem is solved through empirical investigation. The number of data required for RNA training is related to the complexity of 2

Industrial Crops & Products 146 (2020) 112149

E.C.S.d. Freitas, et al.

Fig. 1. Location of the eucalyptus stands centroids.

initial configuration was chosen: a hidden layer with the number of neurons defined (half the sum of the number of input units). When training the network, if the achieved performance was not acceptable, more neurons were added in the hidden layer. On the other hand, if networks with larger number of hidden layer neurons performed well in training but poorly performed in validation, then probably overfitting (memorization of training data) had occurred and the number of neurons was reduced. The stopping criterion was the number of cycles (between 2000 and 5000) and the mean error (0.0001), that is, the network training was interrupted when reaching any of these criteria. The networks were selected based on the correlation between the observed and estimated MAI6 (ryyˆ ), root mean square error in percentage (RMSE (%)) (Equation 1), and percent frequency histogram of percentage errors (Eq. 2)

RMSE (%) =

Error (%) =

n

100 y¯

(yˆi

i=1

n

yi ) yi

(yi

× 100

The best networks were those in which, for the rainfall and water deficit variables, the eight years (six years of rotation and two years prior to planting) were considered, instead of only six years of rotation. The network considering all the edaphic (0−20 cm depth layer) and climatic variables (RN14) showed similar results to the networks that excluded the highly correlated variables. The ANN44, which took into account the Climatological Normals (1980–2013), also showed satisfactory results. However, the ANN53 and ANN99 networks, with lower RMSE (%) and higher correlation coefficient in the validation, respectively, were trained with the annual rainfall of the eight years prior to obtaining MAI6. The difference between these networks consisted of the edaphic variables and the number of cycles as a stopping criterion. According to the frequency histograms of the percentage errors of the MAI6 estimates in the validation (Fig. 2), approximately 50 % errors in ANN53 were concentrated in ± 5 %. All the selected networks had more than 75 % percentage errors between -15 % and +15 %, and ANN99 showed higher frequency in this range (85 %). It was developed a map for the MAI6 estimated by the ANN53 network (Fig. 3) for the 507 eucalyptus stands, and the lowest MAI6 values (18.7–28 m³ ha−1 year−1) were predominantly observed in the Northern region of the Minas Gerais state.

yˆi ) 2 (1) (2)

in which y¯ is the mean of the total observed values; n is the total number of observations; yi is the observed value and yˆi is the estimated value Root mean square error in percentage evaluates the estimate accuracy, the smaller the more accurate; the correlation indicates the degree and direction of association between the estimated and observed MAI6.

4. Discussion Several studies have shown better precision of variables estimates related to forest production with the use of artificial neural networks compared with regression models (Özçelik et al., 2010; Diamantopoulou, 2012; Bhering et al., 2015; Silva Ribeiro et al., 2016; Vahedi, 2017). Özçelik et al. (2013), when estimating the height of Juniperus trees in function of diameter, observed that ANN reduced RMSE by more than 20 % compared with a non-linear regression model. Among the networks advantages, these authors highlight the ability to detect complex nonlinear relations between dependent and independent variables, with no need to prior knowledge on the function to be estimated.

3. Results Of all the configurations, it was selected the 10 best networks (Table 2), which showed ryyˆ higher than 85 % in the validation. All networks were obtained with the Resilient Propagation (Rprop +) training algorithm using the logistic function, both in the hidden layer (6 or 8 neurons) and in the input layer. The best ryyˆ and RMSE (%) results were obtained using 90 % of the data for the training 3

Industrial Crops & Products 146 (2020) 112149

E.C.S.d. Freitas, et al.

Table 1 Variables used to train artificial neural networks. Output variable

Unit −1

MAI6 – Mean annual increment at 6 years of age

m³ ha

Categorical Variables Genotype

Categories Clone1 Clone2 Clone3 Clone4 Clone5 Clone6 3.5 × 2.0 3.5 × 2.5 3.0 × 3.0 3.0 × 2.5 4.0 × 2.0 3.0 × 2.0

Spacing (m)

Continuous variables TS - Total sand (0−20 cm) CS - Coarse sand (0−20 cm) FS - Fine sand (0−20 cm) CLA - Clay (0−20 cm) SIL - Silt (0−20 cm) m - Aluminum saturation (0−20 cm) V - Base saturation (0−20 cm) P - Available phosphorus (0−20 cm) K - Potassium available (0−20 cm) pH (H2O) – pH measured in water (0-20 cm) pH (CaCl2) – pH measured in solution of CaCl2 (0-20 cm) CEC - Cation exchange capacity at pH 7.0 (0−20 cm) H + Al - Potential acidity (0−20 cm) Al - Exchangeable acidity (0−20 cm) Mg - Exchangeable magnesium (0−20 cm) Ca - Exchangeable calcium (0−20 cm) OM - Organic matter (0−20 cm) Prem - Remaining phosphorus (0−20 cm) Rain1 - Rainfall of the fifth year after planting Rain2 - Rainfall of the fourth year after planting Rain3 - Rainfall of the third year after planting Rain4 - Rainfall of the second year after planting Rain5 - Rainfall of the first year after planting Rain6 - Rainfall of the planting year Rain7 - Rainfall of the year prior planting Rain8 - Rainfall of two years prior planting Tmax - Maximum temperature (average of 6 years of planting) Tmin - Minimum temperature (average of 6 years of planting) Tmea - Mean temperature (average of 6 years of planting) WS - Wind speed (average of 6 years of planting) Etp - Potential evapotranspiration (average of 6 years of planting) RH - Relative humidity (average of 6 years of planting) SRa - Solar radiation (average of 6 years of planting) WD1 - Water deficit of the first year after planting WD2 - Water deficit of the second year after planting WD3 - Water deficit of the third year after planting WD4 - Water deficit of the fourth year after planting WD5 - Water deficit of the fifth year after planting WD6 - Water deficit of planting year WD7 - Water deficit of the year prior planting WD8 - Water deficit of two years prior planting Rain_nor – Rainfall (average from 1980 to 2013) Tmax_nor - Maximum temperature (average from 1980 to 2013) Tmin_nor - Minimum temperature (average from 1980 to 2013) Tmea_nor - Mean temperature (average from 1980 to 2013) WS_nor - Wind speed (average from 1980 to 2013) Etp_nor - Potential evapotranspiration (average from 1980 to 2013) RH_nor - Relative humidity (average from 1980 to 2013) WD_nor - Water deficit (average from 1980 to 2013) SRa_nor - Solar radiation (average from 1980 to 2013) N_fert - Total N added during the 6 years of rotation P_fert – Total P2O5 added during the 6 years of rotation K_fert – Total K2O added during the 6 years of rotation Ca_fert - Total CaO added during the 6 years of rotation

year

−1

Unit % % % % % % % mg dm−3 mg dm−3 – – cmolc dm−3 cmolc dm−3 cmolc dm−3 cmolc dm−3 cmolc dm−3 dag kg−1 mg L−1 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 °C °C °C m s−1 mm year−1 % MJ m−2 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 mm year−1 °C °C °C m s−1 mm year−1 % mm year−1 MJ m−2 kg ha−1 kg ha−1 kg ha−1 kg ha−1

Minimum

Quartile 1

Quartile 2

Quartile 3

Maximum

16.43

27.54

31.96

36.54

50.99

Minimum 8.78 3.68 3.06 6.00 1.90 0.72 0.81 0.47 2.52 3.99 3.80 2.19 1.82 0.02 0.01 0.01 0.70 2.01 732.92 620.68 620.68 620.68 557.37 677.52 647.24 647.24 25.19 14.43 20.03 1.17 1191.81 62.98 16.35 185.20 146.52 76.62 122.66 184.67 146.88 65.39 78.49 81263 25.87 14.66 20.32 1.18 1182.81 63.84 111 16.26 0.00 0.00 53.71 0.00

Quartile 1 39.84 15.16 18.36 14.00 6.58 50.91 3.92 1.48 17.84 4.6 4.03 3.73 3.30 0.48 0.09 0.10 1.34 1133 938.56 801.82 801.82 783.54 1024.38 924.01 1050.42 924.01 29.56 17.57 23.61 1.29 1478.4 63.66 19.37 406.01 343.31 324.52 354.38 412.83 373.86 354.38 387.96 861.5 29.53 17.08 23.3 1.24 1401.67 64.70 339 18.61 18.71 141.92 141.25 338

Quartile 2 69.52 27.06 28.29 20.00 8.92 66.00 7.39 1.88 25.39 4.77 4.10 5.15 4.46 0.67 0.13 0.18 2.07 22.41 1067.76 1053.23 1157.90 1207.13 1184.64 1110.67 1139.99 999.91 30.04 17.95 24.00 1.33 1515.26 66.67 19.55 548.15 421.03 543.27 475.09 485.89 524.92 420.50 497.66 1153.33 29.91 17.53 23.72 1.26 1435.15 69.75 377 18.88 21.40 156.02 158.81 498.79

Quartile 3 78.26 35.35 45.86 50.00 11.11 77.72 12.4 2.34 33.62 4.97 4.19 7.72 7.26 1.01 0.19 0.29 3.15 29.81 1241.59 1370.56 1333.00 1384.08 1323.46 1286.92 1359.53 1272.41 30.82 18.64 24.79 1.38 1555.46 67.32 19.59 624.63 736.54 753.95 753.95 685.00 735.49 479.41 532.40 1194.62 30.50 18.56 24.59 1.38 1537.24 70.00 672 19.08 30.60 178.95 215.05 542.03

Maximum 87.70 72.40 66.12 76.00 36.69 93.90 49.17 16.86 239.08 6.14 5.10 17.16 16.83 2.76 0.92 1.87 8.17 47.93 1946.25 2037.49 2137.08 1878.03 2137.08 1741.40 1915.64 1682.24 31.10 19.14 25.12 1.73 1615.72 78.35 20.45 857.42 1000.66 1000.66 1000.66 869.22 915.52 915.52 838.73 1513.39 30.75 18.99 24.87 1.44 1582.27 78.02 799 19.26 60.93 325.81 512.14 1100.78

(continued on next page)

4

Industrial Crops & Products 146 (2020) 112149

E.C.S.d. Freitas, et al.

Table 1 (continued) Output variable Mg_fert - Total MgO added during the 6 years of rotation S_fert - Total S added during the 6 years of rotation B_fert - Total B added during the 6 years of rotation Cu_fert - Total Cu added during the 6 years of rotation Zn_fert - Total Zn added during the 6 years of rotation

Unit kg kg kg kg kg

−1

ha ha−1 ha−1 ha−1 ha−1

Most studies use dendrometric data in the ANN training base, which generates better correlation and precision of the estimates. Binoti et al. (2015a) predicted the eucalyptus stands production using quantitative (age, basal area, volume) and categorical variables (soil class, texture, spacing, relief, design, clone). These authors obtained RMSE less than 5 % and correlation higher than 95 %, result similar to Alcântara (2015), when using meteorological and CFI data in training. The use of environmental variables, added to CFI data, as input variables in an ANN is important since it considers the natural atypical effects, as a year with high water deficit, besides being able to simulate predictions of nonstandard behavior (Alcântara, 2015). Although being less accurate than networks that include CFI data in estimating forest productivity, networks that consider only environmental, genotype and management factors allow better analysis of these factors influence on productivity. These networks also allow estimating productivity in locations with no inventory data or with no planting (Alcântara, 2015). In relation to the best ANN settings, other studies also have demonstrated better performance of the Resilient Propagation algorithm in solving forest problems. Martins et al. (2016) also observed good performance to estimate eucalyptus height. Neuron number of the hidden layer and the number of cycles were altered according to the observation of sub or overtraining occurrence. According to Martins et al. (2016), a high number of cycles can lead to overtraining and increase processing time, and a reduced number of cycles can lead to undertraining. This also occurs for the neuron number in the hidden layer and amount of hidden layers. Data percentage used to train the network influenced the MAI6 estimation accuracy and association between the estimated and observed values in the validation. As the data used in the training were reduced, RMSE (%) increased and ryyˆ reduced. The need for higher data percentage for training (90 %) may be due to the large number of input variables, thus covering their greater variability and improving the network's capacity to generalize. Alcântara (2015) also observed the same, when using meteorological data and categorical variables (clone, spacing, age, rotation, predominant soil) to estimate MAI7 of eucalyptus stands in Minas Gerais. This author recommended the use of the entire database to generate better productivity estimates when do not have CFI data in input variables. With greater database use to train the networks, they presented good accuracy for MAI6 estimation and correlation with the observed data, considering that no CFI data were used as predictor variables and data used were at stand level. Alcântara (2015), with no CFI data as predictor variables and using plot-level data, obtained networks with 12 % RMSE in training and 22 % in validation, using 90 % data for training, and the correlation in validation was approximately 54 %. The best MAI estimates in this study, in relation to the mentioned work, may be due to the inclusion of fertilization and edaphic factors as input variables, since having great influence on forest productivity. Most of the selected networks took into account one of the variables in the 0−20 cm soil layer: clay, organic matter or CEC, which are strongly correlated. Gava and Gonçalves (2008) evaluated the effect of soil physical and chemical attributes on productivity and quality of Eucalyptus grandis wood (6.5–7 years of age) and concluded that among soil physical attributes, clay content, directly related to the amount of water and nutrients available, was the most relevant. These authors did

Minimum

Quartile 1

Quartile 2

Quartile 3

Maximum

0.00 0.00 0.00 0.00 0.00

71.79 0 2.77 0 0.75

90 0 3.55 0.41 1.49

134.91 0 4.25 0.75 3.09

255.45 45.92 9.26 5.75 7.72

not find a significant correlation between the timber production and P, Ca and K content; however, there was high correlation with organic matter, exchangeable Al and CEC of the soil. All the selected networks considered the amount of nutrients added during rotation, and when the networks were trained with the same setting of the best networks, but without including fertilization as input variable, they showed smaller ryyˆ and larger RMSE. Silva et al. (2016) observed an increase in the annual increment of 24 % in fertilized eucalyptus stands (N, P, K, Ca and Mg), and this response was influenced by the soil characteristics, mainly by clay and organic matter content. The greatest response to fertilization occurred in regions with no water deficit and with higher clay content, allowing greater humidity retention, and organic matter. Other studies with forest species also show the greater response to fertilization related to the water availability. Campoe et al. (2013) observed that fertilization treatment plus irrigation increased leaf area index by 79 % in pinus stands, while fertilization treatment without irrigation increased it just in 57 %, compared them with control treatment. As observed in the cited studies, water availability has a great influence on the use of environmental resources by arboreal species and, consequently, on forest productivity. Stape et al. (2004a) evaluated wood production in eucalyptus stands as function of water availability for two years and observed its influence on tree growth. Control treatment growth (with no irrigation) corresponded to 86 % and 42 % irrigated plots growth in the year of higher precipitation (1770 mm year −1) and lower (1210 mm year −1), respectively. Otto et al. (2014), in regions with an average annual rainfall of 1390 mm, also observed that the net primary production in clonal eucalyptus plots without irrigation was 17–36% lower than in the plots with irrigation. The lowest MAI6 (< 30 m³ ha−1 year−1) observed in the northern Minas Gerais (Fig. 3) may be due to lower rainfall (mean of 800 mm year−1) compared with other regions of the study, proving the importance of this environmental factor in eucalyptus productivity. Lemos Filho et al. (2007), analyzing 17-year meteorological data, observed that the lowest annual average rainfall values in the state of Minas Gerais were recorded in the North region, reaching a minimum value of 512 mm year−1 in the extreme north, while the highest values were recorded in the South region, reaching the maximum value of 1748 mm year−1 in the southern end. Sette et al. (2010) evaluated the increment in diameter of Eucalyptus grandis trees for 24 months for climatic variables and mineral fertilization. They observed that the period of maximum trunk growth related to the high levels of precipitation, temperature, soil water availability and daylight hours. During the evaluation period, the rate of increase in diameter was high in one month with low precipitation and temperature, resulting from the absorption of water stored in the deeper layers of the soil. According to the authors, trees express trunk growth in response to climatic variables after a time interval, considered as a lag period. In this study, perceiving the importance of taking into account the meteorological data of a period prior to planting was possible, as well as the annual meteorological data, since considering the climatological normals, a change in the forest productivity caused by some climatic extreme in a particular year may be erroneously attributed to other factors. In relation to the categorical variables used to train the networks, spacing and genotype were very important to obtain estimates more 5

6

90

90

ANN74

ANN99

10

10

10

10

10

10

10

10

10

10

Validation data (%)

Genotype; spacing

Genotype; spacing

Genotype; spacing

Genotype; spacing

Genotype; spacing

Genotype; spacing

Genotype; spacing

Genotype; spacing

Genotype; spacing

Genotype; spacing

Categorical variables

P; K; pH(H2O); Mg; Ca; OM; WD1; WD2; WD3; WD4; WD5; WD6; WD7; WD8; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert V; P; K; CEC; Rain1; Rain2; Rain3; Rain4; Rain5; Rain6; Rain7; Rain8;Tmin; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert

CS; FS; CLA; SIL; m; V; P; K; pH(H2O); pH(CaCl2); CEC; H + Al; Al; Mg; Ca; OM; Prem; Rain_nor; Tmax_nor; Tmin_nor; WS_nor; Etp_nor; RH_nor; SRa_nor; WD_nor; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert P; K; pH(H2O); Mg; Ca; OM; Rain1;Rain2;Rain3;Rain4; Rain5;Rain6;Rain7;Rain8; Tmin; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert V; P; K; CEC; Rain1; Rain2; Rain3; Rain4; Rain5; Rain6; Rain7; Rain8; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert CLA; m; P; K; WD1; WD2; WD3; WD4; WD5; WD6; WD7; WD8; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert CLA; m; P; K; Rain_sum (8years);Tmin; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert

CS; FS; CLA; SIL; m; V; P; K; pH (H2O); pH(CaCl2); CEC; H + Al; Al; Mg; Ca; OM; Prem; Rain1; Rain2; Rain3; Rain4; Rain5; Rain6; Rain7; Rain8; Tmax; Tmin; WS; Etp; RH; SRa; WD1; WD2; WD3; WD4; WD5; WD6; WD7; WD8; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert CS; FS; CLA; SIL; m; V; P; K; pH(H2O); pH(CaCl2); CEC; H + Al; Al; Mg; Ca; OM; Prem; WS; WD1; WD2; WD3; WD4; WD5; WD6; WD7; WD8; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert CLA; P; K; pH(H2O); Mg; Ca; Rain_sum(8 years);Tmax; N_fert; P_fert; K_fert; Ca_fert; Mg_fert; S_fert; B_fert; Cu_fert; Zn_fert

Continuous variables

2000

2000

3000

3000

3000

3000

3000

3000

3000

3000

Number of cycles

Resilient Propagation (Rprop+) Resilient Propagation (Rprop+) Resilient Propagation (Rprop+) Resilient Propagation (Rprop+) Resilient Propagation (Rprop+) Resilient Propagation (Rprop+)

Resilient Propagation (Rprop+) Resilient Propagation (Rprop+)

Resilient Propagation (Rprop+)

Resilient Propagation (Rprop+)

Training algorithm

35:8:1

36:8:1

28:6:1

34:6:1

34:8:1

37:8:1

47:8:1

30:8:1

48:8:1

61:8:1

MLP

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

Logistic

0.8875

0.8851

0.8258

0.8864

0.9076

0.9071

0.8153

0.8636

0.8887

0.9137

ryyˆ

Hidden layer

Output layer

Train

Activation function

9.9585

10.2569

12.3178

10.1323

9.2578

9.2803

12.6230

11.0679

10.0522

8.9209

RMSE (%)

0.8927

0.8525

0.8876

0.8633

0.8621

0.8583

0.8621

0.8503

0.8585

0.8704

ryyˆ

11.3177

11.2832

11.1252

11.8688

11.5354

10.8172

11.9371

11.6922

11.4044

11.4007

RMSE (%)

Validation

Genotype (six clones). Spacing (3.5 × 2.0; 3.5 × 2.5; 3.0 × 3.0; 3.0 × 2.5; 4.0 × 2.0; 3.0 × 2.0). Edaphic variables (0−20 cm depth layer) = CS - Coarse sand; FS - Fine sand; CLA - Clay; SIL- Silt; m - Aluminum saturation; V - Base saturation ; P - Phosphorus; K - Potassium; pH (H2O) – pH measured in water; pH (CaCl2) – pH measured in solution of CaCl2; CEC - cationic exchange capacity at pH 7.0; H + Al - Potential acidity; Al Exchangeable acidity; Mg - Magnesium; Ca - Calcium; OM - Organic matter; Prem - Remaining phosphorus. Climatic variables = Rain1,Rain2,Rain3,Rain4,Rain5,Rain6,Rain7,Rain8 and WD1,WD2,WD3,WD4,WD5,WD6,WD7,WD8: annual rainfall and water deficit, respectively, of eight years prior to obtaining MAI6; Rain_sum (8 years): total rainfall of six years of rotation plus two years prior to planting; Tmax - Maximum temperature, Tmin - Minimum temperature,WS - Wind speed, Etp - Potential evapotranspiration, RH - Relative Humidity, SRa - Solar radiation: average of 6 years of planting; Rain_nor – Rainfall, Tmax_nor - Maximum temperature, Tmin_nor - Minimum temperature, WS_nor - Wind speed, Etp_nor - Potential evapotranspiration, RH_nor - Relative humidity, WD_nor - Water deficit, SRa_nor – Solar radiation: average from 1980 to 2013; N_fert – N, P_fert - P2O5, K_fert - K2O, Ca_fert – CaO, Mg_fert – MgO, S_fert - S, B_fert – B, Cu_fert – Cu, Zn_fert – Zn: added during the 6 years of rotation.

90

90

ANN53

ANN74

90

ANN44

90

90

ANN29

ANN68

90

ANN16

90

90

ANN14

ANN66

Training data (%)

Code

Table 2 Characteristics and accuracy of artificial neural networks (ANN) selected to estimate MAI6 (mean annual increment at 6 years of age) of eucalyptus stands in Minas Gerais.

E.C.S.d. Freitas, et al.

Industrial Crops & Products 146 (2020) 112149

Industrial Crops & Products 146 (2020) 112149

E.C.S.d. Freitas, et al.

Fig. 2. Frequency of percentage error class of the MAI6 estimates obtained with validation data subset.

7

Industrial Crops & Products 146 (2020) 112149

E.C.S.d. Freitas, et al.

Fig. 3. Average productivity of eucalyptus stands at 6 years of age (MAI6), in MG, estimated by the artificial neural network ANN53.

accurate, since the networks without these variables inclusion had a reduction in the correlation and greater difference between the observed MAI6 values and estimates. Binoti et al. (2015a) used ANN to design eucalyptus volume and observed that among the categorical variables (soil class, texture, spacing, relief, design and genotype) genotype was selected in all networks. Although all the selected networks (using all the variables of climate and soil; eliminating the highly correlated variables; using climatological normals) showed similar results, the decision should be for those with less number of variables and that are easy to obtain, optimizing data processing. In addition, considering the climatic variables of the rotation period is recommended, and also data prior to planting. The possibility of including numerous predictive variables, even categorical, makes ANN a more generalist tool. Thus, with the trained network it is possible to obtain the average productivity of eucalyptus in areas where there are no plantations, using values for the predictive variables within the maximum and minimum limits used to train the network, with the possibility of minimizing errors in the decision making in relation to forest plantations in a given location. Obtaining ANN, with no inclusion of CFI data as predictor variables, is of practical importance, serving to elaborate strategic and tactical management plans, in which estimates of productivity are necessary at the age defined for production regulation (Alcântara, 2015). Understanding the influence of environmental factors on productivity enables the forest manager to choose the best location for planting, select the best genotypes for a given condition, and manage the forests in order to make better use of available resources.

state, Brazil, may be limited by the water deficit. Credit author statement The authors have made a substantial contribution in this study. PAIVA, NEVES and LEITE conceived the project and provided data. PAIVA, LEITE, MARCATTI and FREITAS determined data analysis methods. FREITAS executed the methodology and interpreted the data. All authors contributed with writing, critical review, and discussion of the results, in particular FREITAS, PAIVA and LEITE. Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Acknowledgments This study was financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES) Finance Code 001. The data of eucalyptus stands was provided by Gerdau S.A. References Alcântara, A.E.M., 2015. Redes neurais artificiais para prognose do crescimento e da produção de povoamentos de eucalipto em Minas Gerais. Universidade Federal de Viçosa, Viçosa. Almeida, A.C., Landsberg, J.J., Sands, P.J., 2004. Parameterisation of 3-PG model for fastgrowing Eucalyptus grandis plantations. For. Ecol. Manage. 193, 179–195. https://doi. org/10.1016/j.foreco.2004.01.029. Berger, R., Schneider, P.R., Finger, C.A.G., Haselein, C.R., Berger, R., Schneider, P.R., Finger, C.A.G., Haselein, C.R., 2002. Efeito do espaçamento e da adubação no crescimento de um clone de Eucalyptus saligna Smith. Ciência Florestal 12, 75–87. https://doi.org/10.5902/198050981682. Bhering, L.L., Cruz, C.D., Peixoto, L.D.A., Rosado, A.M., Laviola, B.G., Nascimento, M., 2015. Application of neural networks to predict volume in eucalyptus. Crop. Breed. Appl. Biotechnol. 15, 125–131. https://doi.org/10.1590/1984-70332015v15n3a23. Billings, W.D., 1952. The environmental complex in relation to plant growth and distribution. Q. Rev. Biol. 27, 251–265. Binoti, D.H.B., Binoti, M.L.M.S., Leite, H.G., 2013. Redução dos custos em inventário de povoamentos equiâneos. Rev. Bras. Ciências Agrárias - Braz. J. Agric. Sci. 8, 125–129.

5. Conclusions Artificial neural networks allows estimating MAI6 of eucalyptus stands with good precision, despite the data complexity, allowing the easy inclusion of many predictive variables, even categorical ones. Among the most important explanatory variables for the estimation of eucalyptus productivity at the end of rotation, can highlight genotype, spacing, edaphic characteristics: clay, organic matter and CEC, climatic characteristics: rainfall, temperature and water deficit, and fertilization performed during rotation. Eucalyptus productivity in the northern region of Minas Gerais 8

Industrial Crops & Products 146 (2020) 112149

E.C.S.d. Freitas, et al.

Leite, H.G., 2017. GIS-based approach applied to optimizing recommendations of Eucalyptus genotypes. For. Ecol. Manage. 392, 144–153. https://doi.org/10.1016/j. foreco.2017.03.006. Martins, E.R., Binoti, M.L.M.S., Leite, H.G., Binoti, D.H.B., Dutra, G.C., 2016. Configuração de redes neurais artificiais para estimação do afilamento do fuste de árvores de eucalipto. Rev. Bras. Ciências Agrárias - Braz. J. Agric. Sci. 11, 117–123. https://doi.org/10.5039/agraria.v11i1a5354. Medeiros, R.A., 2016. Potencial produtivo, manejo e experimentação em povoamentos de Tectona grandis L.f. no Estado de Mato Grosso. Universidade Federal de Viçosa, Viçosa. Montezano, Z.F., Corazza, E.J., Muraoka, T., 2006. Variabilidade espacial da fertilidade do solo em área cultivada e manejada homogeneamente. Revista Brasileira de Ciência do Solo 30, 839–847. https://doi.org/10.1590/S0100-06832006000500010. Ortiz, J.L., Vettorazzi, C.A., Couto, H.T.Z., Gonçalves, J.L.M., 2006. Relações espaciais entre o potencial produtivo de um povoamento de eucalipto e atributos do solo e do relevo. Sci. For. 67–79. Otto, M.S.G., Hubbard, R.M., Binkley, D., Stape, J.L., 2014. Dominant clonal Eucalyptus grandis x urophylla trees use water more efficiently. For. Ecol. Manage. 328, 117–121. https://doi.org/10.1016/j.foreco.2014.05.032. Özçelik, R., Diamantopoulou, M.J., Brooks, J.R., Wiant, H.V., 2010. Estimating tree bole volume using artificial neural network models for four species in Turkey. J. Environ. Manage. 91, 742–753. https://doi.org/10.1016/j.jenvman.2009.10.002. Özçelik, R., Diamantopoulou, M.J., Crecente-Campo, F., Eler, U., 2013. Estimating Crimean juniper tree height using nonlinear regression and artificial neural network models. For. Ecol. Manage. 306, 52–60. https://doi.org/10.1016/j.foreco.2013.06. 009. Pereira, A.R., Angelocci, L.R., Sentelhas, P.C., 2002. Agrometeorologia – Fundamentos e Aplicações. Agropecuária Publishing Company, Guaíba. Sette Jr, C.R., Tomazello Filho, M., Dias, C.T.S., Laclau, J.P., 2010. Crescimento em diâmetro do tronco das árvores de Eucalyptus grandis W. Hill. ex. Maiden e relação com as variáveis climáticas e fertilização mineral. Revista Árvore 34, 979–990. https://doi.org/10.1590/S0100-67622010000600003. Silva, I.N., Spatti, H.D., Flauzino, R.A., 2010. Redes Neurais Artificiais: para engenharia e ciências aplicadas. Artliber, São Paulo. Silva, R.M.L., Hakamada, R.E., Bazani, J.H., Otto, M.S.G., Stape, J.L., 2016. Fertilization response, light use, and growth efficiency in Eucalyptus plantations across soil and climate gradients in Brazil. Forests 7, 1–12. https://doi.org/10.3390/f7060117. Silva Ribeiro, R.B., Gama, J.R.V., Souza, A.L., Leite, H.G., Soares, C.P.B., Silva, G.F., 2016. Métodos para estimar o volume de fustes e galhos na Floresta Nacional do Tapajós. Revista Árvore 40, 81–88. https://doi.org/10.1590/0100-67622016000100009. Stape, J.L., Binkley, D., Ryan, M.G., Gomes, A.N., 2004a. Water use, water limitation, and water use efficiency in a Eucalyptus plantation. Bosque 25, 35–41. https://doi.org/10. 4067/S0717-92002004000200004. Stape, J.L., Ryan, M.G., Binkley, D., 2004b. Testing the utility of the 3-PG model for growth of Eucalyptus grandis x urophylla with natural and manipulated supplies of water and nutrients. For. Ecol. Manage. 193, 219–234. https://doi.org/10.1016/j. foreco.2004.01.031. Vahedi, A.A., 2017. Monitoring soil carbon pool in the Hyrcanian coastal plain forest of Iran: artificial neural network application in comparison with developing traditional models. Catena 152, 182–189. https://doi.org/10.1016/j.catena.2017.01.022. Whitehead, D., Beadle, C.L., 2004. Physiological regulation of productivity and water use in Eucalyptus: a review. For. Ecol. Manage. 193, 113–140. https://doi.org/10.1016/j. foreco.2004.01.026. Xavier, A.C., King, C.W., Scanlon, B.R., 2015. Daily gridded meteorological variables in Brazil (1980–2013). Int. J. Climatol. 36, 2644–2659. https://doi.org/10.1002/joc. 4518.

https://doi.org/10.5039/agraria.v8i1a2209. Binoti, M.L.M.S., Leite, H.G., Binoti, D.H.B., Gleriani, J.M., 2015a. Prognose em nível de povoamento de clones de eucalipto empregando redes neurais artificiais. Cerne 21, 97–105. https://doi.org/10.1590/01047760201521011153. Binoti, D.H.B., Leite, H.G., Lopes, P.F., Lopes, S.S.P., 2015b. Neuroforest: Sistema para geração e aplicação de Redes Neurais Artificiais. (accessed 20 April 2015). http: // neuroforest. ucoz. com. Borges, J.S., 2012. Modulador edáfico para uso em modelo ecofisiológico e produtividade potencial de povoamentos de eucalipto. Universidade Federal de Viçosa, Viçosa. Braga, F.A., Barros, N.F., Souza, A.L., Costa, L.M., 1999. Características ambientais determinantes da capacidade produtiva de sítios cultivados com eucalipto. Revista Brasileira de Ciência do Solo 23, 291–298. https://doi.org/10.1590/S010006831999000200013. Brandelero, C., Antunes, M.U.F., Giotto, E., 2007. Silvicultura de precisão: nova tecnologia para o desenvolvimento florestal. Ambiência 3, 269–281. Bullinaria, J.A., 2014. Introduction to Neural Computation. http://www.cs.bham.ac.uk/ ∼jxb/inc.html (Accessed 27 May 2015). . Campoe, O.T., Stape, J.L., Albaugh, T.J., Allen, H.L., Fox, T.R., Rubilar, R., Binkley, D., 2013. Fertilization and irrigation effects on tree level aboveground net primary production, light interception and light use efficiency in a loblolly pine plantation. For. Ecol. Manage. 288, 43–48. https://doi.org/10.1016/j.foreco.2012.05.026. Campos, J.C.C., 1970. Principais fatores do meio que afetam o crescimento das árvores. Floresta 2, 45–52. Campos, B.P.F., Silva, G.F., Binoti, D.H.B., Mendonça, A.R., Leite, H.G., 2016. Predição da altura total de árvores em plantios de diferentes espécies por meio de redes neurais artificiais. Pesqui. Florest. Bras. 36, 375–385. https://doi.org/10.4336/2016.pfb.36. 88.1166. Diamantopoulou, M.J., 2012. Assessing a reliable modeling approach of features of trees through neural network models for sustainable forests. Sustain. Comput. Inform. Syst. 2, 190–197. https://doi.org/10.1016/j.suscom.2012.10.002. FAO - Food and Agriculture Organization of the United Nations, 2000. Global Data on Forest Plantations Resources. http://www.fao.org/docrep/004/Y2316E/y2316e0b. htm. Ferreira, M.Z., 2009. Modelagem da influência de variáveis ambientais no crescimento e na produção de Eucalyptus sp. Universidade Federal de Lavras, Lavras. Gava, J.L., Gonçalves, J.L.M., 2008. Soil attributes and wood quality for pulp production in plantations of Eucalyptus grandis clone. Sci. Agric. 65, 306–313. https://doi.org/10. 1590/S0103-90162008000300011. GIT Forestry Consulting, 2009. Eucalyptus Global Map. http://git-forestry-blog.blogspot. com/2008/09/eucalyptus-global-map-2008-cultivated.html. Guimarães, D.P., Silva, G.G.C., Sans, L.M.A., Leite, F.P., 2007. Uso do modelo de crescimento 3-PG para o zoneamento do potencial produtivo do eucalipto no estado de Minas Gerais. Revista Brasileira de Agrometeorologia 15, 192–197. Haykin, S., 2001. Neural Networks: Principles and Practice, 2 ed. Bookman, Porto Alegre. IBÁ - Indústria Brasileira de Árvores, 2019. Annual Report 2019. . https://iba.org/ datafiles/publicacoes/relatorios/iba-relatorioanual2019.pdf. Lafetá, B.O., 2012. Eficiência nutricional, área foliar e produtividade de plantações de eucalipto em diferentes espaçamentos estimados com redes neurais artificiais. Universidade Federal dos Vales do Jequitinhonha e Mucuri, Diamantina. Lemos Filho, L.C.D.A., Carvalho, L.G.D., Evangelista, A.W., Carvalho, L.M.T.D., Dantas, A.A., 2007. Análise espaço-temporal da evapotranspiração de referência para Minas Gerais. Ciência e Agrotecnologia 31, 1462–1469. https://doi.org/10.1590/S141370542007000500029. Maestri, R., Sanquetta, C.R., Scolforo, J.R., Machado, S.A., Corte, A.P.D., 2013. Modelagem do crescimento florestal considerando variáveis do ambiente: revisão. Scientia Agraria 14, 103–110. Marcatti, G.E., Resende, R.T., Resende, M.D.V., Ribeiro, C.A.A., Santos, A.R., Cruz, J.P.,

9