Catena 149 (2017) 176–184
Contents lists available at ScienceDirect
Catena journal homepage: www.elsevier.com/locate/catena
Using residual analysis in electromagnetic induction data interpretation to improve the prediction of soil properties Chunfeng Lu a, Zhiwen Zhou b, Qing Zhu b,c,⁎, Xiaoming Lai b, Kaihua Liao b a b c
School of Geographic and Oceanographic Sciences, Nanjing University, Nanjing 210023, China Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing 210008, China Jiangsu Collaborative Innovation Center of Regional Modern Agriculture & Environmental Protection, Huaiyin Normal University, Huaian 223001, China
a r t i c l e
i n f o
Article history: Received 15 June 2016 Received in revised form 8 September 2016 Accepted 21 September 2016 Available online 9 November 2016 Keywords: Applied geophysics Data mining Hydropedology Spatial analysis
a b s t r a c t The electromagnetic induction (EMI) technique has been widely used to survey soil properties at intermediate spatial scales. However, EMI data interpretation remains a challenge for more accurate and robust mapping. Residual analysis is an alternative approach that can be used to improve the EMI data mining. On a tea garden (TG) hillslope and a bamboo forest (BF) hillslope, terrain indices were used to regress the apparent electrical conductivity (ECa) in ten repeated EMI surveys using stepwise multiple linear regressions (SMLR). Residuals of ECa in these regressions, which had terrain influence removed, were then calculated. The classification and regression tree (CART) model was adopted to quantify the relative contributions of terrain indices (elevation, slope, plane curvature–PLC, profile curvature–PRC, and topographic wetness index–TWI), static soil properties (rock fragment content–RFC, depth to bedrock–DB, contents of clay, silt and sand), and dynamic soil property (volumetric soil moisture–θ) to ECa and their residuals. Results show that contributions of terrain indices to ECa are around 50%. However, contributions of terrain indices to ECa residuals are b 20%, while great contributions of different soil properties to ECa residuals can be observed in some cases. On both hillslopes, better predicting accuracies were achieved when using ECa residuals as independent variables in SMLRs to predict soil properties than using only terrain indices or ECa as independent variables. Similarly, on both hillslopes, using terrain indices plus ECa residuals as independent variables also yield better prediction of θ than using only terrain indices or using terrain indices plus ECa as independent variables. Findings of this study indicate that residual analysis can be a useful technique in improving EMI data interpretation for estimating the spatial variations of soil properties. In cases that relationship between target soil properties and ECa readings are weak, this approach can probably be used to improve the mapping accuracy of the target soil properties. © 2016 Elsevier B.V. All rights reserved.
1. Introduction Spatial distributions of soil properties (e.g., soil moisture–θ, soil texture, depth to bedrock–DB and rock fragment content–RFC) are critical factors determining various hydrological and biogeochemical processes (e.g., subsurface flow, denitrification, and soil water storage) across spatial scales (e.g., Lin et al., 2005; Vereecken et al., 2008; Parn et al., 2012). At the intermediate spatial scales (e.g., hillslope and catchment scales with an area from 0.1–100 ha), soil sampling and in-situ observation usually lack of high spatio-temporal resolution since they are costly Abbreviations: BF, bamboo forest; CART, classification and regression tree; DB, depth to bedrock; ECa, apparent electrical conductivity; EMI, electromagnetic induction; PLC, plan curvature; PRC, profile curvature; RFC, rock fragment content; SMLR, stepwise multiple linear regression; TG, tea garden; TWI, topographic wetness index. ⁎ Corresponding author at: Key Laboratory of Watershed Geographic Sciences, Nanjing Institute of Geography and Limnology, Chinese Academy of Sciences, Nanjing 210008, China. E-mail address:
[email protected] (Q. Zhu).
http://dx.doi.org/10.1016/j.catena.2016.09.018 0341-8162/© 2016 Elsevier B.V. All rights reserved.
and time consuming. In addition, at the intermediate spatial scales, remote sensing is also restricted by spatial resolution of the satellite sensor, as well as the weather, surface roughness and vegetation cover conditions (Zhu et al., 2012; Mulder et al., 2011; McColl et al., 2012). Therefore, developing effective tools and approaches is urgently needed to acquire the spatial information of soil properties at the intermediate scales. Electromagnetic induction (EMI) is gaining great acceptance as a way of obtaining spatially distributed apparent electrical conductivity (ECa) that can be used to map soil properties (e.g., salinity, water content and clay content) at the intermediate spatial scale (e.g., Corwin and Lesch, 2003; Robinson et al., 2009, 2012; Neely et al., 2016). Details about the working principles of EMI can be found in studies by Corwin and Lesch (2003) and Doolittle and Brevik (2014). To date, various EMI instruments (e.g., EM meters by Geonics Ltd., Mississauga, ON, Canada; DUALEM meters by Dualem Inc., Milton, ON, Canada; Profiler sensor by Geophysical Survey Systems, Inc., Salem, NH, USA) have been commercially produced. They can be integrated with GPS and data
C. Lu et al. / Catena 149 (2017) 176–184
loggers to perform continuous or station-to-station collection of georeferenced ECa. Influencing factors of soil ECa are complex and spatio-temporal dynamic, which results in the ambiguous integral EMI signal and hinders the straightforward deviation of soil properties. Salinity, clay, mineralogy, θ, and soil temperature are direct influencing factors of soil ECa since they alter the conductivity of the soil bulk electrical conductivity (Archie, 1942; Rhoades et al., 1976). Other soil properties, for example organic matter, indirectly affect ECa since they influence soil properties that are directly related to bulk electrical conductivity (McNeill, 1980; Neely et al., 2016). In addition, terrain attributes (e.g., slope, curvature, and elevation) determined the spatial distributions of soil properties including texture, organic matter DB and θ (e.g., Western et al., 1999; Zhou et al., 2005; Zhu et al., 2010a, 2010b). Therefore, they also have great effects on soil ECa (Liao et al., 2014). The complicated and intertwined relationships among soil ECa, terrain attributes and soil properties hamper the interpretation of soil ECa data for mapping the target variable. For example, Mueller et al. (2003) and Doolittle and Brevik (2014) pointed out that when using EMI to map different soil properties (e.g., clay, pH, θ and ions), results were site/date-specific and varied depending on the complex interactions among multiple soil properties and terrain indices. Therefore, accurate interpretation of EMI data remains a challenge. Modelling the relationships between soil ECa and soil properties is critical in EMI-based soil mapping. To map the spatial variations of “dynamic” soil properties (e.g., θ or subsurface flow), previous studies usually calculated the differences among repeated EMI surveys to remove the influences from static factors. Robinson et al. (2012) subtracted the soil ECa map of the driest day from the ECa map collected during subsequent wetting to estimate relative changes in θ. Altdorff and Dietrich (2014) suggested that the standard deviations from normalized soil ECa collected on different dates can reflect the patterns of θ. When using EMI to map the “static” soil properties (soil texture, organic matter content and DB), simple statistical analyses (e.g., regression and correlation) were used and thus mixed results were reported (Mueller et al., 2003; Zhu et al., 2010b; Doolittle and Brevik, 2014). In addition, in many cases, relationships between soil ECa and target dynamic/static soil properties are difficult to interpret due to that they may not be the primary influencing factors of soil ECa. Usually, the primary influencing factors of soil ECa are the ones that spatially vary most in the study area (Doolittle and Brevik, 2014). Thus, advanced data mining approaches are needed to remove the influences of these most spatially varied factors on soil ECa and thus capture the relationship between soil ECa and target variables. The objective of this study was to investigate whether residual analysis can be used to capture the relationships between soil properties (e.g., particle size distribution, RFC, DB and θ) and soil ECa, which were potentially masked due to the strong influences of terrain on soil ECa. Based on these modelled relationships, this study also tested whether the spatial estimations of these soil properties can be improved on two different land use hillslopes. 2. Materials and methods 2.1. Study hillslopes This study was conducted on two typical land use (tea garden–TG and bamboo forest–BF) hillslopes (each has an area of 0.3 ha) in Taihu Lake Basin, China (31°21′N, 119°03′E) (Fig. 1). This area has a north subtropical to a middle subtropical transition monsoon climate with four distinctive seasons. The annual mean temperature is 15.9 °C and the annual mean precipitation is 1157 mm. The TG and BF hillslopes are adjacent to each other (Fig. 1). Green tea (Camellia sinensis (L.) O. Kuntze) and Moso bamboo (Phyllostachys edulis (Carr.) H. de Lehaie) are dominant on the TG and BF hillslopes, respectively. The elevation of the TG hillslope ranges from 80 to 88 m and the slope ranges from 2 to 21%,
177
while those of the BF hillslope ranges from 77 to 83 m and 0 to 19%, respectively. The dominant soil type on these two hillslopes is shallow lithosols according to the FAO soil classification. Parent material is quartz sandstone. Soils are described as silt loam texture with silt content N 60% and soil organic matter contents are about 2% in the surface layer on both hillslopes. The DB varies from b 0.3 m on the summit slope position to about 1.0 m on the foot slope position. Typical soil profiles on both hillslopes can be divided into three horizons: A (0–10 cm), B (10–40 cm) and C (N40 cm). When the tea plants were first planted on the TG hillslope 15 years ago, tillage was applied and large rocks were removed. Spacing between two rows is about 1–1.5 m wide. The annual N, P and K rates for the TG hillslope are 700–800, 150–200 and 300–400 kg ha−1, respectively. Organic fertilizer (rapeseed residuals after oil extraction) is usually applied by digging shallow trenches (b 0.2 m deep) between rows in November. In the tea growing season (from February to May), manure is broadcasted 2–3 times on the surface. After the tea leafs are harvested (late May), tea plants are pruned and the residuals are left on the surface of the soil. On the BF hillslope that has been established for 35 years, no fertilizer or tillage is applied. Bamboos are occasionally harvested once they are more than five years old. 2.2. EMI survey The EM38-MK2 meter (Geonics Ltd., Mississauga, ON, Canada) was used to measure soil ECa in this study. This instrument has two transmitter-receiver coil spacing of 0.5 and 1.0 m and can be operated in vertical (V) and horizontal (H) dipole orientations. Therefore, it has four different configurations including 0.5H, 0.5V, 1.0H and 1.0V. These four configurations have effective investigation depths of 0.38, 0.75, 0.75 and 1.5 m, respectively. Details of determining the integrated investigation depth can be found in McNeill (1980). Electromagnetic induction surveys were conducted ten times from March 2013 to March 2014 (March 27, May 10, May 13, May 15, July 31, September 26, October 18, and December 16 in 2013; February 26 and March 27 in 2014). In May 2013, EMI surveys were conducted right before (May 10) and after (May 13 and 15) a storm with 58-mm precipitation amount. The EM38-MK2 meter was hand-carried during these surveys and operated in the manual mode. The measurement spatial resolution was designed to be approximately 3 m and reading was taken at the center of each cell. This resulted in N200 measurement points for each hillslope (Fig. 1). All readings were converted to conductivities at a standard temperature of 25 °C using the function established by Reedy and Scanlon (2003). A calibration site was established on the ridgetop where exposed bedrock with strong electrically resistive nature and very low ECa. Before each survey, the EM38-MK2 meter was calibrated 1.5 m above the ground at this calibration site following the user manual. 2.3. Soil properties and terrain attributes Soil properties considered in this study were classified into static and dynamic properties. Particle size distribution, organic matter content and DB are static soil properties, while θ at different depths is dynamic soil property. For monitoring θ, access polyvinyl chloride tubes were installed at 39 and 38 sites on the TG and BF hillslopes, respectively, with a spatial resolution of about 8 m (Fig. 1). A portable time-domain reflectometry TRIME-PICO-IPH soil moisture probe (IMKO, Ettlingen, Germany) was used to measure θ on the same days that the EMI surveys were conducted. Readings were measured at four depth intervals each time: 0 to 0.2, 0.2 to 0.4, 0.4 to 0.6 and 0.6–0.8 m (note that the TRIME-PICO-IPH probe has a length of 0.18 m) to represent the depths of 0.1, 0.3, 0.5, and 0.7 m, respectively. However, due to the shallow soil depths at some locations, only 37, 20 and 2 sites had soil moisture readings recorded at 0.3, 0.5 and 0.7-m depths on the TG hillslope, respectively. Similarly, only 36,
178
C. Lu et al. / Catena 149 (2017) 176–184
Fig. 1. The geographic location of the study tea garden and bamboo forest hillslopes and their soil sampling, soil moisture collection and EMI survey sites. In the figure, TDR (time-domain reflectometry) sites are the sites for soil sampling and soil moisture collection.
27 and 8 sites had soil moisture readings recorded at 0.3, 0.5 and 0.7-m depths on the BF hillslope, respectively. For all measurements, the TRIME-PICO-IPH probe was twisted in the access tube to face different directions and 2–3 readings were then taken, and the average values were used as the measured θ. For each site, the soil profile mean θ were calculated as the average θ at different depths. However, due to the small number of sampling sites, θ at the depth of 0.7-m was not considered in the statistical analysis. Within 1-m distance of each θ access tube, soil samples at 0 to 0.2, 0.2 to 0.4, and 0.4 to 0.6 m depth intervals were collected using a hand auger. Three subsamples were collected for each depth and then fully mixed. These samples were air dried, weighted, ground and sieved through a 2-mm polyethylene sieve. Particles larger than 2 mm were weighed to determine the RFC in each soil sample. Soils that passed through the 2-mm polyethylene sieve were used to analyze the particle size distribution using the Malvern Mastersizer 2000 laser diffraction particle size analyzer (Malvern Instruments Inc., Worcestershire, UK). The fractions of b0.002 mm (clay), 0.002–0.05 mm (silt), and 0.05– 2 mm (sand) fractions were determined for each soil sample. Similar
to θ, the average value of each soil property at all depths was calculated for each sampling site. The DB of all 77 sites were measured when installing the access tubes for θ measurements and taking soil samples using a hand auger. A high-resolution (1 m) digital elevation model (DEM) of the study hillslopes was derived from a 1:1000 contour map. This contour map was generated from a local differential GPS survey. Terrain indices including elevation, slope, plan curvature (PLC), profile curvature (PRC), and topographic wetness index (TWI) were determined from this DEM in ArcGIS 10.0 (ESRI, Redlands, CA). The terrain indices of all 77 sampling points were then extracted. 2.4. Data analysis Textbooks (e.g., Draper and Smith, 1966) recommend that the regression residuals, εi, are analyzed with respect to: 1) where the ε are normally distributed with expectation equal to 0 and with constant variance; 2) whether the ε are independent of the predicted dependent variable and the independent variables; and 3) whether the ε are
C. Lu et al. / Catena 149 (2017) 176–184
179
3. Results and discussion
on both hillslopes) among all terrain indices. Among all considered soil properties, sand content has the strongest spatial variation (CVs = 33% and 49% on TG and BF hillslopes, respectively), while DB and RFC has moderate spatial variations (CVs = 19–28%) and silt and clay had the small spatial variations (CVs b 13%). On both hillslopes, terrain indices (elevation, slope, PLC, PRC and TWI) had large contributions to soil ECa. For different dipole orientations, the mean contribution rates of terrain indices to soil ECa on different dates were 45–52% (Table 2). On some specific survey dates, contributions of terrain indices to soil ECa were even N 70%. Among these terrain indices, slope and elevation had the greatest and the second greatest contributions to soil ECa (contributions rates ranged from 0 to 86.3% and from 0 to 36.7%, respectively), while TWI and PRC had the least and the second least contributions to soil ECa (contributions rates ranged from 0 to 18.1% and from 0 to 25.4%, respectively). The great contributions of terrain indices to soil ECa on our study hillslopes indicates that they are correlated with each other. This has been further proved in the correlation analysis. Significant correlations (P b 0.05 and absolute correlation coefficients from 0.35 to 0.60) were observed between terrain indices and soil ECa on both hillslopes. In previous studies, correlations between terrain indices and soil ECa were attributed to that topography determined the spatial distributions of soil properties (e.g., θ, clay, DB and RFC) that directly or indirectly affected the soil ECa (Martinez et al., 2010; Liao et al., 2014). On both hillslopes, mean contributions of static soil properties (RFC, DB, contents of clay, silt and sand) to soil ECa were 40–50% for different dipole orientations, while those of θ to soil ECa were 5–10% (Table 2). Among these static soil properties, DB and RFC content had the greatest and the second greatest contributions to soil ECa (contributions rates ranged from 0 to 72.1% and from 0 to 47.4%, respectively), while sand content had the least contributions to soil ECa (contributions rates ranged from 0 to 18.5%) (Table 2). Under wet condition, θ contributed more to the spatial variation of soil ECa (Zhu et al., 2010a), thus static soil properties contributed comparatively less to soil ECa. For example, a 22-mm rainfall event occurred on May 12, 2013. On May 13, 2013, contribution of θ to 0.5H ECa increased to 16.2%, while contribution of soil properties to 0.5H ECa reduced to 37.4%. Therefore, the varied contributions of static soil properties and θ to soil ECa were observed in the current study. In our study, relationships between selected soil properties and ECa (correlation coefficients b 0.30) are weak compared with the same properties observed in other studies (correlation coefficients N 0.60) (e.g., Brevik and Fenton, 2002; Sherlock and McDonnell, 2003). This indicates that these selected soil properties are not the primary influencing factors of soil ECa on our study hillslopes. Therefore, the EMI may not be an effective tool to map these soil properties if regular regressions are used.
3.1. Contributions of terrain indices and soil properties to soil ECa
3.2. Relationships between soil properties and terrain indices
The statistical summaries of different terrain indices and soil properties on different hillslopes are provided in Table 1. The PRC (CVs = 175% and 95% on TG and BF hillslopes, respectively) has the greatest spatial variation among all terrain indices. Next to PRC, PLC and TWI have the second and third greatest spatial variations. The CVs of PLC were 144% and 93% and those of TWI are 54% and 78% on TG and BF hillslopes, respectively. However, elevation has the least spatial variation (CVs = 2%
Generally, selected terrain indices are not significantly correlated with most soil properties considered on both hillslopes (Table 3). Exceptions include correlations between slope and RFC on the TG hillslope (correlation coefficients = − 0.39), between slope and DB on the BF hillslope (correlation coefficients = −0.48), and between all terrain indices and silt content on the BF hillslope (absolute correlation coefficients from 0.34 to 0.59). Actually, besides topography, previous
independent of each other. Residual analysis is then carried out for checking with these three aspects. In many cases, the independent variables used in the regression models are not sufficient to predict the dependent variable. The ε must contain information from other independent variables and thus can be further used to model these independent variables (e.g., Akdeniz, 2001; Keijizer-Veen et al., 2005). The stepwise multiple linear regression (SMLR) was processed in SAS (SAS Institute, Cary, NC, USA) to construct the regression model between the dependent variables and independent variables: n
Y ¼ a0 þ ∑ ai X i i¼1
Where Y is the dependent variable; Xi is the independent variable at site i; a is the regression coefficient; n is the number of sites. The significance entry level for an independent variable was set as P b 0.10. To calculate the ECa residuals after removing terrain effect, the four different sets (0.5H, 0.5V, 1.0H, 1.0V) of soil ECa data recorded on ten different survey dates were used as dependent variables, while terrain indices including elevation, slope, TWI, PRC and PLC were used as independent variables. When predicting static soil properties including RFC, sand, silt, clay and DB, terrain indices (elevation, slope, TWI, PRC, PLC), soil ECa in different surveys or ECa residuals on different dates were used as independent variables. When predicting temporally varied θ, besides terrain indices, soil ECa and their residuals measured on the same date as θ collected were used as independent variables. The robustness of the regression models was evaluated by the model R2 value and root mean square error (RMSE). Pearson correlation analysis with statistical significant level as P b 0.05 was also conducted in SAS to reveal the relationships among terrain indices, soil properties and soil ECa. The best regression models that used terrain attributes, soil ECa, or residuals of ECa to predict different soil properties were then determined. The CART model was applied to quantify the relative contributions of different factors to the target variables (soil ECa and ECa residuals). The underlying principle behind CART is to identify and partition the target variable into relatively homogeneous (low standard deviation) terminal nodes, and use the mean value of each node as the node's predicted value. Different types of predictive variables (categorical and continuous) can be integrated into the model (Selle et al., 2007). The CART analysis was conducted in Clementine 12.0 (SPSS, 2008). Detailed description of this model can be found in the studies by Breiman et al. (1984) and Cheng et al. (2009). Using CART, influences of soil properties and terrain indices to soil ECa and ECa residuals can be determined.
Table 1 Means (CVs, %) of terrain attributes (slope, elevation, TWI, PLC and PRC), soil properties (DB, RFC and contents of sand, silt and clay) on different land-use hillslopes. Hillslopes
TG BF
Terrain attributes Slope
Elevation
(%)
(m)
11.5(27) 8.5(46)
84.1(2) 79.5(2)
Soil properties TWI
PLC
PRC
DB
RFC
(cm) 0.3(54) 1.0(78)
0.61(144) 0.12(93)
−1.31(175) 0.22(95)
48.7(19) 56.4(28)
Sand
Silt
Clay
72.2(5) 75.0(8)
15.0(13) 12.5(10)
(%, gravimetric) 50(15) 42(24)
12.8(33) 12.5(49)
180
C. Lu et al. / Catena 149 (2017) 176–184
Table 2 Mean, minimum and maximum relative contributions of terrain attributes (slope, elevation, TWI, PLC and PRC), static soil properties (DB, RFC, contents of sand, silt and clay) and dynamic soil property (volumetric soil water content–θ) to ECa on all ten dates on different land-use hillslopes. Hillslope
Factors
TG
Mean (minimum-maximum) relative contribution (%)
Terrain indices Soil properties θ Terrain indices Soil properties θ
BF
0.5Va
1.0Va
0.5Ha
1.0Ha
49.2 (31.6–76.5) 45.8 (23.5–65.2) 5.0 (0–10.5) 51.8 (45–72) 39.2 (20.4–46.0) 9.1 (6.8–11.1)
52.0 (35.6–91.1) 42.1 (6.7–57.5) 5.9 (0–11.2) 52.6 (36.9–67) 41.3 (30.8–53.2) 6.2 (0.2–9.9)
45.7 (17.2–59.0) 48.4 (34.5–80.2) 5.9 (1.3–9.2) 45.4 (39.4–54.3) 45.0 (37.4–53.4) 9.6 (5.9–16.2)
51.3 (39.2–60.0) 43.4 (34.5–54.5) 5.2 (0.2–8.7) 49.7 (45.3–56.2) 41.1 (35.9–45.5) 9.2 (6.7–12.9)
a 0.5V and 1.0V means EM38-MK2 meter operated vertical dipole orientation with 0.5 and 1.0 m coil spacing, respectively; 0.5H and 1.0H means EM38-MK2 meter operated horizontal dipole orientation with 0.5 and 1.0 m coil spacing, respectively.
studies also reported that spatial distributions of soil properties can be influenced by a multitude of different factors (e.g., land use type and management practices) (Fu et al., 2003; Biro et al., 2013). Correlations between θ and terrain indices are weak on most dates on the TG hillslope, while they are comparatively stronger on the BF hillslope (Table 4). On the TG hillslope, elevation is the only terrain index that was significantly (P b 0.05) correlated with θ at all three depths on most dates (absolute correlation coefficients from 0.33 to 0.46). This suggests that the selected terrain indices are not good predictors for θ on the TG hillslope. In comparison, on the BF hillslope, terrain indices (elevation, slope, TWI, PLC and PRC) are frequently significantly (P b 0.05) correlated with θ and most depths (absolute correlation coefficients from 0.34 to 0.62). The mean θ on the BF hillslope were 1.5–2.5 times greater than those on the TG hillslope, which can be attributed to its lower elevation, gentler slope, higher TWI and thicker surface organic layer (Table 1). Therefore, stronger correlations are observed between terrain indices and θ on the BF hillslope than on the TG hillslope. Similarly, in previous studies, strong correlations between topography and θ have also been observed under wetter conditions (Western et al., 1999; Grayson et al., 2002). Since the selected terrain indices are not closely correlated with the selected soil properties on both hillslopes (Tables 3 and 4), they are not good predictors for these soil properties. In addition, previous section has suggested that EMI may not be an effective tool to map these soil properties either. Therefore, advanced data processing techniques are needed to predict the spatial distributions of these soil properties. 3.3. Contributions of terrain indices and soil properties to ECa residuals Terrain indices were used to regress soil ECa in different surveys and corresponding ECa residuals were then calculated. Elevation, TWI and slope were included as independent variables in all SMLR models, while PLC and PRC were only adopted in SMLP models for EMI surveys conducted on December 16th 2013 and February 26th 2014 on the BF hillslopes. The ECa residuals, which have removed the influences from the selected terrain indices, were more closely related to the selected soil properties (absolute correlation coefficients N 0.50). Soil properties (except silt content) have great contributions to the ECa residuals on some dates, while the contributions of terrain indices Table 3 Correlation coefficients between terrain attributes and static soil properties. “–” in the table means the correlation is not significant at P b 0.05. Factors
Slope Ea TWI PLC PRC a
RFC
Sand content
Silt content
Clay content
DB content
TG
BF
TG
BF
TG
BF
TG
BF
TG
BF
−0.39 – – – –
– – – – –
– – – – –
0.51 – – – −0.39
– – – – –
−0.59 −0.39 0.34 −0.34 0.55
– – – – –
– – – – –
– – – – –
−0.48 – – – –
Elevation.
are comparatively low (Table 5). For example, contributions of RFC, sand content, clay content and DB to ECa residuals are as great as 42.4, 46.7, 93.0, and 41.5% on the TG hillslope, respectively, while those values are 77.1, 57.7, 83.5, and 64.9% on the BF hillslope, respectively. Strong contributions of θ to ECa residuals are also observed on both hillslope. For example, on the TG hillslope, θ on Oct. 18, 2013 contributed 88.1% and 36.6% to residuals of 1.0V and 1.0H, respectively. On the BF hillslope, θ on Mar 27, 2013 contributed 66.0% and 44.2% to residuals of 0.5V and 1.0V, respectively; θ on July 31, 2013 contributed 63.8% to residuals of 1.0V; θ on Oct 18, 2013 contributed 64.7% and 46.4% to residuals of 0.5H and 1.0H, respectively.
3.4. Spatial estimation of static soil properties Three scenarios of independent variable combinations were considered in the spatial estimation of static soil properties on these two hillslopes. Scenarios include the use of i) only terrain indices, ii) only soil ECa from different EMI surveys, and iii) only ECa residuals from different EMI surveys (Fig. 2). When only terrain indices were used to predict static soil properties, the accuracies of the SMLRs were generally low on both hillslopes. For example, the R2 values of the SMLRs using terrain indices to regress RFC, sand and silt contents are b0.31 on the TG hillslope (Fig. 2a). Even more, no SMLR can be constructed to predict clay content and DB on the TG hillslope (Fig. 2a). Soils on the TG hillslope received disturbance from the management practices including tillage, fertilization, tea plant pruning, RFC removing, soil loosening and ground surface leveling. This could result in the comparatively random and less topography dependent spatial distributions of these soil properties. Similarly, the accuracies of the SMLRs were also low when only terrain indices were used to predict static soil properties on the BF hillslopes (R2 from 0.15 to 0.59 and RMSE from 0.019 to 0.140) (Fig. 2b). The only exception is the SMLR for predicting silt content (R2 = 0.59 and RMSE = 0.031) (Fig. 2b). This is due to that silt content is the only selected soil property that is significantly (P b 0.05) correlated with all terrain indices on the BF hillslope (Table 3). When only soil ECa values on all dates were used to predict the selected soil properties, the accuracies of the SMLRs were also low on both hillslopes. On the TG hillslope, the R2 values of the SMLRs using soil ECa to regress different soil properties are b0.32 (Fig. 2a). On the BF hillslope, the R2 values of the SMLR using soil ECa to regress RFC, silt and clay are b0.30, while the SMLR cannot be constructed to predict DB (Fig. 2b). The only exception is the SMLR for predicting sand content on the BF hillslope (R2 = 0.68 and RMSE = 0.032) (Fig. 2b). The low accuracies of using soil ECa to regress the selected soil properties can be attributed to that the spatial variations of soil ECa being largely influenced by terrain indices in this study (Table 2). Since terrain indices are not good predictors of these selected soil properties, soil ECa were not good predictors either. Previous studies also reported weak or varied relationships between soil ECa and soil properties. For example, Zhu et al. (2010b) found that soil ECa were not correlated with clay
C. Lu et al. / Catena 149 (2017) 176–184
181
Table 4 Correlation coefficients between terrain attributes and θ. Since slope, TWI, PLC and PRC are not significantly correlated with θ at the 0.1, 0.3 and 0.5-m depths on TG hillslope on all dates, they are not listed in the table. Similarly, correlation coefficients between θ at 0.1 and 0.5-m depths and PRC and TWI are not listed in the table for BF hillslope at 0.1 and 0.5-m depths, respectively. “–” in the table means the correlation is not significant at P b 0.05. Hillslope
Factors
Depth (m)
θ on different dates 2013
TG
BF
a
Ea E E Slope E TWI PLC Slope E TWI PLC PRC Slope E PLC PRC
0.1 0.3 0.5 0.1
0.3
0.5
2014
27-Mar
10-May
13-May
15-May
31-Jul
26-Sep
18-Oct
16-Dec
26-Feb
27-Mar
−0.43 −0.37 −0.44 −0.62 −0.36 – – −0.55 −0.44 0.33 – 0.36 −0.57 −0.54 – –
−0.34 −0.34 −0.42 −0.60 −0.34 0.33 −0.34 – −0.36 – – – – −0.46 – –
−0.38 −0.33 – −0.57 – – −0.33 −0.49 −0.44 – −0.36 0.39 – −0.47 −0.45 0.44
−0.35 −0.35 – −0.58 – – – −0.46 −0.39 – −0.38 0.39 −0.48 – −0.47 0.46
−0.38 −0.38 −0.42 −0.45 −0.39 – – −0.38 −0.34 – −0.39 0.37 −0.48 −0.45 −0.49 –
−0.42 −0.39 −0.43 −0.35 – – – −0.35 −0.34 – −0.37 0.36 −0.45 – −0.42 –
−0.36 −0.36 – −0.57 −0.34 – – −0.47 −0.46 – −0.33 0.33 −0.42 −0.46 – –
−0.36 −0.35 – – – – – – – – −0.34 0.36 – – – 0.46
−0.46 – −0.45 −0.58 – – – −0.49 −0.38 – – – −0.49 −0.44 – –
−0.44 – – −0.56 – – – −0.37 – – −0.40 0.39 −0.47 – −0.40 0.49
Elevation.
content and organic matter content; Mueller et al. (2003) reported mixed results of correlations between soil ECa and different soil properties. The accuracies of SMLRs using only ECa residuals on all dates to predict soil properties are generally the best in all three scenarios. On the TG hillslope, using ECa residuals, the R2 values are all N0.50 (Fig. 2a), while on the BF hillslope, except sand and silt contents, the R2 values are all N0.46 (Fig. 2b). This indicates that on both hillslopes, ECa residuals are generally better predictors than terrain indices or original soil ECa in estimating the spatial distributions of these selected static soil properties. 3.5. Spatial estimation of θ In our study, θ is significantly correlated with terrain indices on both hillslope (Table 4). In addition, previous studies reported that variations in θ are influenced by both soil properties and terrain indices (e.g., Western et al., 1999; Grayson et al., 2002; Zhu and Lin, 2011). Therefore, in order to better estimate spatial distributions of temporally varied θ on our study hillslopes, both terrain indices and soil properties should be considered in the SMLRs. Three scenarios of independent variable
combinations were considered here: i) only terrain indices, ii) terrain indices plus soil ECa from an EMI survey conducted on the same day as θ was collected, and iii) terrain indices plus ECa residuals from an EMI survey conducted on the same day as θ were collected. When only terrain indices were used to predict θ at different depths on different dates (scenario i), the accuracies of the SMLRs were generally low on both hillslopes. At different depths on the TG hillslope, R2 values range from 0.27 to 0.59 (mean = 0.41), while the RMSE values range from 0.030 to 0.054 (mean = 0.041) (Fig. 3). At different depths on the BF hillslope, R2 values range from 0.26 to 0.48 (mean = 0.40) and RMSE values range from 0.030 to 0.069 (mean = 0.050) (Fig. 4). These R2 values are comparable to those obtained in the studies of Zhu and Lin (2010), Ibrahim and Huggins (2011) and Yao et al. (2013) conducted in Pennsylvania and Washington, USA, and Loess Plateau of China, respectively. When soil ECa collected on the same day as θ was added with terrain indices as independent variables (scenario ii), the accuracies of the θ prediction did not improved on most dates. At different depths, R2 values still range from 0.27 to 0.59 (mean = 0.42) and from 0.26 to 0.49 (mean = 0.42) on the TG and BF hillslopes, respectively (Figs. 3 and 4). In very few cases, scenario ii achieved the best accuracies for θ
Table 5 Mean, minimum and maximum relative contributions of all terrain attributes (including slope, elevation, TWI, PLC and PRC), static soil properties (DB, RFC, contents of sand, silt and clay) and dynamic soil property (θ) to ECa residuals on all ten dates. The ECa residuals are the residuals of the SMLRs using terrain attributes to predict ECa on different dates and hillslopes. Hillslope
TG
BF
Factors
Terrain indices RFC Sand Silt Clay DB θ Terrain indices RFC Sand Silt Clay DB θ
Mean (minimum-maximum) relative contribution (%) 0.5Va
1.0Va
0.5Ha
1.0Ha
12.9 (7.7–18) 6.4 (0–19.3) 8.1 (0–15.7) 5.4 (0–16.2) 46.9 (15.2–76.0) 18.2 (10–31.8) 5.4 (0–16.2) 10.6 (5.5–17) 20.5 (0–77.1) 8.2 (0–33.6) 1.7 (0–10.1) 18.2 (0–79.6) 34.0 (0–75.7) 26.9 (0–66.0)
4 (0–8.5) 3.7 (0–10.4) 0.2 (0–0.7) 0.2 (0–0.7) 47.0(0–93.0) 10.5 (1.7–17.9) 34.7 (0.7–88.1) 12.9 (5.6–19.9) 10.1 (0–30.4) 5.6 (0–15.6) 3.0 (0–8.9) 23.9 (0–83.5) 26.7 (0–64.9) 17.8 (0.4–63.8)
16.2 (13.8–19.2) 23.6 (13.4–42.4) 15.6 (0–46.7) 19.9 (12–27.1) 7.6 (0–22.8) 14.8 (0–41.5) 2.2 (0–4.0) 2.7 (2.3–3.0) 1.9 (0–3.7) 28.9 (0–57.7) 3.6 (0–7.2) 14.2 (0–28.4) 3.1 (1.5–4.6) 45.8 (26.9–64.7)
13.0 (12.3–13.6) 1.5 (1.2–1.7) 1.6 (1.5–1.7) 10.9 (0–21.7) 41.0 (23–58.9) 13.4 (1.4–25.4) 19.2 (1.7–36.6) 8.3 (0–14.4) 21.5 (0–52.9) 8.4 (0–26.2) 6.9 (0–19.3) 39.9 (0–78.6) 2.3 (0–9.4) 12.7 (0–46.4)
a 0.5V and 1.0V means EM38-MK2 meter operated vertical diploe orientation with 0.5 and 1.0 m coil spacing, respectively; 0.5H and 1.0H means EM38-MK2 meter operated horizontal diploe orientation with 0.5 and 1.0 m coil spacing, respectively.
182
C. Lu et al. / Catena 149 (2017) 176–184
a) TG hillslope 1.0
R2 values
0.8
0.6 0.4
0.2
RMSE (g g-1 or m)
0.20 Only terrain indices Only ECa in different surveys Only residuals of ECa in different surveys
0.0
0.15 0.10 0.05
0.00 RFC
Sand
Silt
Clay
DB
RFC
Sand
Silt
Clay
RFC
Sand Silt Clay Soil properties
DB
b) BF hillslope 0.20
1.0 RMSE (g g-1 or m)
R2 values
0.8
0.6 0.4
0.2
0.15 0.10 0.05 0.00
0.0 RFC
Sand Silt Clay Soil Properties
DB
DB
Fig. 2. The R2 and RMSE values of the SMLRs that used only terrain indices, only soil ECa and only ECa residuals in different EMI surveys to predict RFC, sand, silt, clay and DB on a) the TG hillslope and b) the BF hillslope. No SMLR can be constructed for predicting clay and DB on the TG hillslope using only terrain indices, for predicting DB on the BF hillslope using only soil ECa, and for predicting sand and silt on the BF hillslope using only ECa residuals. The unit of RMSE values for RFC, sand, silt and clay is g g−1, while that for DB is m.
Fig. 3. The R2 and RMSE values for the SMLRs that used only terrain indices, terrain indices + soil ECa and terrain indices + ECa residuals to predict θ on the same date that the soil ECa was collected on the TG hillslope at a) 0.1-m depth, b) 0.3-m depth and c) 0.5-m depth.
C. Lu et al. / Catena 149 (2017) 176–184
183
Fig. 4. The R2 and RMSE values for the SMLR that used only terrain indices, terrain indices + soil ECa and terrain indices + ECa residuals to predict θ on the same date that the soil ECa was collected on the BF hillslope at a) 0.1-m depth and b) 0.3-m depth and c) 0.5-m depth.
prediction (Figs. 3 and 4). This indicates that on both hillslopes, inclusion of soil ECa as independent variables had limit benefit for improving the θ prediction. This is inconsistent with previous studies, in which soil ECa were reported to be good predictors of θ (e.g., Reedy and Scanlon, 2003; Sherlock and McDonnell, 2003; Zhu et al., 2010a). Terrain indices and soil ECa are not independent from each other in our study (Table 2). Therefore, adding soil ECa with terrain indices as predictors of θ did not really supplement much new information in the SMLRs. Actually, in most cases, the independent variables included in the SMLRs are the same in scenario i and ii. This also suggests that soil ECa may be more
a) Using terrain indicesto predict θ
strongly affected by other soil properties that are more spatially varied than θ but not considered in this study. For example, organic matter content, CEC and bulk density are not considered, but have been widely reported to have important influences on soil ECa (e.g., Archie, 1942; Rhoades et al., 1976; Brevik and Fenton, 2002; Corwin and Lesch, 2003). On both hillslopes, the θ prediction achieved the best accuracy when a combination of terrain indices and ECa residuals were used as independent variables in the SMLRs (scenario iii). On the TG hillslope, the R2 values of this scenario are the greatest among all three scenarios at most depths on most dates (mean = 0.49) (Fig. 3). On this hillslope,
b) Using terrain indices+soil ECa to predict θ
R2=0.38, NSE=0.37
R2=0.45, NSE=0.45
c) Using terrain indices+ECa residuals to predict θ
R2=0.54, NSE=0.53
Fig. 5. The examples of predicted θ maps that used a) terrain indices, b) terrain indices + soil ECa and c) terrain indices + ECa residuals as independent variables on July 31th 2013 on the BF hillslope at 0.1-m depth using SMLR.
184
C. Lu et al. / Catena 149 (2017) 176–184
the R2 values of the SMLRs in scenario iii are the greatest on 6, 6 and 7 dates at the depths of 0.1, 0.3 and 0.5 m, respectively (Fig. 3). Similarly, on the BF hillslope, the R2 values of this scenario are also the greatest at most depths on most dates (mean = 0.48) (Fig. 4). The R2 values of the SMLRs in scenario iii are the greatest on 6, 7 and 7 dates at the depths of 0.1, 0.3 and 0.5 m, respectively (Fig. 4). In addition, for these dates on which scenario iii achieved the best prediction, θ contributed N 40% to the ECa residuals in the CART analysis. It may be a surprise that scenario ii performed worse than scenario iii, since both of them used the same information (terrain and soil ECa). Soil ECa and terrain indices are not independent in this study (Table 2). Therefore, when both of them were used in the SMLRs, some of these independent variables may automatically be excluded from the final regression equations. For example, on most dates, SMLRs in scenario ii had the same R2 and RMSE values as those in scenario i (Figs. 3 and 4). This is due to that the additional soil ECa variables have been automatically removed in the stepwise procedure. However, ECa residuals are independent from terrain indices. When ECa residuals and terrain indices were used in SMLR to predict θ (scenario iii), more independent variables were included in the final regression equations and thus better accuracies were achieved. As an example, Fig. 5 showed that better accuracy was achieved in scenario iii at 0.1-m depth on the BF hillslope. 4. Conclusions Better EMI data interpretation is important in detecting the spatial variations of soil properties at the intermediate spatial scale. In this study, SMLRS between soil ECa (dependent variables) and terrain indices (independent variables) were constructed on two hillslopes under different uses and managements. This process generated the ECa residuals that had little the influence from terrain. While contributions of terrain indices to soil ECa are strong, great contributions of soil properties to ECa residuals of are observed. When ECa residuals were used in the SMLRs to predict soil properties on both hillslopes, better accuracies were achieved than using only terrain indices and/or ECa as independent variables. This indicated that residual analysis should receive more attention to improve the accuracies in EMI data interpretation for mapping different variables. Acknowledgements This work was supported by the National Natural Science Foundation of China (41622102 and 41571080) and the Natural Science Foundation of Jiangsu Province (BK20151613). References Akdeniz, F., 2001. The examination and analysis of residuals for some biased estimators in linear regression. Commun. Stat. Theor. M. 30, 1171–1183. Altdorff, D., Dietrich, P., 2014. Delineation of areas with different temporal behavior of soil properties at a landslide affected alpine hillside using time-lapse electromagnetic data. Environ. Earth Sci. 72, 1357–1366. Archie, G.E., 1942. The electrical resistivity log as an aid on determining some reservoir characteristics. Trans. Am. Inst. Min. Metal. Am. Petr. Eng. 146, 54–62. Biro, K., Pradhan, B., Buchroithner, M., Makeschin, F., 2013. Land use/land cover change analysis and its impact on soil properties in the northern part of Gadarif region, Sudan. Land. Degrad. Dev. 24, 90–102. Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J., 1984. Classification and Regression Trees. Pacific Grove, CA, Wadsworth. Brevik, E.C., Fenton, T.E., 2002. The relative influence of soil water, clay, temperature, and carbonate minerals on soil electrical conductivity readings with an EM-38 along a Mollisol catena in central Iowa. Soil. Surv. Hori. 43, 9–13. Cheng, W., Zhang, X.Y., Wang, K., Dai, X.L., 2009. Integrating classification and regression tree (CART) with GIS for assessment of heavy metals pollution. Environ. Monit. Assess. 158, 419–431.
Corwin, D.L., Lesch, S.M., 2003. Application of soil electrical conductivity to precision agriculture: theory, principle, and guidelines. Agron. J. 95, 455–471. Doolittle, J.A., Brevik, E.C., 2014. The use of electromagnetic induction techniques in soils studies. Geoderma 223-225, 33–45. Draper, N.R., Smith, H., 1966. Applied Regression Analysis. Wiley, New York. Fu, B.J., Wang, J., Chen, L.D., Qiu, Y., 2003. The effects of land use on soil moisture variation in the Danangou catchment of the Loess Plateau, China. Catena 54, 197–213. Grayson, R.B., Blöschl, G., Western, A.W., McMahon, T.A., 2002. Advances in the use of observed spatial patterns of catchment hydrological response. Adv. Water Resour. 25, 1313–1334. Ibrahim, H.M., Huggins, D.R., 2011. Spatial-temporal patterns of soil water storage under dryland agriculture at the watershed scale. J. Hydrol. 404, 186–197. Keijizer-Veen, M.G., Euser, A.M., van Montfoort, N., Dekker, F.W., Vandenbroucke, J.P., Van Houwelingen, H.C., 2005. A regression model with unexplained residuals was preferred in the analysis of the fetal origins of adult diseases hypothesis. J. Clin. Epidemiol. 58, 1320–1324. Liao, K.H., Zhu, Q., Doolittle, J.A., 2014. Temporal stability of apparent soil electrical conductivity measured by electromagnetic induction techniques. J. MT Sci. ENGL 11, 98–109. Lin, H.S., Bouma, J., Wilding, L., Richardson, J., Kutílek, M., Nielsen, D., 2005. Advances in hydropedology. Adv. Agron. 85, 1–89. Martinez, G., Vanderlinden, K., Giraldez, J.V., Espejo, A.J., Muriel, J.L., 2010. Field-scale soil moisture pattern mapping using electromagnetic induction. Vadose Zone J. 9, 871–881. McColl, K.A., Ryu, D., Matic, V., Walker, J.P., Costelloe, J., Rudiger, C., 2012. Soil salinity impacts on l-band remote sensing of soil moisture. IEEE. Geosci. Remote. S. 9, 262–266. McNeill, J.D., 1980. Electromagnetic Terrain Conductivity Measurement at Low Induction Numbers. Tech. Note TN-6. Geonics Ltd., Mississauga, ON, Canada. Mueller, T.G., Hartsock, N.J., Stombaugh, T.S., Shearer, S.A., Cornelius, P.L., Barnhisel, R.I., 2003. Soil electrical conductivity map variability in limestone soils overlain by loess. Agron. J. 95, 496–507. Mulder, V.L., de Bruin, S., Schaepman, M.E., Mayr, T.R., 2011. The use of remote sensing in soil and terrain mapping: a review. Geoderma 162, 1–19. Neely, H.L., Morgan, C.L.S., Hallmark, C.T., Mclnnes, K.J., Molling, C.C., 2016. Apparent electrical conductivity response to spatially variable vertisol properties. Geoderma 263, 168–175. Parn, J., Pinay, G., Mander, U., 2012. Indicators of nutrients transport from agricultural catchments under temperate climate, a review. Ecol. Indi. 22, 4–15. Reedy, R.C., Scanlon, B.R., 2003. Soil water content monitoring using electromagnetic induction. J. Geotech. Geoenviron. 129, 1028–1039. Rhoades, J.D., Raats, P.A.C., Prather, R.S., 1976. Effects of liquid-phase electrical conductivity water content and surface conductivity on bulk soil electrical conductivity. Soil Sci. Soc. Am. J. 40, 651–665. Robinson, D.A., Lebron, I., Kocar, B., Phan, K., Sampson, M., Crook, N., Fendorf, S., 2009. Time-lapse geophysical imaging of soil moisture dynamics in tropical deltaic soils: an aid to interpreting hydrological and geochemical processes. Water. Resour. Res., W00D32 http://dx.doi.org/10.1029/2008WR006984. Robinson, D.A., Abdu, H., Lebron, I., Jones, S.B., 2012. Imaging of hillslope soil moisture wetting patterns in a semi-arid oak savanna catchment using time-lapse electromagnetic induction. J. Hydrol. 417–419, 39–49. Selle, B., Lischeid, G., Huwe, B., 2007. Effective modeling of percolation at the landscape scale using data- based approaches. Comput. Geosci. UK 34, 699–713. Sherlock, M.D., McDonnell, J.J., 2003. A new tool for hillslope hydrologists: spatially distributed groundwater level and soil water content measured using electromagnetic induction. Hydrol. Process. 17, 1965–1977. Vereecken, H., Huisman, J., Bogena, H., Vanderborght, J., Vrugt, J., Hopmans, J., 2008. On the value of soil moisture measurements in vadose zone hydrology: a review. Water. Resour. Res., W00D06 http://dx.doi.org/10.1029/2008WR006829. Western, A.W., Grayson, R.B., Blöschl, G., Willgoose, G.R., McMahon, T.A., 1999. Observed spatial organization of soil moisture and its relation to terrain indices. Water Resour. Res. 35, 797–810. Yao, X.L., Fu, B.J., Lu, Y.H., Sun, F.X., Wang, S., Liu, M., 2013. Comparison of four spatial interpolation methods for estimating soil moisture in a complex terrain catchment. PLOS ONE, e54660 http://dx.doi.org/10.1371/journal.pone.0054660. Zhou, B., Zhang, X.G., Wang, F., Wang, R.C., 2005. Soil organic matter mapping by decision tree modeling. Pedosphere 15, 103–109. Zhu, Q., Lin, H.S., 2010. Comparing ordinary kriging and regression kriging for soil properties in contrasting landscapes. Pedosphere 20, 594–606. Zhu, Q., Lin, H.S., 2011. Influences of soil, terrain, and crop growth on soil moisture variation from transect to farm scales. Geoderma 163, 45–54. Zhu, Q., Lin, H.S., Doolittle, J.A., 2010a. Repeated electromagnetic induction surveys for determining subsurface hydrologic dynamics in an agricultural landscape. Soil Sci. Soc. Am. J. 74, 1750–1762. Zhu, Q., Lin, H.S., Doolittle, J.A., 2010b. Repeated electromagnetic induction surveys for improved soil mapping in an agricultural landscape. Soil Sci. Soc. Am. J. 74, 1763–1774. Zhu, Q., Liao, K.H., Xu, Y., Yang, G.S., Wu, S.H., Zhou, S.L., 2012. Monitoring and prediction of soil moisture spatial-temporal variations from a hydropedological perspective: a review. Soil Res. 50, 625–637.