Rescue and renewal of legacy soil resource inventories in Iran as an input to digital soil mapping

Rescue and renewal of legacy soil resource inventories in Iran as an input to digital soil mapping

Journal Pre-proof Rescue and renewal of legacy soil resource inventories in Iran as an input to digital soil mapping Zahra Rasaei, David G. Rossiter,...

2MB Sizes 3 Downloads 24 Views

Journal Pre-proof Rescue and renewal of legacy soil resource inventories in Iran as an input to digital soil mapping

Zahra Rasaei, David G. Rossiter, Abbas Farshad PII:

S2352-0094(20)30011-0

DOI:

https://doi.org/10.1016/j.geodrs.2020.e00262

Reference:

GEODRS 262

To appear in:

Geoderma Regional

Received date:

6 November 2019

Revised date:

6 February 2020

Accepted date:

7 February 2020

Please cite this article as: Z. Rasaei, D.G. Rossiter and A. Farshad, Rescue and renewal of legacy soil resource inventories in Iran as an input to digital soil mapping, Geoderma Regional(2020), https://doi.org/10.1016/j.geodrs.2020.e00262

This is a PDF file of an article that has undergone enhancements after acceptance, such as the addition of a cover page and metadata, and formatting for readability, but it is not yet the definitive version of record. This version will undergo additional copyediting, typesetting and review before it is published in its final form, but we are providing this version to give early visibility of the article. Please note that, during the production process, errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.

© 2020 Published by Elsevier.

Journal Pre-proof

Rescue and renewal of legacy soil resource inventories in Iran as an input to digital soil mapping

Zahra Rasaei

a,*

, David G. Rossiter b,c,d , Abbas Farshad e

a College of Agriculture, Shahrekord University, Shahrekord, Iran b ISRIC World Soil Information, PO Box 353, Wageningen 6700 AJ, The Netherlands 
 c Section of Soil & Crop Sciences, Cornell University, 242 Emerson Hall, Ithaca NY 14853, USA 


of

d School of Geography Science, Nanjing Normal University, 1 Wenyuan Road, Nanjing, Jiangsu 210046, China e Retired scientific member of the ESA Dept., ITC, Twente University, Enschede, The Netherlands

ro

Abstract

-p

Maps of soil properties and classes are key inputs to land management, especially with the increasing awareness of ecosystem services provided by soils. Due to limited resources for new

re

soil surveys, legacy soil inventories are often the major public source of soil data, and this is the

lP

case in Iran. One line of previous work has presented methods for data archaeology, rescue and

na

renewal, and another line has shown the value of legacy surveys as covariates for digital soil mapping (DSM). The present study therefore aimed to integrate these two streams, adding

ur

another step of adequacy evaluation of the rescued surveys according to the Cornell guidelines.

Jo

The study area is a 10,480 km2 region located at the border of Isfahan and Chaharmahal-vaBakhtiari provinces, Iran, covered by three legacy studies at the scale of 1: 50,000. The legacy maps were georeferenced and geocorrected, after which the Cornell adequacy guidelines were used to assess the quality of soil unit separation as evaluated at four levels of Soil Taxonomy. Evaluation was by weighted accuracy and the Tau coefficient, based on forty-one legacy soil profiles from a correlation study. The weighted accuracy of the map and the Tau index at all classification levels were respectively greater than 70% and 50%, which can be considered as *

Corresponding author Email addresses: [email protected] (Zahra Rasaei), [email protected] (David G. Rossiter), [email protected] (Abbas Farshad)

1

Journal Pre-proof satisfactory separation of soil units. Multinomial logistic regression (MLR) predictive models of soil classes with and without the legacy map as covariate were fit. The predictive accuracy of the models was improved at all taxonomic levels when the renewed legacy soil map was included as a covariate. We conclude that the selected legacy soil maps are of reasonable quality and can be used as reliable and useful inputs to DSM, and we propose that these procedures be widely

of

applied.

ro

Keywords: Legacy soil inventories, Cornell adequacy criteria, weighted Tau index, weighted

-p

overall accuracy, conventional soil mapping, Alfisols, Aridisols, Entisols, Inceptisols, Mollisols,

Jo

ur

na

lP

re

Vertisols

2

Journal Pre-proof 1. Introduction Soil plays an important role in numerous environmental and social fields including land use planning management, ecosystem services, and food security, as outlined in the UN Sustainable Development Goals (Bouma et al., 2019), so spatial information on soil properties and classes provide crucial information for these. Hence, attempts to provide high accuracy and reliable soil point data and maps are increasing (Keesstra et al., 2016). New surveys are in many contexts

of

infeasible, whereas many areas of the world have been surveyed in the past, the results of these

ro

known as “legacy” soil information, here referred to as conventional soil maps (CSM) and their

-p

accompanying legends and data tables (Waltner et al., 2014). Some previous works (Cambule et

re

al., 2015; Dent and Ahmed, 1995; Yang et al., 2011) have shown that legacy information can be re-used in the current context.

lP

However, since CSM were created without the benefit of digital technologies such as digital

na

elevation models and GPS location of point observations, they may not be geometrically precise. Further, the map unit composition and characterization were done by expert judgement which is

ur

complex and qualitative, and the quality of these products is unknown. Consequently, CSM are

Jo

somewhat subjective, and at the same time their final outputs are suffering from some drawbacks e.g., a low to medium efficiency accuracy, and also a lack of clearly quantified uncertainty assessment (Kempen et al., 2012; Zhu et al., 2001). Therefore, there is a reluctance to use them as reliable sources of soil information. Furthermore, the lack of access to data (or even unawareness of its existence), preparation of maps according to old standards, soil changes over time, inaccurate georeferencing, and the destruction of soil maps have limited their use in such a way that there is less inclination to reuse them (Bui and Moran, 2001; Láng et al., 2013; Odeh et al., 2012; Rossiter, 2008). Therefore, this valuable legacy information is often discarded or kept unused in the archives. Taking into

3

Journal Pre-proof account the dependence of better soil and environmental management on this information, serious attempts should be made to preserve, evaluate, and reuse these old soil inventories (Baxter and Crawford, 2008) in building national soil geographic databases. These are expected to help policymakers and planners to address future soil-related environmental and social challenges. Accordingly, attempts to keep this valuable information from destruction (Cambule et al., 2015; Rossiter, 2008), reuse and enhance their reliability through updating via new field

of

samplings and Digital Soil Mapping (DSM) (McBratney et al., 2003) techniques (Ahmed and

ro

Dent, 1997; Kempen et al., 2012; Li et al., 2013; Mayr et al., 2010; Yang et al., 2011), and use as

-p

an input in DSM process (Balkovič et al., 2013; Mayr et al., 2008) have been increasing over the

re

past decades.

Rossiter (2008) distinguishes three stages of using legacy soil data: data archaeology, rescue,

lP

and renewal. In the archaeology stage attempts are made to find all possible legacy soil resources

na

at hand in the desired study area. The discovered legacy soil surveys are then grouped based on some of their general characteristics such as researcher, year, scale, and categorical detail. The

ur

studies are afterwards digitized, at least by scanning as images and preferably converted to text if

Jo

possible, in order to save them from any danger of destruction or loss; this is the data rescue step. In the renewal step the studies are put into a new georeferenced and digitized structure, with a well-structured

point

and

polygon

database.

Several researches have emphasized

that

preprocessing processes such as evaluating, arranging and quality control are needed before reusing legacy soil inventories (see e.g., Cambule et al., 2015; Lagacherie and McBratney, 2006; Leenaars et al., 2014; Sulaeman et al., 2013; Waltner et al., 2014). Recently, there have been some efforts at a global level to control the quality of these studies and renew them. Among others, Sulaeman et al. (2013) and Dewitte et al. (2013) investigated legacy soil maps to harmonize and to prepare them for digital soil mapping. Sarmento et al. (2014) checked the

4

Journal Pre-proof quality of some Brazilian legacy soil maps and emphasized the importance of controlling the quality of this information before reusing. After rescuing some legacy soil maps from Mozambique, Cambule et al. (2015) evaluated their accuracy through the Cornell Adequacy Criteria (Forbes et al., 1982). These criteria summarized the state-of-the-art in evaluating published soil surveys, based on two expert consultations at Cornell in the late 1970’s and are a suitable set of guidelines for such evaluation.

of

This paper first (Section 1) provides a brief background on soil survey in Iran; for details see

ro

the recent book by Roozitalab et al. (2018). It then (Section 2) discusses the methodology of soil

-p

survey archeology, rescue and renewal, assessment of map quality, and the use of renewed

re

surveys as a covariate in digital soil mapping. The results of applying the methodology in our study area are then (Section 3) presented and discussed. Finally (Section 4), some conclusions are

na

lP

drawn from this study, with recommendations for future work.

1.1. Traditional soil survey in Iran

ur

Scientific soil survey in Iran was initiated in 1953 with modernization of agriculture sector and

Jo

construction of big dams (Farshad et al., 2015; Moemeni, 2000; Roozitalab et al., 2018; Zeraatpisheh et al., 2020). The whole country has been studied through different soil surveys from 1944 to 2007. After this period of intense activity, soil survey was rarely carried out, most likely owing to lack of funds. A thorough and detailed representation over background of soil mapping in Iran and different soil survey projects is given by Roozitalab et al. (2018) and Farshad et al. (2015). More recently, Zeraatpisheh et al. (2020) have reviewed the history and the methodology of CSM in Iran. A result of this long history of soil survey is that Iran is rich in legacy soil maps and information. As of 2016 the area mapped at semi-detailed (scale of 1: 50,000) or larger scales

5

Journal Pre-proof exceeds 20 Mha (more than 12% of the country), mostly in the inter-mountain plains and valleys (Roozitalab et al., 2018). These studies mostly aimed to evaluate land suitability for various crops based on the physiographic analysis of aerial photographs. The Iranian Soil and Water Research Institute (SWRI) under supervision of ministry of Agriculture – Jihad is the legally responsible organization for soil and land resources study at different scales from reconnaissance to detailed. Soil studies in Iran with the aim of providing land classification maps for development

of

projects were done in two stages; pedologic soil surveys, and land classification surveys for

ro

general planning. Traditional soil survey generally followed the method proposed by Mahler

-p

(Mahler, 1970) which ends in soil series and mapping phases classified to family level, based on

re

the Keys to Soil Taxonomy current at the date of the survey (Farshad et al., 2015; Roozitalab et al., 2018; Zeraatpisheh et al., 2020). The Iranian soil series concept differs from that of the

lP

USDA system. In the absence of a national soil correlation scheme, across survey areas, soil

na

series are specifically defined for each study area separately according to the surveyor’s judgment. As a result, some soils with the same series name, but in different areas, are quite

ur

different from each another, and some soil series with different names are similar or identical.

Jo

The final outputs of the pedologic soil survey attempts were then interpreted as land classification or evaluation maps, considering soil properties and limitations relevant to the intended land use, such as infiltration, soil surface texture, depth, salinity, and drainage, as well as site characteristics such as flooding, ponding, and erosion. These two maps were afterwards combined to obtain phases of the series.

1.2. Recent developments in Iranian soil survey A further development from the officially recognized by SWRI and widely used Mahler’s method was the geopedologic approach of Zinck (Zinck et al., 2016), which since a decade have

6

Journal Pre-proof been applied mainly at research level (MSc and PhD theses), and most recently DSM (McBratney et al., 2003) for mapping grid cells at fixed resolution. It was realized that geopedology (Zinck et al., 2016), where geomorphology, remote sensing, and modelling within a GIS environment are used is more cost-effective way to map soils (Roozitalab et al., 2018) though it is not officially considered as a method for soil mapping in Iran (Zeraatpisheh et al., 2020). Based on a strong integration of geomorphology and pedology,

of

the former being used as a tool to improve the quality of soil survey, the approach makes it

ro

possible to predict, map, and explain the occurrence of soils on the landscape with reasonable

-p

accuracy. SWRI surveyors have also attempted to apply DSM methods (McBratney et al., 2003)

re

in several pilot projects. Likewise, legacy soil information has been increasingly used by Iranian soil researchers (see Zeraatpisheh et al., 2020). One stream has been to update legacy surveys by

lP

additional sampling. Pahlavan-Rad et al. (2014) updated some legacy soil maps of northern Iran

na

by merging information from new soil sampling with these maps. Another stream has been to use legacy information as a source for DSM. Pahlavan-Rad et al. (2016) verified the qualification of

ur

these legacy soil maps as a covariate in DSM based on new soil sampling. Similarly,

Jo

Zeraatpisheh et al. (2019) disaggregated and updated legacy soil maps in central Iran using the DSMART DSM algorithm. Pakparvar et al. (2012) used legacy soil data to detect salinity change in southern Iran. Rasaei and Bogaert (2019a, 2019b) assessed the quality of legacy soil data as inputs to DSM in a huge study area in the East Zagros region of Iran. In summary, legacy soil maps have been used for two main purposes: (1) updating the legacy maps to be used as an input in DSM (e.g., see Pahlavan-Rad et al., 2016; Zeraatpisheh et al., 2019), (2) updating the maps by a new sampling campaign (e.g., see Pahlavan-Rad et al., 2014). However, none of these studies explain the preprocessing steps or quality assessment of the

7

Journal Pre-proof legacy maps, and indeed it is not clear that a standard procedure has been applied to these ad hoc uses of legacy maps.

1.3. Objectives This study aims to develop a systematic approach to archaeology, rescue and renewal of legacy soil data from several representative Iranian soil surveys to see if this legacy information

of

can be brought to the standard needed to support mapping soil functions via DSM. First, their

ro

adequacy is investigated based on the Cornell adequacy criteria standards (Forbes et al., 1982),

-p

followed by the renewal stage according to the guidelines presented in Rossiter (2008) and

re

Cambule et al. (2015), and finally their application for DSM, using environmental covariates to produce predictive soil maps of soil classes. Following traditional Iranian soil survey procedure,

lP

the developed procedure is for mapping soil series, which can then be interpreted for land

Jo

2.1. Study area

ur

2. Material and methods

na

suitability, or from which maps of soil properties can be extracted from representative profiles.

The study area is located between 31° 45′ to 33° 15′ N and between 50° 0′ to 51° 30′ E at the border of Isfahan and Chaharmahal-va-Bakhtiari provinces, in the East Zagros region of Iran (Fig. 1). It covers an area of 10,480 km2 , with mean annual temperature ranging from 9.5°C to 12°C and mean annual rainfall ranging from 320 mm to 410 mm. Major parent materials are sedimentary rocks (limestone, sandstone, shale and marl) and alluvial sediments derived from these. Topography is diverse, with elevation ranging from 1802 m to 3835 m a.s.l. Accordingly, the study area includes different physiographic units including mountains, hills, alluvial gravelly fans, piedmont plains, plains, and lowlands. Soils formed on alluvial gravelly fans and high

8

Journal Pre-proof piedmont plains contain coarse parent materials and are shallow, with coarse soil textures. The lowlands contain fine-textured Quaternary limestone-derived deposits or calcareous deposits, so that soils become deeper and with a high clay content. The Soil Taxonomy soil temperature regime is Mesic (Soil Survey Staff, 2014); the Soil Taxonomy soil moisture regime in lowlands with high groundwater level and gleyic conditions is Aquic, in some small parts in the Northeast of the study area Aridic, and in the rest Xeric (Mohammadi, 1999). The area is the source of three

-p

ro

different Taxonomic Orders are formed throughout the area.

of

important Iranian rivers: Zayandehrud, Karoun and Dez. Diverse well-developed soils from

re

2.2. Methods 2.2.1. Data archaeology

lP

An inventory was made of the legacy soil surveys and information in the studied area. This

na

revealed that the study area has long attracted the attention of surveyors in variety of fields, because of its specific climatic and hydrological and consequently soil conditions for agriculture.

ur

One group of studies had soil survey and land evaluation for agriculture as the main aim. A

Jo

second group surveyed hydrological conditions, soil erosion and conservation, preservation of natural resources, and land use planning. Legacy maps and reports are mostly forgotten. Therefore, to collect available legacy soil data took a considerable amount of time and repeated visits to source institutions and libraries, about four months. A list of traditional soil inventories carried out over the study area from 1953 to 2009 is available in Table A.1. Although geopedologic soil mapping approaches (Zinck et al., 2016) is a more modern traditional soil mapping technique, the physiographic approach (Mahler, 1970) was more widely used and thus most soil inventories available in the study area use this method. In all

9

Journal Pre-proof these legacy soil surveys soils of only inter-mountain plains and valleys were studied and mapped.

2.2.2. Data rescue We selected three soil surveys at the scale of 1: 50,000 (semi-detailed) for rescue and renewal, because they have complete soil information, i.e., soil map, soil report, morphological and

of

analytical information of soil profiles, and profile location. Together they cover the study area.

ro

They are specified in the last column of the Table A. 1 as “Yes” (numbers 13, 14, and 16) and are

-p

fully presented in Table 1. All three selected legacy soil surveys were prepared based on the Mahler method. In the rescue stage, paper maps were scanned at a horizontal and vertical

re

resolution of 300 dpi, and thereby converted to the ‘bitmap’ (file extension .bmp) digital format

2.2.3. Data renewal

na

lP

which can be read by any graphics program or GIS that supports raster data.

ur

Data renewal of the mapped parts of the study area consisted of four steps: georeferencing,

Jo

digitizing, map legend, reclassification of soil profiles, statistical quality assessment. These are now explained in turn.

Georeferencing In the first step of data renewal process, the georeference of the legacy maps was determined and related to a coordinate reference system (CRS), with coordinate system, projection and datum (Iliffe and Lott, 2008). The three selected maps were referenced to geographic coordinates (longitude and latitude) on the European Datum 1950 (ED 50), as shown at map corners and ticks every 5’. Since the paper maps have become warped or folded over time, we did not use the

10

Journal Pre-proof corner ticks as tie points and an affine transformation. Rather, a number of reliable control points, i.e., intersection of roads and some villages, were selected on the paper maps and then identified on Google Earth images, which uses geographic coordinates on the WGS84 datum. The WGS84 coordinates were assigned to each control point digitized on the scanned map, which was then warped using a first-order polynomial transformation from this set of points. Thus, the original datum was not used, since the paper maps were directly referenced to WGS84. The quality of this

of

step was evaluated with respect to the root mean square error (RMSE) of the ground control point

-p

ro

transformation (Hughes et al., 2006).

re

Digitizing

In the second step, the boundaries of soil units and the location of 123 soil profiles which had

lP

full physio-chemical information in the accompanying soil survey reports were digitized on-

na

screen from their symbol on the georeferenced scan in the same coordinate reference system (WGS84). Then, the quality of the soil maps units’ separation was checked and evaluated through

ur

the Cornell adequacy criteria: (i) scale and texture, and (ii) map legend, as follows.

Jo

Scale and texture: These criteria are used to evaluate the smallest legible area of soil map units using the definitions proposed by Forbes et al. (1982). Minimum legible area (MLA), the minimum legible land area on a soil map which can be displayed on its printed scale, was calculated based on a minimum legible delineation (MLD) of 0.4 cm2 on the paper map. The maps were then evaluated based on the Index of Maximum Reduction (IMR), which is the factor by which the map scale can be decreased until the average size delineation (ASD) is equal to the MLD. The optimal IMR was set at 2 by the Cornell expert consultation, in which case the ASD is equal to 1.6 cm2 , four times the MLD (Forbes et al., 1982). High IMR values indicate that the soil

11

Journal Pre-proof map does not show details as its scale indicates, whereas small values indicate that the map is illegible, and its printed scale should be increased. Map legend: This defines the map units and refers to the definitions presented in the relevant soil report and may show a brief explanation. Identification is based on the information provided for each soil map unit. Farshad et al. (2016) have shown that the information given in map legend is subjective and generally depends on the surveyor’s experience. It was however confirmed that

of

the physiographic soil maps contain much less information than the geopedologic maps.

ro

To evaluate the map legend, the concept of taxonomic classes is used as a proxy for overall

-p

information on the map unit. In this regard, a map unit is called “well-defined” if its soils are classified as the taxonomic classes without any ambiguity and confusion. Specificity and

re

homogeneity are two components of map legend quality. A unit is considered as “adequately

lP

defined” in the case (i) the soils of the unit are classified in a soil taxonomy class which shows

na

the main characteristics of the unit, and (ii) most of the soils are classified in the same taxonomic class. The map legend information is considered as “adequate” if proportion or percentage of

ur

number or area size of whole adequately defined map units occupies more than 80% of the study

Jo

area. The overall quality of the map legends was then expressed in terms of the proportion of land unit or study area to the total number of units or total area (Forbes et al., 1982). In the third step, the displacement of soil map units’ boundaries is controlled and improved. Considering the relationship between soil formation and topographic features, these latter can be effectively used to this end. Since in the Iranian approach to soil survey the soil units were defined and separated based on the physiographic units, we used the DEM and some of its derivatives including slope, aspect, and hillshade, as well as its basic topographic lines with 100 m interval, to reveal physiography and compare it with the lines from the original survey.

12

Journal Pre-proof The derivative maps were computed from a 90 m resolution DEM (Rasaei and Bogaert, 2019a) with SAGA GIS routines within QGIS 3 (Conrad et al., 2015). The digitized boundaries from the legacy maps were imported to QGIS 3 and displayed over the DEM and its derivatives, without changing the CRS, by “on-the-fly” reprojection within QGIS, in order to find possible dislocation of soil unit boundaries with respect to clearly-defined topographic features. The dislocation was evaluated by spatial accuracy criterion proposed by Goodchild and Hunter

of

(1997), that is part of the total length of the digital lines that are located within a threshold

ro

distance from the topographic lines visualized on the DEM. Here, the buffer intervals began from

-p

the maximum location accuracy (12.5 m at 1: 50,000 map scale) and continued until reaching the

re

90% ratio, within 100 m buffer intervals in accordance with the topographic lines spacing. That is, the digitized boundaries of soil map units with unknown spatial accuracy (test lines) were

lP

compared with the topographic lines with certain spatial accuracy (reference lines). The

na

percentage of the total length of test lines fallen in the buffer areas, with distinct intervals, was calculated as a ratio to the total length of the soil map boundaries’ line. This was continued until

ur

the total length of the soil map boundaries to the total length of the map boundaries was equal to

Jo

90% or more. In this level, a map considered as an acceptable spatial accuracy under the condition that the total length of map unit boundaries fallen in between the buffer areas at 90% percentile was lower than the side length of the MLA (316.23 m at 1: 50,000 map scale). Finally, in the cases where there were significant differences between the boundaries of soil units and the topographic features, the boundaries of soil units were corrected by on-screen digitizing. In the end, the final spatially-corrected soil maps were merged together in order to have single legacy soil map.

Reclassification of soil profiles

13

Journal Pre-proof As legacy soil profiles were classified using several obsolete versions of Soil Taxonomy, their classifications were updated to the family level by following the keys in the latest edition of Soil Taxonomy (Soil Survey Staff, 2014). This was easily done for representative soil profiles, with full description of morphological, physical, and chemical characteristics. For supplementary profiles with no

morphological description and only limited laboratory information, the

classification was inferred from available information and characteristics of the soil series of the

ro

of

map unit in which the georeferenced profile is located.

-p

Statistical evaluation of soil map quality

re

In order to quantitatively evaluate the quality of the conventional soil maps (CSMs), an error matrix, often called a confusion matrix, is formed whose rows represent estimated classes from

lP

the map, and columns represent actual classes from the field (Stehman and Czaplewski, 1998).

na

The kappa index of agreement and overall accuracy are the most used indices computed from the error matrix. To calculate the kappa index, it is assumed that the rows and columns of the

ur

error matrix are independent, and their marginal probability is known before the classification.

Jo

While the user's marginal probability is not specified until the map is completed, so the calculations need to determine the conditional probability for the map user (Rossiter et al., 2017). The Tau index (Ma and Redmond, 1995) was defined to overcome this deficiency. The Tau index requires the analyst to specify the prior probabilities of finding each class. Like kappa, it penalizes map accuracy as the number of classes is reduced and as the balance between classes becomes less equal (Rossiter et al., 2017). In addition to these naïve indices, however, weighted indexes can be used, in which another matrix indicating weights assigned to each classification error is also required. The weighted overall accuracy of the map and the Tau index are calculated as follows (Rossiter et al., 2017);

14

Journal Pre-proof 𝐴𝑜𝑤 = ∑𝑟𝑖=1 ∑𝑟𝑗=1 𝑤𝑖𝑗 . 𝑝𝑖𝑗 𝑇𝑤 =

(1)

( ∑𝑟𝑖=1 ∑𝑟𝑗=1 𝑤𝑖𝑗 .𝑝𝑖𝑗 )−( ∑𝑟𝑖=1 ∑𝑟𝑗=1 𝑤𝑖𝑗 .𝑝𝑖 .𝑝+𝑗 )

(2)

1 −( ∑𝑟𝑖=1 ∑𝑟𝑗=1 𝑤𝑖𝑗 .𝑝𝑖 .𝑝+𝑗 )

Where, Aow: weighted overall accuracy; Tw: weighted Tau index; wij: Cell weight; pij: the probability of each cell; pi: the probability of each row; and p+j: the sum of the probabilities of each column.

of

Various methods can be used to determine weights, as explained by Rossiter et al. (2017). One

ro

preferable way is to determine the weights matrix based on the degree of similarity between soils, either in their intrinsic properties or their use potential. Since there are many possible uses it is

-p

simpler to use taxonomic similarity, under the assumption that the designers of the hierarchical

re

classification systems grouped soils by their overall similarity, which should translate to similar

lP

land use potentials. This rationale was clearly stated by Smith (1986). Since a major motivation for renewing legacy soil data is to compensate for the inability to

na

perform a new soil survey, requiring new field data to evaluate the accuracy of a legacy soil map

ur

is likely unfeasible. Therefore, we used other legacy soil data covering the study area as the

Jo

reference data, assumed correct, to fill the confusion matrix. The legacy data selected for evaluation is from a comprehensive legacy soil survey entitled “correlation study”. In this study, soils of the three selected legacy soil maps were conceptually correlated together in the term of their soil series’ definition by SWRI (Mohammadi, 1999). See Table A. 1 (number 24) and the last row of Table 1 for details. This study covers the three legacy soil surveys under evaluation. Another point which encourages us to prioritize it over others is that all these legacy soil surveys were implemented by almost the same team of soil surveyors therefore data arising from the correlation study can be considered as confirmation work of the selected legacy soil surveys. This data is thus expected to be a good representative of the area.

15

Journal Pre-proof Consequently, 41 soil profiles of the correlation study that fall in the study area were selected. The mean and standard deviation of this data set were compared with 123 data at hand from three selected legacy data for different soil variables (Taghizadeh-Mehrjardi, 2016) to be evaluated to what extent they represent the soil distribution of the study area (Brus et al., 2011). To evaluate the quality of soil map groups separation at four hierarchical Soil Taxonomic levels: Order, Suborder, Great Group, and Subgroup, weighted naïve accuracy (formula 1) and

of

the weighted Tau index (formula 2) were calculated using the known proportion of the legacy

ro

profiles, i.e., the producer’s marginal probabilities, in each class as the prior for the user’s

-p

marginal probabilities. This is because we expect the actual area proportions of classes to be like

re

the mapped proportions, i.e., assuming that the prime surveyors were competent if not perfect. Soil series were considered as members of their respective higher- level soil taxonomic groups.

lP

A similarity matrix was determined based on the taxonomic distance between soils as a weight

na

matrix, as follows.

Available physical and chemical properties of 123 legacy profiles were extracted from three

ur

renewed legacy soil maps including particle-size class separates, i.e., percentages of gravel, sand,

Jo

silt, and clay, and the percentage of organic carbon (OC), calcium carbonate (CaCO3), and volumetric water percentage at saturation (SPz), as well as electrical conductivity (EC) and acidity (pH). Properties were vertically averaged using the aqp “Algorithms for Quantitative Pedology” package in R software (R Development Core Team, 2017) by weighted spline interpolation (Malone et al., 2009) at the five standard depth slices 0–5, 5–15, 15–30, 30–60 and 60–150 cm (or fewer if the profile is shallower) from the GlobalSoilMap.net standard (Arrouays et al., 2014), resulting in a single value of each soil property for each profile. We terminated the standard depths at 150 cm as an “effective rooting depth” of plants typical of the study area.

16

Journal Pre-proof We then used these per-profile properties to compute weights to adjust the unweighted accuracy measures, as follows. Centroids (mean values) of each group for each Soil Taxonomy level of each soil property were calculated from the set of whole-profile averaged properties. The Mahalanobis distance (Formula 3) was then computed as a taxonomic distance between each soil taxonomy level using pairwise.mahalanobis function of the HDMD “Structural Analysis Tools for High Dimensional Molecular Data” R package (McFerrin, 2013). The weights matrix was

of

then computed by dividing the distance matrix by the complement of the maximum distance.

ro

Classes with less than two pedons available for centroid calculations were excluded from

-p

taxonomic distance studies. The Mahalanobis distance dij between two groups i and j is: 𝑡

𝑑𝑖𝑗 = √(𝒙𝑖 − 𝒙𝑗 ) 𝑆 −1 (𝒙𝑖 − 𝒙𝑗 )

re

(3)

lP

where xi and xj are the average vectors of the soil properties of the profiles in groups i and j,

na

respectively, and S is the pooled covariance matrix of the properties.

ur

2.2.4. Use in Digital Soil Mapping

Jo

In order to test the usability of the renewed legacy soil maps, soil classes at four taxonomic levels (Order, Suborder, Great Group, and Subgroup) were predicted across the study area by DSM considering two sets of covariates: (1) environmental covariates extracted from DEM and remote sensing (RS) images only; (2) these with the legacy soil map arising from the renewal step classified to the appropriate level (SoilMap). Twenty environmental covariates were developed from the DEM and Landsat 8 Operational Land Imager (OLI) images of July 2016: Elevation (El), Slope (Sl), Aspect (As) and Flow accumulation (FlAc) and Convexity (Con) (Böhner et al., 2006), Valley depth (VD) (AbdelKader, 2011), Midslope position (MidS) (Böhner and Antonić, 2009), Topographic wetness

17

Journal Pre-proof index (TWI) (Sørensen et al., 2006), Module Multiresolution Index of Valley Bottom Flatness (MRVBF) (Gallant and Dowling, 2003), Stream power index (SPI) (Moore et al., 1991), Vertical Distance to Channel Network (VDCHN) (Bock and Köthe, 2008), Length slope factor (LSf) (Böhner and Selige, 2006), Normalized difference vegetation index (NDVI) (Rouse Jr et al., 1974), grain size index (GSI) (Xiao et al., 2006), Clay index (ClI) and Carbonate index (CarI) (Boettinger et al., 2008), and Gypsum index (GypI) (Nield et al., 2006).

of

In addition, maps of geology, landform, and landuse of the study area were prepared and used.

ro

The geology map was prepared by digitizing geologic maps of the study area at scale of Iran, 2016). This map is of help to

-p

1:100,000 (Geological survey and mineral exploration of

re

extract parent materials of the soils. The landform map was produced from the DEM in 90 m spatial resolution by using LandMapR software (Macmillan, 2003). The landuse map was

lP

extracted from the OLI images by unsupervised classification using ENVI, followed by manual

na

interpretation of classes: range lands, bare lands, water, agricultural lands including cultivation, dry farming, and gardens. All covariates were resampled to 90 m spatial resolution and projected

ur

to the same coordinate system as DEM. The values of the covariates at the locations of the 123

Jo

legacy soil profiles were obtained by map overlay with the ‘over’ function of ‘sp’ R package (Bivand et al., 2013).

To determine which covariates to include in the DSM models, we first used the ‘Boruta’ R package (Kursa and Rudnicki, 2010). The Boruta algorithm finds all “relevant” covariates to predict a target variable for regression and classification (as in this case) random forest models, as well as variable importance. To confirm that the predictors identified as relevant by Boruta have a reasonable relationship with soil classes, binomial logistic regression models (BLR) for each class at each soil Taxonomic level were then built, and the Wald statistic was used to evaluate the significance of the regression coefficient of each predictor for each soil class

18

Journal Pre-proof (Hosmer and Hjort, 2002). Predictors confirmed as important for any class were then used in the following step. We then built Multinomial logistic regression (MLR) models of the relationship between soil classes at each of the four levels using the selected covariates for that level. Models were trained based on 123 legacy soil profiles extracted from three renewed legacy soil maps. This approach was also taken by Prado et al. (2008), who predicted potassium classes from a limited legacy data

of

set in Brazil, and by Rasaei and Bogaert (2019b), who predicted WRB soil classes by using an

ro

unfairly distributed small legacy soil dataset over a rather huge area in Iran with reasonable

-p

results. Internal model evaluation was with the Akaike Information criterion (AIC) (Hosmer and

re

Hjort, 2002). The best model shows the lowest AIC and residual deviance. Also, ANOVA was used to check the variables importance to each soil class. Predictive model performances were

lP

evaluated by 10-fold cross validation. The accuracy statistics (see above, Section 2.2.3) for the

na

two models (with and without legacy map) were compared at each taxonomic level to quantify the effect of using the legacy soil map as a covariate in DSM.

ur

The MLR models were then applied over the portion of the study area that had been included

Jo

in the legacy maps, i.e., the inter-mountain plains and valleys. The models were not applied outside these areas because there were no observations with which to train the models, nor map units to use in the “with legacy maps” DSM. The spatial difference between maps was computed with a cross-classification test over these two maps using ‘diffR’ R package (Pontius Jr and Santacruz, 2019).

3. Results and Discussion 3.1. Data archology and rescue

19

Journal Pre-proof Although the soils in this region have been frequently studied, there are few useable legacy soil studies, as summarized in Table A. 1. These can be grouped into three categories, based on their amount of available information. The first category includes legacy soil studies with complete information (i.e., Ghazi-Zahedi et al., 1992; Mohammadi, 1986; Yazdani et al., 1986) including profile descriptions and locations, laboratory analysis, and maps. The second category includes all information except the soil map (i.e., Mohammadi, 1999). The last category only has

of

laboratory information of geo-referenced soil profiles (sometimes by named location, not

ro

coordinates), with no profile or site description (i.e., Anonymous, 1994; Jalalian and Khademi,

-p

1987).

re

Three legacy soil surveys from the first group were selected for rescue and renewal: (i) Chaharmahal-va-Bakhtiari province (Shahrekord-Borujen regions) (Mohammadi,

1986),

(ii)

lP

Juneghan (Yazdani et al., 1986), and (iii) Fereydan (Ghazi-Zahedi et al., 1992) (Fig. 1, Tables A.

na

1 and 1). They have been carried out by SWRI, on a topographic map base (scale 1: 50,000), with the purpose of identifying soils of inter-mountain plains and evaluating their capability for

ur

agricultural production. Representative profiles were selected in each legend category for full

Jo

description and laboratory analysis. Legend entries are phases of soil series. For each legend entry the capability of the soils of each unit for agricultural production is also indicated. Soils were classified to the Soil Taxonomy family level (Soil Survey Staff, 1999), and also in the FAO (FAO, 1974). The detail of these three selected legacy studies are accessible in three volumes of written soil reports (in Persian) and 18 printed sheets of soil maps, each with a corresponding land evaluation map printed at scale 1: 50,000, 36 sheets of maps in all. The general properties of each soil series, along with description of their soil profiles are mentioned in full detail in the corresponding reports. Morphological descriptions are at disposal only for the representative soil

20

Journal Pre-proof profiles whilst analytical information for all layers of each profile, both representative and supplementary, are reported.

3.2. GIS layers The scanned and georeferenced soil maps, before manual digitizing and boundary correction, are given in Fig. A. 1a. In all studied areas the georeferencing of the evaluation maps is more

of

accurate than for the soil maps, due to the availability of original maps in all study regions with

ro

better map sheet quality. As both soil and evaluation maps have got the same map unit

-p

boundaries in all selected legacy soil surveys, the results of quality control of georeferencing

re

processes of only evaluation maps are presented in Table 2. The relative georeferencing RMSE ranges from 2% of the side length of the MLA (square root of 100 000 m = 316.23 m) in

lP

Juneghan survey to 6% in Fereydan survey. The RMSE is 0.49 to 1.44 times the maximum

na

location accuracy in the scale (12.5 m). Thus, low RMSE values comparing to MLA and the

acceptable.

ur

maximum location accuracy at the map scale indicate that the georeferencing of maps is

Jo

The digitized vector maps corresponding to the mapped parts of the study area are accessible in Fig. A. 1b. The geographic locations of soil profiles were extracted from the georeferenced evaluation maps by digitizing the point symbols. The location of the 123 representative and supplementary soil profiles coloured by soil Order are shown in Fig. 2.

3.3. Reclassification of soils Soil profiles were allocated to six Orders of Soil Taxonomy (Fig. 2, labelled as “calibration”). The majority of the profiles (68%) are Inceptisols. Reclassification rarely affected the Order level and family specification of the original surveys; changes were mostly at the Subgroup level.

21

Journal Pre-proof

3.4. Assessment the quality of soil maps 3.4.1. Map scale and texture Table 3 displays the scale and map texture assessment based on ASD and IMR of the studied legacy soil maps. All the delineations are larger than the MLA of the printed map (10 ha). Therefore, the legacy maps meet the standard. While all IMR values are more than its optimal

of

value (2), which means that maps are legible enough, but map scale can be reduced to a level that

-p

ro

soil units do not lose their legibility.

re

3.4.2. Map legend

In all three studies, soil unit symbols are shown in each delineation. Soil units in the

lP

accompanying reports are described by their differentiation from other units, soil type and

na

properties, area and percentage of coverage, and land evaluation class. In addition to soil map delineations containing soil profiles with either complete information

ur

(representative ones) or only analytical data (see section 2.2.3), there are some delineations, in

Jo

which soil profiles were investigated (as shown by map symbols), but details about their characteristics are not given in the related soil report. In these cases, the classification of soil units is assigned to the map unit or units with similar description in the same soil series. To give an illustration, let us consider a soil series, for example soil series number 5 in one of these study areas with four soil map units – 5.1, 5.2, 5.3, and 5.4 that they all have soil profiles shown on the map, but only 5.3 has the representative soil profile for the series, classified as a Typic Calcixerepts. Following the upper-mentioned example, it is supposed that unit 5.2 has a supplementary soil profile with full information and classified as a Lithic Calcixerepts. To assign an appropriate classification to the two remaining soil units, their definitions, given in the related

22

Journal Pre-proof soil report, are adjusted to 5.2’ and 5.3’. They are considered as Lithic subgroup if their definitions go with 5.2’, having a limiting layer in 50 cm. Otherwise the Typic subgroup is used. In a few units, covering a small area, no soil profiles are available. Following the definitions and criteria suggested by Forbes et al. (1982) in the case of taxonomic classes, the adequacy criterion of only the units with soil profiles or the ones which their detail are mentioned in the soil reports were considered as "adequate" and the others were

of

considered as "inadequately defined”. When the value or percentage of the area of the well-

ro

defined soil unit is significant compared to the amount or percentage of the number of well-

-p

defined soil unit’s components, the overall quality of the information was considered as

re

"sufficient" (more than 80%). Based on the results, information of the legends of legacy soil maps was assessed as sufficient (Table A. 2). E.g., considering Fereydan survey, with 212 soil

lP

map unit delineations: 67 units with soil profiles completely described in the related soil report,

na

70 units with soil profiles in it which their detail are not at hand in the report, and 75 ones without any soil sampling. Therefore, we have 137 (67+70=137) soil map units with adequate

ur

information. Thus only 65% of these surveyor’s map units are “adequately defined”. Therefore,

Jo

the map legend does not convey adequate information in the case of number of soil profiles. However, when the area covered by the map units is taken into consideration, soil map units with adequate information cover 84% of the area which indicates legend of this soil survey offers sufficient information in this case. This is a positive indication of the quality of legacy surveys – the soils were well-explained over a large majority of the survey area.

3.4.3. Control and modification of boundary dislocations In Chaharmahal-va-Bakhtiari and Juneghan study regions, the entire length of the test lines in the buffer zone is equal to the square root of MLA (316.23 m). In Fereydan area, however, it

23

Journal Pre-proof slightly exceeds this threshold (Table A. 3). By overlaying georeferenced and digitized soil unit boundaries on slope and hillshade maps, and Google Map images, displacement at the soil unit boundaries in some cases was revealed and corrected (Fig. A. 2).

3.4.4. Merger to a single soil map The corrected legacy soil map polygons were merged in the GIS. To produce a common

of

legend, seven soil series were found in all three source maps. Referring to the corresponding soil

ro

reports, these series were merged based on their morphological characteristics and taxonomic

-p

classification. This resulted in a soil map with 44 soil series and 167 soil map units (Fig. A. 3),

re

dissolved to Orders and shown in Fig. 3. This final renewed soil map has adequate accuracy in its legend information in the case of both number of soil profiles and the area (Table A. 2).

lP

However, we did not evaluate its scale and texture as we did not disaggregate the final map

na

owing to lack of enough point at hand to deal with. As a result, we will have the same ASD and IMR.

ur

Three tables linked to the map legend were produced (Tables A. 4, A. 5, and A. 6). The first

Jo

contains soil series general description and original classifications, in addition to their division into different soil map units along with their final land classification class. The second contains morphological properties of representative soil profiles in each soil series. The third contains soil profiles’ geographic and physicochemical properties coupled with the number of each soil map units they are located in and their classification using Key to Soil Taxonomy based on its legacy editions (Soil Survey Staff, 1985, 1975), available in the soil reports, and its latest edition (Soil Survey Staff, 2014).

3.4.5. Statistical assessment

24

Journal Pre-proof The distribution of 41 legacy data selected for evaluation set are shown in Fig. 2, labelled as “validation”. Appropriate consistency and validity of these data to be trusted and used in validation procedure are confirmed (Table A. 7). The results of the evaluations and the assessment of the separation quality of soil units in each study region and summarized over the whole area are presented in Table 4. The overall accuracy of the maps at all the classification levels is more than 70%, and the Tau value is more than 50%.

of

This shows that the studied legacy maps have good quality in terms of separation of soil units.

ro

The value of these indices increase in final renewed soil map in all Taxonomic levels comparing

-p

to initial legacy maps. This thus indicate good quality of the final map due to the lower number

re

of soil delineations as a result of merging some soil series together. The low Tau values are due to a lack of balance between the numbers of soils examined in each Order. This is because in all

lP

studied regions, Inceptisols and Alfisols have the highest percentages among others, while

na

Vertisols, Mollisols, and Aridisols are fairly scattered throughout the region (Fig. 2, Tables A. 6 and A. 8). Tau generally decreases with decreasing taxonomic levels because of the larger

ur

number of classes, despite that each one represents less area. This result is similar to that of

Jo

Bazaglia Filho et al. (2013) who evaluated the quality of the conventional soil maps using the Kappa index and found that this index’s value reduces as going to the lower classification levels.

3.4.6. Digital Soil Mapping Fig. 4 shows the most important environmental covariates for modelling soil Orders, as determined by the Boruta package (see Fig. A. 4 for results of other Taxonomic levels). The importance of the covariates largely differs for the various soil Taxonomic levels, that is the number of essential covariates increases at lower levels. Sl and LSf are essential at all levels. El is important to only three first ones, in contrast; MRVBF and Con are considered important at

25

Journal Pre-proof Suborder, Great Group and Subgroup levels. That is, the model needs more secondary information in the case of facing more details by going through lower levels. Including Sl, LSf, and El lower the residual deviance of MLR models. Hence, these are the most informative factors at almost all levels, and correspondingly justify most of the soil variations especially at two first levels. This makes sense because soils at the first level were separated based on different physiographic units (see Section 1.1). This close relationship between soil classes and topography

of

can be also considered as a confirmation of good quality of the renewed legacy soil map in the

ro

term of identification and separation various soils. At Suborder level, soils are mostly classified

-p

considering their soil moisture regime (i.e., Xeric, Aridic, and Aquic). Topographic features (i.e.,

re

Sl and El) indirectly control the depth of ground water (i.e., Aqualfs and Aquepts in Lowlands), and also Aridic regime (i.e., Calcids). MRVBF is in conjunction with sediment accumulations

lP

especially at the lower levels e.g., Typic Haplocalcids, Typic Calcixerepts, Calcic Haploxerepts

na

(Jafari et al., 2012). All together, these emphasize the role of topographic features as the most dominant soil forming factor in the study area which highly controls the distribution of soil types

ur

over the area, as already confirmed in different studies (e.g., Debella-Gilo and Etzelmüller, 2009;

Jo

Jafari et al., 2012; Taghizadeh-Mehrjardi et al., 2015). The legacy soil map (SoilMap) was identified as the most important covariate at all taxonomic levels (Figs. 4b and A. 4). This confirms the value of the legacy soil map for modelling. Similarly, Pahlavan-Rad et al. (2016) found some legacy soil maps from Northern Iran useful to be used as covariates for DSM. BLR models mostly confirmed the most informative covariates from the Boruta (Table A. 9). The performances of models largely vary for the different soil Orders. NDVI, TWI, Geology and Landuse showed relation to only Alfisols, while CarI was significantly associated to only Aridisols. In contrast; VD, LSf, El, Sl, and SoilMap were significantly related to at least three of

26

Journal Pre-proof soil Orders. That is, for Mollisols only one covariate (Sl) and for Aridisols (El, VD) and Vertisols (LSf, SoilMap) only two covariates among others have significantly low p-values, while all these covariates are significantly related to Inceptisols. These findings also confirm that Sl, El, and LSf are related to most of the soils which reflect the general topographic characteristics used in dividing soils, while other topographic features are associated to soil characteristics at the lower levels. E.g., TWI and CarI are respectively related to Endoaqualfs and Haplocalcids Great

of

Groups. As they are well associated with Inceptisols which covers most of the study area, VD,

ro

LSf, El, and Sl are considered as the best covariates to predict soil Orders (Fig. A. 5).

-p

MLR models were then built with the most relevant covariates, with and without the legacy

re

map (Fig. A. 3a) for modelling all soil Taxonomic levels. Table 5 shows that including legacy soil map improved model accuracy. In both cases, with and without considering the legacy map,

lP

the AIC of the models increases (i.e., the model is poorer) with decreasing Taxonomic level. This

na

is reflected in the predictive accuracy (naïve accuracy and the Tau index), which decrease with taxonomic level. This is a natural result of the increasing number of soil classes in the lower

ur

levels and their increasing imbalance. Referring to the number of soil profiles attached with four

Jo

Taxonomic levels in the training and the test data (Table A. 8), one can see that both sets are dominated by Inceptisols and Alfisols. In contrast, in the lower levels, i.e., Great Group and Subgroup, some classes contain only one soil profile. On the other hand, soil classes present in the training data set are missing in the evaluation data set especially in the lower levels, and on the other hand there are two Subgroups in the test set which are not in the training set. Overall, the results look acceptable for modelling soil Orders and even somewhat for Suborders. Indeed, in the case of having limited numbers of soil data over a huge area while are not fairly distributed over the study area (Fig. 2) it can be considered as an unexpected and positive finding (Rasaei and Bogaert, 2019b). Therefore, we feel confident to map over a spatial

27

Journal Pre-proof grid of 90 m resolution covering the study area using the renewed soil series map (SoilMap) as a covariate (Fig. A. 3a). Though some classes at the Order level may not be specific enough for map users interested in land suitability and soil functions, for example when the Suborder level can be Aquic (e.g., Aquepts) or based on regional climate (e.g., Xerepts). Thus a more useful result would be a DSM at a lower Taxonomic level. However, with the limited number of legacy soil data overall, which implies per-class training points at lower Taxonomic levels, this study

of

was forced to work at the Order level (Rasaei et al., 2019b). It may be also argued that a

ro

surveyor-based-judgment legacy soil map with perhaps unsatisfactory, and certaintly unknown

-p

accuracy is not a proper basis for DSM, unless first disaggregated into its components (Odgers et

re

al., 2014).

Figure 5 shows the predictive soil maps from the DSM process. Without using the legacy soil

lP

map as a covariate, MLR misses prediction of Vertisols with the fewest number of observations

na

(Rasaei and Bogaert, 2019b). Inceptisols are mapped over most of the areas, the others are

six orders.

ur

distributed in small patches. By contrast, the model including the renewed legacy map predicts all

Jo

Considering the difference between these two maps (Fig. 6 and Table 6), 73% of the area is predicted the same in both maps, as shown in grey on Fig. 6. The most agreement between two maps is in the case of Inceptisols with the highest number of observations. In the map made without the legacy survey, the area predicted as Inceptisols is greatly increased, at the expense of Alfisols, and Entisols (Table 6). This is not the case for Mollisols and Aridisols with almost same number of observations as Entisols. Likewise, the first model was not able to predict Alfisols, even though there are more observations of these than of Entisols. Consequently, as a result, one can see that the prediction of Alfisols, Vertisols, and Entisols were greatly improved by including the SoilMap in the model.

28

Journal Pre-proof

4. Conclusion This study builds on the few previous efforts on data archaeology, rescue and renewal (Ahmed and Dent, 1997; Arrouays et al., 2017; Cambule et al., 2015; Dent and Ahmed, 1995; Yang et al., 2011) and further confirms that many legacy surveys, long ignored or even forgotten, contain valuable information, and can be rescued to varying degrees. Although emphasis in the digital

of

soil mapping community has been on rescuing legacy profile information (e.g., Arrouays et al.,

ro

2017), the present study shows that rescued and renewed polygon maps can be useful as

-p

covariates for digital soil mapping.

re

The prediction of soil Orders was greatly improved by using the renewed legacy soil map, as shown both by model performance at evaluation points and by the spatial distribution on the

lP

digital soil maps. This clearly shows the importance of the legacy soil map as a covariate for

legacy surveys.

na

digital soil mapping, and justifies the effort needed for the archaeology, rescue and renewal of

ur

In the specific case of Iran, this study shows clearly that there is wealth of previous studies

Jo

that contain valuable information and can be rescued and renewed. The legacy maps in this study were of sufficient accuracy to be used in DSM as a covariate. We chose the most complete studies for this exercise; clearly less complete studies will give less information but could still be useful as a covariate for DSM. Therefore, it is proposed that the quality of legacy soil studies in other parts of Iran be evaluated according to the standard criteria, and based on this either ignoring or rescuing them. Even the preliminary steps of archaeology and rescue into a consistent database of soil studies would be a valuable first step which could be undertaken by a responsible agency. A systematic project to renew them, according to the steps shown in this paper and

29

Journal Pre-proof following recompilation procedures developed for the US soil survey (D’Avello and McLeese, 1998), would be even better.

References Abdel-Kader, F.H., 2011. Digital soil mapping at pilot sites in the northwest coast of Egypt: A multinomial logistic regression approach. The Egyptian Journal of Remote Sensing and

of

Space Science 14, 29–40. https://doi.org/10.1016/j.ejrs.2011.04.001

ro

Ahmed, F.B., Dent, D.L., 1997. Resurrection of Soil Surveys: A case study of the acid sulphate

-p

soils of The Gambia. II. Added value from spatial statistics. Soil Use and Management 13,

re

57–59. https://doi.org/10.1111/j.1475-2743.1997.tb00557.x Anonymous, 1994. Comprehensive study of soil of Chaharmahal-and-Bakhtiari province. Natural

lP

resource organization of Chaharmahal-va-Bakhtiari province (in Persian).

na

Arrouays, D., Leenaars, J.G.B., Richer-de-Forges, A.C., Adhikari, K., Ballabio, C., Greve, M., Grundy, M., Guerrero, E., Hempel, J., Hengl, T., Heuvelink, G., Batjes, N., Carvalho, E.,

ur

Hartemink, A., Hewitt, A., Hong, S.-Y., Krasilnikov, P., Lagacherie, P., Lelyk, G.,

Jo

Libohova, Z., Lilly, A., McBratney, A., McKenzie, N., Vasquez, G.M., Mulder, V.L., Minasny, B., Montanarella, L., Odeh, I., Padarian, J., Poggio, L., Roudier, P., Saby, N., Savin, I., Searle, R., Solbovoy, V., Thompson, J., Smith, S., Sulaeman, Y., Vintila, R., Rossel, R.V., Wilson, P., Zhang, G.-L., Swerts, M., Oorts, K., Karklins, A., Feng, L., Ibelles Navarro, A.R., Levin, A., Laktionova, T., Dell’Acqua, M., Suvannang, N., Ruam, W., Prasad, J., Patil, N., Husnjak, S., Pásztor, L., Okx, J., Hallett, S., Keay, C., Farewell, T., Lilja, H., Juilleret, J., Marx, S., Takata, Y., Kazuyuki, Y., Mansuy, N., Panagos, P., Van Liedekerke, M., Skalsky, R., Sobocka, J., Kobza, J., Eftekhari, K., Alavipanah, S.K., Moussadek, R., Badraoui, M., Da Silva, M., Paterson, G., Gonçalves, M. da C.,

30

Journal Pre-proof Theocharopoulos, S., Yemefack, M., Tedou, S., Vrscaj, B., Grob, U., Kozák, J., Boruvka, L., Dobos, E., Taboada, M., Moretti, L., Rodriguez, D., 2017. Soil legacy data rescue via GlobalSoilMap

and

other international and

national initiatives.

GeoResJ 14,

1–19.

https://doi.org/10.1016/j.grj.2017.06.001 Arrouays, D., McKenzie, N., Hempel, J., de Forges, A.R., McBratney, A.B., 2014. GlobalSoilMap: Basis of the global spatial soil information system. CRC Press.

conventional

field

soil

observations.

Soil

and

Water

Research

8,

13–25.

ro

from

of

Balkovič, J., Rampašeková, Z., Hutár, V., Sobocka, J., Skalský, R., 2013. Digital soil mapping

-p

https://doi.org/10.17221/43/2012-SWR

re

Baxter, S.J., Crawford, D.M., 2008. Incorporating Legacy Soil pH Databases into Digital Soil Maps, in: Hartemink, A.E., McBratney, A., Mendonça-Santos, M. de L. (Eds.), Digital Soil with

Limited

Data.

Springer

lP

Mapping

Netherlands,

Dordrecht,

pp.

311–318.

na

https://doi.org/10.1007/978-1-4020-8592-5_27 Bazaglia Filho, O., Rizzo, R., Lepsch, I.F., Prado, H. do, Gomes, F.H., Mazza, J.A., Demattê,

complex

geology.

Revista

Brasileira

de

Ciência

do

Solo

37,

1136–1148.

Jo

with

ur

J.A.M., 2013. Comparison between detailed digital and conventional soil maps of an area

https://doi.org/10.1590/S0100-06832013000500003 Bivand, R.S., Pebesma, E., Gómez-Rubio, V., 2013. Spatial Data Import and Export, in: Bivand, R.S., Pebesma, E., Gómez-Rubio, V. (Eds.), Applied Spatial Data Analysis with R, Use R! Springer, New York, pp. 83–125. https://doi.org/10.1007/978-1-4614-7618-4_4 Bock, M., Köthe, R., 2008. Predicting the depth of hydromorphic soil characteristics influenced by ground water. SAGA-seconds Out 19, 13–22.

31

Journal Pre-proof Boettinger, J., Ramsey, R., M. Bodily, J., Cole, N., Kienast-brown, S., J. Nield, S., M. Saunders, A., K. Stum, A., 2008. Landsat Spectral Data for Digital Soil Mapping. Digital Soil Mapping with Limited Data. https://doi.org/10.1007/978-1-4020-8592-5_16 Böhner, J., Antonić, O., 2009. Chapter 8 Land-Surface Parameters Specific to Topo-Climatology, in: Hengl, T., Reuter, H.I. (Eds.), Developments in Soil Science, Geomorphometry. Elsevier, pp. 195–226. https://doi.org/10.1016/S0166-2481(08)00008-1

of

Böhner, J., McCloy, K.R., Strobl, J., 2006. SAGA: Analysis and Modelling Applications. No.

ro

115. Goltze.

-p

Böhner, J., Selige, T., 2006. Spatial prediction of soil attributes using terrain analysis and climate

re

regionalisation. SAGA - Analysis and Modelling Applications 115, 13–27. Bouma, J., Montanarella, L., Evanylo, G., 2019. The challenge for the soil science community to

Wiley Online Library.

URL https://doi.org/10.1111/sum.12518 (accessed

na

Management.

lP

contribute to the implementation of the UN Sustainable Development Goals. Soil Use and

8.24.19).

Journal

of

Soil

Science

62,

394–407.

https://doi.org/10.1111/j.1365-

Jo

European

ur

Brus, D., Kempen, B., Heuvelink, G., 2011. Sampling for validation of digital soil maps.

2389.2011.01364.x

Bui, E.N., Moran, C.J., 2001. Disaggregation of polygons of surficial geology and soil maps using spatial modelling and legacy data. Geoderma, Estimating uncertainty in soil models 103, 79–94. https://doi.org/10.1016/S0016-7061(01)00070-2 Cambule, A.H., Rossiter, D.G., Stoorvogel, J.J., Smaling, E.M.A., 2015. Rescue and renewal of legacy soil resource inventories: A case study of the Limpopo National Park, Mozambique. CATENA 125, 169–182. https://doi.org/10.1016/j.catena.2014.10.019

32

Journal Pre-proof Conrad, O., Bechtel, B., Dietrich, H., Fischer, E., Gerlitz, L., Wehberg, J., Wichmann, V., Böhner, J., 2015. System for Automated Geoscientific Analyses (SAGA) v. 2.1.4. Geoscientific Model Development 8, 1991–2007. https://doi.org/10.5194/gmd-8-1991-2015 D’Avello, T.P., McLeese, R.L., 1998. Why Are Those Lines Placed Where They Are?: An Investigation

of

Soil

Map

Recompilation

Methods.

Soil Horizons

39,

119–126.

https://doi.org/10.2136/sh1998.4.0119

of

Debella-Gilo, M., Etzelmüller, B., 2009. Spatial prediction of soil classes using digital terrain

ro

analysis and multinomial logistic regression modeling integrated in GIS: Examples from

-p

Vestfold County, Norway. CATENA 77, 8–18. https://doi.org/10.1016/j.catena.2008.12.001

re

Dent, D.L., Ahmed, F.B., 1995. Resurrection of Soil Surveys - a Case-Study of the Acid Sulfate Soils of the Gambia .1. Data Validation, Taxonomic and Mapping Units. Soil Use and

lP

Management 11, 69–76. https://doi.org/10.1111/j.1475-2743.1995.tb00499.x

na

Dewitte, O., Jones, A., Spaargaren, O., Breuning-Madsen, H., Brossard, M., Dampha, A., Deckers, J., Gallali, T., Hallett, S., Jones, R., Kilasara, M., Le Roux, P., Michéli, E.,

ur

Montanarella, L., Thiombiano, L., Van Ranst, E., Yemefack, M., Zougmore, R., 2013.

Jo

Harmonisation of the soil map of Africa at the continental scale. Geoderma 211–212, 138– 153. https://doi.org/10.1016/j.geoderma.2013.07.007 FAO, 1974. Soil classification system. UNESCO. Farshad, A., Mohammadi, M., Masihabadi, M.H., Farzaneh, A., 2015. Geopedology; Application of RS and GIS in soils studies. Iranian Soil and Water Research Institute (in Persian), Karadj, Iran. Farshad, A., Zinck, J.A., Shrestha, D.P., 2016. Geopedology promotes precision and efficiency in soil mapping: photo - interpretation application in the Henares River Valley, Spain, in: Zinck,

J.A.,

Metternicht,

G.I.,

Bocco-Verdinelli,

33

G.H.R.,

Del Valle,

H.F.

(Eds.),

Journal Pre-proof Geopedology: An Integration of Geomorphology and Pedology for Soil and Landscape Studies, 20. Springer, pp. 347–360. Forbes, T.R., Wambeke, A. van, Rossiter, D., Services, U.S.S.M.S., 1982. Guidelines for Evaluating the Adequacy of Soil Resource Inventories. US Department of Agriculture. Gallant, J.C., Dowling, T.I., 2003. A multiresolution index of valley bottom flatness for mapping depositional

areas.

Water

Resources

39,

1347–1359.

of

https://doi.org/10.1029/2002WR001426

Research

ro

Geological survey and mineral exploration of Iran, 2016. Geological survey and mineral

-p

exploration of Iran. URL https://www.gsi.ir

Ghazi-Zahedi, A.A., Mohammadi, M., Noorbakhsh, F., 1992. Semi-detailed soil survey and land

re

classification of Fereydan region. Ministry of Agriculture, Iranian Soil and Water Research

lP

Institute (in Persian).

International

Journal

na

Goodchild, M., Hunter, G.J., 1997. A simple positional accuracy measure for linear features. of

Geographical

Information

Science

11,

299–306.

ur

https://doi.org/10.1080/136588197242419

Jo

Hosmer, D., Hjort, N., 2002. Goodness-of-fit processes for logistic regression: Simulation results. Statistics in medicine 21, 2723–38. https://doi.org/10.1002/sim.1200 Hughes, M.L., McDowell, P.F., Marcus, W.A., 2006. Accuracy assessment of georectified aerial photographs:

Implications

for

measuring

lateral

channel

movement

in

a

GIS.

Geomorphology 74, 1–16. https://doi.org/10.1016/j.geomorph.2005.07.001 Iliffe, J., Lott, R., 2008. Datums and Map Projections: for Remote Sensing, GIS and Surveying. Whittles Publishing, Caithness, Scotland. Jafari, A., Finke, P.A., Wauw, J.V., Ayoubi, S., Khademi, H., 2012. Spatial prediction of USDAgreat soil groups in the arid Zarand region, Iran: comparing logistic regression approaches to

34

Journal Pre-proof predict diagnostic horizons and soil types. European Journal of Soil Science 63, 284–298. https://doi.org/10.1111/j.1365-2389.2012.01425.x Jalalian, A., Khademi, H., 1987. Studies on determination of resources and land suitability in Semirom region. Ministry of Agriculture, Jihad construction of Isfahan province (in Persian). Keesstra, S.D., Bouma, J., Wallinga, J., Tittonell, P., Smith, P., Cerdà, A., Montanarella, L., Quinton, J.N., Pachepsky, Y., van der Putten, W.H., Bardgett, R.D., Moolenaar, S., Mol, G.,

the

United

Nations

Sustainable

Development

Goals.

SOIL

2,

111–128.

ro

of

of

Jansen, B., Fresco, L.O., 2016. The significance of soils and soil science towards realization

-p

https://doi.org/10.5194/soil-2-111-2016

re

Kempen, B., Brus, D.J., Stoorvogel, J.J., Heuvelink, G.B.M., de Vries, F., 2012. Efficiency comparison of conventional and digital soil mapping for updating soil maps. Soil Science

lP

Society of America Journal 76, 2097–2115. https://doi.org/10.2136/sssaj2011.0424

na

Kursa, M.B., Rudnicki, W.R., 2010. Feature Selection with the Boruta Package. Journal of Statistical Software 36, 1–13. https://doi.org/10.18637/jss.v036.i11

ur

Lagacherie, P., McBratney, A.B., 2006. Chapter 1 Spatial Soil Information Systems and Spatial

Jo

Soil Inference Systems: Perspectives for Digital Soil Mapping, in: Lagacherie, P., McBratney, A.B., Voltz, M. (Eds.), Developments in Soil Science, Digital Soil Mapping. Elsevier, pp. 3–22. https://doi.org/10.1016/S0166-2481(06)31001-X Láng, V., Fuchs, M., Waltner, I., Michéli, E., 2013. Soil taxonomic distance, a tool for correlation: As exemplified by the Hungarian Brown Forest Soils and related WRB Reference

Soil

Groups.

Geoderma

192,

269–276.

https://doi.org/10.1016/j.geoderma.2012.07.023 Leenaars, J.G.B., Kempen, B., van Oostrum, A., Batjes, N., 2014. Africa Soil Profiles Database: a compilation of georeferenced and standardised legacy soil profile data for Sub-Saharan

35

Journal Pre-proof Africa, in: GlobalSoilMap: Basis of the Global Spatial Soil Information System Proceedings

of

the

1st

GlobalSoilMap

Conference.

pp.

51–57.

https://doi.org/10.1201/b16500-13 Li, W., Zhang, C., Dey, D., R Willig, M., 2013. Updating Categorical Soil Maps Using Limited Survey Data by Bayesian Markov Chain Cosimulation. TheScientificWorldJournal 2013, 587284. https://doi.org/10.1155/2013/587284

of

Ma, Z., Redmond, R.L., 1995. Tau coefficients for accuracy assessment of classification of

ro

remote sensing data.

-p

Macmillan, R.A., 2003. LandMapR Software Toolkit- C++ Version (2003): User’s Manual. Ministry of Agriculture, Soil

re

Mahler, P.J., 1970. Manual of Land Classification for Irrigation. Institute of Iran. Pub. No. 205.

lP

Malone, B.P., McBratney, A.B., Minasny, B., Laslett, G.M., 2009. Mapping continuous depth

na

functions of soil carbon storage and available water capacity. Geoderma 154, 138–152. https://doi.org/10.1016/j.geoderma.2009.10.007

ur

Mayr, T., Rivas-Casado, M., Bellamy, P., Palmer, R., Zawadzka, J., Corstanje, R., 2010. Two

Jo

Methods for Using Legacy Data in Digital Soil Mapping, in: Boettinger, J.L., Howell, D.W., Moore, A.C., Hartemink, A.E., Kienast-Brown, S. (Eds.), Digital Soil Mapping: Bridging Research, Environmental Application, and Operation, Progress in Soil Science. Springer Netherlands, Dordrecht, pp. 191–202. https://doi.org/10.1007/978-90-481-8863-5_16 Mayr, T.R., Palmer, R.C., Cooke, H.J., 2008. Digital Soil Mapping Using Legacy Data in the Eden Valley, UK, in: Hartemink, A.E., McBratney, A., Mendonça-Santos, M. de L. (Eds.), Digital Soil Mapping with Limited Data. Springer Netherlands, Dordrecht, pp. 291–301. https://doi.org/10.1007/978-1-4020-8592-5_25

36

Journal Pre-proof McBratney, A.B., Mendonça Santos, M.L., Minasny, B., 2003. On digital soil mapping. Geoderma 117, 3–52. https://doi.org/10.1016/S0016-7061(03)00223-4 McFerrin, L., 2013. HDMD: Statistical analysis tools for high dimension molecular data (HDMD). R package version 1. Moemeni, A., 2000. Production capacity of land resources of Iran. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian).

of

Mohammadi, M., 1986. Semi-detailed soil survey of Chaharmahal-Va-Bakhtiari province

ro

(Shahrekord and Borujen area). Ministry of Agriculture, Iranian Soil and Water Research

-p

Institute (in Persian).

re

Mohammadi, M., 1999. Correlation study of soils in central arid south of Iran, vol. 2: East of Zagros region (Fereydan, Chaharmahal-va-Bakhtiari and Semirom). Ministry of Agriculture,

lP

Iranian Soil and Water Research Institute (in Persian).

na

Moore, I.D., Grayson, R.B., Ladson, A.R., 1991. Digital terrain modelling: A review of hydrological,

geomorphological,

and

biological

applications.

ur

https://doi.org/10.1002/hyp.3360050103

Jo

Nield, J.S., Boettinger, J., Ramsey, R., 2006. Digitally Mapping Gypsic and Natric Soil Areas Using Landsat ETM Data. Soil Science Society of America Journal 71, 245–252. https://doi.org/10.2136/sssaj2006-0049 Odeh, I., Leenaars, J.G.B., Hartemink, A., Amapu, I., 2012. The challenges of collating legacy data for digital mapping of Nigerian soils, in: Digital Soil Assessments and Beyond Proceedings

of

the

Fifth

Global

Workshop

https://doi.org/10.1201/b12728-88

37

on

Digital

Soil

Mapping.

Journal Pre-proof Odgers, N.P., Sun, W., McBratney, A.B., Minasny, B., Clifford, D., 2014. Disaggregating and harmonising soil map units through resampled classification trees. Geoderma 214–215, 91– 100. https://doi.org/10.1016/j.geoderma.2013.09.024 Pahlavan-Rad, M.R., Toomanian, N., Khormali, F., Brungard, C.W., Komaki, C.B., Bogaert, P., 2014. Updating soil survey maps using random forest and conditioned Latin hypercube sampling in the loess derived soils of northern Iran. Geoderma 232–234, 97–106.

of

https://doi.org/10.1016/j.geoderma.2014.04.036

ro

Pahlavan-Rad, M.R., Khormali, F., Toomanian, N., Brungard, C.W., Kiani, F., Komaki, C.B.,

-p

Bogaert, P., 2016. Legacy soil maps as a covariate in digital soil mapping: A case study from

re

Northern Iran. Geoderma 279, 141–148. https://doi.org/10.1016/j.geoderma.2016.05.014 Pakparvar, M., Gabriëls, D., Aarabi, K., Edraki, M., Raes, D., Cornelis, W., 2012. Incorporating

International

Journal

of

Remote

Sensing

33,

6215–6238.

na

Iran.

lP

legacy soil data to minimize errors in salinity change detection: a case study of Darab Plain,

https://doi.org/10.1080/01431161.2012.676688

ur

Pontius Jr, R.G., Santacruz, A., 2019. diffeR: Metrics of Difference for Comparing Pairs of Maps

Jo

or Pairs of Variables.

Prado, R.B., Benites, V.M., Machado, P.L.O.A., Polidoro, J.C., Dart, R.O., Naumov, A., 2008. Mapping Potassium Availability from Limited Soil Profile Data in Brazil, in: Hartemink, A.E., McBratney, A., Mendonça-Santos, M. de L. (Eds.), Digital Soil Mapping with Limited Data. Springer Netherlands, Dordrecht, pp. 91–101. https://doi.org/10.1007/978-1-40208592-5_8 R Development Core Team, 2017. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna Austria.

38

Journal Pre-proof Rasaei, Z., Bogaert, P., 2019a. Spatial filtering and Bayesian data fusion for mapping soil properties: A case study combining legacy and remotely sensed data in Iran. Geoderma 344, 50–62. https://doi.org/10.1016/j.geoderma.2019.02.031 Rasaei, Z., Bogaert, P., 2019b. Bayesian data fusion for combining maps of predicted soil classes: A case study using legacy soil profiles and DEM covariates in Iran. CATENA 182, 104138. https://doi.org/10.1016/j.catena.2019.104138

of

Roozitalab, M.H., Siadat, H., Farshad, A. (Eds.), 2018. The Soils of Iran, World Soils Book

ro

Series. Springer International Publishing.

-p

Rossiter, D.G., 2008. Digital soil mapping as a component of data renewal for areas with sparse

re

soil data infrastructures, in: Hartemink, A.E., McBratney, A., Mendonça-Santos, M. (Eds.), Digital Soil Mapping with Limited Data. Springer, Dordrecht, pp. 69–80.

of

soil

class predictions.

Geoderma

292,

118–127.

na

assessment

lP

Rossiter, D.G., Zeng, R., Zhang, G.-L., 2017. Accounting for taxonomic distance in accuracy

https://doi.org/10.1016/j.geoderma.2017.01.012

ur

Rouse Jr, J.W., Haas, R.H., Schell, J.A., Deering, D.W., 1974. Monitoring Vegetation Systems in

Jo

the Great Plains with Erts. NASA Special Publication 351, 309–317. Sarmento, E.C., Giasson, E., Weber, E.J., Flores, C.A., Rossiter, D.G., Hasenack, H., 2014. Caracterização de mapas legados de solos: uso de indicadores em mapas com diferetens escalas no Rio Grande do Sul. Revista Brasileira de Ciência do Solo 38, 1672–1680. https://doi.org/10.1590/S0100-06832014000600002 Smith, G.D., 1986. The Guy Smith Interviews: Rationale for Concepts in Soil Taxonomy. Technical Monograph. Soil Management Support Service, Washington, DC., Washington, DC.

39

Journal Pre-proof Soil Survey Staff, 2014. Keys to Soil Taxonomy, 12th ed. USDA-Natural Resources Conservation Service, Washington DC. Soil Survey Staff, 1999. Soil Taxonomy: a basic system of soil classification for making and interpreting

soil surveys,

2nd

ed.

USDA-Natural Resources

Conservation

Service,

Washington, DC. Soil Survey Staff, 1985. Key to Soil Taxonomy, Technical Monograph No. 6. ed. USDA- Soil

of

Management Support Service, Ithaca, New York.

ro

Soil Survey Staff, 1975. Soil Taxonomy: A Basic System of Soil Classification for Making and

-p

Interpreting Soil Surveys. USDA- Agricultural Handbook.

re

Sørensen, R., Zinko, U., Seibert, J., 2006. On the calculation of the topographic wetness index: evaluation of different methods based on field observations. Hydrology and Earth System

lP

Sciences 10, 101–112. https://doi.org/10.5194/hess-10-101-2006

na

Stehman, S.V., Czaplewski, R.L., 1998. Design and Analysis for Thematic Map Accuracy Assessment: Fundamental Principles.

Remote Sensing of Environment 64, 331–344.

ur

https://doi.org/10.1016/S0034-4257(98)00010-8

legacy

Jo

Sulaeman, Y., Minasny, B., McBratney, A.B., Sarwani, M., Sutandi, A., 2013. Harmonizing soil data

for

digital soil mapping

in

Indonesia.

Geoderma 192,

77–85.

https://doi.org/10.1016/j.geoderma.2012.08.005 Taghizadeh-Mehrjardi, R., Nabiollahi, K., Minasny, B., Triantafilis, J., 2015. Comparing data mining classifiers to predict spatial distribution of USDA-family soil groups in Baneh region, Iran. Geoderma 253–254, 67–77. https://doi.org/10.1016/j.geoderma.2015.04.008 Taghizadeh-Mehrjardi (Ed.), 2016. Modern concepts in soil science (Pedometrics). Ardakan University (in Persian), Ardakan, Iran.

40

Journal Pre-proof Waltner, I., Csákiné Michéli, E., Fuchs, M., Láng, V., Pásztor, L., Bakacsi, Z., Laborczi, A., Szabó, J., 2014. Digital mapping of selected WRB units based on vast and diverse legacy data, in: Arrouays, D., McKenzie, N., Hempel, J., DeForges, A.C.R., McBratney, A. (Eds.), GlobalSoilMap. CRC Press - Taylor and Francis Group, Boca Raton FL, pp. 313–317. Xiao, J., Shen, Y., Tateishi, R., Bayaer, W., 2006. Development of topsoil grain size index for monitoring desertification in arid land using remote sensing. International Journal of Remote

of

Sensing 27. https://doi.org/10.1080/01431160600554363

ro

Yang, L., Jiao, Y., Fahmy, S., Zhu, A.-X., Hann, S., Burt, J.E., Qi, F., 2011. Updating

-p

Conventional Soil Maps through Digital Soil Mapping. Soil Science Society of America

re

Journal 75, 1044–1053. https://doi.org/10.2136/sssaj2010.0002 Yazdani, A., Eskandarzadeh, Y., Mohammadi, M., 1986. Semi-detailed soil survey of Juneghan

na

Research Institute (in Persian).

lP

region, Chaharmahal-va-Bakhtiari province. Ministry of Agriculture, Iranian Soil and Water

Zeraatpisheh, M., Ayoubi, S., Brungard, C.W., Finke, P., 2019. Disaggregating and updating a

ur

legacy soil map using DSMART, fuzzy c-means and k-means clustering algorithms in

Jo

Central Iran. Geoderma 340, 249–258. https://doi.org/10.1016/j.geoderma.2019.01.005 Zeraatpisheh, M., Jafari, A., Bagheri Bodaghabadi, M., Ayoubi, S., Taghizadeh-Mehrjardi, R., Toomanian, N., Kerry, R., Xu, M., 2020. Conventional and Digital Soil Mapping in Iran: Past,

Present,

and

Future.

URL

https://www.researchgate.net/publication/337945434_Conventional_and_Digital_Soil_Mapp ing_in_Iran_Past_Present_and_Future (accessed 1.19.20). Zhu, A.X., Hudson, B., Burt, J., Lubich, K., Simonson, D., 2001. Soil mapping using GIS, expert knowledge, and fuzzy logic. Soil Science Society of America Journal 65, 1463–1472.

41

Journal Pre-proof Zinck, J.A., Metternicht, G., Bocco, G., Francisco, D.V., Héctor (Eds.), 2016. Geopedology: An Integration of Geomorphology and Pedology for Soil and Landscape Studies, 1st ed. 2016

Jo

ur

na

lP

re

-p

ro

of

edition. ed. Springer, New York, NY.

42

Journal Pre-proof List of Tables Table 1 General characteristics of the selected legacy soil surveys (the three first rows) and the correlation study used for evaluation of DSM models (the last row). All used legacy soil maps are in the scale of 1: 50,000.

Soil survey Chaharmahalva-Bakhtiari

Studied year 1970-1980

Location 31° 45′ to 32° 30′ N 50° 45′ to 51° 30′ E 32° 0′ to 32° 15′ N 50° 30′ to 50° 45′ E

Total area (km2 )

Studied area (km2 )

Nr soil map sheet

3893

1700

6

318

154.8

1

3085

Juneghan

1982

Fereydan

1983-1987

32° 15′ to 33° 15′ N 50° 0′ to 51° 15′ E

6269

1992-1993

31° 0′ to 33° 15′ N 50° 0′ to 52° 0′ E

16500

Correlation

l a n

15000

Nr soil series/ representative soil profiles

r P

Without soil map

r u o

J

43

Nr available soil profiles

67

48

8

12

8

24

99

67

44

-

44

f o

o r p

e

11

Nr soil units

19

Reference (Mohammadi, 1986) (Yazdani et al., 1986) (GhaziZahedi et al., 1992) (Mohammadi, 1999)

Journal Pre-proof Table 2 The quality of georeferencing process of legacy evaluation maps at the scale of 1: 50,000 with Maximum location accuracy = 12.5 m, MLA = 10 ha, and side length of MLA= 316.23 m. Number of

RMSE proportion of

RMSE proportion of

side length MLA

Maximum location

(%)

accuracy

1.18

RMSE Region

ground control (m) points

Chaharmahal10

14.75

4.66

Juneghan

4

6.12

1.94

Fereydan

15

18.01

ro

of

va-Bakhtiari

Jo

ur

na

lP

re

-p

5.70

44

0.49 1.44

Journal Pre-proof Table 3 Assessment of map scale and map texture through the ASD and IMR of studied legacy soil maps. Area of soil map units (ha)

Map texture

Region Minimum

Mean

Maximum

ASD (cm2 )

IMR

44.38

811.40

6357.78

162.28

20.14

Juneghan

34.12

624.42

3762.00

124.88

17.67

Fereydan

36.94

1274.09

9732.78

254.82

25.24

Chaharmahal- va-

Jo

ur

na

lP

re

-p

ro

of

Bakhtiari

45

Journal Pre-proof Table 4 Weighted statistical evaluation of legacy soil maps’ units for four taxonomic levels. Suborder

Great Group

Subgroup

Tau

Overall

Tau

Overall

Tau

Overall

Tau

Overall

index

accuracy

index

accuracy

index

accuracy

index

accuracy

0.786

0.851

0.741

0.817

0.704

0.811

0.704

0.791

Juneghan

0.630

0.776

0.541

0.733

0.503

0.696

0.501

0.696

Fereydan

0.756

0.841

0.763

0.846

0.759

0.841

0.717

0.824

Total

0.773

0.855

0.742

0.844

0.733

0.798

0.791

0.792

Final renewed map

0.826

0.927

0.780

-p

Order

0.764

0.823

0.802

0.809

Chaharmahal- va-

ro

of

Bakhtiari

Jo

ur

na

lP

re

0.917

46

Journal Pre-proof Table 5 MLR models of predicting soil classes in different Taxonomic levels along with their accuracy based on 41 validation data set, without using legacy soil map as a covariate (Dataset1) and when it is included (Dataset2).

Taxonomic

Overall AIC

Dataset

Tau index

Predictors

accuracy 496.196

Dataset1

0.732

0.678

Sl, LSf, El, VD

246.507

Dataset2

0.951

0.912

Sl, LSf, El, SoilMap

889.992

Dataset1

0.678

0.609

Sl, LSf, El, Con, MRVBF

330.026

Dataset2

0.801

0.712

Great

1248

Dataset1

0.526

re

level

Group

402.532

Dataset2

0.543

1606.415

Dataset1

555.504

Dataset2

ro

0.502

Sl, LSf, El, GSI, NDVI, Con, ClI, MRVBF, SoilMap

0.376

0.301

LSf, Sl, Con, GSI, MRVBF, FlAc

0.409

0.416

LSf, Sl, Con, GSI, MRVBF, El, SoilMap

ur

lP

Sl, LSf, El, GSI, NDVI, Con, ClI, MRVBF

na

Sl, LSf, Con, El, FlAc, SoilMap

0.464

Jo

Subgroup

-p

Suborder

of

Order

47

Journal Pre-proof Table 6 The cross-classification table of difference areas between soil Orders predictions (km2 ) from two MLR models, with (Map2) and without (Map1) using the legacy soil map (SoilMap) as a covariate. Map2 (with SoilMap) Alfisols

Aridisols

Alfisols

0

0

Aridisols

5.1

Entisols

Entisols

Inceptisols

Mollisols

Vertisols

0

2.6

0

0

131.6

1.2

43.1

0.6

0

0.1

0.0

0.4

1.7

0

0.1

Inceptisols

404.4

205.2

331.0

145.0

26.8

Mollisols

1.9

0.9

0.8

12.7

0.2

0.1

Vertisols

0

0

0

0

0

of

Orders

(without

3094.0

re

-p

SoilMap)

ro

Map1

Jo

ur

na

lP

0

48

Journal Pre-proof

na

lP

re

-p

ro

of

List of Figures

Fig. 1 Location of the study area in the border of Isfahan (upper right of the study area) and

Jo

ur

Chaharmahal- va-Bakhtiari (lower part of the study area) provinces. The three selected legacy soil surveys are shown too.

49

lP

re

-p

ro

of

Journal Pre-proof

na

Fig. 2 The geographic distribution of legacy soil profiles coloured by soil Orders. Legacy soil profiles digitized from three selected legacy soil maps are shown in circle and derived from the

Jo

ur

Correlation study used to evaluate the quality of soil maps in triangles.

50

lP

re

-p

ro

of

Journal Pre-proof

Fig. 3 Final renewed polygon map of soil units coloured by soil orders. The blank (white) parts

Jo

ur

na

are related to the non-studied and non-mapped parts (mostly mountains).

51

Journal Pre-proof

-p

ro

of

a

Jo

ur

na

lP

re

b

Fig. 4 The results of Boruta for identifying the most important covariates to prediction of soil Orders for two different covariate sets; (a) without and (b) with legacy soil map. Red, yellow and green boxplots represent Z scores of rejected, tentative and confirmed attributes respectively. (El: Elevation, Sl: Slope, As: Aspect, VD: Valley depth, MidS: Midslope position, TWI: Topographic wetness index, MRVBF: Module Multiresolution Index of Valley Bottom Flatness,

52

Journal Pre-proof SPI: Stream power index, VDCHN: Vertical Distance to Channel Network, FlAc: Flow accumulation, LSf: LS factor, Con: Convexity, NDVI: Normalized difference vegetation index,

Jo

ur

na

lP

re

-p

ro

of

GSI: grain size index, ClI: Clay index, GypI: Gypsum index, CarI: Carbonate index)

53

Journal Pre-proof

re

-p

ro

of

a

Jo

ur

na

lP

b

Fig. 5 Maps of predicted soil Orders over the study area by using the selected covariates (a) without and (b) with legacy soil map as a covariate.

54

lP

re

-p

ro

of

Journal Pre-proof

Fig. 6 Difference between two soil Orders predictions through MLR models with (Map2) and

na

without (Map1) legacy soil map using a cross-classification table. Differences are related to

Jo

ur

prediction in the Map2 rather than Map1.

Supplementary data

Supplementary material 1 Supplementary material 2

References (Supplementary materials) Abtahi, A., Yusefi, A., 1968. Comprehensive detailed and semi-detailed soil survey and land classification of Zayandehrud watershed. Ministry of Agriculture, Iranian Soil and Water Research Institute. (in Persian).

55

Journal Pre-proof Anonymous, 1994. Comprehensive study of soil of Chaharmahal-and-Bakhtiari province. Natural resource organization of Chaharmahal-va-Bakhtiari province (in Persian). Ghaumi-Mohammadi, H., 2007. Geopedologic studies of Daran-Damane region. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian). Ghaumi-Mohammadi, H., 1994. Detailed soil survey and land classification of north of Ardal region. Ministry of Agriculture, Iranian Soil and Water Research Institute. (in Persian).

of

Ghazi-Zahedi, A.A., Mohammadi, M., Noorbakhsh, F., 1992. Semi-detailed soil survey and land

ro

classification of Fereydan region. Ministry of Agriculture, Iranian Soil and Water Research

-p

Institute (in Persian).

re

Jalalian, A., 1997a. Studies on determination of resources and land suitability in Semirom region. Ministry of Agriculture, Jihad construction of Isfahan Province (in Persian).

lP

Jalalian, A., 1997b. Studies on resource and land suitability evaluation in Farsan area. Ministry

Persian).

na

of Jihad construction. Ministry of Agriculture, Jihad construction of Isfahan Province (in

ur

Jalalian, A., Khademi, H., 1987. Studies on determination of resources and land suitability in

Persian).

Jo

Semirom region. Ministry of Agriculture, Jihad construction of Isfahan province (in

Jalalian, A., Mohammadi, M., 1997. Resource studies and land capability of northern watershed of Karoun river. Ministry of Agriculture, planning department (in Persian). Jalalian, A., Mohammadi, M., 1986. Soil survey studies of Chaharmahal-va-Bakhtiari province. Ministry of Agriculture, Iranian plan and Budget organization (in Persian).

56

Journal Pre-proof Mahjuri, S., 1974. Semi-detailed soil survey of Fereydan research farm of rectification and preparation of sapling and seed. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian). Mohammadi, M., 1999. Correlation study of soils in central arid south of Iran, vol. 2: East of Zagros region (Fereydan, Chaharmahal-va-Bakhtiari and Semirom). Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian).

of

Mohammadi, M., 1986. Semi-detailed soil survey of Chaharmahal-Va-Bakhtiari province

ro

(Shahrekord and Borujen area). Ministry of Agriculture, Iranian Soil and Water Research

-p

Institute (in Persian).

re

Mohammadi, M., 1977a. Semi-detailed and reconnaissance soil survey of different parts of Chaharmahal-va-Bakhtiari province (around Borujen). Ministry of Agriculture, Iranian Soil

lP

and Water Research Institute (in Persian).

na

Mohammadi, M., 1977b. Semi-detailed soil survey of Agriculture and animal college of Shahrekord. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian).

ur

Mohammadi, M., 1970a. Semi-detailed soil survey of Avargan region (Borujen). Ministry of

Jo

Agriculture, Iranian Soil and Water Research Institute (in Persian). Mohammadi, M., 1970b. Semi-detailed soil survey of Dezak region of Shahrekord. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian). Molazadeh, I., 1977. Land capability evaluation of Zayandehrud watershed. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian). Noorbakhsh,

F.,

classification

Ghaumi-Mohammadi, research

centers

of

H.,

1984.

Isfahan

57

and

Semi-detailed

soil survey and

Chaharmahal-va-Bakhtiari

land

provinces

Journal Pre-proof (Golpaygan, Borujen, Braan, Shahrekord). Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian). Ormazdi, B., 1968. Semi-detailed soil survey of some Shahrekord area. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian). Saadatmand, GH.R., 2009. Qualitative assessment of land suitability for Wheat, Barley, Alfalfa and Potato in Fereydan-Daran region. Ministry of Agriculture, Iranian Soil and Water

of

Research Institute (in Persian).

ro

Saadatmand, GH.R., 2006. Qualitative assessment of land suitability for Wheat, Barley, Alfalfa

-p

and Potato in Fereydan-Chadegan region. Ministry of Agriculture, Iranian Soil and Water

re

Research Institute (in Persian).

Saadatmand, GH.R., 2003. Semi detailed studies of qualitative assessment of land suitability for

lP

Wheat, Barley, Alfalfa and Potato in Fereydan-Asgaran region. Ministry of Agriculture,

na

Iranian Soil and Water Research Institute (in Persian). Samadi, M., 1972. Land capability evaluation of bottom of Zayandehrud, Eastern parts of

Jo

Persian).

ur

Isfahan (part 3). Ministry of Agriculture, Iranian Soil and Water Research Institute (in

Samadi, M., 1940. Land capability evaluation of Isfahan region (Zayandehrud) for irrigation. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian). Schoeneberger, P.J., Wysocki, D.A., Benham, E.C., Soil Survey Staff, 2012. Field book for describing and sampling soils, Version 3.0. Natural Resources Conservation Service, National Soil Survey Center, Lincoln, NE.

58

Journal Pre-proof Shayesteh, P., Mohajer-Shojaei, M.H., 1991. Evaluation of resources and land suitability of Daran-Shahrekord region. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian). Yazdani, A., Eskandarzadeh, Y., Mohammadi, M., 1986. Semi-detailed soil survey of Juneghan region, Chaharmahal-va-Bakhtiari province. Ministry of Agriculture, Iranian Soil and Water Research Institute (in Persian).

of

Yusefi, A., Zaringhalam, A., 1953. Semi-detailed soil survey and land classification of Isfahan

ro

project (Zayandehrud). Ministry of Agriculture, Iranian Soil and Water Research Institute

lP

re

-p

(in Persian).

na

Highlights

Jo

DSM

ur

 Legacy soil maps in the case of Iran are of reasonable quality and can be used as inputs to

 A MLR model performance is improved by using the renewed legacy soil map as a covariate

 Findings justify the effort needed for the archaeology, rescue, and renewal of legacy surveys

 The used procedures can be found helpful to evaluate legacy soil studies in other parts of Iran

59