A modular and parallelized watershed modeling framework

A modular and parallelized watershed modeling framework

Environmental Modelling and Software 122 (2019) 104526 Contents lists available at ScienceDirect Environmental Modelling and Software journal homepa...

4MB Sizes 3 Downloads 96 Views

Environmental Modelling and Software 122 (2019) 104526

Contents lists available at ScienceDirect

Environmental Modelling and Software journal homepage: http://www.elsevier.com/locate/envsoft

A modular and parallelized watershed modeling framework Liang-Jun Zhu a, b, Junzhi Liu c, d, e, *, Cheng-Zhi Qin a, b, e, **, A-Xing Zhu a, b, c, d, e, f a

State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing, 100101, China b College of Resources and Environment, University of Chinese Academy of Sciences, Beijing, 100049, China c School of Geography, Nanjing Normal University, Nanjing, 210023, China d Key Laboratory of Virtual Geographic Environment (Nanjing Normal University), Ministry of Education, Nanjing, 210023, China e Jiangsu Center for Collaborative Innovation in Geographical Information Resource Development and Application, Nanjing, 210023, China f Department of Geography, University of Wisconsin-Madison, Madison, WI, 53706, USA

A R T I C L E I N F O

A B S T R A C T

Keywords: Watershed modeling framework Modular Parallelization Watershed process simulation SEIMS

It is necessary to develop a flexible and extensible watershed modeling framework with the support of parallel computing to conduct long-term high-resolution simulations over large areas with diverse watershed charac­ teristics. This paper introduced an open-source, modular, and parallelized watershed modeling framework called SEIMS (short for Spatially Explicit Integrated Modeling System) to meet this need. First, a flexible modular structure with standard interfaces was designed, in which each module corresponds to one simulation algorithm for a watershed subprocess. Then, a parallel-computing middleware based on an improved two-level parallel computing approach was constructed to speed up the computational efficiency. With SEIMS, users can add their own algorithms in a nearly serial programming manner and construct parallelized watershed models. SEIMS also supports model level parallel computation for applications which need numerous model runs. The effectiveness and efficiency of SEIMS were illustrated through the simulation of streamflow in the Youwuzhen watershed, Southeastern China.

Software availability Software: SEIMS (Spatially Explicit Integrated Modeling System) Operating systems supported: Windows, Linux, and macOS Language: Cþþ and Python Availability: SEIMS is open-sourced on Github (https://github. com/lreis2415/SEIMS) 1. Introduction Watershed modeling is the process of building computer-based simulation models to aid users in understanding watershed processes (such as hydrology, soil erosion, and nutrient cycling), and has been widely used in applications such as the flood prediction and the evalu­ ation of watershed management practices. A typical procedure for watershed modeling can be summarized as follows: 1) Select or customize watershed models according to the physical characteristics of

the study area and the requirements of applications; 2) Collect and preprocess input data including both the basic data such as climate, landuse, soil, and Digital Elevation Model (DEM) data and applicationspecific data such as the environmental efficiency data for the Best Management Practices (BMPs) under consideration (Qin et al., 2018); 3) Perform sensitivity analysis and calibration manually or automatically (Arnold et al., 2012; Rouholahnejad et al., 2012; Zhan et al., 2013; Zhang et al., 2013); 4) Analyze the model outputs or apply the calibrated model to evaluation purposes such as decision-making support based on BMP scenarios analysis (Maringanti et al., 2009; Qin et al., 2018; Zhu et al., 2019). Two major challenges need to be overcome in order to carry out the above procedures effectively and efficiently. On the one hand, a flexible and extensible modeling system is needed to meet varied modeling purposes, since an ideal and all-purpose watershed model does not exist (Kneis, 2015). On the other hand, parallel computing is required because a large amount of computation is needed by both the model itself and the

* Corresponding author. School of Geography, Nanjing Normal University, Nanjing, 210023, China. ** Corresponding author. State Key Laboratory of Resources and Environmental Information System, Institute of Geographic Sciences and Natural Resources Research, CAS, Beijing, 100101, China. E-mail addresses: [email protected] (L.-J. Zhu), [email protected] (J. Liu), [email protected] (C.-Z. Qin), [email protected] (A.-X. Zhu). https://doi.org/10.1016/j.envsoft.2019.104526 Received 18 September 2018; Received in revised form 10 February 2019; Accepted 24 September 2019 Available online 26 September 2019 1364-8152/© 2019 Elsevier Ltd. All rights reserved.

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 1. (a) Delineation of spatial hierarchical units of a watershed from coarse to fine levels, i.e., subbasins, hillslopes (source-, left-, and right-hillslope), slope positions (e.g., ridge, backslope, and valley), and basic simulation units (e.g., grid cells and irregularly shaped fields); and topography-based flow routing re­ lationships of (b) basic simulation units for overland and subsurface flow routing processes, and (c) subbasins (or reaches) for channel flow routing process.

flexibility rather than good parallel performance. In OMS3, the simu­ lation of different watershed subprocesses is organized into self-contained modules (David et al., 2013), while in ECHSE, several types of spatial units (e.g., subbasin, reach, lake, and node) with asso­ ciated variables and methods are treated as different simulation objects (Kneis, 2015). These frameworks achieve parallel computing by executing modules or objects without dependencies concurrently and are mainly implemented using shared-memory multithreaded pro­ gramming approaches such as Open Multi-Processing (OpenMP), the de facto standard API (Application Programming Interface) for parallel computing on shared-memory machines, which uses notation (directives or pragmas) to indicate the code region that should be executed concurrently (Chapman et al., 2007). Such shared-memory multi­ threaded implementations cannot make good use of more scalable distributed-memory parallel platforms such as the symmetric multi­ processing (SMP) clusters, which limits the scalability of these frame­ works (David et al., 2013; Kneis, 2015; Wenderholm, 2005). In fact, several parallelization strategies which can effectively utilize both SMP cluster and shared-memory parallel platforms have been proposed in recent years (Liu et al., 2016; Vivoni et al., 2011; Wang et al., 2011, 2013; Yalew et al., 2013), but have not yet been integrated into flexible watershed modeling frameworks. These parallelization strategies often conduct task scheduling based on spatial discretization (Liu et al., 2016; Vivoni et al., 2011; Yalew et al., 2013) or spatio-temporal discretization (Wang et al., 2013). The implementations are mainly based on the Message Passing Interface (MPI) programming model or a hybrid of MPI and OpenMP models (Liu et al., 2016). However, to obtain the high efficiency offered by these parallelization strategies requires users to pay a high cost in terms of learning pro­ gramming skills as well as spending lots of effort on parallel program­ ming details, such as domain decomposition, task scheduling, and data communication (Liu et al., 2016; Wang et al., 2013). Meanwhile, these parallelized watershed models often lack standard and concise in­ terfaces for extending new watershed subprocess modules according to corresponding parallelization strategies. Therefore, it is difficult for re­ searchers with less parallel computing experience to add new algorithms for watershed subprocesses (e.g., infiltration) to the specific parallelized

evaluation applications that require numerous model runs (Clark et al., 2017; David et al., 2013; Zhang et al., 2013), especially given the emerging trends of increasing complexity of process-based models, finer spatial and temporal resolutions, larger study areas, and longer simu­ lation periods (Freeze and Harlan, 1969; Liu et al., 2016). Environmental modeling frameworks provide an effective way to address the above-mentioned challenges (David et al., 2013; Formetta et al., 2014; Kneis, 2015; Wagener et al., 2001). General-purpose envi­ ronmental modeling frameworks define standard interfaces to couple the models for different processes, which make them flexible and extensible. This type of framework includes the Earth System Modeling Framework (ESMF; Hill et al., 2004), Open Modeling Interface (OpenMI; Moore and Tindall, 2005), Community Surface Dynamics Modeling System (CSDMS; Peckham et al., 2013), and so on. These general-purpose frameworks often provide parallel-computing support for common operations such as data communications and trans­ formation (e.g. regridding) (Hill et al., 2004). However, since they are designed for coupling the existing models for multidisciplinary appli­ cations, they do not provide specific support for the parallelization of spatially-explicit watershed modeling as well as for the development of new watershed models (Kneis, 2015). Watershed modeling frameworks are those environmental modeling frameworks specifically designed for watershed modeling with remarkable characteristics in terms of flexibility, re-usability, and high performance (David et al., 2013; Kneis, 2015; Buahin and Horsburgh, 2018). Specifically, these characteristics commonly include 1) adopting the object-orientated design to provide standard interfaces for the extension of new modeling objects or algorithms; 2) assembling existing modules or objects for specific watershed modeling task in a config­ urable manner; 3) hiding the implementation details of the framework as much as possible and only exposing concise interfaces to facilitate rapid development; and 4) supporting parallel computing inherently without needing users to handle too much parallel computing pro­ gramming details. Object Modeling System (OMS3, David et al., 2013; Formetta et al., 2014) and Eco-Hydrological Simulation Environment (ECHSE, Kneis, 2015) are typical examples. However, the existing watershed modeling frameworks put emphasis on providing good 2

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 2. Architecture of Spatially Explicit Integrated Modeling System (SEIMS) which consists of the SEIMS module library, two versions of SEIMS main programs (i.e., OpenMP version and OpenMP&MPI version), the watershed database, and utility tools for watershed model applications, and can be conducted on multiple parallel computing platforms (see main text in Section 2.1 for detailed explain).

watershed models, especially those in which different watershed sub­ processes are tightly coupled by sharing global variables, or the code of parallel computing is tightly bound to specific variables or modules. Besides the flexible modeling structure and the parallelization of the model itself, the parallel computing of model level applications (such as parameter sensitivity analysis and auto-calibration) should also be supported by a comprehensive watershed modeling framework. Currently, various tools have been developed for specific watershed models (such as the Soil and Water Assessment Tool [SWAT]; Arnold et al., 1998), which are conducted on multi-core computers (Rouho­ lahnejad et al., 2012), SMP clusters (Zhang et al., 2013), or grid-computing platforms (Zhao et al., 2013). However, to the best of our knowledge, existing parallelized watershed modeling frameworks often lack the support for model level parallelization on multiple parallel-computing platforms (Buahin and Horsburgh, 2018). Therefore, the community still lacks a watershed modeling frame­ work which has a flexible and extensible modular structure and an efficient parallel-computing middleware to facilitate rapid development of parallelized watershed models on distributed-memory parallel plat­ forms. This paper introduces a modular and parallelized watershed

modeling framework (called Spatially Explicit Integrated Modeling System, or SEIMS for short) to fill this gap. The design of SEIMS is introduced in Section 2, and its implementation is discussed in Section 3. A case study is shown in Section 4 to illustrate a typical watershedmodeling procedure based on SEIMS, including model construction, parameter sensitivity analysis, and auto-calibration. The computational efficiencies of the single SEIMS-based model and applications that required repeated model runs are also presented. Conclusions and future perspectives are given in Section 5. 2. Design of SEIMS 2.1. Basic idea and overall design SEIMS is currently designed to focus mainly on one type of distrib­ uted watershed models: those in which both overland and subsurface flow routing and channel flow routing are conducted sequentially and follow upstream-downstream orders (Liu et al., 2014). Accordingly, two common assumptions (Liu et al., 2014, 2016; Wang et al., 2011, 2013) for this type of distributed watershed models implemented in SEIMS are 3

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

presented as follows.

will be dispatched to one of the multiple nodes of an SMP cluster using message-passing programming model for execution, while for each subbasin the simulation of basic simulation units will be further dispatched to multi-core within each node using shared-memory programming model and then be executed. With the help of the modular interface, the details of parallel computing at the subbasin level could be hidden from users, and the parallel computing at the basic-unit level based on OpenMP is quite simple and requires little extra programming skill. This al­ lows users to develop parallelized watershed models easily. Be­ sides the two-level parallelization strategy adopted for inside-model execution, a model level parallelization tool based on job management is also designed to speed up those model level applications which require repeated model runs, such as spatial optimization of BMP scenarios.

(1) A watershed can be partitioned into spatial hierarchical units from coarse to fine levels, such as subbasins, hillslopes, slope positions (also known as “landforms” in Band (1999)), and basic simulation units (e.g., grid cells, Voronoi polygons, or hydrologic response units) (Fig. 1a; Band, 1999; Band et al., 2000; Bieger et al., 2016; Vivoni et al., 2011). A subbasin is defined as a relative closed and independent catchment area which contains two or three types of hillslopes (i.e., source-, left-, and right-hillslope) and drains to one reach (or channel) of the watershed drainage network (Fig. 1a). Slope positions are comparatively homogeneous spatial units with physical geographic features at hillslope scale, which are often too coarse to be simulation units for distributed watershed modeling and however are suitable as configuration units for the evaluation of watershed management practices (Qin et al., 2018; Zhu et al., 2019). Those finer spatial units with more homogeneous hydro­ logic responses (such as grid cells, and irregularly shaped fields) are defined as basic simulation units (Fig. 1a) in SEIMS, on which the hillslope processes at both vertical and horizontal directions are simulated. (2) A topography-based flow direction is assigned to each spatial unit. Computing dependencies exist at different spatial levels and often accord with upstream-downstream routing relationships, such as overland and subsurface flow routing in layers of basic simulation units within each subbasin (Fig. 1b; Liu et al., 2014) and channel flow routing among subbasins (Fig. 1c; Wang et al., 2011).

Based on the basic idea above, the overall architecture of SEIMS was designed as shown in Fig. 2, and consists mainly of the SEIMS module library based on the modular structure, the SEIMS main programs (i.e., OpenMP version and MPI&OpenMP version), the watershed database, and utility tools for watershed model applications (such as parameter sensitivity analysis tool, and auto-calibration tool). The parallel computing middleware at multiple levels is implemented at the basicunit level in SEIMS modules, the subbasin level in the MPI&OpenMP version of SEIMS main program, and the model level in watershed model applications, respectively. A SEIMS-based model consists of one SEIMS main program, several customized SEIMS modules, and the watershed database. Each SEIMS module inherits from the base module class (i.e., SimulationModule) with standard and concise interfaces (e.g., SetData, GetData, and Execute functions) and dependents on base modules such as I/O module (Fig. 2). The basic-unit level parallelization is achieved using OpenMP in the execution function of each SEIMS module. The basic version of SEIMS main program, which is the OpenMP version, is responsible for loading a set of user-configured modules to build and execute a simulation workflow. The MPI&OpenMP version of SEIMS main program uses the subbasin level domain decomposition and task scheduling to create instances of the OpenMP versioned SEIMS main program for each individual subbasins and distribute them among different computing nodes with MPI-based communication, so to ach­ ieve subbasin level parallel computing. Watershed model applications that requires numerous model runs (e.g., sensitivity analysis of a watershed model) are parallelized at model level based on job management. With the support of parallel computing, SEIMS is compatible with common operating systems (such as Windows® and Linux®) and par­ allel computing platforms such as personal computers with multi-core CPU (Central Processing Unit) and SMP clusters (Qin et al., 2014). The detailed design of the modular structure and the parallel computing middleware in current SEIMS is presented as follows.

In order to function as a watershed modeling framework with high extendibility and computational efficiency, SEIMS is designed with two particular features. (a) A flexible modular structure. Much like OMS3 (David et al., 2013), SEIMS adopts a modular structure, which has the advantages of good flexibility and easy maintenance (Leavesley et al., 2006; David et al., 2013; Peckham et al., 2013). Each module corre­ sponds to one simulation algorithm for a watershed subprocess, such as the Penman-Monteith method for simulating potential evapotranspiration. Every module inherits from a standard and concise interface and exposes input and output information via metadata. According to the metadata, a SEIMS-based watershed model for a specific application can be constructed in a loosely coupled manner, i.e., a SEIMS main program dynamically as­ sembles user-configured modules according to their input-output relationships, so to build an application-specific workflow and conducts the simulation. (b) A parallel computing middleware. A parallel-computing middle­ ware is designed to support inside-model and model level parallel computation. Since subbasins can be treated as relative inde­ pendent spatial units for watershed modeling, they are suitable for candidates for being distributed among and executed by the different nodes in an SMP cluster or other distributed-memory parallel platforms (Liu et al., 2016; Vivoni et al., 2011; Wang et al., 2013; Wu et al., 2013; Yalew et al., 2013). For basic simulation units inside each subbasin, which often interact frequently with each other during the simulation, a shared-memory programming model such as OpenMP is suitable and efficient (Liu et al., 2014). Thus, a two-level parallelization strategy proposed by Liu et al. (2014, 2016) is adopted for its exploitation of the parallelizability at both coarse-grained (i.e., subbasin level) and fine-grained (i.e., basic simulation unit level, hereafter referred to basic-unit level for short) levels. With such a strategy, the simulation of subbasins is treated as relative inde­ pendent parallel tasks of individual subbasins. Each of these tasks

2.2. Modular structure Each module of SEIMS module library inherits from a unified module interface (Fig. 2) and is compiled as a separate dynamic link library file. The user-configured modules for a specific watershed modeling appli­ cation are dynamically loaded and combined as a workflow by SEIMS main programs at runtime to conduct the watershed simulation. The unified module interface consists of four types of functions, i.e., Meta­ dataInformation, SetData and GetData, CheckInputData and InitialOutputs, and Execute (Fig. 2). With the unified module interface, users can write SEIMS modules in a nearly serial programming manner and achieve parallel computing at both the basic-unit level inside these modules and the subbasin level without taking care of the details of data communi­ cation in the MPI&OpenMP version of SEIMS main program.

4

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 3. Pseudo-code of the input and output information according to the compatible metadata scheme of SEIMS module.

Fig. 4. Example of the calculation of an InOutput variable (i.e., an output variable that is needed as input for the simulation of one subbasin in MPI&OpenMP version) by the channel flow routing module in the OpenMP version and MPI&OpenMP version of SEIMS main programs. (This InOutput variable has a data type of DT_Array1D for output and a type of TF_SingleValue for transferred data.)

2.2.1. MetadataInformation function The MetadataInformation function provides the metadata of the module, and the SEIMS main program can assemble the selected mod­ ules according to these metadata. The metadata mainly includes basic information about the module, such as its designer and description, as well as the input and output information. The input and output infor­ mation includes static input parameters recorded in database, such as saturated hydraulic conductivity (e.g., lines 1–6 in Fig. 3), dynamic input variables output from other modules, such as surface runoff (e.g., lines 7–9 in Fig. 3), and output variables for the module, such as streamflow at each reach outlet (e.g., lines 10–12 in Fig. 3). Each item of

the input and output information mainly includes the name, unit, description, and data type. The basic data types of SEIMS include single float value (DT_Single, e.g., global parameters of the watershed), 1dimension float array (DT_Array1D and DT_Raster1D, e.g., compact rater data by unfolding 2-dimension array excluding no-data values, Liu et al., 2014), and 2-dimension float array (DT_Array2D and DT_Raster2D, e.g., soil properties of multi-layers). To support the expandability of complex input data with unified interfaces of setting and getting data (see Section 2.2.2), three complex data types are also designed in SEIMS. The first is the reach object (DT_Reach) which contains the reach related parameters (e.g., flow routing layer, task scheduling group, and initial 5

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 5. Pseudo-code of outer loop of time-steps handled in SEIMS main program.

geometric parameters). The second is the subbasin object (DT_Subbasin) which contains subbasin-scale statistical parameters (e.g., area, and average slope). The third is the scenario object (DT_Scenario) that con­ tains BMP scenarios information (e.g., operation schedules of plant management practices with corresponding parameters). The extension of these complex objects can be implemented in the base I/O modules of SEIMS module library (Fig. 2). Since subbasins are modeled independently in the MPI&OpenMP version of SEIMS, data transmission occurs between upstream subbasins and downstream subbasins, especially for channel-routing related modules. This means that some variables could be output and input simultaneously (hereafter referred to as InOutput variable). For example, in the channel flow routing module, the streamflow is an output variable of the array data type with a length of subbasin count (DT_Array1D), while in the MPI&OpenMP version the execution of this module for each subbasin needs the streamflow values from its upstream subbasins as inputs of the single-value data type (e.g., line 11 in Fig. 3). Fig. 4 shows an example of the calculation of an InOutput variable within the channel flow routing module of the OpenMP version and MPI&OpenMP version of SEIMS main programs, respectively. In the OpenMP version, this InOutput variable acts as an ordinary output variable, which is firstly initialized as a 1-dimension array, then assigned values by the execution order of subbasins which following the order of flow routing layers, and finally output as an array with valid simulated values (Fig. 4). In the MPI&OpenMP version which is built on the OpenMP version, the channel flow routing module of a subbasin (e.g., the subbasin D in Fig. 4) cannot be executed until the output values of this variable of all of its upstream subbasins (e.g., subbasin F and G) have been received and the value to this variable of the subbasin has been assigned correspondingly (Fig. 4). Therefore, an optional argument was designed to specify the data type of transferred input and output variables (e.g., lines 11 in Fig. 3) among upstream-downstream subbasins, which made the meta­ data interface compatible with both the OpenMP version and the MPI­ &OpenMP version. There are three options for this argument: TF_None is the default, indicating that the variable doesn’t need to be transferred and can be omitted; TF_SingleValue is used for variables of types DT_Array1D and DT_Raster1D; and TF_OneArray1D is used for variables of types DT_Array2D and DT_Raster2D. With the corresponding setting and getting operations (see Section 2.2.2), the values of transferred variables across subbasins can be dynamically determined after loading the required modules at runtime in the MPI&OpenMP version (see Section 2.3.3). In this way, users can achieve parallel computing at the subbasin level without worrying about the details of data communica­ tion in the SEIMS main program.

include SetValue, Set1DData, and Set2DData, while the GetData functions include GetValue, Get1DData, and Get2DData, correspondingly. For complex data types, only SetData functions are needed, i.e., SetReaches, SetSubbasins, and SetScenario. In order to improve data transfer effi­ ciency across modules in the OpenMP version, array-type and complextype variables are passed by memory address, which means the same variable in different modules points to the same area in memory. Note that, in order to enable the OpenMP version and MPI&OpenMP version of SEIMS main program to invoke SEIMS modules built by the same module code, the variables with certain transferred data types (i.e., TF_SingleValue and TF_OneArray1D) should be set and gotten according to the original data type and transferred data type, respectively. For example, the streamflow variable (line 11 in Fig. 3 and the example in Fig. 4) should be gotten as DT_Array1D to act as an ordinary output variable for output data of the simulation or input of other modules and should also be set as DT_Single for receiving the simulated value of this variable from its upstream subbasin and gotten as DT_Single for its downstream subbasin. 2.2.3. CheckInputData and InitialOutputs functions The CheckInputData function is responsible for checking the validity of input data. For example, it verifies that the input array exists and is accessible and raises an exception if any unavailable or illegal data ex­ ists. The InitialOutputs function is used for creating and initializing output variables and temporary variables of the current module so that the output variables required by other modules as inputs are valid. 2.2.4. Execute function The Execute function, as the core of a module, is for the simulation of a specific watershed subprocess within a time-step. The outer loop of time-steps is handled in SEIMS main program. The nested time-steps between channel/groundwater-routing processes and hillslope pro­ cesses are supported, which means the time-step of hillslope processes can be finer than channel/groundwater-routing processes (Fig. 5). Generally, the customized modules of one SEIMS-based model can be divided into three categories which should be executed sequentially, i.e., driver factor related modules (e.g., reading and preprocessing of climate data), hillslope processes modules, and channel/groundwater-routing modules (Fig. 5). The inner loop in the Execute function is to perform simulation on basic simulation units for hillslope processes and subbasin units (or reaches) for channel/groundwater-routing processes (see Sec­ tion 2.3.2). 2.3. Parallel computing middleware

2.2.2. SetData and GetData functions The SetData and GetData functions are responsible for setting and getting parameters or variables listed in the metadata of one module (e. g., Fig. 3), respectively. The SetData functions for basic data types

Decomposing a modeling issue into tasks that can be executed concurrently is the first step to achieving parallelization (Foster, 1995). The parallel computing middleware of SEIMS implements two types of parallelization strategy, i.e., parallelization of inside-model execution 6

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 6. Partition of (a) subbasins into layers by (b) upstream-downstream strategy and (c) downstream-upstream strategy. Label numbers of each node in the graph (b) and (c) represent computing weights. Adapted from Liu et al. (2016).

Fig. 7. Division of grid cells according to (a) flow directions into layers by (b) upstream-downstream strategy and (c) downstream-upstream strategy. Numbers in grids represent grid indexes while numbers at the bottom line represent layer orders. Adapted from Liu et al. (2014).

based on the spatial discretization of the watershed (Section 2.3.1–2.3.3) and model level parallel job management (Section 2.3.4).

subbasin and basic-unit levels. Subbasins are partitioned into layers via either an upstream-downstream strategy (Fig. 6b) or a downstream-upstream strategy (Fig. 6c). The subbasin layers are treated as a graph, in which one node represents a subbasin and the edges represent the upstream-downstream relationships. Unlike the binary-tree code adopted by Wang et al. (2011), the graph structure

2.3.1. Domain decomposition of watershed The two-level domain decomposition method proposed by Liu et al. (2014, 2016) is adopted to utilize the parallelizability at both the

Fig. 8. Pseudo-code of parallel computing at basic-unit level. Adapted from Liu et al. (2014). 7

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 9. Pseudo-code of the static task scheduling strategy from the perspective of spatial discretization at the subbasin level.

adopted in SEIMS can support subbasins with more than two upstream subbasins. Each node is assigned a computing weight (e.g., by subbasin area in the current version) which is useful to account for load balancing in task scheduling (Liu et al., 2016) (See Section 2.3.2). Taking grid cells as an example of basic simulation units, Fig. 7 shows the division of basic simulation units into layers using an upstream-downstream strategy (Fig. 7b) or downstream-upstream strategy (Fig. 7c) according to flow directions (Fig. 7a). The current version of SEIMS adopts the commonly used D8 single flow direction algorithm (O’Callaghan and Mark, 1984), while multiple flow direction algorithms such as MFD-md algorithm (Qin et al., 2007) could also be supported in future versions. In addition, support for using irregularly shaped fields, such as hydrologic response units (Arnold et al., 1998) and patches (Tague and Band, 2004), as basic simulation units is under development.

2.3.2. Parallelization at the basic-unit level At the basic-unit level (e.g., grid cells in current version of SEIMS), simulation methods for watershed processes can be divided into two types according to their computational characteristics, i.e., local inde­ pendent methods (such as runoff generation processes and plant growth simulations) and sequential dependent methods, such as overland flow routing using 1-D kinematic wave method (Liu et al., 2014). For local independent methods, parallel computing can be conducted simply by dividing simulation units into equal parts, while for sequential depen­ dent methods, parallel computing can be achieved in the same layer divided by flow directions (Fig. 7), in which there are no upstream and downstream relationships between the corresponding simulation units. The OpenMP programming model is used in current version of SEIMS, which makes it quite easy to achieve parallel computing in most circumstances by embedding compiler directives (start with #pragma) in the source code without damaging the original code structure (Fig. 8). Other circumstances which need a few more OpenMP compiler

Fig. 10. Pseudo-code of the execution workflow of one subbasin object in a channel routing time-step. 8

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

directives, such as reduce operations on array-type data according to subbasins, can be found in https://github.com/lreis2415/SEIMS/iss ues/36.

2.4. Watershed database The watershed database for the SEIMS-based model is designed to store all data related to watershed modeling, such as the input and output data, the parameter-adjusting data for sensitivity analysis, and so on. The input data include, but are not limited to, climate data (e.g., precipitation and meteorological data such as temperature), geospatial data (e.g., DEM and the derived spatial parameters, landuse map, soil type map, and soil attribute data), and management data (e.g., plant management schedules), depending on the purpose of watershed modeling. Two types of output data can be specified by a configuration file, i.e., time-series outputs (e.g., discharge at the outlet of the water­ shed) and spatio-temporal distribution variables (e.g., the average of soil water storage). The output data can also be exported as single files (e.g., plain text and GeoTiff files). All spatial input data are decomposed into small parts according to subbasin boundaries and stored in the database as binary types during data preprocessing, so as to improve I/O efficiency (Liu et al., 2016). For the execution of the OpenMP version of the SEIMS-based model, data for the whole watershed are stored as well. With the flexible watershed database design, there is no need to duplicate the input data when conducting numerous model runs differentiated by calibrated parameter-settings or scenario settings, a measure which is required by some other watershed models such as SWAT (Zhang et al., 2013).

2.3.3. Parallelization at the subbasin level SEIMS adopts the MPI programming model to conduct parallel computing at the subbasin level. To make the best use of computational resources, load imbalance should be reduced by distributing the amount of computation in each computing node as evenly as possible, while minimizing the amount of communication (e.g., try to dispatch sub­ basins with upstream-downstream relationships to the same node) to reduce communication overhead (Liu et al., 2016). To achieve such a tradeoff between load balance and communication overhead, current SEIMS adopts the widely used METIS graph-partitioning algorithm (Karypis and Kumar, 1998) for static task scheduling with the graph of subbasins (Fig. 6b and c) as input, and estimates the computing weight of each node roughly by subbasin area (Liu et al., 2016). The master-slave strategy for task scheduling and data communica­ tion, which was commonly used in previous studies (Liu et al., 2016; Wang et al., 2011), may cause unnecessary communication overhead. For example, with the master-slave strategy, the transferred values be­ tween upstream and downstream subbasins will be sent and receive twice, i.e., from slave processes to master process, and from master process to another slave process. Therefore, a static peer-to-peer communication strategy is designed to ensure that the transferred values only need to be sent and received once from a process to another. Fig. 9 shows the pseudo-code of the static task scheduling strategy from the perspective of spatial dis­ cretization at the subbasin level. The predetermined task scheduling plan is loaded by the master process and then broadcasted to all pro­ cesses (lines 1–6 in Fig. 9), which makes the peer-to-peer communica­ tion available. Then, each process is responsible for creating and executing the task of subbasins assigned to it (lines 7–18 in Fig. 9). The execution workflow of one subbasin object within a channel routing time-step based on the peer-to-peer communication strategy is described by the pseudo-code in Fig. 10. The hillslope processes and the channel routing processes can proceed separately one after another (lines 2 and 11 in Fig. 10, respectively). The transferred data from up­ stream subbasins should be updated by reading data from the current process or receiving from another process before the execution of channel routing processes (lines 3–10 in Fig. 10). After that, the trans­ ferred data derived from the current subbasin should be saved for its downstream subbasin, if such a subbasin exists (lines 13–19 in Fig. 10). If the downstream subbasin is not assigned to the current process, the transferred data will be sent directly to another process which handles the downstream subbasin (line 15 in Fig. 10). Finally, the specified output data of the current subbasin will be saved (line 20 in Fig. 10).

3. Implementation of SEIMS 3.1. Overall implementation The proposed modular and parallelized watershed modeling frame­ work, namely SEIMS, was implemented using standard Cþþ and Python programming languages, which means it is cross-platform compatible. SEIMS uses CMake (https://cmake.org), a cross-platform build tool, to manage the entire project in order to achieve compatibility on main­ stream compilation environments. The compiled Cþþ programs include the SEIMS main programs (including the OpenMP version and the MPI&OpenMP version), SEIMS module library, and executable pro­ grams for data preprocessing. Python is used for utility tools including data preprocessing, calibration, sensitivity analysis, scenario analysis, and so on. To simplify the implementation, a parallel job management strategy based on SCOOP (Hold-Geoffroy et al., 2014) using Python language was adopted for the current SEIMS. SCOOP automatically distributes tasks among available computing resources using dynamic load balancing (Hold-Geoffroy et al., 2014), which means it has the potential to be compatible with multiple computing platforms. By drawing lessons from the WetSpa model (Water and Energy Transfer between Soil, plant, and atmosphere) (Liu et al., 2003; Liu, 2004), the SWAT model (Arnold et al., 1998) and other watershed models or algorithms, current SEIMS module library can support the simulation of hydrology, soil erosion, and nutrient cycling processes for storm event models (Liu et al., 2014, 2016; Wu et al., 2018) and long-term (e.g., daily) models (Qin et al., 2018; Zhu et al., 2019). With the parallel middleware of SEIMS at subbasin level and its standard module interfaces, modelers need not take care of programming details of MPI and can use serial programming code to develop their own SEIMS modules achieving subbasin level parallelization with high parallel ef­ ficiency (see Section 2.3.3). Furthermore, modelers can conveniently achieve the OpenMP-based parallelization at basic-unit level through programming with a fairly low learning curve, in most cases without damaging the serial code structure (see Section 2.3.2). Please refer to the user manual document (https://github.com/lreis2415/SEIMS/blob/m aster/SEIMS-UserManual.pdf) for more detailed information on pro­ gramming with SEIMS (such as the demo of developing a new SEIMS module). Given the requirements of flexible data structures, elastic scalability, and high performance, a widely-used NoSQL database, MongoDB (http

2.3.4. Parallelization at the model level Parameter sensitivity analysis (Zhan et al., 2013), auto-calibration (Rouholahnejad et al., 2012; Zhang et al., 2013), and scenario analysis based on optimization algorithms (Qin et al., 2018) are common ap­ plications which need numerous model runs with different model in­ puts. Each model run is independent of others and thus the parallelism at the model level exists, although the results of a series of model runs are summarized. Note that although the multiple model runs can be executed concurrently using multithreading programming techniques on shared-memory machines (e.g., Rouholahnejad et al., 2012), the computational efficiency of each model on individual thread may degrade due to the contention for shared hardware resources, especially when the number of threads exceeds the number of processors. There­ fore, the model level parallelization is suitable to be conducted through parallel task distribution on distributed-memory parallel computing platforms (e.g., SMP cluster, grid computing platform) by job manage­ ment (Zhao et al., 2013) or MPI (Zhang et al., 2013). Current SEIMS provides the model level parallelization based on job management. 9

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 11. Map of the Youwuzhen watershed in Fujian province, China. Adapted from Qin et al. (2018).

s://www.mongodb.com), was adopted to manage all kinds of data in the current version of SEIMS. SEIMS is still under continuous development. Please visit https ://github.com/lreis2415/SEIMS for more details and latest updates.

any sampling method and sensitivity analysis method. The most time-consuming step is repetitive model evaluations, which is paral­ lelized at model level by parallel job management based on SCOOP. Like the parameter sensitivity tool, auto-calibration and scenario analysis tools also need to generate many different model inputs, eval­ uate models, and perform analysis based on model outputs. In the cur­ rent SEIMS version, the commonly used Non-dominated Sorting Genetic Algorithm (NSGA-II; Deb et al., 2002) was integrated as the auto-calibration and scenario analysis tool. The support for other pop­ ular optimization algorithms is on the development plan. Running all utility tools requires a corresponding configuration file in which the application-related parameters are specified. SEIMS pro­ vides different templates for the configuration files for data pre­ processing, parameter sensitivity analysis, auto-calibration, etc. More details please refers to the user manual document (https://github.com/ lreis2415/SEIMS/blob/master/SEIMS-UserManual.pdf).

3.2. Utility tools The data preprocessing tools, which were written in Python and Cþþ languages, include four categories of functions, i.e., watershed spatial discretization, spatial parameters extraction, hydroclimate data pro­ cessing, and watershed database management. The watershed spatial discretization functions include the functions of delineating spatial units at different scales (Fig. 1a) such as subbasins (implemented based on TauDEM v5.3.7; Tarboton, 2016), hillslopes, hydrologically connected fields (Wu et al., 2018), and slope positions (Zhu et al., 2018), and the functions of spatial domain decomposition of the watershed at subbasin and basic-unit levels (see Section 2.3.1). The spatial parameters extraction functions mainly include the calculation of terrain and loca­ tion related parameters, such as depression capacity, and the mapping of soil and landuse/landcover related parameters based on lookup tables. The hydroclimate data processing functions mainly refer to the data formatting (such as interpolating irregular precipitation records to be regular time interval data), and the statistical calculation (such as annual number of heat units). The watershed database management functions are responsible for creating the database and importing data with two types of formats. The first is spatial parameters unfolded as a 1-dimension array and stored as binary data; and the second is plain text data organized and stored as the format of key-value pairs (i.e., JSON format). A main Python script is provided to perform the whole work­ flow of data preprocessing. Present data preprocessing tools in SEIMS are implemented to support the currently available SEIMS modules. This means that additional tools should be added if the input data of a newly developed module is not available from the database or outputs of other modules in SEIMS. Parameter sensitivity analysis is useful for identifying the most important or influential parameters for a specified simulation target (e. g., discharge of the watershed outlet) (Zhan et al., 2013). A typical sensitivity analysis procedure includes sampling from the value ranges of determined model inputs, evaluating the model with the generated samples and saving the interested outputs, and calculating sensitivity indices by various methods such as the Morris screening method (Mor­ ris, 1991) and FAST (Fourier Amplitude Sensitivity Test, Cukier et al., 1978). The sensitivity analysis tool in SEIMS is organized as separated functions according to these steps, which means it is easy to incorporate

4. Case study 4.1. Study area and dataset The Youwuzhen watershed (~5.39 km2), which is located in Changting county of Fujian province of China, was selected as the case study area (Fig. 11). This area belongs to the typical red-soil hilly region in southeastern China and suffers from severe soil erosion (Chen et al., 2013). The study area has hills with steep slopes (up to 52.9� and with an average slope of 16.8� ) and broad alluvial valleys (Qin et al., 2018). The elevation ranges from 295.0 m to 556.5 m. The study area is charac­ terized by a mid-subtropical monsoon moist climate and has an annual average temperature of 18.3 � C. The annual average precipitation is 1697.0 mm and intense short-duration thunderstorm events contribute about three-quarters of annual precipitation from March to August (Chen et al., 2013). The land-use types are mainly forest (59.8%), paddy field (20.6%), and orchard (12.8%). The dominant soil type is red earth (Humic Acrisols in FAO soil taxonomy, or Ultisols in US soil taxonomy) (Qin et al., 2018). The basic data of the Youwuzhen watershed collected for watershed modeling include spatial data (i.e., a gridded DEM with 10 m resolution, a soil type map with a scale of 1:50,000, and a land-use type map interpreted from ALOS [Advanced Land Observation Satellite] images from 2009), climate data, and periodic site-monitoring streamflow and sediment discharge data collected at the watershed outlet (Qin et al., 2018). According to an accumulated threshold of 0.185 km2 (Chen et al., 2013), the Youwuzhen watershed was delineated into 17 subbasins 10

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Table 1 List of hydrological processes related parameters and their value ranges for the parameter sensitivity analysis. Subprocess

Parameter

Description

Max.

Min.

Initial

Change

Range

Evapotranspiration

K_pet gsi

Correction factor Max stomatal conductance (in drough condition) (m/s)

1.3 5.0

0.7 0

1.0 9999

AC RC

[-0.3, 0.2] [1.2, 2.0]

Interception

Interc_max Interc_min Pi_b

Maximum Interception Capacity (mm) Minimum Interception Capacity (mm) Interception Storage Capacity Exponent

100 100 1.5

0 0 0.5

9999 9999 1.35

RC RC AC

[1.0, 2.0] [0.5, 1.6] [-0.15, 0]

Surface runoff

K_run P_max Runoff_co sw_cap

Runoff exponent for a near zero rainfall intensity Rainfall intensity corresponding to a K_run of 1 (mm) Potential runoff coefficient Soil water capacity such as available water (mm)

5.0 1000 1.0 1000

0 10 0.5 0.01

2.5 30 9999 9999

AC AC RC RC

[-1.5, 1.0] [-20.0, 0] [1.2, 1.8] [1.0, 1.5]

Depression storage Percolation Interflow

Depression Conductivity Ki Poreindex

Depression storage capacity (mm) Saturated hydraulic conductivity (mm/hr) Scaling factor Pore size distribution index

5.0 2000 10.0 1.2

0.5 0 0 0.8

9999 9999 3.0 9999

RC RC AC RC

[0.8, 1.5] [0.6, 1.0] [-1.8, 1.1] [0.8, 2.0]

Groundwater

Base_ex Kg df_coef gwmax

Baseflow exponent Baseflow recession coefficient Deep percolation coefficient Maximum groundwater storage (mm)

4.0 0.1 1.0 50000

1.0 0.001 0 1

1.0 0.005 0 300

AC AC AC AC

[0.5, 1.2] [-0.004, 0.001] [0.3, 0.5] [0, 300]

Channel routing

ep_ch MSK_co1 MSK_X ch_n ch_width ch_depth

Channel evaporation adjustment factor Coefficient of the storage time impact for normal flow Muskingum weighting factor Manning coefficient n value of the channel Width of the channel (m) Depth of the channel (m)

1.0 1.0 0.3 0.3 1000 100

0 0 0 0 0 0

1.0 0.75 0.2 9999 9999 9999

AC AC AC RC RC RC

[-0.5, 0] [0, 0.25] [-0.1, 0.1] [0.4, 0.8] [2.0, 4.0] [0.8, 1.5]

Note: 1) Max. and Min. are absolute maximum and minimum values of the parameter, respectively. Initial is the default initial value. Change is the value-changing method, such as AC for adding and RC for multiplying a Change value defined by Range. 2) The default initial value of 9999 means the parameter is spatially distributed data or array-type data.

(Fig. 11) (Qin et al., 2018). Soil properties such as mechanical composition and organic matter were derived from field sampling data (Chen et al., 2013). Soil erod­ ibility factors, cover management factors, and conservation practice factors of the USLE (Universal Soil Loss Equation) model were drawn from the study conducted in the same area by Chen and Zha (2016). Climate data contains daily meteorological data derived from National Meteorological Information Center of China Meteorological Adminis­ tration data and precipitation data derived from local monitoring stations. The periodic site-monitoring data regarding streamflow and sedi­ ment discharge at the watershed outlet from 2012 to 2015 were pro­ vided by the Soil and Water Conservation Bureau of Changting county, Fujian province, China. Limited by the quality of this data, those periods of more than three consecutive days of rainfall with complete records of runoff generation and sediment yield data were selected for modeling. As a result, the year 2012 was selected as a warm-up period for the SEIMS-based model, the years 2013 and 2014 for calibration, and the year 2015 for validation.

evaluate the reduction rate of multi-year average soil erosion under different BMP scenarios. The combination of adopted modules is pre­ pared as a list of module names in a plain text file which will be parsed by the SEIMS main program in order to assembling the corresponding modules as described in Section 2.2. For more details about the corre­ sponding simulation algorithms, please refer to Qin et al. (2018). 4.3. Experiment design 4.3.1. Experimental environments A high-performance computing platform with 134 computing nodes and a General Parallel File System (GPFS) was used. Each computing node was equipped with 2-way Intel® Xeon® E5650 6-cores CPUs (i.e., 12 physical cores in total), 24 GB memory, and one InfiniBand (40 GB/s) network card used for high-speed interconnection of parallel computing jobs. The Red Hat® Enterprise Linux® Server 6.2 was used as Operation System. The compilation environment of SEIMS included Intel® Cþþ 12.1 compiler with the support of OpenMP 3.1, Intel® MPI Library 4.0.3, GDAL-1.11.5, and mongo-c-driver-1.6.1. The virtual environment for running SEIMS utility tools was Python-2.7.13 with several third-party packages such as GDAL-1.11.5, NumPy-1.12.1, matplotlib-1.5.3, pymongo-3.4.0, DEAP-1.2, SALib-1.1.2, and SCOOP-0.7.

4.2. Construction of the SEIMS-based model In this case study, SEIMS was used to construct a grid-based water­ shed model for simulating streamflow and sediment discharge at a daily time-step for the period from 2012 to 2015, according to the available dataset. The simulated hydrologic processes included interception, surface depression storage, surface runoff, potential evapotranspiration, percolation, interflow, groundwater flow, and channel flow. Soil erosion by water on hillslopes and sediment routing in channels were simulated. To be a comprehensive watershed model, the simulation of plant growth process and nutrient (i.e., nitrogen and phosphorous) cycling were also included. The combination of adopted modules is the same as that used and manually calibrated by Qin et al. (2018), which allows for a com­ parison between the manual calibrated model and the auto-calibration conducted by SEIMS (see Section 4.3). In the study of Qin et al. (2018) and the follow-up study of Zhu et al. (2019), the calibrated SEIMS-based watershed model with a daily time-step was used to

4.3.2. Experiment for evaluating parallel performance of SEIMS-based model The simulation results of the OpenMP version and the MPI&OpenMP version were confirmed to be the same before the parallel performance of the SEIMS-based model was tested. According to the parallelization of SEIMS at the subbasin level, the maximum number of processes should not exceed the count of subbasins (i.e., 17 in this case study). Therefore, the OpenMP version and the MPI&OpenMP version were executed separately using the same number (ranging from 1 to 16) of threads and processes (using one thread per process) respectively, for convenience of comparison. For the MPI&OpenMP version, executions using 2, 3, 4, 6, and 8 threads per process were also tested. The actual simulation time, excluding the time for data input and output (I/O), is regarded as 11

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 12. The (a) computing time and (b) speedup ratios of the OpenMP version and the MPI&OpenMP version using different combinations of process and thread numbers and the theoretical maximum speedup ratio (TMSR). The process refers to the instance of SEIMS main program, and the multiple processes of the MPI­ &OpenMP version are created by MPI. A thread is a component of a process. Multiple threads of the OpenMP version and each process of the MPI&OpenMP version are created by OpenMP. With certain numbers of threads per process, the speedup ratio of all combinations reached a plateau after 10 processes. For all numbers of processes, the speedup ratio reached a maximum when the threads number per process is 4.

sensitivity analysis process such as FAST (Cukier et al., 1978) or auto-calibration based on optimization algorithms such as NSGA-II (Deb et al., 2002). Two parameters should be set for the Morris screening method, i.e., the number of replications R and the number of levels p (Morris, 1991). The level number p is for gridding the value range of parameters (Table 1), and it is normally between 4 and 10 (Yang, 2011). In this study, the replication number R was set to 100 and the level number p was 10, which means a total of 2400 (i.e., 100 * (23 þ 1)) models were evaluated for computing the sensitivity indices. The simulation period used for parameter sensitivity analysis was 2012–2014 and the results from the 2013–2014 period are used to evaluate the model performance. Commonly used model performance indicators, NSE (Nash-Sutcliffe efficiency, equation (1)), RSR (root mean square error-standard devia­ tion ratio, equation (2)), and PBIAS (percent bias, equation (3)) (Moriasi et al., 2007), were used to evaluate the performance of the SEIMS-based model in this study.

computing time in this study. For the MPI&OpenMP version, the computing time is the maximum computing time among all processes. Each execution was repeated three times to get an average computing time as the result for evaluation. The speedup ratio, which is defined as the ratio between the serial computing time and the parallel computing time, was used to measure the parallel performance. For the OpenMP version, the serial computing time was the computing time of the execution using one thread, while the computing time of the execution using one process and one thread per process was regarded as the serial computing time for the MPI­ &OpenMP version. The theoretical maximum speedup ratio (TMSR) was also estimated by the method proposed by Liu et al. (2013) which is suitable for grid-based distributed watershed models. The proportion of computing channel routing processes in the total amount of computa­ tions was set as 1% for estimating TMSR. 4.3.3. Experiment of parameter sensitivity analysis in SEIMS The most important step for parameter sensitivity analysis is to determine the involved parameters and their proper change ranges ac­ cording to prior knowledge (Zhan et al., 2013). The simulation of streamflow in a watershed model, compared with that of sediment and nutrient, is often the priority in sensitivity analysis and calibration (Abbaspour et al., 2015). Therefore, this study takes the simulation of streamflow as an example to illustrate the sensitivity analysis and auto-calibration by SEIMS. A total of 23 parameters related to hydro­ logical processes were selected. Table 1 lists the basic information (such as description, absolute maximum and minimum values, and default initial value), the value-changing method, and the range of each selected parameter. The Morris screening method (Morris, 1991) implemented SEIMS was used in this study to qualitatively identify important parameters which impact the model performance of streamflow at the watershed outlet. Although the Morris screening method cannot provide an accu­ rate quantitative estimate of how much a parameter contributes to the model performance indicators or distinguish the non-linearity of a parameter from the interaction with others (Yang, 2011; Zhan et al., 2013), it can be used to exclude non-sensitive parameters through one-at-a-time sensitivity analysis for a subsequent quantitative

n P

NSE ¼ 1

i¼1

2

ðY obs i

Y sim i Þ

ðY obs i

Y mean Þ

i¼1 n P

2

rffiffinffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P obs 2 ðY i Y sim i Þ i¼1 RSR ¼ rffiffinffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P obs 2 ðY i Y mean Þ

(1)

(2)

i¼1

n P ðY obs i

PBIAS ¼ i¼1

Y sim i Þ � 100 n P i¼1

Y obs i

(3)

where Yobs and Ysim are the ith observed and simulated values, respec­ i i tively; Ymean is the average of all observed values; n is the number of observed values. The job of parameter sensitivity analysis was distributed between 32 computing nodes. Each computing node was allowed to run 3 model 12

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

evaluations concurrently (except the master computing node, which only ran 2) which meant that up to 95 models could be executed simultaneously. Therefore, the speedup ratio at the model level of SEIMS can be estimated by the ratio between the sum of the running time (including I/O time and computing time) of all SEIMS-based model and the running time of the parallel job. 4.3.4. Experiment of auto-calibration in SEIMS After parameter sensitivity analysis, a small number of the most sensitive parameters were selected for auto-calibration. The value ranges of parameters may be the same with those defined in the parameter sensitivity analysis, or they may be narrowed by excluding values that result in an unacceptable model performance, such as in cases where NSE is less than zero. In the current version of SEIMS, the NSGA-II algorithm was used for auto-calibration. During NSGA-II execution, the Latin-hypercube sam­ pling method (Iman and Shortencarier, 1984) was used to generate initial samples of calibrated parameters. The crossover and mutation operations were similar to the original implementations in NSGA-II by Deb et al. (2002). The initial population number was 240 with a selec­ tion rate of 0.8. The maximum generation number was 50. The crossover probability and the mutate probability were 0.8 and 0.1, respectively. The multi-objectives for auto-calibration of the SEIMS-based model for streamflow simulation in this case study were maximizing the NSE and minimizing both the RSR and the absolute value of PBIAS. The settings for the parallel job management of auto-calibration were the same as those in the parameter sensitivity analysis experiment. 4.4. Experimental results and discussions 4.4.1. Parallel performance of single run of the SEIMS-based model Fig. 12 presents the computing time and speedup ratios of the OpenMP version and the MPI&OpenMP version using different combi­ nations of process and thread numbers. Generally, the computing times decreased and the speedup ratios increased with the number of pro­ cesses for the MPI&OpenMP version (or the thread number for the OpenMP version) in all experiments, except for the OpenMP version when the threads number exceeds the physical cores number (i.e., 12; see the line with black squares in Fig. 12). For the one-tier parallel computation at the basic-unit level (i.e., the OpenMP version), the speedup ratios reached a plateau after the threads number exceeded 8 and got a maximum of 2.26 with 12 threads. In contrast, the speedup ratios of the parallel computation at the subbasin level (i.e., the MPI­ &OpenMP version using 1 thread per process) reached stability after the processes number exceeded 10 and got a maximum of 6.24 with 16 processes, which is very close to the theoretical maximum speedup ratio in this case (i.e., 6.44). Thus, the parallelization strategy based on MPI at the subbasin level can achieve a much higher efficiency than the par­ allelization strategy based on OpenMP at the basic-unit level. When the thread number exceeded 1 for the MPI&OpenMP version, i.e. the situation where the two-level parallel computation was con­ ducted, the speedup ratios significantly improved, as shown by Liu et al. (2016). The stable situation also appeared when the process number exceeded 10. With any number of processes in the experiment, the speedup ratios increased with the thread number first and then decreased after it exceeded 4. The maximum speedup ratios reached 19.15 under 14 processes and 4 threads per process, which was nearly three times that of the TMSR (Fig. 12b). This means that the two-level parallelization strategy integrated into SEIMS has much better scal­ ability than the parallelization at either the subbasin level or the basic-unit level only (Liu et al., 2016). Considering the parallel performance of the MPI&OpenMP version and the computing resources of each computing node, the combination of 4 processes and 2 threads per process were used by every single model run in the following model level parallel computation for parameter sensitivity analysis and auto-calibration.

Fig. 13. Parameter sensitivity analysis results with respect to the (a) NSE, (b) RSR, and (c) PBIAS of the simulated streamflow (m3s 1) at the water­ shed outlet.

4.4.2. Results of parameter sensitivity analysis by SEIMS The running time of the parallel job which included all 2400 SEIMSbased models for parameter sensitivity analysis was 7629.42 s (~2.12 h) by SEIMS, while the sum of the running time (including I/O time and computing time) of individual SEIMS-based models was 539195.25 s (~149.78 h) in this study. From the perspective of model level paralle­ lization, the speedup ratio was 70.67, which indicates a good parallel efficiency of the parallel computing middleware in SEIMS. The parameter sensitivity analysis result from the Morris screening method in SEIMS is shown as screening plots (Fig. 13). Each point in the screening plot represents one parameter of the SEIMS-based model in Table 1. The larger the modified means μ* (Campolongo et al., 2007), the more sensitive the watershed response is to variation in the parameter (Yang, 2011; Zhan et al., 2013). As shown in Fig. 13, the importance ranks of parameters were slightly different depending on the model performance indicators (i.e., NSE, RSR, and PBIAS). For example, the four parameters with the highest sensitivity for NSE and RSR are Base_ex, df_coef, Kg, and sw_cap while those for PBIAS are K_pet, ch_n, MSK_co1, and df_coef. Finally, a total of twelve parameters (i.e., Base_ex, df_coef, Kg, sw_cap, ch_n, K_pet, MSK_co1, Runoff_co, ch_width, Conductiv­ ity, ch_depth, Ki) were selected as sensitivity parameters for this case and were used for the auto-calibration of streamflow simulation.

13

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

Fig. 14. The (a) changes of hypervolume index with generations and the near optimal Pareto fronts of selected generations: (b) the first generation, (c) the 14th generation, and (d) the 32nd generation.

Fig. 15. The calibration and validation of the simulated streamflow (m3s 1) at the watershed outlet of the study area of one solution selected from the 50th near optimal Pareto fronts.

4.4.3. Auto-calibration results by SEIMS The value ranges for the calibrated parameters were adopted as same as those in the parameter sensitivity analysis experiment. During the evolution portion of the auto-calibration process, a total of 7916 SEIMSbased models were executed. The running time of the parallel job which included all 7916 SEIMS-based models from all generations was 38648.03 s (~10.74 h), while the sum of the running times (including I/ O time and computing time) of the individual SEIMS-based models was

2504051.23 s (~695.57 h). From the perspective of model level paral­ lelization, the speedup ratio was 64.79. The relative low speedup ratio compared with that in the parameter sensitivity analysis experiment (i. e., 70.67) could be attributed to the oversupply of computational re­ sources. For example, there were about 153 new models which needed to be executed for each generation, which means that after the first 95 model runs were finished only 58 models left to be executed on 95 computational resources. 14

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526

The hypervolume index (Zitzler and Thiele, 1999) is often used to indicate the quality of the near Pareto optimal fronts by considering both convergence and diversity (Zitzler et al., 2003). A higher hyper­ volume indicates a better quality of solutions. As shown in Fig. 14a, the hypervolume index increased slowly continuously until keeping stable after the 32nd generation. Fig. 14 also presents near optimal Pareto fronts of the generation 1st, 14th, and 32nd, from which the evolution trend from scattering to concentrate can be obviously observed. Any solution from the near optimal Pareto fronts after the 32nd generation could be selected for further applications. Fig. 15 shows the calibration and validation of the simulated streamflow (m3s 1) at the watershed outlet of the study area of one solution selected from the 50th near optimal Pareto fronts. According to the general performance ratings for streamflow simulations at a monthly time step by Moriasi et al. (2007), a model simulation can be judged as satisfactory if NSE > 0.50, RSR � 0.70, and PBIAS is within �25%. Thus, the SEIMS-based model performance of streamflow, in this case, is satisfactory since the NSE, RSR, and PBIAS for the calibration period are 0.58, 0.65, and 1.52%, respectively, and they are 0.52, 0.69, and 11.40%, respectively, for the validation period. In the study of Qin et al. (2018), a total of 10 streamflow simulation related parameters were determined by trial-and-error and then half of them were included in the set of parameters selected for auto-calibration in this study (i.e., df_coef, Kg, sw_cap, Runoff_co, and Conductivity) with close calibrated values. For example, the calibrated value of Conductivity was 0.8 in Qin et al.’s (2018) study and 0.777 for the selected solution in this study. Compared to the manual calibration conducted by Qin et al. (2018), which had NSE, RSR, and PBIAS of 0.48, 0.72, and 16.24% for the calibration period and 0.28, 0.85, and 10.69% for the validation period, respec­ tively, the result of auto-calibration in SEIMS showed better perfor­ mance in terms of extrapolation.

Acknowledgments The work reported here was funded by the National Natural Science Foundation of China (No. 41431177, 41601413, 41871362, and 41871300), the Innovation Project of LREIS (No. O88RA20CYA), Na­ tional Basic Research Program of China (Project No.: 2015CB954102), PAPD (Project No.: 164320H116), and Outstanding Innovation Team in Colleges and Universities in Jiangsu Province. Supports to A-Xing Zhu through the Vilas Associate Award, the Hammel Faculty Fellow Award, and the Manasse Chair Professorship from the University of WisconsinMadison are greatly appreciated. Cheng-Zhi Qin thanks the support through the “Excellent Young Scholars” project from National Natural Science Foundation of China. The authors thank all contributors to SEIMS project with their brilliant code and advices, and a range of opensource software used by SEIMS such as GDAL (https://github.com/O SGeo/gdal), mongo-c-driver (https://github.com/mongodb/mongo -c-driver), METIS (http://glaros.dtc.umn.edu/gkhome/metis/metis /overview), TauDEM (https://github.com/dtarb/TauDEM), tinyxml (https://www.sourceforge.net/projects/tinyxml), SALib (https://gith ub.com/SALib/SALib), DEAP (https://github.com/DEAP/deap), and SCOOP (https://github.com/soravux/scoop), etc., which motivate SEIMS to be an open-source framework. The authors appreciate the valuable suggestions and comments from the anonymous reviewers, which help us to improve the manuscript. References Abbaspour, K.C., Rouholahnejad, E., Vaghefi, S., Srinivasan, R., Yang, H., Kløve, B., 2015. A continental-scale hydrology and water quality model for Europe: calibration and uncertainty of a high-resolution large-scale SWAT model. J. Hydrol 524, 733–752. https://doi.org/10.1016/j.jhydrol.2015.03.027. Arnold, J.G., Moriasi, D.N., Gassman, P.W., Abbaspour, K.C., White, M.J., Srinivasan, R., Santhi, C., Harmel, R.D., Van Griensven, A., Van Liew, M.W., 2012. SWAT: model use, calibration, and validation. Trans. ASABE (Am. Soc. Agric. Biol. Eng.) 55, 1491–1508. https://doi.org/10.13031/2013.42256. Arnold, J.G., Srinivasan, R., Muttiah, R.S., Williams, J.R., 1998. Large area hydrologic modeling and assessment part I: model development. J. Am. Water Resour. Assoc. JAWRA 34, 73–89. https://doi.org/10.1111/j.1752-1688.1998.tb05961.x. Band, L.E., Tague, C.L., Brun, S.E., Tenenbaum, D.E., Fernandes, R.A., 2000. Modelling watersheds as spatial object hierarchies: structure and dynamics. Trans. GIS 4, 181–196. https://doi.org/10.1111/1467-9671.00048. Band, L.E., 1999. Spatial hydrography and landforms. In: Longley, P., Goodchild, M., Maguire, D., Rhind, D. (Eds.), GIS: Management Issues and Applications. John Wiley & Sons, pp. 527–542. Bieger, K., Arnold, J.G., Rathjens, H., White, M.J., Bosch, D.D., Allen, P.M., Volk, M., Srinivasan, R., 2016. Introduction to SWATþ, a completely restructured version of the soil and water assessment tool. J. Am. Water Resour. Assoc. JAWRA 53, 115–130. https://doi.org/10.1111/1752-1688.12482. Buahin, C.A., Horsburgh, J.S., 2018. Advancing the open modeling interface (OpenMI) for integrated water resources modeling. Environ. Model. Softw 108, 133–153. https://doi.org/10.1016/j.envsoft.2018.07.015. Campolongo, F., Cariboni, J., Saltelli, A., 2007. An effective screening design for sensitivity analysis of large models. Environ. Model. Softw. Modelling, Comput-Ass. Simul. Map. Dangerous Phenom. Hazard Assess. 22, 1509–1518. https://doi.org/10. 1016/j.envsoft.2006.10.004. Chapman, B., Jost, G., Van Der Pas, R., 2007. Using OpenMP: Portable Shared Memory Parallel Programming, Scientific and Engineering Computation. The MIT press, Cambridge, Massachusetts, London, England. Chen, S., Zha, X., 2016. Evaluation of soil erosion vulnerability in the Zhuxi watershed, Fujian Province, China. Nat. Hazards 82, 1589–1607. https://doi.org/10.1007/s110 69-016-2258-4. Chen, Z.-B., Chen, Z.-Q., Yue, H., 2013. Comprehensive Research on Soil and Water Conservation in Granite Red Soil Region: A Case Study of Zhuxi Watershed, Changting County, Fujian Province. Science Press, Beijing (in Chinese). Clark, M.P., Bierkens, M.F.P., Samaniego, L., Woods, R.A., Uijlenhoet, R., Bennett, K.E., Pauwels, V.R.N., Cai, X., Wood, A.W., Peters-Lidard, C.D., 2017. The evolution of process-based hydrologic models: historical challenges and the collective quest for physical realism. Hydrol. Earth Syst. Sci. 21, 3427–3440. https://doi.org/10.519 4/hess-21-3427-2017. Cukier, R.I., Levine, H.B., Shuler, K.E., 1978. Nonlinear sensitivity analysis of multiparameter model systems. J. Comput. Phys. 26, 1–42. https://doi. org/10.1016/0021-9991(78)90097-9. David, O., Ascough II, J.C., Lloyd, W., Green, T.R., Rojas, K.W., Leavesley, G.H., Ahuja, L. R., 2013. A software engineering perspective on environmental modeling framework design: the Object Modeling System. Environ. Model. Softw., Thematic Issue on the Future of Integrated Modeling Science and Technology 39, 201–213. https://doi. org/10.1016/j.envsoft.2012.03.006.

5. Conclusions This paper proposes an open-source, modular, and parallelized watershed modeling framework (SEIMS), which makes it easier for re­ searchers to build their own watershed models effectively and effi­ ciently. With a fairly low learning curve of OpenMP-based parallel computing technique, users of this framework can add their own algo­ rithms in a nearly serial programming manner and construct parallelized models. As a comprehensive modeling framework, SEIMS also provides utility tools for parallel computation at the model level to support ap­ plications such as sensitivity analysis and auto-calibration, which need numerous model runs. The effectiveness and efficiency of SEIMS were illustrated through a case study in the Youwuzhen watershed, which included model construction, parameter sensitivity analysis based on the Morris screening method, and auto-calibration based on the NSGA-II algorithm. SEIMS still needs to be improved continuously. On the aspect of basic structure, the support of irregularly shaped fields as basic simulation units and the multiple flow direction model is under development. To extend the scalability and further improve the computing efficiency of SEIMS, general-purpose computing on graphics processing units (GPUs) at the basic-unit level may be a promising direction (Qin, 2015). New task scheduling strategies at the subbasin level, which consider load balance based on the concept of the spatial computational domain (Wang and Armstrong, 2009) by means of pre-experiment or quantity estimation equations, might produce a higher speedup ratio than the static task scheduling strategy based on the area of subbasins in current version of SEIMS. Declaration of competing interest None.

15

L.-J. Zhu et al.

Environmental Modelling and Software 122 (2019) 104526 Qin, C.-Z., Zhan, L.-J., Zhu, A.-X., Zhou, C.-H., 2014. A strategy for raster-based geocomputation under different parallel computing platforms. Int. J. Geogr. Inf. Sci. 28, 2127–2144. https://doi.org/10.1080/13658816.2014.911300. Qin, C.-Z., Zhu, A.-X., Pei, T., Li, B., Zhou, C., Yang, L., 2007. An adaptive approach to selecting a flow-partition exponent for a multiple-flow-direction algorithm. Int. J. Geogr. Inf. Sci. 21, 443–458. https://doi.org/10.1080/13658810601073240. Rouholahnejad, E., Abbaspour, K.C., Vejdani, M., Srinivasan, R., Schulin, R., Lehmann, A., 2012. A parallelization framework for calibration of hydrological models. Environ. Model. Softw 31, 28–36. https://doi.org/10.1016/j.envsoft.2011. 12.001. Tague, C.L., Band, L.E., 2004. RHESSys: regional hydro-ecologic simulation system-an object-oriented approach to spatially distributed modeling of carbon, water, and nutrient cycling. Earth Interact. 8, 1–42. https://doi.org/10.1175/1087-3562(2004) 8<1:RRHSSO>2.0.CO;2. Tarboton, D.G., 2016. Terrain Analysis Using Digital Elevation Models (TauDEM 5.3.7). http://hydrology.usu.edu/taudem/. (Accessed 19 October 2016). Vivoni, E.R., Mascaro, G., Mniszewski, S., Fasel, P., Springer, E.P., Ivanov, V.Y., Bras, R. L., 2011. Real-world hydrologic assessment of a fully-distributed hydrological model in a parallel computing environment. J. Hydrol 409, 483–496. https://doi.org/10.10 16/j.jhydrol.2011.08.053. Wagener, T., Boyle, D.P., Lees, M.J., Wheater, H.S., Gupta, H.V., Sorooshian, S., 2001. A framework for development and application of hydrological models. Hydrol. Earth Syst. Sci. 5, 13–26. https://doi.org/10.5194/hess-5-13-2001. Wang, H., Fu, X., Wang, G., Li, T., Gao, J., 2011. A common parallel computing framework for modeling hydrological processes of river basins. Parallel Comput. 37, 302–315. https://doi.org/10.1016/j.parco.2011.05.003. Wang, H., Fu, X., Wang, Y., Wang, G., 2013. A High-performance temporal-spatial discretization method for the parallel computing of river basins. Comput. Geosci. 58, 62–68. https://doi.org/10.1016/j.cageo.2013.04.026. Wang, S., Armstrong, M.P., 2009. A theoretical approach to the use of cyberinfrastructure in geographical analysis. Int. J. Geogr. Inf. Sci. 23, 169–193. htt ps://doi.org/10.1080/13658810801918509. Wenderholm, E., 2005. Eclpss: a Java-based framework for parallel ecosystem simulation and modeling. Environ. Model. Softw 20, 1081–1100. https://doi.org/10.1016/j. envsoft.2004.06.006. Wu, H., Zhu, A.-X., Liu, J., Liu, Y., Jiang, J., 2018. Best management practices optimization at watershed scale: Incorporating spatial topology among fields. Water Resour. Manag. 32, 155–177. https://doi.org/10.1007/s11269-017-1801-8. Wu, Y., Li, T., Sun, L., Chen, J., 2013. Parallelization of a hydrological model using the message passing interface. Environ. Model. Softw 43, 124–132. https://doi.org/10. 1016/j.envsoft.2013.02.002. Yalew, S., van Griensven, A., Ray, N., Kokoszkiewicz, L., Betrie, G.D., 2013. Distributed computation of large scale SWAT models on the Grid. Environ. Model. Softw 41, 223–230. https://doi.org/10.1016/j.envsoft.2012.08.002. Yang, J., 2011. Convergence and uncertainty analyses in Monte-Carlo based sensitivity analysis. Environ. Model. Softw 26, 444–457. https://doi.org/10.1016/j.envsoft.20 10.10.007. Zhan, C.-S., Song, X.-M., Xia, J., Tong, C., 2013. An efficient integrated approach for global sensitivity analysis of hydrological model parameters. Environ. Model. Softw 41, 39–52. https://doi.org/10.1016/j.envsoft.2012.10.009. Zhang, X., Beeson, P., Link, R., Manowitz, D., Izaurralde, R.C., Sadeghi, A., Thomson, A. M., Sahajpal, R., Srinivasan, R., Arnold, J.G., 2013. Efficient multi-objective calibration of a computationally intensive hydrologic model with parallel computing software in Python. Environ. Model. Softw 46, 208–218. https://doi.org/10.1016/j. envsoft.2013.03.013. Zhao, G., Bryan, B.A., King, D., Luo, Z., Wang, E.L., Bende-Michl, U., Song, X., Yu, Q., 2013. Large-scale, high-resolution agricultural systems modeling using a hybrid approach combining grid computing and parallel processing. Environ. Model. Softw 41, 231–238. https://doi.org/10.1016/j.envsoft.2012.08.007. Zhu, L.-J., Zhu, A.-X., Qin, C.-Z., Liu, J.Z., 2018. Automatic approach to deriving fuzzy slope positions. Geomorphology 304, 173–183. https://doi.org/10.1016/j. geomorph.2017.12.024. Zhu, L.-J., Qin, C.-Z., Zhu, A.-X., Liu, J., Wu, H., 2019. Effects of different spatial configuration units for spatial optimization of watershed best management practices scenarios. Water 11, 262. https://doi.org/10.3390/w11020262. Zitzler, E., Thiele, L., 1999. Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach. IEEE Trans. Evol. Comput. 3, 257–271. https://doi.org/10.1109/4235.797969. Zitzler, E., Thiele, L., Laumanns, M., Fonseca, C.M., da Fonseca, V.G., 2003. Performance assessment of multiobjective optimizers: an analysis and review. IEEE Trans. Evol. Comput. 7, 117–132. https://doi.org/10.1109/TEVC.2003.810758.

Deb, K., Pratap, A., Agarwal, S., Meyarivan, T., 2002. A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans. Evol. Comput. 6, 182–197. https://doi.org/ 10.1109/4235.996017. Formetta, G., Antonello, A., Franceschi, S., David, O., Rigon, R., 2014. Hydrological modelling with components: a GIS-based open-source framework. Environ. Model. Softw 55, 190–200. https://doi.org/10.1016/j.envsoft.2014.01.019. Foster, I., 1995. Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA. Freeze, R.A., Harlan, R.L., 1969. Blueprint for a physically-based, digitally-simulated hydrologic response model. J. Hydrol 9, 237–258. https://doi. org/10.1016/0022-1694(69)90020-1. Hill, C., DeLuca, C., Balaji, Suarez, M., Silva, A. da, 2004. The architecture of the earth system modeling framework. Comput. Sci. Eng. 6, 18–28. https://doi.org/10.1109 /MCISE.2004.1255817. Hold-Geoffroy, Y., Gagnon, O., Parizeau, M., 2014. Once you SCOOP, no need to fork. In: Proceedings of the 2014 Annual Conference on Extreme Science and Engineering Discovery Environment. ACM, Atlanta, GA. https://doi.org/10.1145/2616498.261 6565. Iman, R.L., Shortencarier, M.J., 1984. Fortran 77 Program and User’s Guide for the Generation of Latin Hypercube and Random Samples for Use with Computer Models (No. NUREG/CR-3624; SAND-83-2365). Sandia National Labs., Albuquerque, NM (USA). https://doi.org/10.2172/7091452. Karypis, G., Kumar, V., 1998. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20, 359–392. https://doi.org/10.1137 /S1064827595287997. Kneis, D., 2015. A lightweight framework for rapid development of object-based hydrological model engines. Environ. Model. Softw 68, 110–121. https://doi.org/10. 1016/j.envsoft.2015.02.009. Leavesley, G.H., Markstrom, S.L., Viger, R.J., 2006. USGS modular modeling system (MMS)–precipitation-runoff modeling system (PRMS). In: Singh, V.P., Frevert, D.K. (Eds.), Watershed Models. CRC Press, Boca Raton, FL, pp. 159–177. Liu, J., Zhu, A.-X., Liu, Y., Zhu, T., Qin, C.-Z., 2014. A layered approach to parallel computing for spatially distributed hydrological modeling. Environ. Model. Softw 51, 221–227. https://doi.org/10.1016/j.envsoft.2013.10.005. Liu, J., Zhu, A.-X., Qin, C.-Z., 2013. Estimation of theoretical maximum speedup ratio for parallel computing of grid-based distributed hydrological models. Comput. Geosci. 60, 58–62. https://doi.org/10.1016/j.cageo.2013.04.030. Liu, J., Zhu, A.-X., Qin, C.-Z., Wu, H., Jiang, J., 2016. A two-level parallelization method for distributed hydrological models. Environ. Model. Softw 80, 175–184. https:// doi.org/10.1016/j.envsoft.2016.02.032. Liu, Y.B., 2004. Development and Application of a GIS-Based Distributed Hydrological Model for Flood Prediction and Watershed Management. Vrije Universiteit Brussel, Belgium. Liu, Y.B., Gebremeskel, S., De Smedt, F., Hoffmann, L., Pfister, L., 2003. A diffusive transport approach for flow routing in GIS-based flood modeling. J. Hydrol 283, 91–106. https://doi.org/10.1016/S0022-1694(03)00242-7. Maringanti, C., Chaubey, I., Popp, J., 2009. Development of a multiobjective optimization tool for the selection and placement of best management practices for nonpoint source pollution control. Water Resour. Res. 45. W06406. https://doi. org/10.1029/2008WR007094. Moore, R.V., Tindall, C.I., 2005. An overview of the open modelling interface and environment (the OpenMI). Environ. Sci. Policy, Research & Technology Integration in Support of the European Union Water Framework Directive 8, 279–286. https ://doi.org/10.1016/j.envsci.2005.03.009. Moriasi, D.N., Arnold, J.G., Van Liew, M.W., Bingner, R.L., Harmel, R.D., Veith, T.L., 2007. Model evaluation guidelines for systematic quantification of accuracy in watershed simulations. Trans. ASABE (Am. Soc. Agric. Biol. Eng.) 50, 885–900. https://doi.org/10.13031/2013.23153. Morris, M.D., 1991. Factorial sampling plans for preliminary computational experiments. Technometrics 33, 161–174. https://doi.org/10.1080/00401706.1991.10484804. O’Callaghan, J.F., Mark, D.M., 1984. The extraction of drainage networks from digital elevation data. Comput. Vis. Graph Image Process 28, 323–344. https://doi. org/10.1016/S0734-189X(84)80011-0. Peckham, S.D., Hutton, E.W.H., Norris, B., 2013. A component-based approach to integrated modeling in the geosciences: the design of CSDMS. Comput. Geosci., Modeling for Environmental Change 53, 3–12. https://doi.org/10.1016/j.cageo.20 12.04.002. Qin, C.-Z., 2015. Cuda/GPU. In: Shekhar, S., Xiong, H., Zhou, X. (Eds.), Encyclopedia of GIS. Springer International Publishing, Cham, pp. 1–7. https://doi.org/10.1007/9 78-3-319-23519-6_1606-1. Qin, C.-Z., Gao, H.-R., Zhu, L.-J., Zhu, A.-X., Liu, J.-Z., Wu, H., 2018. Spatial optimization of watershed best management practices based on slope position units. J. Soil Water Conserv. 73, 504–517. https://doi.org/10.2489/jswc.73.5.504.

16