A cooperative target search method based on intelligent water drops algorithm

A cooperative target search method based on intelligent water drops algorithm

Computers and Electrical Engineering 80 (2019) 106494 Contents lists available at ScienceDirect Computers and Electrical Engineering journal homepag...

3MB Sizes 0 Downloads 10 Views

Computers and Electrical Engineering 80 (2019) 106494

Contents lists available at ScienceDirect

Computers and Electrical Engineering journal homepage: www.elsevier.com/locate/compeleceng

A cooperative target search method based on intelligent water drops algorithm ✩ Xixia Sun a, Chao Cai b,∗, Su Pan a, Zhengning Zhang c, Qiyu Li c a

Key Laboratory of Broadband Wireless Communication and Sensor Network Technology, Ministry of Education, College of Internet of Things, Nanjing University of Posts and Telecommunications, Nanjing, China b State Key Laboratory for Multispectral Information Processing Technologies, School of Artificial Intelligence and Automation, Huazhong University of Science and Technology, Wuhan, China c Tianjin Zhong Wei Aerospace Data System Technology Co. Ltd, Tianjin, China

a r t i c l e

i n f o

Article history: Received 27 December 2018 Revised 9 October 2019 Accepted 12 October 2019 Available online xxx Keywords: Unmanned aerial vehicle (UAV) Cooperative search Path planning Intelligent water drops (IWD) algorithm

a b s t r a c t This study investigates the problem of cooperative search path planning for multiple unmanned aerial vehicles. Firstly, we present a search probability map based environmental model, and formulate a multi-UAV cooperative search path optimization problem. Then, we propose a solution strategy based on an improved intelligent water drops (IWD) optimization algorithm. Compared with the traditional IWD algorithm, the proposed algorithm can simultaneously model the cooperation and competition among UAVs by introducing several heterogeneous water drop populations and a co-evolutionary mechanism. Water drops in the same population cooperate with each other, whereas water drops among different populations compete with each other to obtain the optimal path and reduce meaningless search effort. Meanwhile, the proposed algorithm can improve its search capability using a new soil update mechanism. Simulation results demonstrate the advantages of the proposed method over other baselines. © 2019 Published by Elsevier Ltd.

1. Introduction Recently, unmanned aerial vehicles (UAVs) have been more and more widely applied in military and civilian domains, including battlefield surveillance, aerial refueling, environmental monitoring, and map building [1–4]. For example, as a typical UAV, drone has come into widespread use in many industrial fields [2,3]. A typical application of UAV is target search, such as border patrol, mine sweeping, and perimeter surveillance [5–8]. These missions are usually urgent and encompass a wide area, which require the collaboration a team of UAVs, also known as multiple UAV (multi-UAV) cooperative search. In such missions, the exact locations of targets are unknown, but certain information regarding the target locations, such as the target location probability distributions in the searching areas, can be extracted from prior knowledge. To locate the target, the UAVs use on-board platforms to coordinately and incessantly detect the environment while flying on the search area.

✩ This paper is for CAEE special section SI-umv. Note that it was originally submitted for SI-aicv3. Reviews processed and recommended for publication to the Editor-in-Chief by Associate Editor Dr. Huimin Lu. ∗ Corresponding author. E-mail addresses: [email protected] (X. Sun), [email protected] (C. Cai), [email protected] (S. Pan), [email protected] (Z. Zhang).

https://doi.org/10.1016/j.compeleceng.2019.106494 0045-7906/© 2019 Published by Elsevier Ltd.

2

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

As part of mission planning, target search path planning technology is critical in improving a UAV team’s mission effectiveness. Its aim is to generate optimal cooperative paths for the UAVs to find the targets using on-board sensor platforms, probably under some resource and time constraints. According to the theory of Koopman [5,9,10], search path planning problem can be classified from multiple aspects and perspectives, such as continuous or discrete searching space, unilateral or bilateral searching strategy, and so on [5–8]. Recently, search path planning has attracted increasing attention from researchers. For example, Berger and Lo established an optimization model for the multiple-agent discrete search path planning problem, which was represented by a mixedinteger linear programming formulation [6]. Xiao et al. introduced a multiple-agent cooperative search algorithm based on Bayesian theory to solve the moving target search path planning problem under multiple constraints [7]. More recently, Din et al. proposed a behavior-based model for solving the problem of multiple-robot coordinated search path planning [8]. Path planning for multi-UAV cooperative search is nonpolynomial complete (NP-complete), and the commonly used solution methods are classical methods and heuristic methods [11–13]. Classical methods such as Dijkstra’s method can guarantee the optimality of the final solution, if one exists, or prove that no feasible solution exists [4,6]. However, they require a considerable amount of time and resources even when the size of the problem is small. Therefore, heuristic methods are more suitable for handling NP-complete problems [2,11,14]. Among the various types of heuristic methods, nature-inspired ones are gaining increasing popularity in UAV path planning [8,11,14–16]. As a typical nature-inspired method, the intelligent water drops (IWD) algorithm is a novel and popular swarm intelligence metaheuristic, which mimics how the water drops (WDs) obtain the best path between their start and destination positions [17]. Since its introduction, many researchers have investigated it and proposed various IWD variants. For example, Alijla et al. improved the performance of the IWD algorithm by modifying its parameters [18]. Niu et al. presented an improved multi-objective (multiple objective) IWD algorithm by incorporating a Pareto schedule checking process into the original IWD algorithm to optimize the scheduling of a job shop [19]. Teymourian proposed an improved IWD algorithm, and then presented two hybrid nature-inspired optimization algorithms incorporating the improved IWD and cuckoo search algorithms [20]. Elsherbiny et al. appended a task assignment phase to the original IWD algorithm for the scheduling of work flows on the cloud [21]. In the past few years, the IWD algorithm has been widely applied in many domains, such as parallel machine scheduling and textural features selection [18–22]. In the multi-UAV cooperative search path planning problem, not only cooperation within the UAV team to collaboratively search the target, but also competition among different UAVs exist to avoid searching the same region. The conventional IWD algorithm and its variants consider the cooperation within a WD population. However, they cannot model the competition among different agents [17–22]. In this study, we propose a lost-water-drop (WDs that are trapped in map dead-ends)based multiple swarms (multiswarm) IWD (LMIWD) algorithm for multi-UAV cooperative search path planning. Specifically, we first establish an optimization model for the problem of cooperative search path planning for multi-UAV, which considers the turn angle constraint, collision avoidance constraint, and so on. Then, we propose an LMIWD algorithm exploiting a co-evolutionary strategy together with a novel soil update mechanism to solve the problem. The proposed co-evolutionary mechanism considers both the cooperation and competition among UAVs. Moreover, the proposed soil update mechanism improves the search ability of the conventional IWD algorithm. Extensive simulation experiments with real-world digital terrain data demonstrate that the proposed method outperforms the common search methods like the random search, greedy search, particle swarm optimization (PSO) algorithm, ant colony (ACO) algorithm, Q-learning, conventional IWD algorithm, and multiswarm IWD (MIWD) algorithm with co-evolutionary strategy. The remainder of this paper is structured as follows. Section 2 establishes an optimization model of the path planning for multi-UAV cooperative search. An LMIWD algorithm is proposed in Section 3 to solve the cooperative search path planning problem. Computational results are analyzed in Section 4, and Section 5 summarizes the study. 2. Problem formulation This section first presents a search probability map based environmental model. Then, it establishes an optimization model for multi-UAV cooperative search path planning. 2.1. Environment description Inspired by authors in [5–8], the search region is modeled as a rectangular area, where a team of UAVs attempt to locate the target. As shown in Fig. 1, the search environment is discretized into a set of N = L × S cells of equal size. Thus, the search map comprising N discretized cells can be represented by C = {C1 ,C2 ,…, CN }. The target can be anything of importance, such as a hidden missile site of the enemy or a mineral resource. In this study, it is assumed that the target is static and occupies one cell at the most. Certain information regarding the approximate position of the target is known in advance. In the literature, the target existence probability density distribution model is a popular model to represent the prior information regarding the target. The target existence probability of a cell is also known as the target occupancy probability, representing the chance of a target locating in a cell. In this study, the probability density function f of the target location is known, and each cell i has an associated target occupancy probability  pi = S f (x )dx (where s is the region covered by cell i).

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

3

Fig. 1. Search environment discretization and example of orientation transition choices for the UAVs.

2.2. UAV model A team of M UAVs U = {U1 ,U2 ,…, UM } fly coordinately, and they detect the search space using their on-board platforms. Their objective is to jointly observe the environment to collect as much information as possible, such as maximizing target detection probability in T detection periods. At each discrete time step, the UAV can fly from one cell to the next and perform a detection. It can detect the entire cell that it locates by using on-board platforms and maneuvering in each cell. It is assumed that its conditional probability of detecting a target given that it is located in cell i (i=1, 2, …, N, where N is the number of cells) is equal to the target occupancy probability of cell i. Owing to its physical constraints, the UAV can only change its flying direction by one step at the most. Therefore, three possible locations exist at the most for the next detection period of each UAV: ahead, right, or left, as shown in Fig. 1. 2.3. Search objectives and constraints The aim of cooperative search path planning is to generate optimal cooperative paths that jointly maximize the expected reward (search effectiveness) for the UAV team. This expected reward can be represented by the total target detection probability, environment uncertainty reduction, region coverage percentage, and so on. In this study, the expected reward is represented by the probability of detecting the target over a finite horizon (T detection time periods). Consequently, the cooperative search path planning problem considered in this study can be formulated as a combination optimization problem, expressed as follows:

max J =

N 

M  T 

p j (1 −

s.t . ψm ( j, t ) = {0, 1},

ψm ( sm , 1 ) = 1, N 

(1 − ψm ( j, t )))

(1)

m=1 t=1

j=1

∀ j, m, t

∀m

ψm ( j, t ) = 1,

(2) (3)

∀m, t

(4)

ψm (I, t ) , ∀ j, m, t = 2, 3, . . . , T

(5)

∀ j, t

(6)

j=1

ψm ( j, t − 1 ) ≤

 I∈Im ( j )

M 

ψm ( j, t ) ≤ 1,

m=1

where M and N denote the numbers UAVs and cells, respectively. pj denotes the target occupancy probability of cell j. Constraint (2) means that ψ m (j,t) is either 1 or 0. When UAV m detects cell j at the tth detection period, ψ m (j,t) is 1; otherwise, ψ m (j,t) is 0. Constraints (3) and (4) imply that UAV m starts from cell sm and should be in one of the cells at each detection period. If UAV m detects cell j at the tth detection period, namely, ψ m (j,t)= 1, it can only be in one of its neighboring cells Im (j) satisfying the turn angle constraint at time t + 1, represented by constraint (5). Constraint (6) is the collision avoidance constraint, meaning that the UAVs should not collide with each other and each cell is allowed to have one UAV at the most at each time. 3. Cooperative search based on LMIWD algorithm This section designs an LMIWD algorithm with a co-evolutionary strategy and a novel soil update mechanism for solving the cooperative search path optimization problem described in Section 2.

4

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

3.1. Overview of IWD algorithm As a novel and popular optimization technique, the IWD algorithm is based on how WDs flow along riverbeds to reach their destinations. It exploits a population of WDs that move along the edges of a graph to obtain the optimal solution. Each WD possesses two significant characteristics: velocity and soil that changes while flowing along the riverbed. WDs can use soil to interact with each other and the environment. As the WD flows, it collects soil from the riverbed, and the amount of collected soil is affected by its velocity. The higher the speed of a WD, the more soil it collects from the riverbed. Meanwhile, the soil trail on the riverbed can affect a WD’s flowing direction, which can be interpreted as friction. These parts of riverbed removed soil get deeper, and they can attract more WDs because WDs favor paths with less friction. Hence, shorter paths are selected by the WDs, and the optimal (or near optimal) path can be produced at the end of the evolution. During target search, not only cooperation within the UAV team to cooperatively search the target, but also competition among different UAVs exist to avoid searching the same area. The conventional IWD algorithm can emulate the cooperation within the UAV team, as well as interactions between the UAV team and the environment [17–22]. However, it cannot model the competition among different UAVs during target search. For a better cooperation of the UAV team, an LMIWD algorithm is proposed in this study for cooperative search path planning. The following sections describe its main frame and steps. 3.2. Main frame of LMIWD algorithm As shown in Fig. 2, the proposed algorithm adopts M swarms of WDs (one per each UAV) that evolve cooperatively to obtain the best path for the associated UAVs. Hence, M heterogeneous rivers exist in the environment (each river maintains its own parameters) and M heterogeneous types of soils exist in each cell. Let (m, k) denote the kth WD of the mth river. Each WD (m, k) represents a candidate path of the mth UAV, which carries a certain amount of soil and can move in the search map. To build a path, it flows from the start position of its associated UAV, and then incrementally selects the subsequent cells until a complete path is constructed (when the WD has visited T cells, where T is the number of detection periods). Each WD (m, k) creates a path in discrete steps based on a novel probabilistic function proposed in this study. Once a WD (m, k) has flowed from cell i to cell j, it collects a certain amount of the mth type of soil from cell j, and its velocity increases. When it has completed its path, the quality of its path is evaluated. In the path construction and evaluation steps, WDs communicate with each other and they interact with the environment through soil. Meanwhile, the M WD populations can exchange information (such as their soil maps of the environment and the best path obtained) with each other. In this way, WDs in the same population cooperate with each other to obtain the best path for their associated UAV, while different WD populations compete with each other to reduce meaningless search effort. This proposed co-evolutionary strategy demonstrates better performance than the conventional IWD algorithm. This is because it naturally models both the cooperation and competition among the UAVs, as well as the interactions between the UAVs and the environment. When all WDs have constructed their paths, soils on the best paths are reduced and one generation of the algorithm is terminated. This evolutionary process is repeated until the terminating criterion is satisfied. The main phases and communication scheme of the proposed LMIWD algorithm can be summarized as follows. Step 1: Initialize the LMIWD parameters. Step 2: For each generation, each WD moves in the search map to search the path for its corresponding UAV. When it moves to a neighboring cell, increase its velocity and soil level. Meanwhile, update the soil in the environment. Once it has finished its path, evaluate the performance of its path. In this step, the environment and the WDs affect each other, and WD populations can communicate with each other through soil. Step 3: When all WDs have finished their paths, select the representative of each population. Then permutate and combine the paths of all populations based on the representatives to construct the cooperative paths (global solutions). In this step, WD populations can exchange information with each other through their representatives. Step 4: Evaluate the quality of each global solution, and update all types of soils in the environment based on the global optimal solution of the current generation. Step 5: If all generations have completed, output the global optimal solution. Otherwise, reinitialize the dynamic parameters and proceed to step 2. 3.3. Details of the LMIWD algorithm 3.3.1. Initialization In this step, static and dynamic parameters are initialized. Static parameters should be initialized once the evolution has started, which are constants during the entire evolutionary process. They comprise the maximum number of generations, each population’s size, initial soil loaded in cell i (i=1, 2, …, N, where N is the number of cells), and velocity updating and soil updating coefficients. Meanwhile, dynamic parameters should be reinitialized at the start of each generation, including the initial velocity, soil, and visited cells of each WD.

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

5

Start Static parameters Initialization Get a new generation Initialize population 1

Initialize population m

……

Initialize population K

……

Get a new WD

Get a new WD

Get a new WD

Path construction and local soil update

Path construction and local soil update

Path construction and local soil update

Evaluate individual fitness

Evaluate individual fitness

Evaluate individual fitness

All WDs finish Y Select representatives

All WDs finish

N

N All WDs finish Y Select representatives

N

Y Select representatives

Permutation and combinztion Global soil update N All generations finish Y Return solution End Fig. 2. Flow chart of the proposed LMIWD algorithm.

3.3.2. Path construction When WD (m, k) builds its path, it can only search neighboring cells that fulfill the turning angle constraint. In the original algorithm, the probability of a WD transiting from the cell it locates to a neighboring cell is determined by the amount of soil in the neighboring cells, which does not consider the heterogeneity of different types of soils. In this study, we design a novel transition probability computation mechanism. In this mechanism, pimk (j), the probability that a WD (m, k) flows to cell j while it is in cell i is determined by the soils of all the rivers (i.e., all the types of soils), calculated as follows:



pimk ( j ) = 



l∈Im (i )



ηiγj β γ f α (soilm (l )) minh=m g(soilh ( j )) ηil

f α (soilm ( j )) minh=m g(soilh ( j ))

(7)

where soilm (j) represents the amount of the mth type of soil in cell j. min h = m g(soilh (j)) represents the minimum function value of g of all the types of soils except the mth type in cell j. This information is adopted to reduce the probability of the WDs in different rivers searching paths passing by the same cells, thus reducing meaningless search effort of the UAV team. ηij represents the heuristic information of the edge from cell i to cell j, and it is defined as the target existence probability pj in this study. As a general rule, this proposed probabilistic function allows a WD (m, k) to favor a cell with less soil of the mth type and more soil of other types. Tuning parameters α , β , and γ are used for balancing the effects of the mth type of soil, the other types of soils, and the heuristic information, respectively. Im (i) denotes the set of neighboring cells of cell

6

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

i that satisfy the turn angle constraint. The function f(soilm (j)) is calculated as

f (soilm ( j )) =

1

(8)

ε + g(soilm ( j ))

where the constant ε represents a small positive number. Let s represent the minimum soil of type m in the cells belonging to Im (i), that is, s = min∀l∈Im (i ) soilm (l ) (as in the MAX-MIN ant system, the amount of each type of soil can be bounded to an interval [smin smax ].). The function g(soilm (j)) is used to transform soilm (j) into a positive value, given by



g(soilm ( j )) =

soilm ( j ), if s ≥ 0 soilm ( j ) − s, otherwise

(9)

3.3.3. Local soil update Once WD (m, k) has computed the transition probabilities of all cells to be selected, and has transited from cell i to a neighboring cell j, its velocity is increased by the mth type of soil in cell j, described as

vmk = vmk +

av 2 ( j) bv + cv soilm

(10)

where vmk represents the velocity of WD (m, k). soilm (j) represents the amount of the mth type of soil in cell j. av , bv , and cv are the velocity updating coefficients. Once WD (m, k) has flowed into cell j, it collects the mth type of soil from cell j, expressed by

soilm ( j ) = (1 − ρn )soilm ( j ) − ρn soilm ( j )

(11)

soilmk = soilmk + soilm ( j )

(12)

where soilmk represents the amount of soil carried by WD (m, k). ρ n is the local soil updating rate. soilm (j) is the amount of the mth type of soil that WD (m, k) picks up from cell j. The soil trail intensities can directly affect the WDs in exploring their optimal paths. To improve the search performance of WDs, an adaptive soil update mechanism is designed in this study; it allows the soil updating rate to change dynamically according to the evolutionary stage. In the early phase of the evolutionary process, the soil updating rate is relatively low and the WDs search more randomly to focus on exploration. Meanwhile, in the late phase of the evolutionary process, the soil updating rate is relatively high and the WDs focus on exploitation to search the optimal path. In this study, the local soil updating rate is expressed as l l l ρn =ρmin + (ρmax − ρmin ) ∗ NT B /Nmax

(13)

l where Nmax and NTB are the total number of generations and the index of the current generation, respectively. ρmin and l ρmax are the minimum and maximum local soil updating rates, respectively. In Eq. (12), soilm (j) represents the amount of the mth type of soil that WD (m, k) picks up from cell j, calculated as

soilm ( j ) =

as bs + cs t 2 (i, j, vmk )

(14)

where as , bs , and cs are the soil updating coefficients. t(i, j, vmk ) represents the time required by WD (m,k) to flow from cell i into cell j, denoted as

t (i, j, vmk )=

H( j)

(15)

vmk

where H(j) represents the undesirability for WD (m, k) moving into cell j. In this study, H(j) is represented by the inverse of the target existence probability pj , H(j)=1/pj . The lower the target existence probability pj , the higher is the H(j). In each generation, each WD searches a path for its associated UAV according to the initialization, path construction, and local soil update process mentioned above until it visits T cells. Let the individual fitness of the path discovered by WD (m, k) represent the total target occupancy probability of the cells that WD (m, k) visits. When WD (m, k) has completed its path, the individual fitness of its path is calculated to evaluate the path quality. It is noteworthy that not all WDs can find a path fulfilling the constraints expressed by Eqs. (2)–(5) owing to the turn angle and map constraints. This is because some WDs may be trapped in map dead-ends. To reduce the probability of the WDs selecting this type of path in the following generations, soils on the paths found by lost WDs are increased, expressed as follows:

soilm ( j ) = (1 + ρn )soilm ( j ) +

Q ρn IB )soil mk q(Tmk TB

(16)

where soilTmk represents the amount of soil carried by WD (m,k), which gets trapped into map dead-ends in generation TB. B IB ) represents the individual fitness of the path discovered by WD (m, k) in generation TB. ρ is the local soil updating q(Tmk n rate, and Q is a constant. soilm (j) represents the amount of the mth type of soil in cell j.

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

7

Fig. 3. Search space and the target existence probabilities.

3.3.4. Permutation and combination Each population searches paths for its corresponding UAV according to the evolutionary process described in the previous subsections. Once all the populations have built their paths, the representative of each population is selected and the global solution is constructed. In this study, the paths with the largest individual fitness in each population are specified as the representatives. The global solution (the set of cooperative paths) is a group of paths selected from different populations (select one path per population). The global fitness of a path is calculated based on the path itself and the representatives of other populations according to Eqs. (1)–(6). By comparison, the set of cooperative paths with the largest global fitness value is specified as the global best solution of the current generation. 3.3.5. Global soil update After the global optimal solution of the current generation is specified, soils in cells that constitute the best cooperative paths of the current generation are modified as follows [18]: TB TB soilm (i ) = (1 + ρIW D )soilm (i ) − ρIW D q(Tmk )soilmk

(17)

where soilm (i) represents the amount of the mth (m = 1, 2, …, M) type of soil in cell i. Assume that WD (m, k) finds a path T B represents the amount of soil carried by WD (m, k). that constitutes the best cooperative paths of generation TB. soilmk T B q(Tmk ) is the individual fitness of the path discovered by WD (m, k). ρ IWD represents the global soil updating rate, which increases adaptively during the evolutionary process, expressed as g g g ρIW D = ρmin +(ρmax − ρmin ) ∗ NT B /Nmax

(18)

where ρmin and ρmax are the minimum and maximum global soil updating rate, respectively. Nmax and NTB are the total number of generations and the index of the current generation, respectively. If the terminating criterion has been satisfied, the global optimal solution of the last generation can be set as the optimal cooperative paths of the UAV team. Otherwise, we can proceed to the initialization step to continue the evolutionary process. g

g

4. Experimental results 4.1. Search environment and parameters settings The search path planning methods based on the proposed LMIWD, random search, greedy search, PSO [11], ACO [15], Q-learning [16], conventional IWD [17], and MIWD algorithms were implemented in the MATLAB software simulation environment, and executed on a computer with Intel Core i5 Duo U7200 CPU Windows 10 operating system under the same parameters setting conditions as follows: A stationary target is in a search region divided to 12 × 12 hexagonal cells and three UAVs are adopted to execute the search mission. As shown in Fig. 3, the size of each cell is equivalent to the maximum detection range of the UAV’s searching sensors in a single detection period, and the number in each cell is 10 0 0 times the target occupancy probability of that cell. The algorithm parameters can be divided into public parameters and private parameters. Public parameters are universally used in random search, greedy search, PSO, ACO, IWD, MIWD, and LMIWD algorithms. Two public algorithm parameters exist: population size and total number of generations. The population size K was set as 30, and the total number of generations Nmax was set as 50. The private parameters of IWD, MIWD, and LMIWD algorithms were set based on previous studies [17–22]. Parameters related to velocity and soil updating were set as (av ,bv ,cv ) = (1, 0.01, 1) and (as ,bs ,cs ) = (1, 0.01, 1), respectively. The initial amount of all types of soils smax was set as 10 0 0 0, and the lower bound of the amount of each type of soil smin was set

8

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

Fig. 4. Cooperative paths planned by the proposed, random search and greedy search methods: (a) cooperative paths planned by the proposed method; (b) cooperative paths planned by the random search method; (c) cooperative paths planned by the greedy search method.

l l as 2. The maximum and minimum local soil updating rates ρmax and ρmin were set as 0.15 and 0.05, respectively. The g g maximum and minimum global soil updating rates ρmax and ρmin were set as 0.15 and 0.05, respectively. The initial velocity and soil of each WD were set as 100 and 0, respectively. The tuning parameters α , β , and γ of the LMIWD and MIWD algorithms were set as 1, 4, and 3, respectively. The private parameters of the PSO, ACO, and Q-learning algorithms were also set based on theoretical studies [2,11,15,16]. In the PSO algorithm, inertia w was set as 0.9, and parameters c1 and c2 that weigh the cognitive and social components were both set as 2. In the ACO algorithm, the pheromone evaporation rate ρ was set as 0.1. During the neighbor selection process of ants, parameters α a and β a that are related to the effects of heuristic information and pheromone were set as 1 and 2, respectively. In the Q-learning algorithm, the learning rate α was set as 0.2, discount factor γ as 0.8, and total number of iterations Tmax (terminating criterion) as 1500, Tmax =K × Nmax , where K and Nmax represent the total number of generations and population size of the LMIWD algorithm, respectively.

4.2. Comparisons with other approaches Fig. 4(a) depicts the cooperative paths requiring T = 30 detection time steps generated by the proposed method in a simulation. In this figure, the pentagrams represent the start positions of the UAVs. As shown in Figs. 3 and 4(a), the cooperative paths encompass the cells with large target existence probabilities well, and they overlap a few. Their overall detection probability is 0.77, and the region they encompass constitute 55% of the entire search area, where the cells with low target existence probabilities are not searched. Therefore, the proposed search path planning method can reduce the probability of searching overlapping paths and improve the search effectiveness of UAVs. To validate the effectiveness of our method, we compared it with the random search and greedy search methods. In the random search method, the UAV flies randomly to a neighboring cell under the turn angle constraint. Meanwhile, in the greedy search method, the transiting probability from cell i to the next cell j is proportional to the target existence probability of cell j. Fig. 4(b) and (c) show the cooperative paths generated by the random search and greedy search methods in a simulation, whose target detection probabilities are 0.42 and 0.55, respectively. The random search method is simple and computationally cheap. However, its search process does not contain any heuristic information, and it searches paths blindly; thus, the paths that it produces distribute randomly in the search area. A large portion of the produced paths overlap significantly and collide with each other. It cannot coordinate the search behaviors of UAVs, and its performance is poor owing to randomness. The greedy search method is prone to search the area with large target existence probabilities. However, it does not consider the cooperation and completion among UAVs. Therefore, it cannot avoid path overlaps and often produces meaningless search effort. To comprehensively validate the effectiveness of our method, we compared it with the random search, greedy search, PSO, ACO, Q-learning, and original IWD methods, which have been widely used to test various path planning methods under the same parameter setting conditions [5,11,16,18,20]. In addition, to validate the feasibility of the proposed soil update mechanism, we compared our method with the MIWD method under the same parameter setting conditions. Considering the randomness of these methods, the comparative experiments were performed 50 times for each method. Fig. 5(a) demonstrates the increments of the average detection probabilities of the 50 runs with increasing detection time for the eight search methods. As detection time progresses, the proposed method outperforms the other seven methods. This superiority is primarily owing to the proposed co-evolutionary mechanism that simultaneously considers the cooperation and competition among UAVs. Moreover, the proposed soil update mechanism can reduce the probability of WDs selecting paths passed by lost WDs. This can reduce the WDs’ probabilities of becoming trapped in map dead-ends and guide the WDs to search paths along a more reasonable direction. In addition, the designed adaptive soil update mechanism can enable the WD populations to concentrate on exploration and exploitation in the early and late phrases of the evolution, respectively.

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

9

Fig. 5. Increments of average detection probabilities and region coverage percentage with increasing detection time: (a) increments average detection probabilities with increasing detection time; (b) increments average region coverage percentage with increasing detection time.

Fig. 6. Decrements of environment uncertainty and increments of average detection probabilities with increasing detection time: (a) decrements of environment uncertainty with increasing detection time; (b) increments of the average numbers of collisions with increasing detection time.

Therefore, the proposed method can enable the team of UAVs to search the area with large target occupancy probabilities in a short time, and improve search capability effectively. Fig. 5(b) demonstrates the increments of average region coverage percentage of the 50 runs with increasing detection time for the eight search methods. Region coverage percentage is defined as the proportion of region encompassed by the sensors of the UAV team in the entire search region. As shown in Fig. 5(b), the LMIWD method can reduce the probability of searching the same region. This is because the proposed co-evolutionary mechanism considers both the cooperation and competition among UAVs when they are in the same cell. Therefore, it can better coordinate the search behaviors of the UAVs and the UAVs can search the target in a broader area. Fig. 6(a) demonstrates the decrements of the average environment uncertainty of the 50 runs with increasing detection time for the eight search methods. In this study, environment uncertainty is defined as the Shannon entropy of the target location probability density distribution. As shown in Fig. 6(a), the LMIWD method has a certain superiority over the others in reducing environment uncertainty. By introducing heterogeneous soils to different WD populations, the proposed method can effectively reduce the probability of UAVs searching the same region and improve the probability of UAVs searching the target area. Moreover, the proposed soil update mechanism can further improve the search performance of the algorithm. Therefore, the proposed method can effectively reduce environment uncertainty, collect more environment information, and improve the cooperation capability of the UAV team. Fig. 6(b) shows the increments of the average numbers of collisions of the 50 runs with increasing detection time for the eight search methods. In this study, the average number of collisions is defined as the average number of collisions of the cooperative paths generated by the last generation of the evolution. Owing to the randomness of each algorithm, the cooperative paths generated by these methods may collide with each other and do not satisfy constraint (6). Fig. 6(b) shows that the proposed method outperforms the others, and its number of collisions is low (a large portion of the cooperative paths do not collide with each other). By introducing heterogeneous soils to different WD populations, the proposed co-

10

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

Fig. 7. Probabilistic maps and cooperative paths planned by the proposed method: (a) probabilistic map and cooperative paths of a target search mission scenario in the sea; (b) probabilistic map and cooperative paths of a target search mission scenario on the ground.

evolutionary mechanism can effectively reduce paths overlaps. Moreover, the proposed soil update mechanism can reduce the WDs’ probabilities of becoming trapped in map dead-ends. Therefore, the proposed method can improve the probability of generating feasible paths and avoiding collisions among UAVs. To better validate the effectiveness of the proposed method in complicated environments, two comparative experiments were performed. In these cases representative of most likely scenarios, a real-world digital terrain data of resolution 90 m × 90 m per pixel and a fixed-wing UAV model were used. All UAVs fly at a constant speed of 50 m/s, and their maximum flying range is 400 km. The camera equipped on the UAV performs measurements every 25 s, and its field of view is a circle of diameter 440 m. Both task regions for the UAVs are approximately 38 km in length and 33 km in width, which are partitioned to 30 × 30 hexagonal cells. There are some forbidden areas that the UAVs must not enter overlaid on the terrain map, which are approximated by hexagonal cells. The two representative search mission scenarios are to search targets located in the sea and on the ground. Fig. 7(a) and (b) demonstrate the prior likelihood distribution maps of the targets of these two search mission scenarios, respectively. The brighter the cell, the larger is the cell target occupancy probability; the cells marked in red represent the forbidden areas. The first mission scenario is to search a target located in the sea (such as a lost boat), and the prior target occupancy probabilities can be approximated by combining a series of exponential functions, such as Dirac delta functions or Gaussian probability-density functions. In this experiment, we adopted the Gaussian probability-density functions to approximate the prior target occupancy function. The second mission scenario is to search a stationary target located on land, such as a hidden missile launching equipment of the enemy. We assume that the prior target occupancy probability is inversely correlated to the terrain altitude. The higher the mean terrain altitude value of the cell, the lower is the cell target occupancy probability. Three and four UAVs search the targets in the sea and land search mission scenarios, respectively. Fig. 7(a) and (b) depict the cooperative paths requiring T = 40 and T = 50 detection time steps generated by the proposed method in a simulation of the two experiments, respectively. In these figures, the pentagrams represent the start positions of the UAVs. As shown by these two figures, the proposed method is prone to search cells with large target existence probabilities, and few overlaps exist among the cooperative paths produced by our method. In the two experiments, the overall detection probabilities are 0.25 and 0.26, and the region encompassed constitute 13% and 22% of the entire search area, respectively. To comprehensively validate the effectiveness of our method in complicated environments, we compared it with the random search, greedy search, PSO, ACO, Q-learning, original IWD, and MIWD methods, and the comparative experiments were executed 50 times for each method. Fig. 8(a)–(d) demonstrate the increments of the average detection probabilities and the decrements of the average environment uncertainty of the 50 runs with increasing detection time for the eight search methods in these two test cases (the comparison results of the eight methods on the region coverage and number of collisions are similar to those of Figs. 5(b) and 6(b)). As shown in Figs. 5–8, the LMIWD method outperforms the other seven methods in terms of target detection probability, region coverage percentage, environment uncertainty reduction, and probability of collision. This is because it adopts heterogeneous soils to model both the cooperation and competition among different UAVs as well as a new soil update mechanism, which enables the UAVs to search regions with large target existence probabilities and improve the search efficiency of the UAV team.

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

11

Fig. 8. Increments of average detection probabilities and decrements of environment uncertainty with increasing detection time: (a) increments of average detection probabilities with increasing detection time of the sea search mission; (b) increments of average detection probabilities with increasing detection time of the land search mission; (c) decrements of environment uncertainty with increasing detection time of the sea search mission; (d) decrements of environment uncertainty with increasing detection time of the land search mission.

As shown in Figs. 5–8, no significant difference is observed among the performances of the PSO, ACO, Q-learning, and original IWD methods in these experiments, and they are superior to the random search and greedy search methods. This is because these agent-based algorithms can model the cooperation among different agents, and use historical experience and heuristic information to guide the search behavior of the agents during the evolutionary process. Therefore, these four methods can model the cooperation among UAVs, and they are superior to the random search and greedy search methods. However, they cannot model the competition among different UAVs in the search process, and are less effective than the proposed method, which can simultaneously consider the cooperation and competition among UAVs. 4.3. Computational complexity As shown in Fig. 2, the LMIWD algorithm comprises three main parts. The first part is executed once at the beginning of the algorithm, while the other two parts are executed in each generation iteratively. The computational complexity of each part in the entire evolutionary process can be obtained as follows: Part 1: Static parameter initialization. The computational complexity of initializing the maximum number of generations, each population’s size, and velocity updating and soil updating coefficients is O(1). The computational complexity for initializing the initial soil of each population loaded on the search map is O(MN), where M and N are the numbers of WD populations and cells, respectively. Therefore, computational complexity of this part is O(MN). Part 2: Dynamic parameter initialization, path construction and local soil update. The computational complexity of initializing the initial velocity, soil, and visited cells of all WDs is O(KM), where K and M represent the population size and number of WD populations, respectively. At each path construction step of WD (m, k), the minimum soil of all types except the mth type in each neighboring cell j (assume that smin > 0) must be obtained. To improve performance, the amounts

12

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494 Table 1 Computational complexities of the eight planning methods. Method

Computational complexity Initialization

Random Greedy PSO ACO Q-learning IWD MIWD LMIWD

O (1) O (1)

O(KM) O(MN) O(MN) O(MN) O(MN) O(MN)

Path construction

Permutation and combination

Overall

O(TKMNmax ) O(TKMNmax ) O(TKMNmax ) O(TKMNmax ) O(TKMNmax ) O(TKMNmax ) O(TKNmax Mlog2 M) O(TKNmax Mlog2 M)

O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 N)

O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 Nmax ) O(TKMNmax ) O(TM2 Nmax ) O(TM2 Nmax ) O(TM2 Nmax )

of all M types of soils in each cell can be stored on a heap. Once WD (m, k) has transited to cell j, its velocity and soil level are increased, and the mth type of soil in cell j is updated. The computational complexity of updating the heap storing the M types of soils in cell j is O(log2 M). Therefore, the computational of path construction and local soil updating in the entire evolutionary process is O(TKNmax Mlog2 M), where T, K, M, and Nmax represent the number of detection periods, population size, number of WD populations and maximum number of generations, respectively. In summary, this part has a computational complexity of O(TKNmax Mlog2 M). Part 3: Permutation and combination. To calculate the global fitness of a WD in population m, we can calculate the target detection probabilities of the representatives of other populations first, which requires (M − 1)T computations, where M and T denote the numbers of WD populations and detection periods, respectively. Then, computing the global fitness of all the WDs in population m requires KT computations, where K represents the population size. Therefore, computing the global fitness all WDs and selecting the optimal cooperative paths require (M − 1 + K)TMNmax computations in the entire evolutionary process. Updating the soils of the cells that constitute the global cooperative paths require TMNmax computations. Therefore, computational complexity of this part is O(TM2 Nmax + TKMNmax ). Similarly, the evolutionary processes of the random search, greedy search, PSO, ACO, Q-learning, original IWD, and MIWD algorithms based path planners can be divided into three main parts: parameter initialization, path construction, permutation and combination. Each part of the IWID algorithm has the same computational complexity as the LMIWD algorithm. For the random search, greedy search, PSO, ACO, IWD, and Q-learning algorithms, their computational complexities of each part can be analyzed in the same manner. Table 1 summarizes the comparisons results (suppose M > Klog2 M is satisfied), where T, K, M, and Nmax represent the number of detection periods, population size, number of WD populations, and maximum number of generations, respectively. When M is larger than Klog 2 M, the overall computational complexity of the random search, greedy search, PSO, ACO, original IWD, MIWD, and LMIWD based path planners is O(TM2 Nmax ), and the overall computational complexity of the Q-learning based planner is O(TKMNmax ). When M is smaller than Klog 2 M, which means that the number of UAVs, M is relatively small, the overall computational complexity of the MIWD and LMIWD algorithms is O(TKNmax Mlog 2 M). Overall computational complexity of the random search, greedy search, PSO, ACO, Q-learning and IWD algorithms is O(TKMNmax ). Because the number of UAVs, M is relatively small in this case, log 2 M is small. Therefore, the overall computational complexities of the MIWD and LMIWD algorithms is approximately O(TKMNmax ), and there is no significant difference among the overall computational complexities of the eight algorithms. For example, we assume that the population size K is 30 (in swarm intelligence algorithms, the population size K is typically two or three orders of magnitude). When the number of UAVs is less than 237, M is smaller than Klog 2 M and log 2 M is smaller than 7.89. In this case, the overall computational complexity of the MIWD and LMIWD algorithms can also be approximated by O(TKMNmax ). According to the analysis above, we can conclude that when the number of UAVs is small, there is no significant difference among the overall computational complexity of all the eight algorithms based planners. When the number of UAVs is large, there is no significant difference among the overall computational complexities of the random search, greedy search, PSO, ACO, original IWD, MIWD, and proposed LMIWD algorithms based path planners. 5. Conclusion Aiming at the path planning problem of cooperative target search for multiple unmanned aerial vehicles, an improved intelligent water drop algorithm is proposed. The heterogeneous water drop populations design enables drops within the same population to cooperate with each other to obtain the optimal search path, whereas drops among different populations co-evolve with each other to obtain the cooperative search paths. The new designed soil update mechanism can enable the water drops to search path along a more reasonable direction. Therefore, the proposed path planning method can reduce the probability of searching overlapping paths, and improve search effectiveness of the unmanned aerial vehicles. However, it is noteworthy that this method could not completely avoid path conflicts, and a certain probability of producing overlapping paths still remains. These overlapping paths not only waste the limited search resources, but could also cause aircraft collisions. In practical applications, one solution to this problem is post collision checking and path re-

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

13

planning. That is to say, the proposed algorithm can be used to plan the cooperative paths, followed by collision checking. If a path conflict is found, we can call the planning algorithm again to produce new cooperative paths. Our future work will focus on the following three issues. First, we will apply the proposed algorithm to online search path planning in a distributed way. Next, we will perform a thorough study on the hardware implementations and field tests of the proposed algorithm to further validate its performance. Finally, we will incorporate the favorable traits of other swarm intelligence algorithms such as the teaching-learning-based algorithm to the proposed algorithm to improve its performance.

Declaration of Competing Interest The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment This study was co-supported by the Natural Science Foundation of Jiangsu Province [grant numbers BK20170914, BK20160903 and BK20160910], Scientific Research Funds of Nanjing University of Posts and Telecommunications [grant number NY217059], National Natural Science Foundation of China [grant numbers 61806100 and 61701251], and Open Research Fund of State Key Laboratory of Tianjin Key Laboratory of Intelligent Information Processing in Remote Sensing [grant number 2016-ZW-KFJJ-01].

References [1] Horsman G. Unmanned aerial vehicles: a preliminary analysis of forensic challenges. Digit Invest 2016;16:1–11. [2] Lu HM, Li YJ, Mu SL, Wang D, Kim H, Serikawa S. Motor anomaly detection for unmanned aerial vehicles using reinforcement learning. IEEE Internet Things J 2018;5(4):2315–22. [3] Lu HM, Li YJ, Guna J, Serikawa Seiichi. Proposal of a power-saving unmanned aerial vehicle. In: Proceedings of the EAI international conference on testbeds and research infrastructures for the development of networks & communities; 2017. p. 1–9. [4] Wu ZL, Guan ZY, Yang CW, Li J. Terminal guidance law for UAV based on receding horizon control strategy. Complexity 2017;2017(5):1–8. [5] Sujit PB, Ghose D. Search using multiple UAVs with flight time constrains. IEEE Trans Aerosp Electron Sys 2004;40(2):491–509. [6] Berger J, Lo N. An innovative multi-agent search-and-rescue path planning approach. Comput Oper Res 2015;53:24–31. [7] Xiao H, Cui R, Xu D. A sampling-based Bayesian approach for cooperative multi-agent online search with resource constraints. IEEE Trans Cybern 2018;48(6):1773–85. [8] Din A, Jabeen M, Zia K, Khalid A, Saini DK. Behavior-based swarm robotic search and rescue using fuzzy controller. Comput Electr Eng 2018;70:53–65. [9] Breivik Ø, Allen A, Maisondieu C, Olagnon M. Advances in search and rescue at sea. Ocean Dyn 2013;63(1):83–8. [10] Stone LD. Search theory. Encyclopedia of operations research and management science. Boston: Springer; 2013. [11] Mac TT, Copot C, Tran DT, Keyser RD. Heuristic approaches in robot path planning: a survey. Robot Auton Syst 2016;86:13–28. [12] Jiang TY, Li J, Li B, Huang KW, Yang CW, Jiang YM. Trajectory optimization for a cruising unmanned aerial vehicle attacking a target at back slope while subjected to a wind gradient. Math Probl Eng 2015;2015(1):1–14. [13] Tian HF, Li J, Yang CW. Trajectory optimization of airdropped LAV based on gauss pseudospectral method considering the wind gradient. J Comput Theor Nanosci 2016;13(7):4390–8. [14] Lu HM, Li YJ, Chen M, Kim H, Serikawa S. Brain intelligence: go beyond artificial intelligence. Mob Netw Appl 2018;23(2):368–75. [15] Dorigo M, Blum C. Ant colony optimization theory: a survey. Theor Comput Sci 2005;344:243–78. [16] You CX, Lu JB, Filev D, Tsiotras P. Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning. Robot Auton Syst 2019;114:1–18. [17] Hosseini HS. The intelligent water drops algorithm: a nature-inspired swarm-based optimization algorithm. Int J Bio-Inspi Comput 2009;1(1–2):71–9. [18] Alijla BO, Wong LP, Lim CP, Khader AT, Al-Betar MA. A modified intelligent water drops algorithm and its application to optimization problems. Expert Syst Appl 2014;41(15):6555–69. [19] Niu SH, Ong SK, Nee AYC. An improved intelligent water drops algorithm for solving multi-objective job shop scheduling. Eng Appl Artif Intell 2013;26:431–42. [20] Teymourian E, Kayvanfar V, Komaki GHM, Zandiehd M. Enhanced intelligent water drops and cuckoo search algorithms for solving the capacitated vehicle routing problem. Inf Sci 2016;334-335:354–78. [21] Elsherbiny S, Eldaydamony E, Alrahmawy M, Reyad AE. An extended intelligent water drops algorithm for workflow scheduling in cloud computing environment. Egypt Inf J 2018;19(1):33–55. [22] Alijla BO, Lim CP, Wong LP. An ensemble of intelligent water drop algorithm for feature selection optimization problem. Appl Soft Comput 2018;65:531–41. Xixia Sun is a lecturer in the college of Internet of things, Nanjing University of Posts and Telecommunications. She received the Master degree in computational mathematics and the Ph.D. degree in Control Science and Engineering from Huazhong University of Science and Technology. Her research interests include intelligent algorithms and UAV mission planning. Chao Cai is an associate professor in the school of Artificial Intelligence and Automation, Huazhong University of Science and Technology. He received the Master degree in computational mathematics and the Ph.D. degree in Control Science and Engineering from Huazhong University of Science and Technology. His research interests include mission planning and pattern recognition. Su Pan is a professor in the Faculty of Telecommunication Engineering of Nanjing University of Posts and Telecommunications. He received the Master of Engineering degree from Nanjing University of Posts and Telecommunications, and Ph.D. degree in Electrical and Electronic Engineering from the University of Hong Kong. His research interests include resource management and wireless resource allocation.

14

X. Sun, C. Cai and S. Pan et al. / Computers and Electrical Engineering 80 (2019) 106494

Zhengning Zhang is a Ph.D candidate in the Department of Electronic Engineering at the Tsinghua University. He received the Master degree from the Shenzhou Institute at the China Academic of Space Technology, and has been an engineer with Tianjin Zhongwei Aerospace Data System Technology Co., Ltd. His current research interests include unmanned system intelligence and satellite/UAV data co-processing. Qiyu Li is an engineer with Tianjin Zhongwei Aerospace Data System Technology Co., Ltd, a researcher with Tianjin Key Laboratory of Intelligent Information Processing in Remote Sensing. He received the Master degree of Electronic and Information Engineering at Tianjin University. His research interests include UAV system integration and application, UAV remote sensing.